Lists/Queues/Stacks in Linux Kernel

If you have read my previous posts you must have been seen that I’ve started some articles on Linux kernel internals. I attempt to explain some topics around kernel beginning from fundamental and simple parts of the kernel and moving to advanced concepts. Personally the reason that I like playing with kernel is that I’m not satisfied with using high level codes and using operating system services, instead I want to know how exactly whatever I do with computers work down to the lowest level which is the kernel codes. In this article I introduce implementation of lists in Linux. Then I use these kernel lists to make a queue and also a stack. Linux kernel makes heavy use of data structures. Almost all important data in kernel space are stored and managed using data structures. These data structures have been designed with sustaining efficiency in mind and in fact the functions related to the data structures modifications in the kernel are acceptably efficient. Lists in Linux kernel are circular doubly linked list, that is each element of the list points to the next and also the previous element.


   .-------------------------------------
   .                                     |
   .      ----------      ---------      |
   v      |        v      |       v      |
   --------        --------       --------
   | Node |        | Node |       | Node |
   --------        --------       --------
   |      ^        |      ^       |      ^
   |      ---------       --------       .
   |                                     .
   |                                     .
   ---------------------------------------

The interesting implementation feature of Linux kernel lists is that there is only one generic list node type that any list can use as its node member. The list node itself should be contained in another structure which can be defined to be anything. To understand what I mean look at the figure below.


          ------------------------           ------------------------
          |                      |           |                      |
          |    custom struct     |           |    custom struct     |
          |     definitions      |           |     definitions      |
          |                      |           |                      |
  prev    |    ..............    |   next    |    ..............    |  next                      
<--------------| list_head |-------------------->|  list_head |-------------->  ...
          |    ..............<--------------------..............    |           
          |                      |   prev    |                      |
          ------------------------           ------------------------
                 my_struct                           my_struct

Suppose that I want to have a list of a custom structure named “my_struct”. my_struct can have any number of fields of any type. Now if I want to connect the instances of my_struct structure I can leverage the kernel list implementation. Each list node in the kernel is of type “struct list_head”. To make the list of my_struct instances I should have an element of this type defined inside my_struct. This is the interesting part; the nodes are inside the actual structure not that the structure members (in other words node data) are inside the list node. “struct list_head” type is defined in “/include/linux/types.h”. There are some functions and macros in the kernel that work on the variables of this type (effectively tampering with kernel lists). First ones are the initialization macros. You can initialize a list by defining a variable of type list_head through LIST_HEAD macro. This macro and other list related macros and methods reside at “include/linux/list.h”. If you wanna initialize a node in a custom structure (like my_struct introduced previously) you can use LIST_INIT_HEAD macro. In the following examples we use the former to initialize our list, the reason is that the first element of the list is considered a general pointer to the list as a whole and is not needed to be contained in the wrapping structure.  The rest of the nodes that are members of my_struct are added to list and don’t need going through initialization phase. To add a node to list you can use list_add function. This function gets 2 arguments. The first one is the address of the new node to add to the list. The second one is the node that the new node is going to be appended immediately after. Another function that we use in our example is list_add_tail that works similar like to list_add with the sole difference that it inserts the new node before the node specified by the second argument not after. list_add works like this:

It receives a new node (which is going to be added to our list). We call this node “new”. It also receives the node after which new is going to be added; we call this node target. list_add then calls __list_add function. This function gets the new node to be added (the node we named new) and two other nodes between which new is going to be inserted. Say that they are node number 1 and node number 2. Remember lists in kernel are doubly linked lists so even a list with a single node (the head of the list) has its “prev” and “next” pointers pointing to a valid entry which in this case is the list head itself. The function __list_add puts the new node after “1” and before “2”, making all the three nodes’ pointers have correct values. list_add call __list_add like __list_add(new,target,target->next) and list_add_tail calls the same function but slightly differently; __list_add(new,target->prev,target). Clear?
So this is how list_add(new,target) works:


                                                                -----
       .-------------.                                  .------>| 2 |-------.
       |             v                                  |       -----       .
     -----         -----                                |                   v
     | H |         | 1 |      Add node "2"            -----               -----
     -----         -----     =============>           | H |               | 1 |
       |             ^                                -----               -----
       ---------------                                  ^                   |
                                                        ---------------------

And this is how list_add_tail(new,target) works:


                                                                -----
       .-------------.                                  .------>| 2 |-------.
       |             v                                  |       -----       .
     -----         -----                                v                   |
     | H |         | 1 |      Add node "2"            -----               -----
     -----         -----     =============>           | H |               | 1 |
       |             ^                                -----               -----
       ---------------                                  |                   ^
                                                        ---------------------

To traverse the list you should move through the pointers of the members. There is a macro in kernel that simplifies this task and it’s also defined in the previously mentioned list.h file.

/**
* list_for_each_prev - iterate over a list backwards
* @pos: the &struct list_head to use as a loop cursor.
* @head: the head for your list.
*/
#define list_for_each_prev(pos, head) \
for (pos = (head)->prev; pos != (head); pos = pos->prev)

If you use the list head to start traversing you will get the results in the reversed order compared to the one they have been added to the list. This is how I made an “stack”. But if you add each new element to end of the list using list_add or add each new node to the list head using the function “list_add_tail”, traversing the list would yield a FIFO list or a queue. The reason should be clear if you carefully look at the schematics I have provided above. Now to see the real execution of what I have talked about and test our guess, I’ve prepared a code and put it in the kernel by wrapping them inside a custom syscall. Then by calling the syscall from userspace we see the result of running the code. the full code inside the syscall (which I named list_test) is a follows:


struct my_struct{
    int n;
    struct list_head node;
};

SYSCALL_DEFINE0(list_test){
    int i;
    struct list_head *h;
    struct list_head *last;
    LIST_HEAD(head);
 
    //Make a stack 
    printk("Making a stack\n");    
    for (i=0;i<3;i++){ //Let's only add 3 members struct my_struct *new_node = (struct my_struct *)kmalloc(sizeof(struct my_struct),GFP_KERNEL); new_node->n = (i+1);
        list_add(&new_node->node,&head);
        
    }
    //now read the memebers
    printk("Members: \n");    
    list_for_each(h,&head){
        struct my_struct *element = list_entry(h,struct my_struct,node);
        printk("%d ",element->n);
    }
    //clean up
    list_for_each(h,&head){
        struct my_struct *element = list_entry(h,struct my_struct,node);
        kfree(element);
    }
    
    //Now make a queue
    printk("Making a queue/method 1\n");
    head.prev=&head;
    head.next=&head;
    for (i=0;i<3;i++){ //Let's only add 3 members struct my_struct *new_node = (struct my_struct *)kmalloc(sizeof(struct my_struct),GFP_KERNEL); new_node->n = (i+1);
        list_add_tail(&new_node->node,&head);
        
    }
    //now read the memebers
    printk("Members: \n");    
    list_for_each(h,&head){
        struct my_struct *element = list_entry(h,struct my_struct,node);
        printk("%d ",element->n);
    }
    //clean up
    list_for_each(h,&head){
        struct my_struct *element = list_entry(h,struct my_struct,node);
        kfree(element);
    }
    
    
    
    printk("Making a queue/method 2\n");
    head.prev=&head;
    head.next=&head;
    last=&head;
    for (i=0;i<3;i++){ //Let's only add 3 members struct my_struct *new_node = (struct my_struct *)kmalloc(sizeof(struct my_struct),GFP_KERNEL); new_node->n = (i+1);
        list_add(&new_node->node,last);
        last=&new_node->node;
    }
    //now read the memebers
    printk("Members: \n");    
    list_for_each(h,&head){
        struct my_struct *element = list_entry(h,struct my_struct,node);
        printk("%d ",element->n);
    }
    //clean up
    list_for_each(h,&head){
        struct my_struct *element = list_entry(h,struct my_struct,node);
        kfree(element);
    }
    printk("returning");
    return 0;
}

After compiling the kernel and running it, we get this result from running the code:


 Making a stack
 Members:
 3
 2
 1
 Making a queue/method 1
 Members:
 1
 2
 3
 Making a queue/method 2
 Members:
 1
 2
 3

 

In future articles I’ll show an example on how to work with processes descriptors. These descriptors are also implemented as linked list and having learned what explained in this post is helpful to understand the related concepts.

One thought on “Lists/Queues/Stacks in Linux Kernel

  1. Howdy! I’m at work browsing your blog from my new iphone 4!

    Just wanted to say I love reading your blog and look forward to all your posts!
    Keep up the outstanding work!

Leave a Reply

Your email address will not be published. Required fields are marked *