CS360 Lecture notes -- Doubly linked lists

  • Jim Plank
  • Directory: /blugreen/homes/plank/cs360/notes/Dlists
  • This file: http://www.cs.utk.edu/~plank/plank/classes/cs360/360/notes/Dlists/lecture.html

    Compiling

    In order to use the dlist library, you should include the file "dlist.h", which can be found in /blugreen/homes/plank/cs360/include. Instead of including the full path name in your C file, just do:
    #include "dlist.h",
    
    and then compile the program with:
    gcc -I/blugreen/homes/plank/cs360/include
    
    Also when you link your object files to make an executable, you need to include /blugreen/homes/plank/cs360/objs/dlist.o.

    The makefile in this directory does both of these things for you. When you look over the file dlistex.c, make sure you figure out how to compile it so that it finds dlist.h, and so that the compilation links dlist.o.


    Dlists

    The dlist library manipulates doubly-linked lists, which use the following C structure for each node in the list:
    typedef struct dlist {
      struct dlist *flink;
      struct dlist *blink;
      void *val;
    } *Dlist;
    
    The fields of this structure are as follows:

  • Flink -- a pointer to the next node in the list
  • Blink -- a pointer to the previous node in the list
  • Val -- a pointer to the list's value.

    All Dlists have a ``sentinel'' node. This is a dummy node at the head of the list -- it makes writing code much easier. The lists are always circular as well -- meaning that the last node points back to the sentinel node.

    To use a dlist, you declare it as follows:

      Dlist d;
    
    Then, to create one, you call make_dl():
      d = make_dl();
    
    Now d is a pointer to an empty list, which consists of one node -- the sentinel. This node's flink and blink fields point to itself, and the val field is uninitialized:
        d--->|-------|<--\
             | flink |---/
             | blink |---\
             | val=? |<--/
             |-------|
    
    To insert new nodes into a list, you use the procedures dl_insert_b() and dl_insert_a(): Suppose we have the empty list d as above. Then we can insert two strings "1" and "2" into the list as follows:
       dl_insert_b(d, "1");
       dl_insert_b(d, "2");
    
    The first dl_insert_b() puts "1" at the end of the list, and the second one puts "2" at the end of the list. After these calls are complete, the list looks as follows:
       /----------------------------------------------------\
       | /-----------------------------------------------\  |
       | |   d --> |-------|   |-------|   |-------|     |  |
       | \-------->| flink |-->| flink |-->| flink |-----/  |
       \-----------| blink |<--| blink |<--| blink |<-------/
                   | val=? |   |  val  |   |  val  |
                   |-------|   |---|---|   |---|---|
                                   |           |
                                    \           \
                                     --> "1"     --> "2"
    
    Thus, if we call printf("%s\n", d->flink->val), it would print "1". Similarly, printf("%s\n", d->blink->val) will print "2". Now, if we call
        dl_insert_a(d->flink, "Jim"); 
    
    then a new node with the string "Jim" as a value will created and inserted between the "1" and "2" nodes:
       /----------------------------------------------------------------\
       | /-----------------------------------------------------------\  |
       | |   d --> |-------|   |-------|   |-------|   |-------|     |  |
       | \-------->| flink |-->| flink |-->| flink |-->| flink |-----/  | 
       \-----------| blink |<--| blink |<--| blink |<--| blink |<-------/ 
                   | val=? |   |  val  |   |  val  |   |  val  |
                   |-------|   |---|---|   |---|---|   |---|---|
                                   |           |           |
                                    \           \           \
                                     --> "1"     --> "Jim"   --> "2"
    
    So now:
       printf("%s\n", d->flink->val) will print "1"
       printf("%s\n", d->flink->flink->val) will print "Jim"
       printf("%s\n", d->flink->flink->flink->val) will print "2"
       printf("%s\n", d->blink->val) will print "2"
       printf("%s\n", d->blink->blink->val) will print "Jim"
       printf("%s\n", d->blink->blink->blink->val) will print "1"
    
    To print out a generic list of strings, you can use the following for loop:
       /* d and tmp are both defined to be dlists) */
    
       for (tmp = d->flink; tmp != d; tmp = tmp->flink) {
         printf("%s\n", tmp->val);
       }
    
    Note that searching for the sentinel ends the loop. In dlist.h, there are macros defined
    #define nil(l) (l)
    #define first(l) (l->flink)
    #define last(l) (l->blink)
    #define next(n) (n->flink)
    #define prev(n) (n->blink)
    
    You can use these if you feel like it -- they help a little for program readability. More importantly, there is a macro dl_traverse:
    #define dl_traverse(ptr, list) \
      for (ptr = first(list); ptr != nil(list); ptr = next(ptr))
    
    This lets you put list traversal into your code simply and clearly. For example, the above list printing loop becomes:
      dl_traverse(tmp, d) printf("%s\n", tmp->val);
    
    If you want the body of the loop to have several statements, you can use curly braces with dl_traverse:
      i = 0;
      dl_traverse(tmp, d) {
        i++;
        printf("List element %d: %s\n", i, tmp->val);
      }
    
    To delete nodes from a list, use dl_delete_node(Dlist n). For example, to delete the node that contains the string "Jim" from the above list, you can do:
      dl_delete_node(d->flink->flink);
    
    Now the list will look like:
       /----------------------------------------------------\
       | /-----------------------------------------------\  |
       | |   d --> |-------|   |-------|   |-------|     |  |
       | \-------->| flink |-->| flink |-->| flink |-----/  |
       \-----------| blink |<--| blink |<--| blink |<-------/
                   | val=? |   |  val  |   |  val  |
                   |-------|   |---|---|   |---|---|
                                   |           |
                                    \           \
                                     --> "1"     --> "2"
    
    To delete a list entirely, use dl_delete_list(d). For example,
      dl_delete_list(d);
    
    will delete the above list. Note that the dlist library calls malloc to allocate nodes for the lists. dl_delete_node() and dl_delete_list() both call free() to free up deleted nodes for reuse. Thus, suppose we have the following list again:
       /----------------------------------------------------------------\
       | /-----------------------------------------------------------\  |
       | |   d --> |-------|   |-------|   |-------|   |-------|     |  |
       | \-------->| flink |-->| flink |-->| flink |-->| flink |-----/  |
       \-----------| blink |<--| blink |<--| blink |<--| blink |<-------/
                   | val=? |   |  val  |   |  val  |   |  val  |
                   |-------|   |---|---|   |---|---|   |---|---|
                                   |           |           |
                                    \           \           \
                                     --> "1"     --> "Jim"   --> "2"
    
    And suppose I have a Dlist tmp. Then if I do:
       tmp = d->flink->flink;
       dl_delete_node(d-flink->flink);
    
    Then I cannot use the value of tmp any longer. Why? Because it points to storage that has been freed. However, since the linked list's structure is maintained, d->flink->flink is now the node with the value of "2".

    An example program: Dlistex.c

    The code in dlistex.c has all the above examples in it. Copy all the .c files and the makefile to a directory of your own and type make. Then run dlistex, and make sure you understand whe the output is the way it is.

    Also, look over the file /blugreen/homes/plank/cs360/include/dlist.h for complete descriptions of each procedure defined.


    The val field

    Since val is a (void *) it can be defined to be any pointer, not just a (char *) as above. You can also have it hold an int if you want, since int's and pointers are all 4 bytes.

    Look at the file "hosts". This file contains IP addresses and host names for 571 different machines around campus. Suppose we want to read in this file and print it out backwards in a more formatted way. Then we can read the file into a dlist, where the val field is a pointer to a structure with two fields -- one for the host name, and one for the IP address. The program is in hostread.c. Make sure you understand this program before you undertake the lab. If you don't understand any part of the program (for example, the malloc statement, or the scanf statement, or the strdup statement), first read the man page, then ask a TA.

    Try the program out on the file hosts:

    hostread < hosts > output
    
    vi the output file -- is it as you expected?

    Memory bugs

    Finally, look at bugread.c. It is exactly the same as hostread.c except the strdup()'s are omitted. Compile this program and run it on the file hosts:
    bugread < hosts > bugoutput
    
    vi the bugoutput file -- is it as you expected?

    What has happened? Since strdup() isn't called, every structure in the dlist points to places in is->text2. Thus, when the dlist is printed out, every line is a variation of the last IP address and host read in: "128.169.224.8" and "oper-8-utk.edu".

    Make sure you look at this example -- it is a good example of how a simple omission can give you what looks like gibberish for output.

    These kinds of bugs happen often -- if you want to save a string that may be overwritten, you must make a copy of the string, not a copy of the pointer.