The details of singly linked lists are not very pretty, but they are slightly clever, so you should know them. But first the API.
Look in /home/cs140/spring-2004/include/sllist.h. This header file defines the programming interface to singly linked lists. There is a single struct that defines a data type called a Sllist:
typedef struct sllist { struct sllist *link; Jval val; } *Sllist;You've seen this before. Now, the following procedures create and manipulate Sllist's:
This structure is very common in linked list code for two reasons. First, it's nice to have a sentinel node so that empty lists have a clean implementation -- they are simply the sentinel node whose link field points to itself. Second, they remove the need for special code to deal with the ends of the list. You'll see how when you see the code.
So, for a few examples. When you call list = new_sllist(), you get a pointer to a Sllist struct whose link field points to itself:
list ----+->|---------| | | link ------\ | | val = ? | | | |---------| | | | \---------------/Note, the val field is uninitialized, and should not be used.
If we next call tmp = sllist_prepend(list, new_jval_i(3)), then a node will be added to the list. It will look as follows:
tmp----------------------\ | list ----+->|---------| +----->|---------| | | link ------/ | link --------\ | | val = ? | | val.i=3 | | | |---------| |---------| | | | \-------------------------------------/Now, suppose you insert a new node after tmp with the call tmp2 = sllist_insert_after(tmp, new_jval_i(9)). Now, you get:
tmp2-----------------------------------------\ tmp----------------------\ | | | list ----+->|---------| +----->|---------| +--->|---------| | | link ------/ | link ------/ | link --------\ | | val = ? | | val.i=3 | | val.i=9 | | | |---------| |---------| |---------| | | | \-------------------------------------------------------/Finally, if you call tmp = sllist_prepend(list, new_jval_i(1)), you'll get a new node in the front of the list.
tmp2-------------------------------------------------------\ tmp----------------------\ | | | list ----+->|---------| +--->|---------| /->|---------| +->|---------| | | link ------/ | link ------/ | link ------/ | link --------\ | | val = ? | | val.i=1 | | val.i=3 | | val.i=9 | | | |---------| |---------| |---------| |---------| | | | \-------------------------------------------------------------------/
There is an important thing to note about this API (oh, API is a acronym for ``application programming interface.'' This is a buzzword that is bandied about quite a bit these days, so I thought I'd join in with the trend). That is that there are some primitives missing that we would probably like, such as sll_last() to return the last node in the list, sll_append() to put an item at the end of the list, and sll_delete_node() to delete an item from the list. As it turns out, you really can't implement these cleanly and efficiently, so they are best left out. The book does implement some of these, but their implementation is not efficient (or, in the case of deleting, it is not clean). The bottom line is that if you need these things, you would do best to use a different data structure (doubly linked lists).
first-name last-name U/G scoreThe U/G says whether the person is a graduate or undergraduate. Now, suppose you want to write a program that takes a grade file on standard input, and prints out the average for graduates, and then a listing of the graduate students plus their grades and their distance to the average, and then the average for undergraduates and a similar listing of students.
This is something that can be done fairly well with singly linked lists. We'll take each person and make a struct for that person that has the person's name and grade, and then putting that struct onto either a list for graduate students or for undergraduates.
The code is in grader.c. What it does is create two Sllists called grad and ugrad. It appends students to these lists. To append to the list, you must maintain a pointer to the last node on the list. This is gtmp for the graduate student list, and ugtmp for the undergraduates.
Each student is put into a struct, which is then entered into the list as a (void *). This is a legal thing to do, since all pointers in C (in this case (void *)'s and (Person *)'s) are the same sizel
Once the students are all read in, we need to calculate the averages and print out the students. Since we're doing this twice, it's best to do these things in procedures. return_avg() calculates the average of a list of students, and print_list() prints out the students.
Note that these make use of the macro sll_traverse():
#define sll_traverse(tmpnode, list) for (tmpnode = sll_first(list); \ tmpnode != sll_nil(list); \ tmpnode = sll_next(tmpnode))What this does is substitute the for loop wherever it sees sll_traverse(). This may be a little confusing, but what it means is that the C preprocessor turns:
sll_traverse(tmp, d) {into
for (tmp = sll_first(d); tmp != sll_nil(d); tmp = sll_next(tmp)) {It's a nice way of traversing the list and having the code say what you're doing.
Anyway, try out the code and see that it works. Note that the only reason that the output is sorted is that the input file is also sorted by grade. Grader does nothing to actually sort beyond separating the students into graduates and undergraduates.
UNIX> head gradefile Betty Flintstone U 99.43 Pat Anderson U 98.56 Pat Fulmer U 96.77 Pat Ward G 96.01 Barney Fulmer U 94.80 Phil Rubble G 93.64 Wilma Rubble G 93.05 Bill Flintstone U 92.85 Dino Fulmer U 92.00 Fred Ward G 90.62 UNIX> grader < gradefile Undergraduates: Average = 74.19 Betty Flintstone 99.43 25.24 Pat Anderson 98.56 24.37 Pat Fulmer 96.77 22.58 Barney Fulmer 94.80 20.61 Bill Flintstone 92.85 18.66 Dino Fulmer 92.00 17.81 ... Graduates: Average = 75.53 Pat Ward 96.01 20.48 Phil Rubble 93.64 18.11 Wilma Rubble 93.05 17.52 Fred Ward 90.62 15.09 ... UNIX>
Let's now place an additional requirement on the undergraduate and graduate lists that will require the flexibility of a list. Specifically let's say that the lists should be printed alphabetically by last name. To simplify matters, we will say that if two people have the same last name then it does not matter in which order they are printed out.
The easiest way to satisfy this requirement is to keep the lists in alphabetical order. This means that each time we add a person to the list we will need to insert that person in its proper place in alphabetical order. This in turn means that we will need to insert into the middle of the list. In order to insert a person we will traverse the list until we find a node whose last name is alphabetically greater than the person we are inserting. We will then insert the new person before this node (call it the greater node).
When we scan the list of operations provided by sllist.h we find that there is an insert_after operation but no insert_before operation. Hence, we will need to have a pointer to the node immediately preceding the greater node. In order to have this pointer available we will need to save a pointer to both the previous node in the list and the current node. Here is the code that accomplishes this task:
void insert_person(Person *p, Sllist student_list) { Sllist prev_node = student_list; Sllist current_node; Person *current_person; sll_traverse(current_node, student_list) { current_person = (Person *)current_node->val.v; if (strcmp(p->lname, current_person->lname) < 0) break; else prev_node = current_node; } sll_insert_after(prev_node, new_jval_v((void *)p)); }
Notice that we start by setting prev_node to student_list. student_list points to the list's sentinel node. Making prev_node point to the sentinel node ensures that we will insert the new person in the proper place, even if the new person should be the first person in the list. If the new person should be first in the list, then the strcmp operation will return a result that is less than 0 when the new person is compared with the first node in the list. prev_node will point to the sentinel node in this case so the sll_insert_after command will place the new person after the sentinel node and hence make the new person the first node in the list.
In the most common case the new person will go somewhere in the middle of the list. In this case the current_node pointer will be incremented several times to point at the next node in the list. Before the pointer is incremented, prev_node is set to the current node so that we always have a pointer to the previous node. When the code finally finds a node whose last name is alphabetically greater than the new person, it will be able to use the previous node pointer to perform the sll_insert_after operation.
So what could go wrong with this code? Suppose that the new person's last name is alphabetically greater than any other node in the list. For example suppose we want to insert "Vander Zanden" into the following list:
list ----+->|---------| +--->|-------------| /->|-------------| | | link ------/ | link ---------/ | link --------------| | | val = ? | | val="Brown" | | val="Smith" | | | |---------| |-------------| |-------------| | | | \-------------------------------------------------------------/The traversal of the list will come to an end after current_node visits "Smith" and the loop will exit without the strcmp ever returning a result less than 0. At this point prev_node will point to "Smith"'s node. That is exactly what we want, since "Vander Zanden" should be inserted after "Smith." In other words our code works even when the person should be inserted at the end of the list.
The code for our sort program can be found in sorter.c. Here is the result of running it on gradefile:
UNIX> grader < gradefile Undergraduates: Average = 74.19 Pat Anderson 98.56 24.37 Fred Anderson 89.14 14.95 Barney Anderson 87.59 13.40 Phil Anderson 75.63 1.44 Wilma Anderson 63.53 -10.66 Bill Anderson 63.48 -10.71 John Anderson 62.49 -11.70 Dino Anderson 52.05 -22.14 Betty Anderson 51.65 -22.54 Betty Flintstone 99.43 25.24 Bill Flintstone 92.85 18.66 Phil Flintstone 87.43 13.24 ... Graduates: Average = 75.53 Phil Rubble 93.64 18.11 Wilma Rubble 93.05 17.52 Betty Rubble 80.89 5.36 Bill Rubble 73.04 -2.49 John Rubble 71.77 -3.76 Dino Rubble 68.64 -6.89 Barney Rubble 67.14 -8.39 Fred Rubble 61.32 -14.21 Pat Rubble 51.04 -24.49 Bill Summitt 89.32 13.79 Fred Summitt 88.23 12.70 Betty Summitt 83.71 8.18 Dino Summitt 82.24 6.71 ...
First, new_sllist() merely creates and returns an empty list:
list ----+->|---------| | | link ------\ | | val = ? | | | |---------| | | | \---------------/Here's the code:
Sllist new_sllist() { Sllist l; l = (Sllist) malloc(sizeof(struct sllist)); l->link = l; return l; }Similarly, an list is empty only if l->link points to l. Therefore, sll_empty() is one line:
int sll_empty(Sllist l) { return (l->link == l); }To insert a node after a node, the code is the same, regardless of whether the node is the sentinel, the first node, a middle node, or the last node. You simply create a new node, have that node's link point to the specified node's link, and have the node's link point to the new node. Note, it must be done in that order, or the rest of the list after the node will get lost!
Sllist sll_insert_after(Sllist l, Jval val) { Sllist tmp; tmp = (Sllist) malloc(sizeof(struct sllist)); tmp->val = val; tmp->link = l->link; l->link = tmp; return tmp; }Note, this works on the empty list when l is the sentinel. Work out the pointers for yourself if that is not clear to you.
Now, all sll_prepend(l, val), is create a new node whose val field is val, and inserts it after the sentinel. Thus, sll_prepend() is equivalent to sll_insert_after() called on the sentinel. Again, the sentinel has made our life very simple:
Sllist sll_prepend(Sllist l, Jval val) { Sllist tmp; return sll_insert_after(l, val); }The first node on the list is the one after the sentinel:
Sllist sll_first(Sllist l) { return l->link; }and the next node following node is the one pointed to by its link field:
Sllist sll_next(Sllist l) { return l->link; }All this leaves are sll_nil() and free_sllist(). Sll_nil() returns the sentinel node. This is because the link field of the last node on the list is the sentinel node. Moreover, when the list is empty, sll_first() also returns the sentinel node. The code is:
Sllist sll_nil(Sllist l) { return l; }Finally, free_sllist(l) needs to free every node in l. It is done as follows:
free_sllist(Sllist l) { Sllist tmp; while (!sll_empty(l)) { tmp = sll_first(l); l->link = tmp->link; free(tmp); } free(l); }The code does the following. While there are still nodes on the list besides the sentinel node, you remove the first node from the list and free it. Then continue until the list it empty (i.e. only the sentinel node remains). Then free it.
Deletion fits in even less well. In order to delete a node, you need to have a pointer to the previous node on the list. Why? Because that is the only node on the list that has a pointer to the node to be deleted, and in order to delete it, you must change that pointer to point to the next node in the list. Thus, you must either
The bottom line is this. If you want to do list appending, you should do it as in grader.c. Even that looks inelegant to me. I would go as far as to say that if you need list appending or deletion of arbitrary nodes, then you should use a different data structure: a doubly linked list.