Tue Dec 1 10:46:03 EST 1998
Splay Tree Implementation
The splay tree API is in
/home/cs140/spring-2004/include/splay.h, and the
implementation is in
/home/cs140/spring-2004/src/splay.c.
Splay trees are nice because they are a balanced binary tree structure
that also gives you flink and blink pointers so that you
can traverse the nodes of the tree sorted by key.
You can take a look at some code that uses splay trees in
splay_test.c.
Other illustrative applications: These all do some form of sorting
on standard input using splay trees.
- strsort.c: Uses splay trees to sort
standard input lexicographically.
- strrsort1.c: Uses splay trees to sort
standard input lexicographically in reverse order. It does this by
traversing the tree in reverse order.
- strrsort2.c: Uses splay trees to sort
standard input lexicographically in reverse order. It does this by
creating a new comparison function revcomp, which simply returns
-strcmp(). Now the tree sorts in reverse order, so it is
traversed in the forward direction.
- strusort.c: Uses splay trees to sort
standard input lexicographically, and it removes duplicate lines. It
does this by checking for a line before inserting it into the tree.
- strisort.c: Uses splay trees to sort
standard input lexicographically, ignoring upper and lower case.
It does this by
creating a new comparison function ucomp, which duplicates
strcmp()'s functionality but ignores case.
- nsort.c: Uses splay trees to sort
like sort -n -- i.e. it treats each line as an integer, and
sorts it that way. If the lines are not integers, or there are duplicate
lines, anything goes.
- nsort2.c: Uses splay trees to sort
like sort -n only now if two lines have the same atoi()
value, then they are sorted lexicographically. This uses
splay_insert_gen().
- nsort3.c: Same as nsort2, but
instead it uses a two-level splay tree. See below for explanation.
Try these on input_s and
input_n.
They are all very simple programs.
_str, _int, _dbl, _gen
The splay tree routines in splay.h/splay.c implement four
types of insertsion/searching routines. The insertion routines are:
- Splay *splay_insert_str(Splay *s, char *key, Jval val):
insert into the tree using a standard character string as the
key. Strcmp() is used as the comparison function. See
strsort.c for a simple example of
sorting standard input lexicographically with splay_insert_str().
Note that it returns a pointer to the new splay tree node. Also
note that if the key is already in the tree, then it still creates
a new node and puts it into the tree. No guarantees are made concerning
the relative ordering of duplicate keys.
Even though the key is a string, it will be converted into a Jval
in the splay tree node. Thus, if you want to get at the key of node
s, you should either use jval_s(s->key) or s->key.s.
- Splay *splay_insert_int(Splay *s, int key, Jval val):
insert into the tree using an integer as the key.
See nsort.c for an example of this.
- Splay *splay_insert_dbl(Splay *s, double key, Jval val):
insert into the tree using a double as the key.
- Splay *splay_insert_gen(Splay *s, Jval key, Jval val,
int (*func)(Jval, Jval)):
Now, your key is a jval. You provide a comparison function
func(), which takes two Jval's as arguments, and returns:
- a negative integer if the first key is less than the second.
- a positive integer if the first key is greater than the second.
- zero if the keys are equal.
This lets you do more sophisticated things than simply sorting with
integers and strings. For example,
strisort.c sorts strings but ignores case.
strrsort2.c sorts strings in reverse order.
Read these over.
You can't mix and match comparison functions within the same tree. In other
words, you shouldn't insert some keys with splay_insert_str() and
some with splay_insert_int(). To do so will be begging for a core dump.
To find keys, you use one of splay_find_str(),
splay_find_int(),
splay_find_dbl() or
splay_find_gen(). Obviously, if you inserted keys with
splay_insert_str(), then you should use splay_find_str()
to find them. If the key that you're looking for is not in the tree, then
splay_find_xxx() returns null.
Finally, there are also:
splay_find_gte_str(),
splay_find_gte_int(),
splay_find_gte_dbl() and
splay_find_gte_gen(). These return the splay tree node whose key is
either equal to the specified key, or whose key is the smallest one greater
than the specified key. If the specified key is greater than any in the tree,
it will return a pointer to the sentinel node. It has an argument found
that is set to tell you if the key was found or not.
A two-level tree example
Suppose we want to sort lines of text by their atoi() value, but when
two strings have the same atoi() value, to sort them lexicographically.
One way to do this is go use a beefed up comparison function and
then insert lines with splay_insert_gen(), as in
nsort2.c. Try it on input_n2.
A second way to do this is to have a two-level tree. The first tree
has integers as keys and is based on the atoi() value of each line.
The val field of each node, however, is another splay tree.
This splay tree contains each line whose atoi() value is equal
to the key of the node, sorted lexicographically. Thus, when you read
a line, you first see if its atoi() value is in the tree. If
so, you get a pointer to the val field of that node. Of not, you insert
a new node into the tree whose key is the atoi(), and whose
val field is a new, empty splay tree. Now, you have a pointer to
the splay tree in the val field of the node use key is the
atoi() value of the string. What you do now is insert the string
into this second splay tree using splay_insert_str(). When you're
done, you have a big two-level splay tree. You traverse it by traversing
the top level tree, and for each node in that tree, you traverse the tree
in its val field and print out the strings. See the code. It is
in nsort3.c.