<center>
<h2> CS140 -- Lab 10 </h2>
</center>
<p>
<hr>
<p>
<h3> Overview </h3>
<p>
This lab is a two part lab. The first part is an experiment that will 
explore how successful binary search trees, AVL trees, and splay trees 
are in creating balanced trees.
The second part will give you an opportunity to revisit the broker portion of
lab 5 and redo it using a balanced tree scheme. You will use a new type
of balanced binary tree, a red-black tree, that is even more efficient than 
AVL or splay trees.
You will be using Dr. Plank's red-black tree library, which you will
see much more of in CS302 and CS360. The implementation of a red-black tree
is quite complicated and we will not explore its implementation
in this class. 
<p>
<hr>
<p>
<h3> Binary Tree Experiment </h3>
<p>
In <a href="tree.h">tree.h</a> you will find the definition of a binary tree
node and routines for creating binary search trees, AVL trees, and splay
trees. The source file, <b>tree.c</b>, can be found in 
~cs140/www-home/spring-2005/src and the object file <b>tree.o</b> can be found 
in ~cs140/www-home/spring-2005/objs. The object file is linked automatically
to your executable by the makefile provided for this lab. You do not need to
read or understand the source code in <b>tree.c</b> in order to complete this
lab. The comments you find in <tt>tree.h</tt> should be sufficient for you
to complete the lab. One important thing to note about the <tt>make_tree</tt>
functions is that they create and return a sentinel node. Hence your program
will have pointers to sentinel nodes rather than to the roots of the trees.
If you want to get a pointer to the root of a tree you should call the
function <tt>tree_root</tt> and pass it a pointer to the sentinel node. The
insert and find functions expect pointers to the sentinel nodes as well.
<p>
Your task is to read a file that is given on the command line and
insert each word in the file into a binary search tree, an AVL tree, and a
splay tree. You should only insert a word into the trees if it has not yet
occurred in the file. You can check whether it has already been seen by
calling the <tt>splay_tree_find</tt> function on the splay tree. It will
return 0 if the word is not in the tree and otherwise will return a pointer
to the node containing the word. You should also save each unique word on
a dllist so that you can do a series of finds on the splay tree once all
the words in the file have been read. You should append words to the dllist.
<p>
Once all the words in the file have been read and inserted into the
respective trees, you should traverse the dllist of words that you created
and for each word you should call the <tt>splay_tree_find</tt> function on
your splay tree. Before each such call you should compute the height of
the current tree and the depth of the word for which you will be searching.
You will want to keep a cumulative total of the height and the depth so that
you can compute an average depth and height when you have completed all the
finds. The depth of the word for which you are searching can be found by
writing a simple find function for a binary search tree and computing the
depth as you search for the node. Note that the node is always guaranteed
to be in the tree so you do not have to worry about reaching NULL nodes at
the bottom of the tree. You can use this fact to simplify your find function.
<p>
The reason for computing the depth and height of the splay tree before each find
is that the find command will dynamically adjust the tree and hence the
tree's height and the depth of the nodes will change after each find. By
calculating the depth of a word before a find is performed on it you can
accurately count the number of nodes that must be traversed to find it. If
you calculate the depth of a word after a find is performed the depth will
always be 0 because the find command rotates the word to the root.
<p>
When you have completed traversing the dllist of words you can compute the
average depth and height of the splay tree by dividing your cumulative 
depth and height by the number of words on the list. At this point you can
also compute the height and the average depth of the binary search tree and
the AVL tree. The average depth is obtained by summing the depth of each node
and then dividing this sum
by the total number of nodes in the tree. Assuming that
all finds are done on random words, the average depth represents the expected
number of nodes that must be examined to retrieve a word's node. Since the
find function does not change either a binary search tree or an AVL tree, it
is ok to compute the average depth and height without doing individual finds
on these two trees.
<p>
You should be
able to obtain both the height of these two trees and compute their 
cumulative depth,
which is the sum of the depths of all nodes, using a
recursive tree traversal. You can then divide the cumulative depth by the
number of nodes in the tree to get the average depth. At the end of your
program you should print the number of words that were read by your program,
which constitutes N, and log<sub>2</sub>N. C does not provide a function
for computing log<sub>2</sub>N but you can compute log<sub>2</sub>N using
the following formula:
<pre>
log<sub>2</sub>N = log(N) / log(2)
</pre>
You will have to include <tt>math.h</tt> in your program to use C's log
function. The log function returns the natural log, <tt>ln</tt>, of a number.
When you use the <tt>math.h</tt> library you must remember to link C's
math library using the -lm flag. The makefile for this lab does this task
for you. 
<p>
Once your program has printed the above information it should print the
1) height, 2) average depth, and 3) ratio of the average depth to 
log<sub>2</sub>N for each tree. For example, your output might look as
follows:
<pre>
%UNIX btree_experiment myfile
Number of words = 233  log 233 = 7.86

		height	depth	depth/log n
Binary tree     22.00	15.85   2.02
AVL tree 	10.00	9.56	1.22
Splay tree  	15.21	11.23	1.43
</pre>
You should name your program <b>btree_experiment</b>.
If you have any questions about what how your output should appear, consult
the executable, <tt>btree_experiment</tt>, which can be found in the lab10
directory. You can correctly position the "height", "depth", and
"depth/log n" labels by using the tab character, \t, in your printf statement.
The "height" label should appear two tab positions from the beginning of
the line and the "depth" and "depth/log n" labels should each be one tab
position over:
<pre>
printf("\t\theight\tdepth\tdepth/log n\n\n");
</pre>
<p>
<hr>
<h3> Broker </h3>
<p>
Redo the broker program from lab 5 so that you use Dr. Plank's red-black
tree library rather than a dllist to first sort customers by name and
later to sort customers by average transaction
cost. 
You will still use
a dllist to store a customer's transactions and you will still name your
executable <b>broker</b>. 
<p>
You will need to include <a href="../../include/jrb.h">jrb.h</a> in your program.
The library's basic data type is a pointer to a tree node, which it calls
<tt>JRB</tt> (for Jim's Red-Black tree). The only two fields you need concern
yourself with are the <tt>key</tt> and <tt>val</tt> fields. The other fields
are work fields that are used by the library's functions to balance the tree
and link the nodes.
The libfdr.a library
contains the red-black tree functions and the makefile automatically links
this library with your executable. The
functions that you will need to use from Dr. Plank's library are:
<ol>
<li> JRB make_jrb(): creates a new red-black tree.
<p>
<li> JRB jrb_insert_str(JRB tree, char *key, Jval val):  insert a new node into the tree using a standard character string as the  key. Strcmp() is used as the comparison function. <tt>jrb_insert_str</tt> returns a pointer to the 
new jrb tree node. 
Even though the key is a string, it will be converted into a Jval 
in the new node. Thus, if you want to get at the key of node s, 
you should either use jval_s(s->key) or s->key.s.
<p>
<li> JRB jrb_insert_dbl(JRB tree, double dkey, Jval val): insert a new node
into the jrb tree based on its <tt>dkey</tt> value. Like a string key, the
<tt>dkey</tt> will be converted to a Jval so if you want to get at the key
of a node <tt>s</tt>, you should either use <tt>jval_d(s->key)</tt> or
<tt>s->key.d</tt>.
<p>
<li> JRB jrb_find_str(JRB root, char *key): returns a JRB node whose value
is equal to <tt>key</tt> or NULL if there is no such node in the tree. 
<tt>root</tt> should be a pointer to the root of the red-black tree (i.e, the
node returned by <tt>make_jrb</tt>).
<p>
<li> jrb_rtraverse(ptr, jrb_tree): Traverses a jrb tree in reverse lexicographic
	order and assigns <tt>ptr</tt> to each node in turn. Both <tt>ptr</tt>
	and <tt>jrb_tree</tt> should be declared as JRB nodes.
<p>
<li> Jval jrb_val(JRB node): returns the value stored at a node.
</ol>