CS140 -- Lab 8
This lab is designed to:
- give you additional practice with function pointers, and
- give you practice with
using and implementing binary search tree's for range queries. Range
queries can take two forms:
- A request for all data in a certain range.
Sample queries might be
all students with grades between 80 and 89, or all political
figures whose photo appeared in a major news publication between October 10
and November 1.
- A request for the first or last n items in a data set. Sample
queries might be the 10 runners with the fastest times in a race or the
20 politicians whose photos appeared most frequently in major news
publications on November 1.
Binary search trees are ideally suited to handle range queries because they
keep data in sorted order. However, the way that binary search trees are
often implemented, it is difficult to get from the current data element to
the next data element without doing an inorder traversal. Inorder traversals
are fine when we want to process all the elements in a tree but they are
inefficient when we only want to access a limited number of elements in the
tree.
To help facilitate the handling of range queries, you are going to extend
the binary search tree library you wrote in the previous lab so that
you can traverse a binary search tree as though it were a linked list.
You are then going to write a program that reads race results and performs
range queries on these results.
Lab Materials
- Executables for the test files are in the directory
/home/bvz/cs140/labs/lab8. As usual, if you have
questions about how these programs should work, try these.
- race1 and race2 are an error-free and an error-filled
test file that you can use. You should develop your own test files
as well. You should not assume that every possible error type is in
race2. In fact, race2 does not contain every possible
error type.
- bst_find.c, bst_print.c, and bst_traverse.c provide several
simple test programs for your binary search tree library. wordfile
is a simple test file for bst_traverse.
Linked-List Style Traversal of Binary Search Trees
In order to extend your binary search tree to handle linked-list style
traversals, you are going to have to make a number of modifications to
your existing code:
- add next and prev pointers to each node in a tree.
The next pointer should point to the next element in ascending
order in the tree and the prev pointer should point to the
next element in descending order in the tree. For example, if we have
the data set {3, 8, 13, 23, 28} then the next pointer
for 13 should be to the node for 23 and the prev
pointer for 13 should be to the node for 8.
- add a sentinel node pointer to your administrative tree struct.
This sentinel node
will point to the minimum and maximum elements of the tree, via its
next and prev pointers. In the above data set, the
sentinel node's next pointer would point to 3 and its
prev pointer would point to 28. Additionally,
3's prev pointer and 28's next pointers will both
point to the sentinel node.
- when you insert a new value into the tree, you will need to make the
new value point to the appropriate successor and predecessor values
in the tree, as well as adjust the
prev and next pointers of these values so that they
point to the new value. If the child is added as a left child, then
its successor will always be its parent. If the child is added as a right
child, then its predecessor will always be its parent. From these two
relationships you should be able to figure out how to insert a child into
the linked list. It is unacceptable to traverse the linked list from
the start to the back in order to find where to insert the new node,
because that would require an O(n) search and would destroy the O(log n)
performance of the insert.
- you are going to need to modify your find procedure so that
it returns a pointer to a node containing the target key, rather than
the value associated with that key. You are then going to add a
get_value function that returns the value of the node,
a get_key function that returns the key field of the node,
a next
function that returns the node with the next highest value in the tree,
and a prev
function that returns the node with the next lowest value in the tree.
More Generic Binary Tree Library
In the previous lab I allowed you to write a binary tree library in which
the key was a char *. Since your library knew the type of the key,
it made key comparisons simple--you simply used strcmp. However,
forcing the key to be a char * also limits the flexibility of the
library. In this lab you are going to make the key be a void *, just
like the value. Now your library will not know how to compare keys unless
it gets some assistance from the library's user. In particular, the user
will need to pass to create_tree a pointer to a function that
compares two keys and returns an indication of their lexicographic order.
New Binary Tree Library Interface
You should create a binary tree library with the following interface. You
are not allowed to change any part of the interface:
- void tree_insert(void *key, void *value, void *binary_tree):
Insert the (key,value) pair into the tree in sorted order. hint:
You may need to modify your insertion routines so that they can determine
whether or not a child was just added to the parent. If a child was
just added to the parent, then the child and parent's linked list pointers
must be updated.
- void *tree_find(void *key, void *binary_tree, bool *found):
Find the node associated with key in the binary tree and return either
1) a pointer to that node or 2) a pointer to the first node whose key
is greater than the target key. If the key is found, then set the found
parameter to true; otherwise set it to false.
Note that in the previous lab you returned
0 if the key was not found. However, when
you are dealing with range queries you do not want to force the user
to ensure that the first key in the range is in the tree. Instead you
want to get a starting spot. For example,
suppose I want all students whose grades are between 80 and 89. If the
grades of these students are 82, 85, and 88, I would like my find function
to return a pointer to 82, not return 0. This is what your find function
will now be doing. If the tree is empty or the search key is greater than
any key in the tree, then return a pointer to the sentinel node.
The found flag is a convenience for the user. The user could determine
whether or not the key was found by retrieving the key from the returned
node and determining whether or not it is equal to the search key.
However, it is much easier if the user can simply check the flag. Passing
a parameter as a pointer and then setting it is a common way of
allowing a function to return more than one value.
- void *tree_get_value(void *node): returns the value field of a node.
- void *tree_get_key(void *node): returns the key field of a node.
- void *tree_next(void *node): returns the node associated with the next
higher key in the tree.
- void *tree_prev(void *node): returns the node associated with the
next lower key in the tree.
- bool tree_end(void *node, void *tree): returns true if the node is the
sentinel node of the tree
and false otherwise. You can use tree_end for iterating through
the tree and making sure that you stop when you reach the last node in
the tree if traversing the tree in ascending order or the first node if
traversing the tree in descending order.
- void *tree_min(void *tree): returns the node associated with the minimum
key in the tree.
- void *tree_max(void *tree): returns the node associated with the maximum
key in the tree.
- void *create_tree(int (*compare)(void *key1, void *key2)): create a record for a binary search tree and return it as
a void *.
The compare function will
take pointers to two keys
and return a negative number, 0, or a positive number depending on whether key1 is
less than, equal to, or greater than key2.
Your library code should store a pointer to the function in the
record for the binary search tree. It will need to use this function when
it tries to find or insert values in the tree.
- void print_tree(void *tree, void (*print_fct)(void *key, void *value)): print the tree in
sorted order based on the values of the keys. print_tree should
perform an in_order traversal. print_fct should cast the
key and value
arguments to the appropriate types and then do whatever is necessary to
print them.
Race Results
In this part of the lab you are going to read the results of a race and then
allow the user to perform several types of range queries.
Input
The input to your program will consist of lines of the form:
FirstName LastName mm:ss
where FirstName and LastName are single word fields and
mm:ss is the runner's time in minutes and seconds. A sample file
might be:
Nels VanderZanden 18:03
Mickey Mouse 20:05
Minnie Mouse 17:50
Brad VanderZanden 16:57
Daffy Duck 17:08
Joe Tortoise 29:08
Sally Hare 16:59
Naturally I won :)
Queries
Once you have read the input the user will be able to enter queries on stdin
that request information. Your program should support the following queries:
Program Output
Print the runners who meet the query criteria one per line with single
spacing between each field. Note that you cannot assume that the fields in
the input will be separated by single spaces and hence it will be necessary
to store the fields individually in a runner's struct.
Program Design
You will need to read your input into a binary search tree using the time
as the key and the name as the value. You can either store the time in a
struct that has two fields--minutes and second--or you can convert the time
to seconds using the equation 60*minutes + seconds. If you store the
time as a struct then your comparison function will need to compare both the
minutes and the seconds. You can use strchr to find the
: delimiter and
separate the time into separate minute and seconds fields.
Once you have read the input into your binary search tree you will need to
enter a loop that reads stdin and performs the indicated query.
Error Checking
You need to perform the following error checks:
- Check that the number of command line arguments is correct.
- Check that the input file can be opened.
- Check that each line of input has exactly three fields.
- Check that the format of the time is correct and that the minutes and
seconds are both numeric. A time must have one
or more digits for the minutes and exactly two digits for the seconds.
- Check that a query is correctly formatted and print an appropriate
error message if it is not correctly formatted. Your error messages
do not need to precisely imitate mine but they should be easily
understood by a user.
You may assume that there are no duplicate times in the input (i.e., you
do not have to error check this condition) and you do not have to catch
time errors of the form 2a:40 or 23:4a, since sscanf and atoi can both
convert the strings to numbers. In contrast, you must catch errors of
the form a2:40 and 23:a4 because sscanf and atoi cannot convert these
strings to numbers. Note that in this lab you cannot use atoi or else you
will miss some errors. Can you see why?
Design Document
To help you think through how you might want to design your program for
this lab, you should answer the following questions and hand them in when
told to do so by the TA:
- Show the binary search tree that would result from processing the
sample race results file shown earlier. When you draw the tree, also
draw the next and prev links between the nodes. To make your drawing
simpler, it is okay to:
- For each node that gets inserted, show which nodes immediately precede
and succeed it in the current tree. For example:
Nels VanderZanden 18:03: SentinelNode SentinelNode
Mickey Mouse 20:05: 18:03 SentinelNode
Minnie Mouse 17:50: SentinelNode 18:03
Now answer the following questions:
- Based on the above insertion pattern, if a node is inserted as
a left child, what is its successor node (i.e., what is the
relationship of the successor node to the left child--parent,
grandparent, left sibling, right sibling left child, right child)?
- Based on the above insertion pattern, if a node is inserted as
a right child, what is its predecessor node?
- Suppose you decide to represent the key as a minute/second pair. Show the
struct you would declare.
- Now show the comparison function you would write to compare two keys
in your minute/second pair representation.
- Suppose you decide to represent the key as an integer. Show the
comparision function you would write to compare two keys.
- Show the call you would make to insert a key with value 1078 into
a tree named my_tree.
Assume the value field is a null pointer for this problem.
- Show the struct you plan to use to store the name fields.
- Suppose you have created a tree named my_tree and inserted
integer keys into it. Complete the following problems:
- Write code fragment that prints the value of the minimum key in the
tree.
- Write a code fragment that prints the first value that is greater than
or equal to 100 in the tree.
- Write a code fragment that traverses the tree's linked list in
ascending order and
prints the values of all the keys in the tree.
- Given the sample race results file presented earlier, write down the
output that should be produced by your program for each of the following
queries:
- first 3
- last 2
- range 14:00 17:30
- range * 18:00
- range 19:00 *
What To Hand In
You should submit your design document when the TA asks for it during the lab.
You will submit the following files to the TAs via the submit script:
- bintree.h
- bintree.c
- runner.c