CS140 -- Lab 8


This lab is designed to:

  1. give you additional practice with function pointers, and
  2. give you practice with using and implementing binary search tree's for range queries. Range queries can take two forms:

    1. A request for all data in a certain range. Sample queries might be all students with grades between 80 and 89, or all political figures whose photo appeared in a major news publication between October 10 and November 1.

    2. A request for the first or last n items in a data set. Sample queries might be the 10 runners with the fastest times in a race or the 20 politicians whose photos appeared most frequently in major news publications on November 1.

    Binary search trees are ideally suited to handle range queries because they keep data in sorted order. However, the way that binary search trees are often implemented, it is difficult to get from the current data element to the next data element without doing an inorder traversal. Inorder traversals are fine when we want to process all the elements in a tree but they are inefficient when we only want to access a limited number of elements in the tree.

    To help facilitate the handling of range queries, you are going to extend the binary search tree library you wrote in the previous lab so that you can traverse a binary search tree as though it were a linked list.

You are then going to write a program that reads race results and performs range queries on these results.


Lab Materials


Linked-List Style Traversal of Binary Search Trees

In order to extend your binary search tree to handle linked-list style traversals, you are going to have to make a number of modifications to your existing code:

  1. add next and prev pointers to each node in a tree. The next pointer should point to the next element in ascending order in the tree and the prev pointer should point to the next element in descending order in the tree. For example, if we have the data set {3, 8, 13, 23, 28} then the next pointer for 13 should be to the node for 23 and the prev pointer for 13 should be to the node for 8.

  2. add a sentinel node pointer to your administrative tree struct. This sentinel node will point to the minimum and maximum elements of the tree, via its next and prev pointers. In the above data set, the sentinel node's next pointer would point to 3 and its prev pointer would point to 28. Additionally, 3's prev pointer and 28's next pointers will both point to the sentinel node.

  3. when you insert a new value into the tree, you will need to make the new value point to the appropriate successor and predecessor values in the tree, as well as adjust the prev and next pointers of these values so that they point to the new value. If the child is added as a left child, then its successor will always be its parent. If the child is added as a right child, then its predecessor will always be its parent. From these two relationships you should be able to figure out how to insert a child into the linked list. It is unacceptable to traverse the linked list from the start to the back in order to find where to insert the new node, because that would require an O(n) search and would destroy the O(log n) performance of the insert.

  4. you are going to need to modify your find procedure so that it returns a pointer to a node containing the target key, rather than the value associated with that key. You are then going to add a get_value function that returns the value of the node, a get_key function that returns the key field of the node, a next function that returns the node with the next highest value in the tree, and a prev function that returns the node with the next lowest value in the tree.


More Generic Binary Tree Library

In the previous lab I allowed you to write a binary tree library in which the key was a char *. Since your library knew the type of the key, it made key comparisons simple--you simply used strcmp. However, forcing the key to be a char * also limits the flexibility of the library. In this lab you are going to make the key be a void *, just like the value. Now your library will not know how to compare keys unless it gets some assistance from the library's user. In particular, the user will need to pass to create_tree a pointer to a function that compares two keys and returns an indication of their lexicographic order.


New Binary Tree Library Interface

You should create a binary tree library with the following interface. You are not allowed to change any part of the interface:

  1. void tree_insert(void *key, void *value, void *binary_tree): Insert the (key,value) pair into the tree in sorted order. hint: You may need to modify your insertion routines so that they can determine whether or not a child was just added to the parent. If a child was just added to the parent, then the child and parent's linked list pointers must be updated.
  2. void *tree_find(void *key, void *binary_tree, bool *found): Find the node associated with key in the binary tree and return either 1) a pointer to that node or 2) a pointer to the first node whose key is greater than the target key. If the key is found, then set the found parameter to true; otherwise set it to false. Note that in the previous lab you returned 0 if the key was not found. However, when you are dealing with range queries you do not want to force the user to ensure that the first key in the range is in the tree. Instead you want to get a starting spot. For example, suppose I want all students whose grades are between 80 and 89. If the grades of these students are 82, 85, and 88, I would like my find function to return a pointer to 82, not return 0. This is what your find function will now be doing. If the tree is empty or the search key is greater than any key in the tree, then return a pointer to the sentinel node.

    The found flag is a convenience for the user. The user could determine whether or not the key was found by retrieving the key from the returned node and determining whether or not it is equal to the search key. However, it is much easier if the user can simply check the flag. Passing a parameter as a pointer and then setting it is a common way of allowing a function to return more than one value.

  3. void *tree_get_value(void *node): returns the value field of a node.
  4. void *tree_get_key(void *node): returns the key field of a node.
  5. void *tree_next(void *node): returns the node associated with the next higher key in the tree.
  6. void *tree_prev(void *node): returns the node associated with the next lower key in the tree.
  7. bool tree_end(void *node, void *tree): returns true if the node is the sentinel node of the tree and false otherwise. You can use tree_end for iterating through the tree and making sure that you stop when you reach the last node in the tree if traversing the tree in ascending order or the first node if traversing the tree in descending order.
  8. void *tree_min(void *tree): returns the node associated with the minimum key in the tree.
  9. void *tree_max(void *tree): returns the node associated with the maximum key in the tree.
  10. void *create_tree(int (*compare)(void *key1, void *key2)): create a record for a binary search tree and return it as a void *. The compare function will take pointers to two keys and return a negative number, 0, or a positive number depending on whether key1 is less than, equal to, or greater than key2. Your library code should store a pointer to the function in the record for the binary search tree. It will need to use this function when it tries to find or insert values in the tree.
  11. void print_tree(void *tree, void (*print_fct)(void *key, void *value)): print the tree in sorted order based on the values of the keys. print_tree should perform an in_order traversal. print_fct should cast the key and value arguments to the appropriate types and then do whatever is necessary to print them.


Race Results

In this part of the lab you are going to read the results of a race and then allow the user to perform several types of range queries.

Input

The input to your program will consist of lines of the form:

FirstName LastName mm:ss
where FirstName and LastName are single word fields and mm:ss is the runner's time in minutes and seconds. A sample file might be:
Nels VanderZanden 18:03
Mickey Mouse 20:05
Minnie Mouse 17:50
Brad VanderZanden 16:57
Daffy Duck 17:08
Joe Tortoise 29:08
Sally Hare 16:59
Naturally I won :)

Queries

Once you have read the input the user will be able to enter queries on stdin that request information. Your program should support the following queries:

Program Output

Print the runners who meet the query criteria one per line with single spacing between each field. Note that you cannot assume that the fields in the input will be separated by single spaces and hence it will be necessary to store the fields individually in a runner's struct.

Program Design

You will need to read your input into a binary search tree using the time as the key and the name as the value. You can either store the time in a struct that has two fields--minutes and second--or you can convert the time to seconds using the equation 60*minutes + seconds. If you store the time as a struct then your comparison function will need to compare both the minutes and the seconds. You can use strchr to find the : delimiter and separate the time into separate minute and seconds fields.

Once you have read the input into your binary search tree you will need to enter a loop that reads stdin and performs the indicated query.

Error Checking

You need to perform the following error checks:

  1. Check that the number of command line arguments is correct.
  2. Check that the input file can be opened.
  3. Check that each line of input has exactly three fields.
  4. Check that the format of the time is correct and that the minutes and seconds are both numeric. A time must have one or more digits for the minutes and exactly two digits for the seconds.
  5. Check that a query is correctly formatted and print an appropriate error message if it is not correctly formatted. Your error messages do not need to precisely imitate mine but they should be easily understood by a user.

You may assume that there are no duplicate times in the input (i.e., you do not have to error check this condition) and you do not have to catch time errors of the form 2a:40 or 23:4a, since sscanf and atoi can both convert the strings to numbers. In contrast, you must catch errors of the form a2:40 and 23:a4 because sscanf and atoi cannot convert these strings to numbers. Note that in this lab you cannot use atoi or else you will miss some errors. Can you see why?


Design Document

To help you think through how you might want to design your program for this lab, you should answer the following questions and hand them in when told to do so by the TA:

  1. Show the binary search tree that would result from processing the sample race results file shown earlier. When you draw the tree, also draw the next and prev links between the nodes. To make your drawing simpler, it is okay to:

  2. For each node that gets inserted, show which nodes immediately precede and succeed it in the current tree. For example:
    Nels VanderZanden 18:03: SentinelNode SentinelNode
    Mickey Mouse 20:05: 18:03 SentinelNode
    Minnie Mouse 17:50: SentinelNode 18:03
         
    Now answer the following questions:
    1. Based on the above insertion pattern, if a node is inserted as a left child, what is its successor node (i.e., what is the relationship of the successor node to the left child--parent, grandparent, left sibling, right sibling left child, right child)?
    2. Based on the above insertion pattern, if a node is inserted as a right child, what is its predecessor node?

  3. Suppose you decide to represent the key as a minute/second pair. Show the struct you would declare.

  4. Now show the comparison function you would write to compare two keys in your minute/second pair representation.

  5. Suppose you decide to represent the key as an integer. Show the comparision function you would write to compare two keys.

  6. Show the call you would make to insert a key with value 1078 into a tree named my_tree. Assume the value field is a null pointer for this problem.

  7. Show the struct you plan to use to store the name fields.

  8. Suppose you have created a tree named my_tree and inserted integer keys into it. Complete the following problems:

    1. Write code fragment that prints the value of the minimum key in the tree.
    2. Write a code fragment that prints the first value that is greater than or equal to 100 in the tree.
    3. Write a code fragment that traverses the tree's linked list in ascending order and prints the values of all the keys in the tree.

  9. Given the sample race results file presented earlier, write down the output that should be produced by your program for each of the following queries:

    1. first 3
    2. last 2
    3. range 14:00 17:30
    4. range * 18:00
    5. range 19:00 *

What To Hand In

You should submit your design document when the TA asks for it during the lab. You will submit the following files to the TAs via the submit script:

  1. bintree.h
  2. bintree.c
  3. runner.c