## Homework 5 Solutions

1. Hashing is the best strategy since:
1. ordered traversals and find min/find max operations are not required.
2. the fact that the data is almost ordered should have no impact on the performance of the hashing algorithm since the hashing function should assure that keys are properly randomized, and
3. hashing's O(1) average time complexity for insert, delete, and find operations is better than either the O(log n) time complexity for balanced tree and the O(N2) time complexity for unbalanced tree schemes on almost ordered data.

1. the prefix expression is obtained from a pre-order traversal: - * * a b + c d e
2. the infix expression is obtained from an in-order traversal: a * b * c + d - e
3. the postfix expression is obtained from a post-order traversal: a b * c d + * e -

2. Result of inserting 3, 1, 4, 6, 9, 2, 5, 7 into an initially empty binary search tree:
3
/ \
1	4
\   \
2   6
/ \
5   9
/
7

3. Consider the following tree:
---------50----------
/                     \
----25----             ----75----
/          \           /          \
10          40         60          90
/            /   \     /            /
2           35     45  55          85
\
57

1. Draw the tree that results from deleting 2
---------50----------
/                     \
----25----             ----75----
/          \           /          \
10          40         60          90
/   \     /            /
35     45  55          85
\
57

When a node is a leaf, like 2, you simply remove it from its parent.

2. Draw the tree that results from deleting 25
---------50----------
/                     \
----35----             ----75----
/          \           /          \
10          40         60          90
/               \     /            /
2                 45  55          85
\
57

When a node with two children is deleted, you must perform the following actions:

• Find the smallest node in the node's right subtree and promote that node to the root. In this case, 35 is the smallest node in 25's right subtree.
• Recursively delete the promoted child's node from the tree. In this case, 35 is a leaf so it is simply removed from the tree.

3. Draw the tree that results from deleting 50
---------55----------
/                     \
----25----             ----75----
/          \           /          \
10          40         60          90
/            /   \     /            /
2           35     45  57          85

When a node with two children is deleted, you must perform the following actions:

• Find the smallest node in the node's right subtree and promote that node to the root. In this case, 55 is the smallest node in 50's right subtree.
• Recursively delete the promoted child's node from the tree. In this case, deleting 55 causes 57 to be promoted to 55's old location.

1. The height of the tree in problem 4 is 4, which is obtained via the path 50-75-60-55-57, which is a path of length 4.

2. The height of a tree is equal to 1 plus the maximum of the heights of the root's two subtrees:
tree_height = 1 + max(height(root->left), height(root->right))

This definition is recursive in that the heights of the left and right subtrees can be computed in the same fashion. By convention the height of an empty tree is -1. These facts lead to the following recursive function for computing a tree's height:
int height (node *root) {
if (root == 0)
return -1;
else
return 1 + max(height(root->left_child),
height(root->right_child));
}

The max function is defined in the math.h library. If you did not use the max function, and I would not expect you to, then the following function would work:
int height (node *root) {
int left_height;
int right_height;
if (root == 0)
return -1;
else {
left_height = height(root->left_child);
right_height = height(root->right_child);
if (left_height > right_height)
return 1 + left_height;
else
return 1 + right_height;
}
}

4. It uses a post-order traversal because the height of a node is computed only after its left and right children are processed (i.e., the heights of its left and right children are computed before the node's own height is computed).

5. The problem with the function is that there is no statement to stop the recursion and hence it will recurse infinitely, or in practice, until the program runs out of stack space. The fix comes from recognizing that the function is computing n! and that 0! = 1. Hence the function can be rewritten properly as:
int fact(n) {
if (n == 0)
return 1;
else
return n * fact(n-1);
}

If you wanted to be really careful, you could also ensure that the initial value of n is non-negative. It would be inefficient for every call to fact to check whether n is non-negative so one would write a helper function that would do the recursion and have fact itself perform the check:
int fact_helper(n) {
if (n == 0)
return 1;
else
return n * fact_helper(n-1);
}

int fact(n) {
if (n < 0) {
fprintf(stderr, "fact(%d): The argument must be non-negative\n", n);
exit(1);
}
return fact_helper(n);
}

6. Suppose that you have a data set with 1,000,000 randomly distributed elements. Compute the average number of searches required to a) find a key that exists, b) determine that a key does not exist in the following data structures. Use the big O notation to do your calculation (e.g, if the average time was O(n2) for a successful find, you would answer 1012. Do not write your answers using exponents, this number was simply too big to write out):

Data StructureKey existsKey does not exist