CS302 --- Binary Search Trees

Brad Vander Zanden

Overview

Binary search trees are a simple and generally efficient internal search structure that can be used when:

you want to maintain an ordered set of keys,
keys may be dynamically added and deleted to the set of keys, and
you want to be able to efficiently find keys

Fundamental Property of Binary Search Trees

At each interior node,

all keys in the left subtree are strictly less than interior node's key, and
all keys in the right subtree are greater than or equal to the interior node's key

Search for a Key

Strategy

compare the search key with the key at the current interior node
if the search key is less than the interior node's tree, go to the left subtree, otherwise go to the right subtree

Code

	struct node {
	    int key;
	    int info;
	    struct node *left, *right;
	};
	struct node *head, *dummy_node;

	int treesearch(int v) {
	    struct node *current_node = head->right;
	    dummy_node->key = v;
	    while (v != current_node->key)
		current_node = (v < current_node->key) ? current_node->left
						       : current_node->right;
	    return current_node->info;
  	}

Code Notes

The dummy node is pointed at by all links that do not have children. In other words, all nodes that have keys are interior nodes and all leaves are dummy nodes. In addition, the binary search tree is complete.
The search begins by initializing the dummy node's key to v. This initialization guarantees that the search will terminate if v is not contained in the tree.
It is assumed that the initialization procedure that creates the tree sets the dummy node's info field to -1. That way, if the search fails, a value of -1 will be returned.

Inserting a Key

Strategy

Find the appropriate spot in the tree for the key.
Link the key into the tree by making its left and right links point to dummy nodes and its parent point to it via the appropriate link

Code

	treeinsert(int v, int info) {
	    struct node *parent, *current_node;

	    /* keep track of parent so that the parent can be linked
		to the search key when the appropriate insertion point
		is found */
	    parent = head; 
	    current_node = head->right;
	    while (current_node != dummy_node) {
		parent = current_node;
		current_node = (v < current_node->key) ? current_node->left
						       : current_node->right;
	    }
	    /* allocate a node for the search key, store the key and
		its information, and make the dummy node be the left
		and right child of the new node */
	    current_node = (struct node *) malloc(sizeof(struct node));
	    current_node->key = v;
	    current_node->info = info;
	    current_node->left = dummy_node;
	    current_node->right = dummy_node;

	    /* make the parent point to the new node via the appropriate
		link */
	    if (v < parent->key)
		parent->left = current_node;
	    else
		parent->right = current_node;
	}

Code Notes

The reason for using a head node at the top of the binary tree with a value smaller than any possible key is that it makes insertion of a root node easy if the tree is empty. The parent pointer can be made to point to the head node and the first key into the tree will be properly inserted as the root node.

Deleting a Key

Strategy

The node has no children: lop it off by making the appropriate link in its parent null
The node has one child: replace the node with the child.
The node has two children but one of the children has no children of its own: replace the node with the child that has no children
The node has two children and both children have children: replace the node with the node that has the next highest key

Code

	treedelete(int v)
	  {
	    struct node *parent, *current_node;
	    struct node *second_parent, *low_node;
	    struct node *t;

	    /* make the dummy node's key v so that even if v is not in the
		tree the search for v's node will terminate */
	    dummy_node->key = v;

	    /* find the node containing v */
	    parent = head; 
	    current_node = head->right;
	    while (v != current_node->key) {
	        parent = current_node; 
		current_node = (v < current_node->key) ? current_node->left 
						       : current_node->right; 
	    }
	    /* find the node that will replace v's node */
	    t = current_node;
	    /* if v does not have a right child, the left child will
		replace v */
	    if (t->right == dummy_node) 
		current_node = current_node->left;

	    /* if v does not have a left child, the right child will
	       replace v */
	    if (t->left == dummy_node)
		current_node = current_node->right;

	    /* if v has a right child with no left child, replace v with
		the right child. Additionally, v's left subtree becomes
		the right child's left subtree. */
	    else if (t->right->left == dummy_node) { 
		current_node = current_node->right; 
		current_node->left = t->left; 
	    }
	    /* otherwise find the node with the next highest key and 
		replace v's node with this new "low node" */
	    else {
		second_parent = current_node->right; 
		low_node = second_parent->left;
		/* the low_node will be the leftmost child in the right
		   subtree */
		while (low_node->left != dummy_node) {
		    second_parent = low_node;
		    low_node = low_node->left;
		}
		/* the low_node replaces v's node so we make low_node
			point to v's left and right subtrees. Before we
			do this, we must promote low_node's right subtree
			so that it becomes the left subtree of low_node's
			parent (we know that low_node does not have a left
			subtree, for if it did, low_node would not be the
			lowest node) */
		current_node = low_node; 
		second_parent->left = low_node->right;
		current_node->left = t->left; 
		current_node->right = t->right;
	      }
	    /* free v and reset its parent link only if t is not the 
		dummy node (i.e., there was a node corresponding to v 
		in the tree */
	    if (t != dummy_node) {
	        free(t);
	        if (v < parent->key) 
		    parent->left = current_node; 
		else 
		    parent->right = current_node;
   	    }
	  }

Code Notes

For simplicity, not all the cases were covered. Additionally, the code always deletes by looking to the right. This policy can leave the tree slightly unbalanced (average height proportional to sqrt(N) rather than lg N) but in practice this inbalance usually does not hurt performance

Performance

Average Case

Insert, Search, and Delete all take O(lg N) time

Worst Case

Insert, Search, and Delete all take O(N) time

Similarity to QuickSort

An inorder traversal of a binary search tree produces a sorted sequence. The root acts much like the pivot in QuickSort, with all values smaller than the pivot appearing in the left subtree and all values greater than the pivot appearing in the right subtree. Notice that binary search trees require more space than QuickSort however.

Check it Out

/sunshine/homes/bvz/courses/302/src/bintree.c contains the source code that's just been presented. /sunshine/homes/bvz/courses/302/bin/bintree contains a compiled verison of this code. bintree will not prompt you for input but it will keep reading lines until you type ctrl-D. The three action codes you can enter are:

d int: delete the search key with value int where int is an integer. Nothing will be printed.
i int1 int2: insert the search key with value int1 into the binary search tree and set its info field to int2. Both values should be integers.
s int: find the search key with value int1 and print its info field.

For example, the input:

i 10 100
i 2  200
i -3 400
s 2
s 10
d -3
s -3

will produce the output:

key = 2          info = 200
key = 10         info = 100
key = -3         info = -1