UNIX> wc txt/input.txt # This is in the Trees lecture note directory 50000 200000 2397902 txt/input.txt UNIX> head txt/input.txt INSERT Brooke-Footwork 443-90-4990 898-934-4865 INSERT Sophia-Allison-Bromfield 510-30-7699 873-553-7759 INSERT Grace-Barnabas 948-49-5092 562-672-8825 INSERT Cole-Illogic 225-22-1798 976-177-7104 INSERT Elizabeth-Green 451-59-3245 106-637-5581 INSERT Dylan-Bambi 183-22-7881 033-896-1807 INSERT Anna-Hitch 284-19-1258 072-144-3834 INSERT Michael-Bilayer 741-13-7226 327-981-7902 INSERT Gavin-Harriman 831-80-7194 488-419-0189 INSERT Charlie-Iii 998-93-7448 930-447-4165 UNIX> time cat txt/input.txt | bin/bstree_tester - real 0m0.435s user 0m0.424s sys 0m0.007s UNIX> time sort txt/input.txt | head -n 1000 | bin/bstree_tester - real 0m0.512s user 0m0.506s sys 0m0.010s UNIX> time sort txt/input.txt | head -n 2000 | bin/bstree_tester - real 0m0.633s user 0m0.629s sys 0m0.010s UNIX> time sort txt/input.txt | head -n 4000 | bin/bstree_tester - real 0m1.273s user 0m1.269s sys 0m0.010s UNIX> time sort txt/input.txt | head -n 8000 | bin/bstree_tester - real 0m3.747s user 0m3.741s sys 0m0.015s UNIX> time sort txt/input.txt | bin/bstree_tester - real 2m7.399s user 2m7.181s sys 0m0.140s UNIX>That's a big problem with binary search trees. AVL trees (and other balanced trees like Splay trees, Red-Black trees, B-trees, 2-3 trees, etc) make sure that their trees are balanced so that the various operations are much faster. For example, the program avltree_test is my solution to the AVL Tree lab (which some semesters will not have the pleasure of implementing):
UNIX> time sort txt/input.txt > /dev/null real 0m0.436s user 0m0.428s sys 0m0.007s UNIX> time cat txt/input.txt | bin/avltree_tester - real 0m0.296s user 0m0.291s sys 0m0.007s UNIX> time sort txt/input.txt | bin/avltree_tester - real 0m0.698s user 0m0.721s sys 0m0.012s UNIX>As you can see, since sorting takes .43 seconds, performing insertions with the AVL tree takes the same time when the input is sorted as when it is not sorted.
A central operation with AVL Trees is a rotation. It is a way of changing a binary search tree so that it remains a binary search tree, but changes how it is balanced. The concept is illustrated below:
B and D are nodes in a binary search tree. They can occur anywhere a tree, but we don't worry about what's above them -- just what's below them. A, C and E are subtrees that rooted as the children of B and D. They may be empty. If they are not empty, then since the tree is a binary search tree, we know that:
When we perform a rotation, we perform it about a node. For example, the rotation pictured above rotates about node D to turn the tree on the left to the tree on the right. It also shows that you can turn the tree on the right to the tree on the left by rotating about node B.
When you rotate about a node, you are going to change the tree so that the node's parent is now the node's child. The middle subtree (subtree C) will change from being one node's child to being the other node's child. The rotation does not violate any of the properties of binary search trees. However, it changes the shape of the tree, and there are multiple types of trees, like AVL, Splay and Red-Black trees that employ rotations to ensure that the trees are balanced.
Below are some example rotations. Make sure you understand all of them:
The definition of an AVL tree is follows:
Below are some AVL trees:
And below are two trees that are binary search trees, but are not AVL trees.
Binky violates the definition |
Fred violates the definition |
Let's try some examples. Suppose I have the following AVL tree -- I now annotate the nodes with their heights:
If I insert Ahmad, take a look at the resulting tree:
The new node Ahmad has a height of one, and when I travel the path up to the root, I change Baby Daisy's height to two. However, her node is not imbalanced, since the height of her subtrees are 1 and 0. Moving on, Binky's height is unchanged, so we can stop -- the resulting tree is indeed an AVL tree.
However, suppose I now try to insert Waluigi. I get the following tree:
Traveling from the new node to the root, I see that Fred violates the balance condition. Its left child is an empty tree, and as such has a height of 0. Its right child has a height of 2. I have to rebalance the tree.
Up to two of the three subtrees A, C and E may be empty in this picture, but all three won't be empty. You'll note that the two pictures are mirror copies of one another.
Now, each of these may be further broken up into two cases, which we call "Zig-Zig" and "Zig-Zag". Let's concentrate on Zig-Zig, because it is simpler. Here is what it looks like in its two mirror images:
You'll note that the defining feature is that the direction of the imbalance either goes from the right child of the root to its right child (in the left picture), or from the left child of the root to its left child (in the right picture). That's why it's called "Zig-Zig".
To fix the Zig-Zig imbalance, you rotate about the child. That "fixes" the imbalance in each case, and it also decreases the height of the tree. In the pictures below, make sure that you double-check all of the nodes to make sure they meet their balance conditions:
This is the left tree above, rotated about node D. |
This is the right tree above, rotated about node B. |
It's now a good time to do some examples where we insert a node into an AVL tree, it becomes imbalanced due to a Zig-Zig imbalance, and we then fix it with a rotation. What I'm going to do in each picture below is show the tree and state what node we're inserting. Then I'll draw the resulting tree, which is imbalanced, and I'll shade the A, C and E subtrees. I'll then show the balanced tree that results when you perform the rotation:
We insert "Ralph" |
"Khloe" is imbalanced. |
It's an AVL Tree again! |
We insert "Becca" |
"Eunice" is imbalanced. |
It's an AVL Tree again! |
We insert "Zelda" |
"Henry" is imbalanced. |
It's an AVL Tree again! |
The "Zig-Zag" imbalance happens when the imbalance goes right, then left, or left, then right. Here's what it looks like in its two mirror images:
To explain how to fix this, we need to blow up the C tree in the picture above, relabeling the nodes and subtrees so that they make sense:
To rebalance the Zig-Zag case, we need to rotate twice about the grandchild. In each of these pictures, the grandchild is D. In the pictures below, we rotate once about D, but the tree is not balanced yet:
One rotation about D: |
One rotation about D: |
We perform one more rotation about D, and now the tree is balanced. In fact, in both cases, the resulting trees are identical!
After the second rotation about D: |
After the second rotation about D: |
Let's do some examples:
We insert "Don". |
"Eve" is imbalanced. |
It's an AVL Tree again! |
We insert "Eve". |
"Kim" is imbalanced. |
It's an AVL Tree again! |
We insert "Ginger". |
"Brad" is imbalanced. |
It's an AVL Tree again! |
Zig-Zig -- same as above. |
This case only occurs with deletion. We treat it as a Zig-Zig. |
Zig-Zag -- same as above. |
After rebalancing, you can't stop, as you do with insertion. Instead you need to keep traveling toward the root. You only stop when you reach the root, or you don't change the height of a node (because then the heights of its ancestor nodes won't change either).
Let's look at some examples. As with insertion, I'll show an original tree before deletion, the tree after deletion, but before rebalancing, and the tree after rebalancing.
We delete "Hal". |
We check Hal's parent, Ian, and it's balanced and its height is unchanged. We're done. |
We delete "Ian". |
As we move up to the root, we see that "Cal" is imbalanced. It's a Zig-Zig, so we rotate about "Bob". |
It's an AVL Tree again. You'll note we still had to travel up from Bob to the root, and change Kim's height from 5 ato 4. |
We delete "Nell". To do that, we find the largest node in Nell's left subtree -- Anne. We delete Anne and replace Nell with Anne. |
Anne is imbalanced. It's a Zig-Zag, so we rotate twice about "Omar". |
It's an AVL Tree again. |
We delete "Naomi". To do that, we find the largest node in Naomi's left subtree -- Jet. We delete Jet and replace Naomi with Jet. |
Jet is imbalanced. It's a Zig-Zig, so we rotate about "Samson". |
That subtree is balanced, but have to continue going to the root. We find that "Henry" is also imbalanced, and that it's a Zig-Zag. Two rotations about ""Dub". |
Now, we're done and the tree is balanced. |