CS140 Final Answers

Fall 2018


  1. (10 points) Show the binary search tree that results when the keys are presented in the following order:
    	300 150 40 80 450 600 550 200 800 20
    
                       ----- 300 -----          
                      /               \
    	     ----150---           450---
                /          \                \
               40	       200	       600
              /  \                        /   \
             20  80                     550   800
    
    
  2. (10 points) Show the binary search tree that results if susan is deleted from the tree below:
                    ------nancy----------------
                   /                           \
                -fred-                     ---susan-------
                /     \                   /               \
            bonnie  george             peter            zachary
                   /      \                \  	   /     
                charles  nick              sarah     yifan
                                          /          /
    	                          rebecca      xavier
                                     /
                                  ralph
    
    To delete "susan" from this tree we must find the largest child in her left subtree, which is "sarah", and delete sarah's node. We will then replace susan with sarah. sarah has a single child, rebecca, so delete sarah's node and replace the deleted node with rebecca, producing the tree (I have underlined the affected nodes):
                    ------nancy----------------
                   /                           \
                -fred-                     ---sarah-------
                /     \                   /               \
            bonnie  george             peter            zachary
                   /      \                \  	   /     
                charles  nick            rebecca     yifan
                                          /          /
    	                           ralph      xavier
    
    

  3. (10 points) Show the result of doing a single left rotation about the node 500. Do not worry if the rotation increases the height of the tree. All I care about is whether you know how to perform a rotation.
                                -300-
                               /     \
                             175	 500
                            /       /   \
    		      100     400  600
    		      /      /   \
    		     50	   350   450
    
    The left rotation will cause 300 to become a left child of 500. In so doing the left subtree of 500, which is rooted at 400, will become an orphan because 300 is taking its place as 500's left child. Therefore 300 adopts 400 as its right child, while retaining the tree rooted at 175 as its left child. Note that it is ok for 300 to adopt 400 as its right subtree because all of the values in 400's tree are greater than 300. The final tree becomes:
                                    --500--
                                   /       \   
                                -300-      600
                               /     \
                             175	 400
                            /       /   \
    		      100     350   450
    		      /     
    		     50	   
    

  4. (12 points) If we delete 60 from the following AVL tree
                         --100---
                        /        \
                       40        200
                         \      /   \
                         60   150   400
                             /  \
                           125  175
      
    we end up with the following binary search tree that is not a valid AVL tree and hence needs to be re-balanced:
                         --100---
                        /        \
                       40        200
                                /   \
                              150   400
                             /  \
                           125  175
      

  5. (10 points) Behold the following recursive function that computes and returns the sum of an array of n numbers:
    int sum(int numbers[], int start, int end) {
    1)   int middle = (start+end)/2;
    2)   return sum(numbers, start, middle) + sum(numbers, middle + 1, end);
    }
      
    Answer the following questions about this function:

    1. Why is this function incorrect? You must answer in 3 sentences or less.

      The function is incorrect because it is missing a base case and hence will recurse infinitely.

    2. Write the C++ code fragment that you must add to the above function to make it compute the sum correctly and indicate the line that it should be placed before or after (e.g., you might write after line x where x is the line number). You may not modify any code from the above function.

      The base case occurs when there is only one entry in the array to be added. In this case we simply return the value associated with that entry. We know there is only one entry in the array to be added when the start and end indices are the same. Hence the code fragment for the base case is:

      if (start == end)
        return numbers[start];
      	  
      This base case must be added before the recursive case, which means that it should be added after line 1 and before line 2.

  6. (15 points) The greatest common divisor (gcd) of two non-zero numbers is the largest positive integer that divides the numbers with a remainder of 0. For example, the gcd of 48 and 20 is 4, the gcd of 48 and 12 is 12, and the gcd of 12 and 12 is 12.

    Euclid's algorithm is a recursive algorithm for finding the gcd of two integers a and b that can be written as the following C++ function:

    int gcd(int a, int b) {
    1)    if (a == b) return a;
    2)    else if (a < b) return gcd(a, b-a);
    3)    else return gcd(a-b, b);
    }	
    Suppose I have the following main function:
    int main() {
    1)    int x = 48;
    2)    int y = 20;
    3)    cout << gcd(x, y) << endl;
    }
    
    In the stack diagram shown below, complete the stack frames that exist when the series of recursive calls finally arrives at the base case for gcd.

    
    	  |----------------------------------------|
    	  | main:                                  |
    	  |    line: 3                             |
    	  |    x: 48                               |
    	  |    y: 20                               |
    	  |----------------------------------------|
    	  | gcd:                                   |
    	  |    line: 3                             |
    	  |    a: 48                               |
    	  |    b: 20                               |
    	  |----------------------------------------|
    	  | gcd:                                   |
    	  |    line: 3                             |
    	  |    a: 28                               |
    	  |    b: 20                               |
    	  |----------------------------------------|
    	  | gcd:                                   |
    	  |    line: 2                             |
    	  |    a: 8                                |
    	  |    b: 20                               |
    	  |----------------------------------------|
    	  | gcd:                                   |
    	  |    line: 2                             |
    	  |    a: 8                                |
    	  |    b: 12                               |
    	  |----------------------------------------|
    	  | gcd:                                   |
    	  |    line: 3                             |
    	  |    a: 8                                |
    	  |    b: 4                                |
    	  |----------------------------------------|
    	  | gcd:                                   |
    	  |    line: 1                             |
    	  |    a: 4                                |
    	  |    b: 4                                |
    	  |    return value: 4                     |
    	  |----------------------------------------|
    	
  7. (12 points) Behold the following 4 fragments of code:
    (a)
    int i, j;
    int sum = 0;
    for (i = 0; i < n*n; i++) 
        sum += i;
    for (j = 0; j < n/2; j++)
        sum *= j;
         
    (b)
    for (year = 0; year < 2000; year++) {
        for (day = 0; day < 365; day++) {			
            if (n == year % day) {
                printf("year = %d and day = %d\n");			
            }
        }
    }			
    			
    (c)
    int mystery(vector<int> &row, vector<int> &col) {
      int i;
      int result = 0;
    
      for (i = 1; i < row.size(); i *= 2) {
        result += row[i] * col[i];
      }
      return result;
    }
    
    (d)
    In the following code, assume that the function f is O(n2)
    int i, j;
    int sum_f = 0, sum_loops = 0;
    for (i = 0; i < n; i++) {
        sum += f(i);
    }	    
    for (i = 0; i < n; i++) {
      for (j = i; j < n; j++) {				
         sum += i * j;
      }  
    }
    if (sum_f > sum_loops) 
        cout << sum_f << endl;		  
    else
        cout << sum_loops << endl;		     
    			 

    For each fragment of code, please circle its Big-O running time:

    1. O(n2): The first loop executes n2 times and the loop body has a constant number of instructions so the running time of the first loop is n2. The second loop executes n/2 times and the loop body has a constant number of instructions so the running time of the second loop is n. The loops run sequentially so the running time of the code is T(n) = n2 + n or O(n2).

    2. O(1): The running time is independent of n. Even though there are two nested loops, each loop is executed a constant number of times, regardless of the size of n. The loop bodies also execute a constant number of instructions and hence the overall running time of the code is constant or O(1).

    3. O(log n): n is the size of the row vector (i.e., it is the number of elements in the row vector) and i is doubled each time, which makes i get to the end of the row vector after log n iterations of the loop. The loop body executes a constant number of instructions so the running time of the code fragment is O(log n).

    4. O(n3): The first loop executes n times and the loop body requires O(n2) time to execute f(i), so the running time of the first loop is (n * n2) or n3. The second part of the code fragment is two nested loops. The inner loop executes n times when i is 0, (n-1) times when i is 1, (n-2 times) when i is 2, and so on as follows:
      i   number of times
          inner loop executes
      0   n		       
      1   n-1
      2   n-2
      ...
      n-2 2
      n-1 1
      		  
      The loop body of the inner loop runs in constant time and so the running time of the inner loop is the number of iterations it performs. The running time of the outer loop is the sum of the running times of the inner loop. The sum of the running times of the inner loop is "n + (n-1) + (n-2) + ... + 2 + 1" which as shown in class is n(n+1)/2. Thus the running time of the nested loops is n2.

      Finally the running time of the conditional at the end of the code fragment is constant time. Thus the running time of the code fragment is n3 + n2 + 1 which is O(n3).

  8. (12 points)
     
         a. array
         b. vector
         c. stack
         d. deque
         e. hash table
         f. list
         g. binary search tree
         h. AVL tree
    

    For each of the following questions choose the best answer from the above list. Assume that the size of an array is fixed once it is created, and that its size cannot be changed thereafter. Sometimes it may seem as though two or more choices would be equally good. In those cases think about the operations that the data structures support and choose the data structure whose operations are best suited for the problem. You may have to use the same answer for more than one question:

    1. (g) binary search tree: The keys must be kept in sorted order so we will be using a tree. The names of the runners will be inserted in random order so a binary search tree should work fine.

      The data structure you should use if you want to implement a map that records the results of a 10K race and the keys are the last names of the runners. The keys must be kept in alphabetical order and the runners names will be entered in the order that the runners finish the race (hence if "Joe" finishes before "Barry" then "Joe" will be inserted first and then "Barry").

    2. (h) AVL tree: The keys must be kept in sorted order so we will be using a tree. The times will be inserted in sorted order, which is the worst case for a binary search tree. Hence we must use a balanced AVL tree.

      The data structure you should use if you want to implement a map that records the results of a 10K race and the keys are the race times of the runner. The keys must be kept in sorted order and the race times will be entered in the order that the runners finish the race (hence a runner with the time 15:21 will be inserted before a runner with the time 15:36).

    3. (d) deque: When you insert and remove from the front of a queue, a deque is the fastest data structure available, faster than either a list or a vector (you should never use a vector for inserting/deleting at the front of a queue because each insert/delete takes O(n) time). Remember that a deque is optimized for insertions/deletions at the front or end of a queue. It is also faster to iterate through a deque then it is through a list.

      The data structure you should use if you want to reverse a file by reading its lines, adding each line to the front of the data structure, and then traversing the data structure from front to back and printing the lines.

    4. (h) AVL tree: You need to insert/delete/find emails and they need to be kept in sorted order by time so you will need to use a tree. Further the emails arrive in sorted order by time, which is the worst case for a binary search tree. Hence we need to use a balanced AVL tree.

      The data structure you should use to store a collection of emails where the emails are ordered by the time that they arrived and they are inserted at the time that they arrived. You want to be able to print the emails in sorted order by time of arrival, insert/delete emails, and find emails.

    5. (a) array: Hash tables are implemented using "tables" and tables can be implemented using either arrays or vectors. Since we know the size of the data in advance, we can allocate a fixed-size array that is at least 2 times the size of the data (thus leading to load factors less than 0.5) and know that the table will never become more than half full.

      The data structure that is used to implement hashing with linear probing when the size of the data is known in advance (the answer is not a hash table--I want to know the data structure used to implement the hash table).

    6. (b) vector: Buckets are kept as a list and a vector is the most efficient data structure to use for implementing a list when you only add values at the end of the list.

      The best data structure to use to implement the buckets in separate chaining. Each bucket holds the key/value pairs that hash to that bucket and new key/value pairs are typically added to the end of the bucket.


Coding Questions

  1. Reverse String
    void reverseStringHelper(string &s, int start, int end) {
       if (start >= end) return;
       char saveChar = s[start];
       s[start] = s[end];
       s[end] = saveChar;
       reverseStringHelper(s, start+1, end-1);
    }
    
    void reverseString(string &s) {
       reverseStringHelper(s, 0, s.length()-1);
    }
         
  2. Leaf Counting
    int BSTree::recursive_leafcount(BSTNode *n) {
      if (n == sentinel)
        return 0;
      else {
        int leftSize = recursive_leafcount(n->left);
        int rightSize = recursive_leafcount(n->right);
        int numLeaves = leftSize + rightSize;
        if (numLeaves == 0)
          return 1;
        else
          return numLeaves;
      }
    }