CS140 Midterm 2 Solutions

Fall 2018

(12 points) Your boss has asked you to choose between two algorithms. The two algorithms have the following running times:
```
	  Algorithm 1: T(n) = n + 5n² + 80
	  Algorithm 2: T(n) = 10000log n + 10000
        
```
1. What is the Big-O running time of Algorithm 1? O(n²)
2. What is the Big-O running time of Algorithm 2? O(log n)
3. If the size of the input will be typically on the order of n < 50. Which of the two algorithms should you choose and why?
  You should choose algorithm 1 because for such small values of n, the constants in Algorithm 2 outweigh the better Big-O running time of Algorithm 2. For example, for n = 50, T(50) for Algorithm 1 is 12630 while T(50) for Algorithm 2 is roughly 66438.
4. If the size of the input will be typically on the order of n > 10000, which of the two algorithms should you choose and why?
  You should choose Algorithm 2 because for large n, the Big-O running time dominates and the constants become insignificant. Since the Big-O running time of Algorithm 2 is less than the Big-O running time of Algorithm 1, you would prefer Algorithm 2.
(14 points) For each of the following questions circle the best answer from the above list. Sometimes it may seem as though two or more choices would be equally good. In those cases think about the operations that the data structures support and what I said about them in class. Then choose the data structure whose operations are best suited for the problem. You may have to use the same answer for more than one question.
```
 
     a. vector
     b. list
     c. deque
     d. map
     e. set
     f. multimap
     g. multiset
     h. hash table
```
- a You want to get the distribution of exam scores by counting the number of exams in the range 0-9, 10-19, 20-29, ..., 90-100. You are guaranteed that the exam scores range from 0-100.
- h You have a test bank of questions and you want to ensure that the test-taker is never presented the same question twice. You decide you want to store the questions that the test-taker has seen. You do not need to keep the questions in sorted order. The two operations you need to perform are insert to insert a question that has been presented to the test-taker and find to determine if a question is in the data structure. You may assume that each question is identified by a unique label.
- c You are implementing a router for the internet which receives packets from various locations and transmits them to other locations. As packets arrive they should be placed at the end of the "line" and packets should be removed from and transmitted from the front of the "line".
- f You are implementing a diary program that stores dates and locations that the diarist visited on that date. The values are kept as (date, location) pairs. It is important to keep the dates in chronological order and a person may visit multiple locations on the same dates (hence there could be multiple pairs that start with the same date).
- a You want to reverse the lines in a file by storing the lines in the same order they are stored in the file (i.e., add each line to the back of the data structure) and then traversing the data structure from back to front once you finish reading all the lines from the file.
- b You want to implement a to-do list where the user can both add and delete items in arbitrary places in the list. The to-do list is not kept in any type of order.
- d You are keeping track of the number of times that each IP (internet) address has accessed your web-site. Each time an IP address accesses your web-site, you access that address in your data structure and increment its count by 1. You decide that it is important to keep the IP addresses in sorted order so that you can perform range queries on them. Each IP address is unique.

(10 points) Suppose I have the following declarations and code:

     string a, b;
     string *x, *y;

Also suppose that the above variables are assigned the following memory addresses:

     a: 0x1000
     b: 0x1100
     x: 0x1200
     y: 0x1204

After each of the following code sequence executes, what are the values of a, b, x, and y? Assume that code segments 1 and 2 execute independently (i.e., code segment 2 does not execute after code segment 1.

       Code Segment 1
       x = &b;
       *x = "Smiley";
       y = x;
       *y = "Brad";
       b = *x;

       Code Segment 2
       x = new string("Michelle");    // address of new string = 0x2000
       a = *x;
       y = new string("Charles");     // address of new string = 0x2100
       x = y;
       b = *x;

       Code Segment 1                 Code Segment 2
       
       a: ""                                 a: Michelle


       b: "Brad"                           b: Charles


       x: 0x1100                             x: 0x2100


       y: 0x1100                             y: 0x2100


                                            *x: Charles


                                            *y: Charles

(6 points) Suppose you are given the following code:
```
class Picture {
public:  
    ifstream *sourceFile;
    int rows;
    int cols;
};

Picture *p;
string *name1, *name2;

name1 = new string("Hank Aaron");
name2 = name1;
delete name1;
p = new Picture();
p->sourceFile = new ifstream();
p->rows = 30;
p->cols = 40;
*name2 = "Hammerin Hank Aaron";
p->sourceFile->open("mountain.jpg");
```
Answer the following questions about the above code:
1. dangling pointer This term is used to describe what name2 becomes after name1 is deleted.
2. What is likely to happen when p->sourceFile->open is executed and why is it likely to happen? Use no more than 3 sentences for your answer.
  The most likely result is a seg fault because there is a good chance that name2 points to the same memory as p and hence the string "Hammerin Hank Aaron" has overwritten the memory pointed to by p. In particular, the memory address in p->sourceFile has been overwritten with random characters and therefore is likely to no longer represent a valid memory address. When you try to access an invalid memory address, you get a seg fault.
  The other less likely result is that when p->sourceFile was overwritten with random characters, the memory address was a valid one but instead of this memory address being the ifstream that was allocated, it is a random memory address. Since p->SourceFile no longer points to the ifstream object, it is hard to predict what will happen but at the very least, the open function will not get called. There is also a good chance that a seg fault will happen because it is unlikely that the memory address to which p->SourceFile points is a valid function address.
(8 points Suppose I want to insert n telephone numbers into the following data structures and then perform n finds, one for each of the telephone numbers. For each of the following three data structures give the total Big-O running time for the find and insert operations (i.e., I do not want the running time for a single insert operation but for all n insert operations). Make the following assumptions:
- For the list and vector, assume that new telephone numbers are always added to the beginning of the data structure.
- For the hash table give me the average case running time.
Data Structure Insert Find

Map O(n log n) O(n log n)

Hash Table O(n) O(n)

List O(n) O(n²)

Vector O(n²) O(n²)

Here is the rationale for each of the answers:
(10 points) The following code is reading the first and last names of individuals and is keeping track of all of the first names that are associated with a last name. For example, if the input is:
```
      Brad VanderZanden
      Smiley VanderZanden
      Minnie Mouse
      Mickey Mouse
      Ebby VanderZanden
    
```
then the set {Brad, Ebby, Smiley} will be associated with "VanderZanden" and the set {Mickey, Minnie} will be associated with "Mouse".
```
  typedef set <string> fnset;

  map <string, fnset> lnames;
  string fn, ln;
  fnset firstnames;

  while (cin >> fn >> ln) {
     firstnames = lnames[ln];
     firstnames.insert(fn);
     lnames[ln] = firstnames;
  }
    
```
1. Explain why the following code is inefficient (it computes the correct result but is inefficient). Use no more than 3 sentences to describe the problem.
  The line firstnames = lnames[ln] makes a copy of the set associated with ln. The next line inserts into this copy and the final line copies this copy back into the original set. Making copies is expensive and should be avoided if possible.
2. How could you rewrite the code to fix the inefficiency. Only modify that portion of the code that needs to be changed.
  You can fix the inefficiency by avoiding the copy and modifying the original set instead. This can be done in two ways.
(12 points) Assume you have the following declaration:
```
class ListNode {
   public:
      string name;
      ListNode *next;
      ListNode *prev;

      ListNode(string n) : name(n) {}
};
```
Further suppose that a series of inserts have created the following list:

The code below is supposed to move the node containing "Ben" so that it is between "Nancy" and "Sarah". However, the code eventually seg faults and there are some other issues with the code as well.
```
1)      ListNode *nextNode = currentNode->next;
2)      currentNode->prev = currentNode->prev->prev;
3)      currentNode->next = currentNode->prev;
4)      nextNode->prev = currentNode->prev;
5)      currentNode->prev->prev = currentNode;
6)      currentNode->next->next = nextNode;
      
```
1. Draw the diagram that results when the code above is executed up until the statement that seg faults (include nextNode in the diagram). Do not include the seg faulted statement or the effects of executing any statement after the seg faulted statement.
  Statement 2 messes everything up because it prematurely sets Nancy's node (which is the node to which currentNode points) to NULL. That means that statement 3 sets currentNode->next (i.e., Nancy's next field) to NULL and statement 4 sets nextNode->prev (i.e., Sarah's prev field) to NULL.
```
       ----------------       -----------------     -----------------
       | name: "Ben"  |       | name: "Nancy" |     | name: "Sarah" |
       | next: -------|------>| next: NULL    |     | next: NULL    |
       | prev: NULL   |	      | prev: NULL    |     | prev: NULL    |
       ----------------       -----------------     -----------------
                                      ^                     ^
                                      |                     |
                                 currentNode            nextNode

     
```
2. Which statement seg faults? 5
3. What does the seg faulting statement appear to be trying to accomplish? Please say something like "it was trying to make Sarah's next field point to Ben's node".
  It appears to be trying to make Ben's prev field point to Nancy's node. This is consistent with Ben being moved after Nancy in the list.
4. One reason the above code is buggy is because it tries to avoid using temporary variables by traversing multiple pointers in a single expression (e.g., current->next->next). These types of multiple traversals often result in buggy code. Re-write the above code so that 1) it successfully places the node containing "Ben" between the nodes containing "Nancy" and "Sarah", and 2) it is easier to read by introducing an additional temporary variable and eliminating multiple pointer traversals (i.e., in any pointer expression, there should be only one ->).
```
      ListNode *nextNode = currentNode->next;
      ListNode *prevNode = currentNode->prev;

      currentNode->next = prevNode;
      currentNode->prev = prevNode->prev;
      prevNode->next = nextNode;
      prevNode->prev = currentNode;
      nextNode->prev = prevNode;
      
```
(10 points) Assume you have the following code:
```
class Person {
public:
  string name;
  Person *bestFriend;

  Person(string n) { name = n; }
};

string names[] = { "mickey", "bugs", "daffy", "winnie" };
list<Person *> friends;
Person *newPerson;
Person *lastPerson;
int i;

lastPerson = NULL;

for (i = 0; i < 4; i++) {
    newPerson = new Person(names[i]);
    newPerson->bestFriend = lastPerson;
    friends.push_back(newPerson);
    lastPerson = newPerson;
}
```
Draw a diagram that shows the list, the Person objects that get created, and the objects to which each pointer points when the code has finished execution. Make sure that:
- You draw the sentinel nodes for the list
- You show the objects to which newPerson and lastPerson point.
- You show the objects to which the bestFriend pointers point.
- You show the objects to which each list node points

Data Structure	Insert	Find
Map	O(n log n)	O(n log n)
Hash Table	O(n)	O(n)
List	O(n)	O(n²)
Vector	O(n²)	O(n²)

answer on next page

Coding Solutions

Codes

void deleteCodes(list<string> &usedCodes, list<string> &codes) {
   list<string>::iterator usedIter, codesIter;

   codesIter = codes.begin();
   usedIter = usedCodes.begin();

   // The used codes list will be exhausted before the outstanding codes
   // list so we stop when we have exhausted the used codes list
   while (usedIter != usedCodes.end()) {
     if (*codesIter == *usedIter) {
       codesIter = codes.erase(codesIter);
       usedIter++;
     }
     else {
       codesIter++;
     }
   }
}

Clubs

Hash Insert Function

void HashTable::Insert(string clubName, Person *student) {
  unsigned int bucketIndex = Hash(clubName)%table.size();
  int i;
  Club *c;

  for (i = 0; i < table[bucketIndex].size(); i++) {
    if (table[bucketIndex][i]->name == clubName) {
      table[bucketIndex][i]->members.push_back(student);
      return;
    }
  }
  // if we get here then the club did not exist, so create a new club
  // and then add the student to it
  c = new Club(clubName);
  c->members.push_back(student);
  table[bucketIndex].push_back(c);
}

AddStudentsAndClub

void AddStudentsAndClubs(HashTable *clubs, map<int, Person*> &students) {
  string command;
  while (cin >> command) {
    if (command == "AddStudent") {
      string studentName;
      int studentId;
      cin >> studentId >> studentName;
      students[studentId] = new Person(studentId, studentName);
    }
    else if (command == "JoinClub") {
      string clubName;
      int studentId;
      Person *student;

      cin >> studentId >> clubName;
      student = students[studentId];

      if (student == NULL) {
	continue;  // student does not exist
      }
      else {
	clubs->Insert(clubName, student);
      }
    }
  }
}