HashDisplay Debugger for CS 302

What you need...

YOUR HEADER (.h) FILE: Your .h file should include at least the following .h files:

#include <stdlib.h>
#include <amulet.h>
#include "interface.h"
#include "hashDebug.h"

YOUR .cc FILE: Your .cc file should include at least the following .h files...

#include OPAL_ADVANCED__H

You will also need to include the header file that you write. You may also have to include other header files from the CS302 library, such as the Dlist header file.

INITIALIZATION: You need to initialize the following static variable:

Bucket::numRecordsPerBucket; // number of buckets per hashtable entry
			// if you don't define this, the default 
			// value is 4.

For example:

int Bucket::numRecordsPerBucket = 4;

TROUBLESHOOTING--READ THIS SECTION BEFORE CODING!

If you declare a global variable to be one of the following classes, do NOT attempt to initialize the variable when you declare it. If you do this, because of the way the code is compiled, the debugger will not be able to set up its environment before you initialize your variables. So if you use these classes as global variables (which there is no real reason to) you CANNOT initialize them except in a function. For example, the following declarations will fail:
```
HashTable *blocks = new HashTable(2);  // hash table with two entries
HashTable blocks(2);
```
The following declaration will succeed, but you will then need to initialize blocks in a function:
```
HashTable *blocks;
```
BEFORE you try to insert any Records into a Bucket, first add the Bucket to a HashTableEntry. The Record needs to know the number of bits to hash on and if its Bucket does not have a connection to a HashTableEntry, it will not know the number of bits to hash on and this will cause a seg fault if you don't follow this order of assignment.
Be sure that when you double your hashtables as they expand that you delete the old hashtable object. Otherwise, you could have lots of hashtables popping up on the screen at once!

HashTable Class

Your hash table directory should be a variable of type HashTable. When you declare a variable to be of type HashTable, you need to give it the size of the hash table. For example:

 blocks = new HashTable(2);  // hash table with two entries

Each entry in a HashTable is of type HashTableEntry, which is described below. HashTable contains actual HashTableEntry's, rather than pointers to these entries so you do not have to allocate memory for a HashTableEntry.

The HashTable object is defined as follows: (at least these are the methods you will need)

class HashTable {
  public:
   HashTable (int sz);
   ~HashTable ();
   HashTableEntry& get_value (int i);
   int getNumTableBits ();   // number of bits used by the table--this number
                             // is equal to log₂entries and is
			     // computed for you automatically (i.e., you
			     // never need to set the number of bits in the
			     // table
                             
   int getSize ();   // number of entries in the table
};

The HashTable object will contain an array of entries of type HashTableEntry. To retrieve an individual entry, you use the get_value function. For example:

HashTableEntry entry = blocks->get_value(0) // retrieve entry 0

HashTableEntry Class

As mentioned in the previous section, a HashTable is composed of HashTableEntry's. The HashTableEntry class definitions you will need are:

class HashTableEntry {
  public:
   HashTableEntry& operator=(const HashTableEntry& hsh);
   void setBucket (Bucket *buc);
   Bucket * getBucket ();
   int getBitsUsed ();
   void setBitsUsed (int n);
};

The = operator allows you to assign one hash table entry to another hash table entry. This operator will be useful when you double the size of a table and want to copy the entries in the old table to the new table.

Te setBucket and getBucket methods allow you to associate a bucket with an entry and retrieve the entry's bucket.

The getBitsUsed and setBitsUsed methods allow you to retrieve and set the number of bits used by that entry.

Bucket Class

Each HashTableEntry points to a bucket. Each bucket contains up to the number of records specified in the numRecordsPerBucket variable (actually it contains pointers to these records). In order to be able to insert and retrieve records from a bucket, you can use the following methods:

class Bucket {
  public:
   Bucket();
   Record * getRecord (int index);
   void setRecord (int index, Record *r);
};

To get or insert a Record object, you must supply the index of the record. If there were 4 spaces for Records in the bucket object, you would have to insert from indices 0-3. In this sense, the bucket is like an array.

You should not assume that the Bucket constructor initializes its entries to point to records. You need to initialize these entries yourself (e.g., to 0).

Record Class

So far we have seen that each HashTable is an array of HashTableEntries which point to Buckets which are arrays of Records. Kinda messy, but that's what an extendable hash table looks like! So how about the Record class? Well, here's what you need to know:

class Record {
  protected:
  public:
   Record (string lname, string fname, string ht,
            string pos, string yr, string player_team, string home);
   string getLastName ();
   string getFirstName ();
   string getHeight();
   string getPosition();
   string getYear();
   string getTeam();
   string getHometown();
   unsigned getKey(int numBits);
};

This class is just like the RosterRecord class from lab 4. The getKey method returns a hash key with the specified number of bits. You can read your data into the Record class in the same manner that you did in lab4 with the RosterRecord class. Essentially Record has the same interface.

Designing the ExtendibleHashTable Class

In this lab you are expected to design an ExtendibleHashTable class that will insert records into an extendible hash table and that will take a last name and return a doubly linked list of records with that last name. Here's roughly how you will design your class:

Your constructor needs to initialize the hash table by creating a hash table with an initial number of entries, initializing each HashTableEntry with the correct number of bits, and initializing the HashTableEntries to point to buckets. Initially your table should require only one bucket.
Your insert method should take a pointer to a Record (Record *) and should insert this record into the hash table. You can use the Record's getKey method to determine which entry it should be inserted in. insert will also need to handle the splitting of buckets and the resizing of the hash table. You will probably want to add another internal (i.e., protected) method to your class to handle splitting buckets when they get full. This method will also have to handle doubling the HashTable if splitting the bucket requires that this operation be performed. If you wish to double the array, you can simply use the HashTableEntry's = operator to copy hash table entries from the old hash table to the new hash table. For example: newHashTable->get_value(newIndex) = oldHashTable->get_value(oldIndex);
Your find method should take a last name and return a dlist of pointers to all records that contain this last name. You will have to write a separate hash function to return the hash key for the last name. Here's the function you should use (it is identical to the one in Record except that it takes an additional name parameter):
```
     unsigned hash(string name, int numBits) // hashing function
     {
      int i, h;

      for (i = 0, h = 0; i < lastname.length(); i++)
	  {
	     h = (64*h + lastname[i]) % 511;
	  } 
      return h >> (9 - numBits);
     }
     
```

The last piece of code you must write is the function called extendHashTable. This function will essentially be your main() function. You should not write your own main function, since we have done that for you in our driver. extendHashTable should:

read in the roster records and call the insert method in your ExtendibleHashTable class to insert them into the extendible hash table,
query the user for last names, call the find method in your ExtendibleHashTable class to find the list of records containing that last name, and then print the list of records.

extendHashTable should be nearly identical to the main procedure you wrote in lab 4, except that 1) you will be inserting records into an extendible hash table, and 2) you do not need to delete records from the hash table.

How To Use The HashDisplay Debugger

After compiling your code, you will run your code through the hashDisplay debugger environment. You should run your program by simply typing extendHash. For example:

prompt>  extendHash

A Window will then pop up with a couple of scroll bars and 4 buttons at the bottom of the screen. The start button will initiate your program. If there is a seg fault, the start button will turn red and say Fault. At this point you can look at the HashTable to see what various buckets contain to possible acertain where the error in your algorithm might be messing up an insertion or deletion. In my experience a seg fault is usually cause be accessing a record object and calling one of its methods when the record object in null or empty.

If you click on the Expand All button, all buckets will expand and display their information. The button will then say 'Shrink All' and if you click it again, all records will shrink (what a surprise!).

If you click the Binary button, all visible records displaying their binary hash key will revert to a binary form and the button will then display 'Decimal'.

Each entry in the hash table also has two 'buttons'. The dark button to the left with the 'R' is the Record button. When this is pressed, if there is a bucket associated with that entry, the bucket will expand and show the first and last name associated with that record. Another click on the button and the record will shrink.

The last buttons are the arrow buttons which are to the left of each entry in the table. (Its technically a triangle, not an arrow) Anyway, this arrow will manually expand an individual bucket. When it is clicked the first time it will flip directions and the box around it will turn from grey to green. A second click will shrink the record and turn the arrow back and the box with go grey.

Pause Function

You should also notice that to the right of the start button is the word 'Message' when you start up the program. Here you can actually pipe messages to the application from your program. This is accomplished through the pause function. Pause is called with a char string like so:

 pause ("Inserting");

This will do two things for you. It will pipe the 'Inserting' message to the hashDisplay window and it will also pause your program's run at that point. The start button will change to 'Continue' and will stay that way until the program is finished. Each click on continue from that point will take you to the next place in your program where you have called pause.

Between clicks on continue you can expand and shrink any records you want and generally just examine the hash table to see where and when things might be getting moved around. This is especially helpful when you have to do a bucket split so that you can see where your Records are getting moved to.

If you're getting tired of clicking continue and just want your program to finish its run with no further pauses, you can click on the 'Skip All Pauses' button. The button will turn red and say 'Pauses Disabled. Then your program will run to the end.

Finally, when your program finishes, the continue button will turn red and say 'Done'. A click on the button will terminate the program. This way if you want to see how the hash table looks after it has finished its run, you can. Also, you can be assured that your program has finished without any drastic errors.

Display Function

This function is similar to the pause function in that it pauses the program and prints a message to the screen. However it takes as its argument a Record pointer and displays that record on the debugger's screen in the box in the lower left hand corner which says "Currently inserting Record"... You can call the display function as follows:

 display (Record *r, char *);

Obviously, this display function should be called in your insert function. I would recommend putting it right at the beginning so you can see what record has been read in and see its hash key. This information will be erased by the Pause function call, so do not be surprised to see the information in the visually displayed record disappear, even if the insert method has not finished (to ensure that the record is displayed during the insertion method, use display instead of pause).