CS140 -- Lab 9

Lab Objective

This lab has you lay out a family tree so that it can be displayed by a graphical viewer. It is designed to give you practice with:

using and searching dllists, and
implementing and traversing general trees

Lab Materials

Executables for the programs can be found in the directory /home/bvz/cs140/labs/lab9. As usual, if you have questions about how the programs should work, try the executables.
family1, family2, and family3 are test files for family_tree. You should develop your own test files as well.

Printing and Drawing A Family Tree

You are going to write two programs, one called print_family_tree that reads an input file that contains a family tree and pretty prints the family tree to the console, and one called family_tree that reads an input file that contains a family tree, calculates the graphical coordinates for each node in the tree, and outputs the graphical coordinates to a user-specified file. You may then pass the output-file as a command line argument to /home/bvz/cs140/labs/lab9/display_tree. display_tree will read the coordinates that you produce and graphically display the family tree. You will write print_family_tree first, and then modify it to create family_tree. Sample invocations of your programs might be:

print_family_tree family1
family_tree family1 family1_output

Format of the Input

The input to your program will consist of lines of the form:

parent-name child₁-name child₂-name ... child_k-name

The first line of the input file represents the root of the family tree. For example:

Mary Jane Jill Emily Howard
Jill Joe William Eddie
Tom Hank Nancy
Howard Tom James Ellen Katie
Jane Tommy Jennifer Susan

Mary is the root of the this tree. To simplify the problem, you may assume that the children names are unique. Note that a child may appear as a parent later in the input. You should not assume that children always appear after their parents. They may appear before their parents as well. For example, Tom's parent line appears on line 3, but you do not find out that Tom's parent is Howard until line 4. Children that do not have a parent line are assumed to be leaf nodes in the family tree.

Program Design

Start by writing print_family_tree. It should use the fields library to read the input lines and construct a family tree. You cannot make any assumptions in advance about how many children a person might have, although one you read a person's parent line, you can use an inputstruct's NF member to determine how many children a person has. Once you have constructed the family tree, you should print it out so that the person at the root of tree is printed first, then each of the person's children's subtrees is recursively printed, with the children indented 4 spaces. For example, given the above file, your output should look like:

Mary
    Jane
        Tommy
        Jennifer
        Susan
    Jill
        Joe
        William
        Eddie
    Emily
    Howard
        Tom
            Hank
            Nancy
        James
        Ellen
        Katie

Once print_family_tree.c works correctly, you should copy it to family_tree.c and modify it so that it computes a graphical layout for the family tree using a two step procedure as follows:

Calculate the space required by each subtree: The space for a subtree should be calculated as follows:
- Space(Leaf Node) = The number of characters in the person's name
- Space(Interior Node) = max(number of characters in the person's name, Sum of childrens' space + (k-1)*2) where k is the number of children. The multiplication by 2 effectively puts a 2 character spacing between children. The max function takes care of unusual cases such as the parent having a single child and the parent having a longer name than the child.
Calculate the position of each node. The y coordinate will be the depth of the node. The x coordinate will represent the center of the node and can be calculated as follows:
- Position(Root) = 0
- Position(Child₁) = Position(Parent) - Space(Parent) / 2 + Space(Child₁) / 2
- Position(Child_i) = Position(Child_i-1) + Space(Child_i-1) / 2 + 2 + Space(Child_i) / 2

Format of Family_Tree's Output

For each person in your family tree you should output the person's name, the person's x and y coordinates, and the names of the person's children. Each person should have their own separate line of output. You should use a pre-order traversal to print your tree. A sample output for the above input file would be:

Mary 0 0 Jane Jill Emily Howard
Jane -31 1 Tommy Jennifer Susan
Tommy -40 2
Jennifer -32 2
Susan -24 2
Jill -9 1 Joe William Eddie
Joe -17 2
William -11 2
Eddie -4 2
Emily 4 1
Howard 24 1 Tom James Ellen Katie
Tom 13 2 Hank Nancy
Hank 10 3
Nancy 16 3
James 22 2
Ellen 28 2
Katie 34 2

Error Checking

You should perform the following error checks:

Check that the number of command line arguments is correct
Check that the input and output files can be successfully opened (only required for family_tree).
Every parent must have at least one child so ensure that each line in the input has at least two fields, one for the parent and at least one for a child.
Check that the input file is not empty (i.e., there must be at least one line of input)

You may assume that names are unique in the sense that no name appears more than once in a family tree (note that a name may appear as both a parent and a child but a name will not appear more than once as a child).

Make sure that you check for special situations, such as a parent having only a single child or a parent's name requiring more space than the space required by its children.

PreLab Homework

The following homework will be discussed at the beginning of lab and should help acquaint you with the problem that you will be asked to solve:

Draw the family tree that would result from the sample input given earlier. Do not worry about space or layout coordinates. Just draw the tree.
For each person in the sample input, calculate the space required by that person's node in the family tree.
For each person in the sample input, calculate the position of that person's node in the family tree.

You should write down your answers on a piece of paper and hand it in to the TA when he/she requests it at the beginning of lab.

Program Implementation

Roughly speaking, here are the steps that you will need to implement in order to implement this program:

You will need to design a struct that you will use to represent a node in the family tree. This struct will need to include the name of the individual, a way to store the individual's children, and fields to represent the space required by the node and its x and y coordinates. For representing the children, you are free to use either the child/next_sibling implementation or the list/array implementation discussed in class. Although you do not know the number of children associated with a person in advance, you can use the NF field of a field's inputstruct to determine how many children an individual has, once that individual's parent line has been read, and then allocate a dynamic array of that size.
You should maintain a dllist that keeps track of all the nodes you have created thus far in the family tree. You will have one list element for each individual that has been mentioned thus far, either as a parent or as a child. For example, the first two lines of the above file read:
```
Mary Jane Jill Emily Howard
Jill Joe William Eddie
```
So after you have read the first two lines, you will have list elements for Mary, Jane, Jill, Emily, Howard, Joe, William, and Eddie. Each of these list elements will point to a tree struct that you should have created for the corresponding individual. When you create a tree struct for an individual, you may not be able to completely fill in its information. For example, when you create the tree struct for Mary, you will be able to completely fill in her children information. However, when you create the tree struct for Jane, you will not be able to fill in her children information. You will have to wait until later in the program when you read her parent line (or you may never read a parent line for Jane if she does not have children).
When you read an input line, you should try to determine whether or not a tree struct already exists for the first person on the line, who is the parent. You can do this by traversing your dllist and checking to see whether or not your dllist contains a list element with the parent's name. If a tree struct already exists, then you should retrieve the tree struct. If a tree struct does not yet exist, then you should create a tree struct for the parent and add a list element for that person to your dllist. For example, when you read the line for Mary, no tree struct yet exists for Mary, so you should create a tree struct for Mary, and also add a list element for Mary to your dllist. However, when you read the parent line for Jill, there is already a tree struct for Jill, so you should simply retrieve the tree struct for Jill from the dllist.
For each of the parent's children you should also determine whether or not a tree struct already exists for the child. If a tree struct already exists, then you should retrieve the tree struct and make the parent point to this child. For example, in the above input, when the parent line for Howard is read (line 4), a tree struct already exists for Tom, who is one of Howard's sons (Tom is first seen on line 3). Hence you would retrieve the tree struct for Tom from the dllist and add a child link from Howard to Tom. If a tree struct does not yet exist, then you should create a tree struct for the child and then make the parent point to this child. In the above input, none of Mary's children yet exists when Mary's line is read, and thus you would create a tree struct for each of Mary's children, add them to your dllist, and then add links from Mary to each of them.
For print_family_tree you will need to traverse the tree in order to print it.
For family_tree, you will need to do two recursive traversals of the tree in order to calculate the layout of each person in the family tree. You will need to do one traversal to calculate the space required by each individual. You will then need to do a second traversal to calculate each individual's position. You cannot combine these two traversals. You should be able to print out each individual's position at the same time as you calculate that individual's position, so you should not require a third traversal to print each individual's position.

Questions to Answer after Completing Your Lab

When you complete your lab, answer the following two questions and place them in a file called answers.txt. Please just use a text editor like vi or notepad to answer these questions.

What type of traversal (pre-order, in-order, or post-order) did you use to print each node of the tree in print_family_tree? Justify your answer. For example, if it was in-order, then you would say that you first calculated the space required by the children in the left subtree, then you calculated the space required by the parent, and finally you calculated the space required by the children in the right subtree.
What type of traversal (pre-order, in-order, or post-order) did you use to calculate the space required by each node in the family tree. Justify your answer. in the same manner as you did in the last question.
What type of traversal (pre-order, in-order, or post-order) did you use to calculate the position of each node in the family tree. Justify your answer in the same manner as you did in the last question.

What To Hand In

You should submit your homeowrk when the TA asks for it at the beginning of lab. You will submit the following files to the TAs via the submit script:

print_family_tree.c
family_tree.c
answers.txt