CS302 -- Lab 1

CS302 -- Fundamental Algorithms
Fall, 1999
Brad Vander Zanden
Due: Wednesday, September 8 at 2:30PM for the Wednesday lab and Friday, September 10 at 11:15AM for the Friday lab.
Sample Makefile: /ruby/homes/ftp/pub/bvz/classes/cs302/labs/lab1/makefile

Where to Find Things

The only files that you should copy to your own directory are the makefile and the input1, input2, and input3 files. You should not copy any of the .h files to your directory. The makefile ensures that they will be found during the compilation process. Copying them to your directory will use up disk space. Every semester the CS department seems to run into a space crunch on its disks so we would appreciate it if you would heed this request.
The declarations for the classes described in this lab can be found in /ruby/homes/ftp/pub/bvz/classes/cs302/include. The names of the .h files you will need from the include directory are mystring.h, StringArray.h, and Fields.h. For this lab there is no need to actually look at the include files. All the information you need is contained in this lab handout. You simply need to make sure that you include the .h files in your own .cc files.
A description of Dr. Plank's Fields library can be found here.

Overview

This lab is designed to give you practice using C++ objects, writing object-oriented code, and seeing how object-oriented code differs from imperative, C-style code. Specifically you will use a string object, an array object, and an object-oriented version of Dr. Plank's Fields library to rewrite a couple programs you wrote in CS140. In doing so, you will be introduced to some of the basic class features of C++.

String Class

C and C++ both support strings via the char * data type. Unfortunately, as you've probably already discovered, it is easy to run into pointer problems with char * strings and it is also easy to forget to copy them at the right times. The Weiss book provides a nice implementation of a string class that allows you to use strings more naturally. For this lab, you may assume that the string class supports the following operations:

String creation: You can create a string simply by declaring it:
```
string a;
```
You can also provide a default initial string:
```
string a = "Hello World";
```
For this assignment you will not need to use pointers to strings so do not worry about declaring string *'s. In fact, one of the big advantages of string objects is you do not have to manipulate them using pointers.
String assignment: You can assign one string to another using the assignment operator:
```
string a = "Hello World";
string b;

b = a;
```
After the assignment is complete, a and b point to their own copies of "Hello World". In other words, the assignment operator automatically performs a strdup to copy the string from a to b.
You can also assign simple strings to a variable. For example:
```
string b;

b = "See ya!"
```
Obtaining a char *: For this lab you will still be using the printf command to print out strings. printf expects a char * and not a string. To obtain a char * from a string, use a string's c_str method:
```
printf("b = %s\n", b.c_str());
```
Character manipulation: You can look at individual characters in a string using array notation. For example, if b stores the string "See ya!", then you can access the individual characters via b[0], b[1], ..., b[6]. For example:
```
     char a;
     
     a = b[0];  // a = 'S'
     
```

For those of you who are curious, you can find the full declaration for the string class in /ruby/homes/ftp/pub/bvz/classes/cs302/include/mystring.h. In order to use the string object, you will have to include mystring.h at the top of your file.

Array Object

A deficiency of C-style arrays is that they cannot dynamically expand themselves when they become full. The following array object remedies this drawback. It stores strings (i.e., the string objects described above), and, if you attempt to store a string at an index beyond the current length of the array object, the array object will automatically increase its size so that it is at least as big as the requested index. The declaration for the StringArray class looks as follows:

	class StringArray {
	public:
	    // operations supported by StringArray 
	    StringArray( int size = 12 );
	    // return the value at location index
	    string get_value(int index);
	    // set the value at location index
	    void set_value(int index, string value);
	};

The StringArray class supports three operations:

Array Creation: You can declare a variable to be of type StringArray and you can give the array a default initial size. For example:
```
     StringArray myArray(20);
     
```
Alternatively you do not have to provide an initial size, in which case the initial size will default to 12:
```
     StringArray myArray();
     
```
Retrieving an Element: You can retrieve an element via the get_value method. For example:
```
     x = myArray.get_value(5);
     
```
Storing an Element: You can store an element in the array via the set_value method. For example:
```
     myArray.set_value(4, "Hi there");
     
```

Field Library

You may recall from CS140 the Field library that was used to read lines of data from a file and write lines of data to a file. To refresh your memory, here is the fields.h file:

#define MAXLEN 1001
#define MAXFIELDS 1000

typedef struct inputstruct {
  char *name;               /* File name */
  FILE *f;                  /* File descriptor */
  int line;                 /* Line number */
  char text1[MAXLEN];       /* The line */
  char text2[MAXLEN];       /* Working -- contains fields */
  int NF;                   /* Number of fields */
  char *fields[MAXFIELDS];  /* Pointers to fields */
  int file;                 /* 1 for file, 0 for popen */
} *IS;

extern IS new_inputstruct(/* FILENAME -- NULL for stdin */);
extern IS pipe_inputstruct(/* COMMAND -- NULL for stdin */);
extern int get_line(/* IS */); /* returns NF, or -1 on EOF.  Does not
                                  close the file */
extern void jettison_inputstruct(/* IS */);  /* frees the IS and fcloses 
                                                the file */

You may recall that the only fields of inputstruct that a program was supposed to access were the line, text1, NF, and fields variables. The remaining variables were supposed to be working variables that were used by the fields package but were off-limits to a program. However, in C there is no way to prevent a program from accessing or modifying these forbidden variables. As a result a program can modify a working variable and cause the fields package to crash.

C++ provides a way to remedy this difficulty by allowing the programmer to declare the working variables protected. As you will soon discover, it is generally important to protect all of a class's variables from public access, not just the working variables. However, we still want to provide the programmer access to the line, text1, NF, and fields variables. We can do this by providing accessor methods that return this data.

Finally, note that the fields package provides four operations. In C these operations must be defined independently of the fields data structure. In C++, these operations are defined as part of the fields class.

Here then is the public class declaration for the fields class in C++:

#include 
class Fields {
  public:
    Fields(string filename);  // implements new_inputstruct
                              // pass the string "stdin" if you want 
			      // to open stdin
    ~Fields();                // implements jettison_inputstruct
    int get_line();           // implements get_line
    
    int get_line_number();    // return the current line number
    int get_NF();             // return the number of fields in the 
                              // current line
    string get_current_line();   // returns the current line
    string get_field(int i); // return the ith field in the current line
}

This class certainly looks different than the C-style fields.h file. First new_inputstruct and jettison_inputstruct are nowhere to be found (neither is pipe_inputstruct but we won't worry about pipe_inputstruct for this lab). Actually they are to be found--new_inputstruct is now incorporated in the constructor and jettison_inputstruct is now incorporated in the destructor. Why has this been done? Well, new_inputstruct is meant to initialize a fields data structure and jettison_inputstruct is meant to destroy a fields data structure. In C++ these functions are performed by a constructor and a destructor respectively.

Second, jettison_inputstruct and get_line require an inputstruct as a parameter in the C implementation of the fields package but not in the C++ package. Why is this? Well, in the C implementation jettison_inputstruct and get_line are defined independently of an inputstruct. Hence, in order to know which inputstruct they are working on, the inputstruct has to be passed as a parameter. However, in the C++ implementation, jettison_inputstruct (actually ~Fields()) and get_line are bound to the fields class. Hence they know which fields object they are working on. Specifically, they can access the object using the implicit this pointer.

Finally, there are now accessor functions for retrieving the line number, the number of fields, the current line, and a field in the current line. These accessors are relatively straightforward. Notice that instead of exporting the array that holds the fields, the program provides an index and the fields class returns the appropriate field.

Adding Safety to the Fields Class

Remember that the C-style version of the fields package returns a line or a field by returning a pointer to a memory location in an inputstruct. The problem that often arises is that we set a variable to point to this memory location and we are mystified when the value later changes. The reason this happens is that the memory location in the inputstruct gets reset to a new value when a new line is read in. The C++ version of the fields package remedies this problem using the string class. When the fields package returns a line or a field, it actually returns a string. Remember that when a string is assigned to another string, a copy of the string is made.

Maxminname

Your first program in this lab will use the fields class to reimplement the maxminname program that you wrote in lab2 for CS140. To refresh your memory, maxminname takes an input file composed of names and scores, and prints out the lines corresponding to the maximum and minimum numbers in the input file. The specific format of the input file is as follows. Each line is of the form:

name score

The name may contain any number of words with any amount of white space between them. No word in a name may begin with the characters `0'-`9', `-', `+', or `.' The score is a floating point number (use a double). Example input files are input1, input2 and input3.

Maxminname should take an input file on standard input, and print out the maximum and minimum score. If standard input is not in the proper form, maxminname can do anything.

UNIX> maxminname < input1
Max: 0.714000
Min: 0.377000
UNIX> maxminname < input2
Max: 264.000000
Min: 221.200000
UNIX> maxminname < input3
Max: 74.580000
Min: 69.210000
UNIX>

Hint: Test the first character of each field to see if a word is a score. Then use sscanf() to convert it to a double.

You should hand in both a C and a C++ version of the maxminname program. Name the c version c_maxminname and the c++ version cpp_maxminname (cpp stands for c-plus-plus). If you have your C version of the maxminname program from CS140 it is ok to hand in that version. Otherwise you should write your own. If you did not write the maxminname program in CS140 it should not take long to write your own C version and it will be good practice. The C version should use Dr. Plank's fields library and char * strings. The C++ version should use the Fields and string classes.

While we do not require a write-up of the differences between your C and C++ program, you should examine the two programs to see how they differ. Note in particular how the C version requires you to pass data structures to functions whereas the C++ version requires you to first name the data structure (i.e., the object) and then specify the operation you wish to be performed on the data structure.

Reversing a File with a Dynamic Array

The second program you will write for this lab requires you to use the the dynamic array class and the fields class to print out lines of text from stdin in reverse order. You can do this by reading lines of text from stdin, storing them in a dynamic array, and then writing them back out in reverse order. For example, if stdin consists of the lines:

New York Yankees .714
Boston .574
Toronto .538
Baltimore .500
Tampa Bay .387

your output should consist of the lines:

Tampa Bay .387
Baltimore .500
Toronto .538
Boston .574
New York Yankees .714

You should write a version of this program in both C and in C++. When you create the array, you should specify that the initial size of the array is 5. The reason for this low number is we want to make sure that your code that expands the array actually gets executed.

Name your programs c_reverse_file and cpp_reverse_file. In the C program you will have to write your own version of a dynamic array. This will require you to use a struct that keeps both an array of data and the current length of the array so that a larger array can be allocated if the program tries to set an index that is larger than the current array. If you allocate a new array, you will need to copy strings from the old array to the new array and delete the old array. In the C program you should use char * strings.

In the C++ version you should use the Fields, string, and StringArray classes.

As with the first program, we will not require you to write up the differences between the two programs but you should examine then to see how they are different.

What to Hand In

You should mail to your lab TA a file created by /sunshine/homes/bvz/courses/302/bin/302submit. The file should be named username.lab where username is your username. For example, if I were submitting the lab, the file would be named bvz.lab. You will be sending your TA a makefile and your .h, .c and .cc files. The makefile should create the executables c_maxminname, cpp_maxminname, c_reverse_file, and cpp_reverse_file. All of the programs should read their input from stdin.

Grading

Your programs will be graded on the basis of 1) their correctness (the most paramount criteria), 2) their efficiency, and 3) the coding style.

Commenting Your Code (By Jim Plank)

You will be graded on commenting. Something like 15%. You should comment your code by blocks of comments at the beginning of your files, and before subroutines. Variables should be commented inline. You may also want to comment large blocks of code. You should not comment with a ``running commentary'' to the right of your code, because that is often useless, hard to maintain, and disconcerting. I have seen comments like the following:

  if (i == 0) {               /* If i equals zero */
    return;                   /* then return */
  } else {                    /* otherwise */
    exit(1);                  /* exit with a value of one */
  }

The above is an extreme example, but don't let it happen to you.

Here's an example of what I would consider a well documented program:

 
#include < stdio.h >

/* sumnsquared.c
   Jim Plank
   August 23, 1998

   This program prompts for and reads a value n from standard
   input, and then calculates and prints the sum of squares
   from one to n.  It uses a for loop instead of using the 
   closed form expression.
   */


main()
{
  int n;                   /* The value */
  int sum;                 /* The running sum of squares */
  int i;                   /* An induction variable */

  /* Prompt for and read in a value of n */

  printf("Enter a value n: ");
  fflush(stdout);
  if (scanf("%d", &n) != 1) {
    exit(1);
  }
  
  /* Calculate sum for each value of i from one to n */

  sum = 0;
  for (i = 1; i <= n; i++) sum += (i*i);

  /* Print the final summation */

  printf("%d\n", sum);
}