CS140 -- Lab 4


This lab will give you practice with a number of C concepts, including:

  1. Dr. Plank's Fields library
  2. string manipulation using several old and several new string functions,
  3. pointers, and
  4. memory allocation


Files that you will need


Blog

A blog is an on-line journal where you or a group of people can post entries about their daily activities, hobbies, personal experiences, etc. In many ways a blog is like an on-line diary. You are going to write a program named blog that reads blog entries and outputs them as formatted html.

A blog entry will have a person's name, date of creation, and time of creation on one line, and then one or more lines with the person's comments. Each blog entry will be separated by a blank line. You can see a sample input file here. As you can see, dates are presented in mm-dd-yy format. The name, date, and time will each be a single field.

Your program will format each blog entry into a row in an html table. Do not worry if you do not know how to create an html table. I will explain how to do so shortly. Each row will have four columns for the person's name, posting date, posting time, and comment. You will not need to do any processing with either names or times. However, you will need to convert dates from their "mm-dd-yy" format to a format that looks like "month dd, 20yr" where "month" is a three letter abbreviation for the month. For example, "03-20-08" will get translated to "Mar 20, 2008" (I have the list of abbreviations that you should use printed below). A formatted html page for the sample input can be found here. If you find the "page source" option in your browser, then you can see what the actual html looks like. This is the html that your program will produce.

Data Structures

You will need to use the following data structures for your program:

  1. You will use an inputstruct to read lines from the blog file.
  2. You will use a FILE * pointer to reference the output html file and will use fprintf to write lines to your html file.
  3. You will need to use a typedef and a struct for storing each blog entry. You should create an anonymous struct using the typedef and assign an appropriate name to the type you have created. The struct should store the name of the person who wrote the entry, the date they posted it, the time they posted it, and the comment they wrote. It is mandatory that you place the declaration for this struct in a header file named blog.h and that you include this header file in your program.
  4. You should use a dynamic array to store the blog entries. I know that strictly speaking you could output a blog entry as soon as you read it, but that would defeat the purpose of this lab, which is to give you practice with reading a file and storing all the entries in the file so that they can later be manipulated (for example, I might input the blog entries in a random order and ask you to sort them by date and time or I might ask you to print the entries in reverse order). You do not know in advance how large your array will need to be, so you will need to dynamically allocate an array with a default number of entries, say 10. If your blog entries exceed the current capacity of your array, you will need to double the size of your array and copy your old array to your new array. Fortunately you can do all this using a single C function called realloc. realloc takes two arguments--a pointer to your old array and the size in bytes of your new array. It allocates the requested memory for your new array, copies the contents of your old array to your new array, frees the memory used by your old array, and returns a pointer to your new array. You obviously need to store the pointer to your new array. For example, suppose you have an array of integers and that you want to expand the size of the array to 20 integers. Then your code might look like:
         int *int_array;
         ... code that does some processing and forces int_array to be re-sized
         int_array = (int *)realloc(int_array, sizeof(int)*20);
    
    Note that you have to cast the return value of realloc to the type of your array.

Program Design

Roughly speaking, your program will be designed as follows:

  1. You will need to read from the command line the names of the input file containing the blog entries and the output file to which you will write your html code from the command line.
  2. You will need to write a while loop that reads each line from the blog file using the fields library:

    1. When you encounter a blank line you will start a new blog entry.
    2. When you read the first line of a blog entry, you should check to ensure that it contains exactly three fields. If it does not contain exactly three fields, you should print an error message that lists the line number and that tells the user what the problem is. You can mimic the error message I use in my executable. Your program should then exit without creating an output file.

      If the line contains exactly three fields, then you will malloc a blog entry struct and save the person's name and the time at which the entry was posted. You will need to convert the date string to an appropriately formatted string using sprintf. sprintf is just like fprintf and printf, except that it writes into a string, rather than a file or to stdout. You need to provide a string that is big enough to contain the formatted string that you wish to create. Remember to include an extra space for the null (\0) character. For example, the following sprintf statement creates a string with the format "lastname, firstname":

             char *lastname;
             char *firstname;
             char *formatted_name;
             int name_length;
             ... code to initialize lastname and firstname;
             // calculate the length of the formatted name. The extra 3 characters
             // are for the comma, the space between lastname and firstname, and
             // the null character
             name_length = strlen(lastname) + strlen(firstname) + 3;
             sprintf(formatted_name, "%s, %s", lastname, firstname);
      
      You should use the following abbreviations for the months:

      123456 789101112
      JanFebMarAprMayJun JulAugSepOctNovDec

    3. When you read the remaining lines of a blog entry, you should simply concatenate them together into one long string. You do not know in advance how long your eventual string will be or how many lines you will have to read. Therefore I suggest that as you read each line, you 1) figure out how long your newly concatenated line will be, 2) realloc your comment string so that it is long enough to accommodate your existing comment string concatenated with the current line, and then 3) concatenate the current line to the end of the newly realloced comment string.

  3. Once you have finished reading your blog entries, you should write them to your output file. To start your table, output the following two lines, exactly as I have written them, replacing output_file with the name of your output file pointer: fprintf(output_file, "<table borders=\"1\" rules=\"all\" frame=\"none\" cellpadding=\"5\">\n"); fprintf(output_file, "<tr><th>Name</th><th>Date</th><th>Time</th><th width=\"50%%\">Comment</th></tr>\n"); The first line creates a table that has a 1 pixel wide border, no frame, lines between all the table cell entries, and 5 pixels of padding around each table cell. The second line creates a set of column headings and indicates that the comment column should take up 50% of the browser window. Note that I used \" so that fprintf would know that I want to print the \" rather than have it terminate the format string, and I used %% to tell fprintf that I want it to output a % sign, rather than interpret the % sign as the start of a conversion specifier.

    You will then output each row of the table. Each row should start with the tag <tr> and should end with the tag </tr>. Each column should start with the tag <td> and should end with the tag </td>. For example, to get the following column:

    BradAug 8, 200912:20pmClimbing Mt. Whitney

    you would need to output the following lines of html:

    <tr> <td>Brad</td> <td>Aug 8, 2009</td> <td>12:20pm</td> <td>Climbing Mt. Whitney</td> </tr>

Testing Your Code

You need to start to get in the habit of thinking about the various types of errors and boundary conditions that can arise and then testing for them. Several errors and boundary conditions that I expect you to test for in this lab include:

  1. error conditions
    1. wrong number of command line arguments
    2. invalid blog file name
    3. inability to open the output file. One way you can test whether or not your code works is to deny yourself write permission for the output file. You can do so by first creating a file and then using chmod to deny yourself write privileges. For example:
           chmod u-w brad.html
      	 
      The string "u-w" says to subtract write permission from the user, which in this case is yourself. You can re-establish write privileges to the file by using "chmod u+w brad.html".
    4. improper number of fields on the first line of a blog entry
  2. boundary conditions
    1. empty blog file: just output the headers for the table (see my executable)
    2. one or more empty lines at the end of the blog file
    3. a very large blog file with a large number of entries
    4. a very large blog entry (i.e., a blog entry whose comment runs over a great many lines)
  3. normal conditions that your program should handle
    1. more than one line between blog entries
    2. dates whose month or day is a single digit
In this lab you do not have to worry about the date being misformatted or the month or date being too large. You may assume that a date always correctly appears as "mm-dd-yy", although the month and day may be single digits. The year will always be two digits.

Note that blog_data.txt and empty_blog.txt test some of these boundary conditions, but not all of them.


What to Submit

Submit a source file named blog.c and a header file named blog.h.