CS360 -- Fields Lecture


  • Jim Plank
  • CS360 -- Systems Programming

    Using the fields library -- if you have an account at the UT CS Dept

    In order to use the fields procedures, you should include the file fields.h which can be found in /mahogany/homes/plank/cs360/include. Instead of including the full path name in your C file, just do:
    #include "fields.h",
    
    and then compile the program with:
    gcc -I/mahogany/homes/plank/cs360/include
    
    Also when you link your object files to make an executable, you need to include /mahogany/homes/plank/cs360/objs/fields.o. This assumes you're running on a SUN4 machine. If not, see below for making a different fields.o.

    Using the fields library -- if you are elsewhere

    Copy over fields.tar, untar it and type make. You will have a fields.h and fields.o file, plus the example programs in this lecture.

    Also, the fields library is part of the libfdr library, which also includes doubly linked lists and red-black trees. I recommend you port that and use all three.


    Fields.o

    Fields.o defines and implements a data structure that simplifies input processing in C. The data structure consists of a type definition and four procedure calls. All are defined in fields.h:
    #define MAXLEN 1001
    #define MAXFIELDS 1000
    
    typedef struct inputstruct {
      char *name;               /* File name */
      FILE *f;                  /* File descriptor */
      int line;                 /* Line number */
      char text1[MAXLEN];       /* The line */
      char text2[MAXLEN];       /* Working -- contains fields */
      int NF;                   /* Number of fields */
      char *fields[MAXFIELDS];  /* Pointers to fields */
      int file;                 /* 1 for file, 0 for popen */
     *IS;
    
    extern IS new_inputstruct(/* FILENAME -- NULL for stdin */);
    extern IS pipe_inputstruct(/* COMMAND -- NULL for stdin */);
    extern int get_line(/* IS */); /* returns NF, or -1 on EOF.  Does not
                                      close the file */
    extern void jettison_inputstruct(/* IS */);  /* frees the IS and fcloses 
                                                    the file */
    
    To use fields.o, you must include fields.h in your C program, and compile it with fields.o. To read a file with fields.o you call new_inputstruct() on that file. New_inputstruct() takes a file name as its argument (NULL for standard input), and returns an IS as a result. If it cannot open the file, it will return NULL.

    Once you have an IS, you call get_line() on it to read a line. Get_line() changes the state of the IS to reflect the reading of the line. Specifically:

    Jettison_inputstruct() closes the file associated with the IS and deallocates the IS. Do not worry about pipe_inputstruct() for now.

    These procedures are very convenient for processing input files. For example, the following program (in printwords.c) prints out every word of an input file prepended with its line number.

    #include < stdio.h >
    #include "fields.h"
    
    main(argc, argv)
    int argc;
    char **argv;
    {
      IS is;
      int i;
    
      if (argc != 2) {
        fprintf(stderr, "usage: printwords filename\n");
        exit(1);
      }
     
      is = new_inputstruct(argv[1]);
      if (is == NULL) {
        perror(argv[1]);
        exit(1);
      }
    
      while(get_line(is) >= 0) {
        for (i = 0; i < is->NF; i++) {
          printf("%d: %s\n", is->line, is->fields[i]);
        }
      }
    
      jettison_inputstruct(is);
      exit(0);
    }
    
    So, for example, if the file rex.in contains the following three lines:
    June: Hi ... I missed you!
    Rex:  Same here!  You're all I could think about!
    June: I was?
    
    Then running printwords on rex.in results in the following output:
    UNIX> printwords rex.in
    1: June:
    1: Hi
    1: ...
    1: I
    1: missed
    1: you!
    2: Rex:
    2: Same
    2: here!
    2: You're
    2: all
    2: I
    2: could
    2: think
    2: about!
    3: June:
    3: I
    3: was?
    UNIX>
    
    One important thing to note about fields.o is that only new_inputstruct() calls malloc(). Get_line() simply fills in the fields of the IS structure --- it does not perform memory allocation. Therefore, suppose you wanted to print out the first word on the second-to-last line. The following program (badword.c) would not work:
    #include < stdio.h >
    #include "fields.h"
    
    main(argc, argv)
    int argc;
    char **argv;
    {
      IS is;
      int i;
      char *penultimate_word;
      char *last_word;
    
      if (argc != 2) {
        fprintf(stderr, "usage: badword filename\n");
        exit(1);
      }
     
      is = new_inputstruct(argv[1]);
      if (is == NULL) {
        perror(argv[1]);
        exit(1);
      }
    
      penultimate_word = NULL;
      last_word = NULL;
    
      while(get_line(is) >= 0) {
        penultimate_word = last_word;
        if (is->NF > 0) {
          last_word = is->fields[0];
        } else {
          last_word = NULL;
        }
      }
    
      if (penultimate_word != NULL) printf("%s\n", penultimate_word);
      jettison_inputstruct(is);
      exit(0);
    }
    
    Why? Look at what happens when you execute it on rex.in:
    UNIX> badword rex.in
    June:
    UNIX>
    
    It prints ``June:'' instead of ``Rex:'' because get_line() does not allocate any new memory. Both penultimate_word and last_word end up pointing to the same thing. Make sure you understand this example, because you can get yourself into a mess of trouble otherwise. The correct version of the program is in goodword.c: (note that this is a very inefficient program because of all the strdup() and free() calls. You could do better if you wanted to).

    Field.o assumes that all input lines are less than 1000 characters.