CS360 Lecture notes -- Cat and its variants. Buffering.

  • Jian Huang, referencing Dr. Plank's notes
  • Directory: ~huangj/cs360/notes/Cat
  • Lecture notes: http://www.cs.utk.edu/~huangj/cs360/360/notes/Cat/lecture.html
    This lecture gives more detail on writing "cat" with unix system calls and with the C standard I/O library. It also motivates buffering for performance.

    Simpcat

    Here are three equivalent ways of writing a simple cat, which just reads from standard input, and writes to standard output.
    simpcat1.c             | simpcat2.c            | simpcat3.c
    
    #include < stdio.h >   |                       |#include < stdio.h >     
                           |                       |                       
    main()                 |main()                 |main()                 
    {                      |{                      |{                      
      char c;              |  char c;              |  char c;              
                           |  int i;               |  int i;               
                           |                       |                       
      c = getchar();       |  i = read(0, &c, 1);  |  i = fread(&c, 1, 1, stdin);
      while(c != EOF) {    |  while(i > 0) {       |  while(i > 0) {    
        putchar(c);        |    write(1, &c, 1);   |    fwrite(&c, 1, 1, stdout);
        c = getchar();     |    i = read(0, &c, 1);|    i = fread(&c, 1, 1, stdin);
      }                    |  }                    |  }                    
    }                      |}                      |}                     
    
    (Links to simpcat1.c, simpcat2.c, and simpcat3.c).

    Let's look at these a little closer. Copy *.c and makefile to one of your directories, and type "make". Now do the following:

    UNIX> sh
    $ time simpcat1 < large > /dev/null
    
    real        3.7
    user        3.2
    sys         0.2
    $ time simpcat2 < large > /dev/null
    
    real      307.5
    user       21.8
    sys       283.2
    $ time simpcat3 < large > /dev/null
    
    real       13.8
    user       12.9
    sys         0.5
    $ exit
    UNIX>
    
    Depending on what machine you're on, you are likely to get different times than the above -- perhaps up to a factor of 10 or more faster or slower, but the ratios between simpcat1, simpcat2 and simpcat3 should be the same.

    So, what's going on? /dev/null is a special file in Unix that you can write to, but it never stores anything on disk. We're using it so that you don't create 7.5M files in your home directory as this wastes disk space. "Large" is a 7,500,000-byte file. This means that in simpcat1.c, getchar() and putchar() are being called 7.5 million times each, as are read() and write() in simpcat2.c, and fread() and fwrite() in simpcat3.c. Obviously, the culprit in simpcat2.c is the fact that the program is making system calls instead of library calls. Remember that a system call is a request made to the operating system. This means at each read/write call, the operating system has to take over the CPU (this means saving the state of the simpcat2 program), process the request, and return (which means restoring the state of the simpcat2 program). This is evidently far more expensive than what simpcat1.c and simpcat3.c do. Now, look at simpcat4.c and simpcat5.c:

    simpcat4.c:                         | simpcat5.c:
                                        |
    #include < stdio.h >                | #include < stdio.h >
                                        |                                    
    extern char *malloc(int);           | extern char *malloc(int);         
    main(int argc, char **argv)         | main(int argc, char **argv)         
    {                                   | {                                   
      int bufsize;                      |   int bufsize;                      
      char *c;                          |   char *c;                          
      int i;                            |   int i;                            
                                        |                                     
      bufsize = atoi(argv[1]);          |   bufsize = atoi(argv[1]);          
      c = malloc(bufsize*sizeof(char)); |   c = malloc(bufsize*sizeof(char)); 
      i = 1;                            |   i = 1;                            
      while (i > 0) {                   |   while (i > 0) {                   
        i = read(0, c, bufsize);        |     i = fread(c, 1, bufsize, stdin); 
        if (i > 0) write(1, c, i);      |     if (i > 0) fwrite(c, 1, i, stdout);
      }                                 |   }                                 
    }                                   | }                                   
    
    (the real simpcat4.c and simpcat5.c have error checking in them too).

    These let us read in more than one byte at a time. This is called buffering: You allocate a region of memory in which to store things, so that you can make fewer system/procedure calls. Note that fread() and fwrite() are just like read() and write(), except that they go to the standard I/O library instead of the operating system. Now, below shows how fast simpcat4 and simpcat5 run on the file "large" for differing values of bufsize: (this redirecting stdout to an actual file instead of to /dev/null. It will take less time if you redirect stdout to /dev/null).

    simpcat4.c (the first column is bufsize):
    1	516.1 real      25.4 user       481.0 sys  
    2	261.1 real      12.3 user       241.9 sys  
    4	135.9 real       6.2 user       123.8 sys  
    8	70.1 real        3.3 user        61.5 sys  
    16	39.2 real        1.5 user        32.5 sys  
    32	22.9 real        0.8 user        16.6 sys  
    64	13.6 real        0.5 user         8.5 sys  
    128	10.0 real        0.2 user         4.7 sys  
    129	8.6 real         0.1 user         4.9 sys  
    256	7.8 real         0.1 user         2.8 sys  
    267	9.3 real         0.1 user         3.0 sys  
    512	5.6 real         0.0 user         1.9 sys  
    513	5.6 real         0.0 user         2.0 sys  
    1024	5.5 real         0.0 user         1.3 sys  
    1025	5.5 real         0.0 user         1.5 sys  
    2048	5.8 real         0.0 user         0.9 sys  
    2049	5.6 real         0.0 user         1.3 sys  
    4096	5.2 real         0.0 user         0.9 sys  
    4097	5.4 real         0.0 user         1.2 sys  
    8192	1.5 real         0.0 user         0.6 sys  
    8193	5.4 real         0.0 user         1.1 sys  
    10000	3.2 real         0.0 user         0.7 sys  
    20000	4.1 real         0.0 user         0.7 sys  
    50000	2.1 real         0.0 user         0.6 sys  
    100000	2.0 real         0.0 user         0.6 sys  
    200000	1.9 real         0.0 user         0.6 sys  
    500000	1.9 real         0.0 user         0.6 sys  
    1000000	1.9 real         0.0 user         0.7 sys  
    2000000	2.0 real         0.0 user         0.8 sys  
    
    simpcat5:
    1	14.3 real       13.2 user         0.8 sys  
    2	7.8 real         6.9 user         0.7 sys  
    4	4.6 real         3.8 user         0.7 sys  
    8	3.5 real         2.3 user         0.7 sys  
    16	1.9 real         1.0 user         0.6 sys  
    32	1.6 real         0.6 user         0.6 sys  
    64	1.5 real         0.5 user         0.5 sys  
    128	1.5 real         0.3 user         0.6 sys  
    129	1.5 real         0.4 user         0.7 sys  
    256	1.5 real         0.2 user         0.7 sys  
    267	1.5 real         0.3 user         0.6 sys  
    512	1.5 real         0.2 user         0.6 sys  
    513	1.5 real         0.3 user         0.7 sys  
    1024	5.2 real         0.2 user         1.1 sys  
    1025	5.4 real         0.3 user         1.1 sys  
    2048	6.9 real         0.1 user         0.9 sys  
    2049	5.4 real         0.1 user         1.1 sys  
    4096	5.1 real         0.1 user         1.0 sys  
    4097	5.5 real         0.1 user         1.0 sys  
    8192	1.5 real         0.1 user         0.6 sys  
    8193	5.4 real         0.2 user         1.0 sys  
    10000	4.8 real         0.1 user         0.7 sys  
    20000	2.7 real         0.1 user         0.8 sys  
    50000	2.1 real         0.2 user         0.6 sys  
    100000	2.0 real         0.1 user         0.6 sys  
    200000	1.9 real         0.1 user         0.6 sys  
    500000	2.0 real         0.1 user         0.6 sys  
    1000000	2.1 real         0.1 user         0.7 sys  
    2000000	2.1 real         0.1 user         0.7 sys  
    
    Note first how the user and system time both decrease drastically when you increase the buffer size. In simpcat4.c this is a direct result of making fewer system calls. In simpcat5.c it is a result of making fewer procedure calls. Note that once the buffer size gets large enough, the two programs exhibit roughly the same behavior.

    Here are graphs of the two programs. Note, both graphs graph the same data. The right-most graph simply enlarges the bottom region of the left-most graph. Also, note that the x-axis is on a log scale.
    Obviously, there is a bunch of noise in this data, but we can use it to draw some conclusions.

    First, what can we infer now about the standard I/O library? It uses buffering! In other words, when you first call getchar() or fread(), it performs a read() of a large number of bytes into a buffer. Thus, subsequent getchar() or fread() calls will be fast. When you attempt to fread() large segments of memory, the two exhibit the same behavior, as fread() doesn't need to buffer -- you are doing it for the subroutine.

    Why then is getchar() faster than fread(c, 1, 1, stdin)? Because getchar() is optimized for reading one character, and fread() is not.


    What's the lesson behind this?

    The same is true for writes, even though we didn't go through them in detail in class.

    Standard I/O vs System calls.

    Each system call has analogous procedure calls from the standard I/O library:
    System Call			Standard I/O call
    -----------			-----------------
    open				fopen
    close				fclose
    read/write			getchar/putchar
    				getc/putc
    				fgetc/fputc
    				fread/fwrite
    				gets/puts
    				fgets/fputs
    				scanf/printf
    				fscanf/fprintf
    lseek				fseek
    
    System calls work with integer file descriptors. Standard I/O calls define a structure called a FILE, and work with pointers to these structs.

    To exemplify, the following are versions of the program cat which must be called with filename as their arguments. Cat1.c uses system calls, and cat2.c uses the standard I/O library. Read the man page for open ("man 2v open") and fopen ("man 3s fopen") to understand their arguments.

    Try:

    UNIX> sh
    $ time cat1 large > /dev/null
            0.9 real         0.0 user         0.3 sys  
    $ time cat2 large > /dev/null
            1.2 real         0.1 user         0.4 sys  
    $ exit
    UNIX>
    
    How do these compare to the first numbers?

    Finally, fullcat.c contains a version of cat which works much like the real version -- if you omit a filename, then it prints standard input to standard output. Otherwise, it prints out each file specified in the command line arguments. Note how it is similar to both simpcat1.c and cat2.c.

    Type 'make clean' when you're done to save disk space, and remove any temporary files. You can erase all the files created from this lecture, since you can re-copy them from my directory.


    Chars vs ints

    You'll note that getchar() is defined to return an int and not a char. Relatedly, look at simpcat1a.c:
    #include < stdio.h >     
                           
    main()                 
    {                      
      int c;              
                           
      c = getchar();       
      while(c != EOF) {    
        putchar(c);        
        c = getchar();     
      }                    
    }                      
    
    
    The only difference between simpcat1a.c and simpcat1.c is that c is an int instead of a char. Now, why would that matter? Look at the following:
    UNIX> ls -l simpcat1.c simpcat1
    -rwxr-xr-x   1 plank       10864 Sep  8 14:03 simpcat1
    -rw-r--r--   1 plank         526 Sep 13  1996 simpcat1.c
    UNIX> simpcat1 < simpcat1 > tmp1
    UNIX> simpcat1 < simpcat1.c > tmp2
    UNIX> ls -l tmp1 tmp2
    -rw-r--r--   1 plank        1746 Sep  8 14:10 tmp1
    -rw-r--r--   1 plank         526 Sep  8 14:10 tmp2
    UNIX>
    
    Notice anything wierd? Now:
    UNIX> simpcat1a < simpcat1 > tmp3
    UNIX> ls -l tmp3 
    -rw-r--r--   1 plank       10864 Sep  8 14:12 tmp3
    UNIX>
    
    This has to do with what happens when getchar() reads the character 255. We'll talk about it in class. See if you can figure it out.
    Here is a simplified code for getchar that has a bug (not because it is simplified). Please figure out where.
    int getchar (void)
    {
        static char buf[BUFSIZE];
        static char *bufp;
        static int  n = 0;
      
        if (n == 0) {
           n = read(0, buf, BUFSIZE);
           bufp = buf;
        }
      
        if (n > 0) {
            n -= 1;
            return *bufp++;
        } else
            return EOF;
    }