CS140 Lecture notes -- Review

  • Jim Plank
  • Directory: ~plank/cs140/notes/Review
  • Lecture notes: http://www.cs.utk.edu/~plank/plank/classes/cs140/Notes/Review
  • Original lecture notes: January, 2011
  • Last modification date: Mon Jan 14 10:49:32 EST 2013

    We walk before we run

    Let's review C++ from cs102. All of this is going to be from the perspective of a Unix account. C++ programs are simply text files created with a text editor like vi, emacs or notepad. Let's look at a simple "hello world" program, in the file hw1.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      cout << "Hello world!" << endl;
    
      return 0;
    }
    

    Every line of this program except the cout line contains stuff that you need in every program. I won't go into much detail, except your programs need to contain these things.

    When your program is executed by the operating system, control starts in the main() procedure, which you must define. You should declare it as returning an int, although you'll see people (especially old ones like me) forget to do this because they've acquired bad habits (the Mac clang compiler will now raise an error if you forget to declare main as returning an int). It is also a good idea to return 0 at the end of your main() routine. This is because the operating system will interpret this value when your program ends, and having it return 0 signifies to the operating system that all has gone ok.

    The cout statement prints "Hello world!" and a newline. To run this program, you must compile it with g++. If this works correctly, an executable file named a.out will be created. You can run that program, and it will print out "Hello world!" and a newline:

    UNIX> g++ hw1.cpp
    UNIX> ./a.out
    Hello world!
    UNIX> 
    
    We can tell the compiler to create the executable with a different name by using the "-o" command line option:
    UNIX> g++ -o hw1 hw1.cpp
    UNIX> ./hw1
    Hello world!
    UNIX> 
    
    We don't have to use endl to print the newline character -- using "\n" in a string accomplishes the same purpose, as shown in hw2.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      cout << "Hello world!\n" ;
    
      return 0;
    }
    

    Using endl vs. "\n" is personal preference. However, there is one difference between the two that might sometimes be noticeable, such as in Code Assessor. endl always prints the output to the screen immediately whereas "\n" might buffer the output and print it on a delayed basis. This might mean that in debugging situations or premature program termination situations, that some output printed with "\n" is still being held in a buffer and has not yet been displayed. This in turn can mislead you into thinking that the output has not been "printed". Buffering of output is more efficient so there is a tradeoff between endl and "\n". endl is more predictable in its output and "\n" is more efficient.

    cout is convenient because it recognizes the types of what you want to print automatically. When you give it an integer, it recognizes it as an integer and prints it accordingly. This is as opposed to printf (which we'll go over later), which requires you to specify the types of what you're printing.

    So, for example, printemall.cpp has cout print strings, string variables and integer variables, all in one statement:

    #include <iostream>
    #include <string>
    using namespace std;
    
    int main()
    {
      int i;
      string s;
    
      i = 2;
      s = " times";
    
      cout << "Love me " << i << s << ", baby\n";
    
      return 0;
    }
    

    UNIX> g++ -o printemall printemall.cpp
    UNIX> ./printemall
    Love me 2 times, baby
    UNIX> 
    

    return() versus exit()

    You may see C programs use exit rather than return to exit a program. This is fine in C but could be problematic in C++, because exit() does not call the destructors for stack-allocated objects. Ordinarily this is not a problem, but if the destructors either free certain resources that would otherwise be held past the termination of the program, or if the destructors perform important wrap-up processing, then it is problematic. Ordinarily it is best to throw exceptions up to main if your program detects an error condition and let main deal with them. main can then use return to actually exit the program. However, since you may not encounter exception handling until you take CS365, it will be okay to use exit to abnormally terminate a program in this course. If you are normally terminating a program in main, please develop a good habit and use return instead of exit.


    Forgive me father, for I have cin'd

    We get input from the terminal with cin, which works a lot like cout. We call input from the terminal "standard input." For example, readone.cpp reads one integer from standard input and prints it out:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i;
    
      cin >> i;
      cout << "I is " << i << endl;
    
      return 0;
    }
    

    Running it and typing 50 gives us the expected output:

    UNIX> g++ -o readone readone.cpp
    UNIX> ./readone
    50
    I is 50
    UNIX> 
    
    Cin does not care about whitespace or lines. For example, if we make two consecutive cin statements, we can put the two integers on the same line, different lines, have space in front, leading zeroes, etc. This is readtwo.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, j;
    
      cin >> i;
      cout << "I is " << i << endl;
    
      cin >> j;
      cout << "J is " << j << endl;
    
      return 0;
    }
    

    Here are several examples of entering two integers:

    UNIX> g++ -o readtwo readtwo.cpp
    UNIX> ./readtwo
    1 2
    I is 1
    J is 2
    UNIX> ./readtwo
    1
    I is 1
    2
    J is 2
    UNIX> ./readtwo
    
    
    
    
                      0000001 
    I is 1
    
    
                -2
    J is -2
    UNIX> 
    
    You can "redirect standard input from a file." This means that instead of reading from the terminal, cin will read from a file. You do this by putting "< filename" in the command that runs the executable: (cat prints the file to the terminal):
    UNIX> cat input-1.txt
    11
    22
    UNIX> ./readtwo < input-1.txt
    I is 11
    J is 22
    UNIX> 
    
    Now, what happens when you don't enter an integer? For example, the file input-2.txt is empty, and input-3.txt does not have two integers:
    UNIX> cat input-2.txt
    UNIX> ./readtwo < input-2.txt
    I is 4096
    J is -1881139893
    UNIX> cat input-3.txt
    Fred
    44
    UNIX> ./readtwo < input-3.txt
    I is 4096
    J is -1881139893
    UNIX> 
    
    In both cases, i becomes 4096 and j becomes -1881139893. This is because in both cases, the first cin call failed, and i and j are left uninitialized. The values are random, and vary from machine to machine. You can't count on integers being initialized to zero, so be careful with uninitialized variables!

    With input-2.txt, it's pretty clear why both cin calls failed -- there was no input. However, with input-3.txt, you might think that since the second value is a number, the second cin call should succeed and j should become 44. This doesn't happen, because once cin fails, you have to "clear" it by calling cin.clear() to get it to work properly. But first, how do you determine that cin failed? You do that by putting the cin call into an if statement -- each cin statement is actually a boolean value that returns TRUE if the cin statement succeeded and FALSE if it failed.

    We do that in rt2.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, j;
    
      if (cin >> i) {
        cout << "I is " << i << endl;
      } else {
        printf("Bad cin call reading i -- calling cin.clear()\n");
        cin.clear();
      }
    
      if (cin >> j) {
        cout << "J is " << j << endl;
      } else {
        printf("Bad cin call reading j -- calling cin.clear()\n");
        cin.clear();
      }
    
      return 0;
    }
    

    Of course, when we run it, it doesn't really work as expected:

    UNIX> g++ -o rt2 rt2.cpp
    UNIX> ./rt2 < input-3.txt
    Bad cin call reading i -- calling cin.clear()
    Bad cin call reading j -- calling cin.clear()
    UNIX> 
    
    The second cin call failed. This is because when you call cin.clear() it resets cin, but it is still trying to read "Fred". You have to go ahead and read the erroneous integer as a string before moving on. This is comparable to a printer where you must both hit a "reset" button and clear the paper jam. The order in which you do things is important. First you must call .clear() and then you must clear the jam by reading the erroneous input as a string. The reason that this order is important is that once you jam cin, you cannot successfully perform another read until you have called .clear().

    The better version is in rt3.cpp:

    #include <iostream>
    #include <string>
    using namespace std;
    
    int main()
    {
      int i, j;
      string s;
    
      if (cin >> i) {
        cout << "I is " << i << endl;
      } else {
        printf("Bad cin call reading i -- calling cin.clear()\n");
        cin.clear();
        cin >> s;
      }
    
      if (cin >> j) {
        cout << "J is " << j << endl;
      } else {
        printf("Bad cin call reading j -- calling cin.clear()\n");
        cin.clear();
        cin >> s;
      }
    
      return 0;
    }
    

    Now, the second cin call reads j successfully:

    UNIX> g++ -o rt3 rt3.cpp
    UNIX> ./rt3 < input-3.txt
    Bad cin call reading i -- calling cin.clear()
    J is 44
    UNIX> 
    
    Of course, when we call it on the empty file, both cin calls fail. We can detect whether the failures are due to reaching the end of the file by using cin.eof() after the failure, as in rt4.cpp:

    #include <iostream>
    #include <string>
    using namespace std;
    
    int main()
    {
      int i, j;
      string s;
    
      if (cin >> i) {
        cout << "I is " << i << endl;
      } else {
        if (cin.eof()) exit(0);
        printf("Bad cin call reading i -- calling cin.clear()\n");
        cin.clear();
        cin >> s;
      }
    
      if (cin >> j) {
        cout << "J is " << j << endl;
      } else {
        if (cin.eof()) exit(0);
        printf("Bad cin call reading j -- calling cin.clear()\n");
        cin.clear();
        cin >> s;
      }
    
      return 0;
    }
    


    A warning to Mac users about using cin with doubles

    Mac OS X introduced an unusual interpretation for numbers when it tries to read doubles. Specifically, if a Mac C++ program attempts to to read a double on a word that begins with the letters a-f, i, n, p, or x (or upper case versions of these letters), then cin will fail but it will consume these characters. It appears that the clang compiler interprets A-F as hexadecimal, I as INF (infinity), N as NAN (Not A Number) and P for C99 floating point hexadecimal constants. I believe X stands for hexadecimal. You can fix it on a Mac by compiling with the --stdlib=libstdc++ flag.

    Here's a sample program that causes the problem:

    #include <iostream>
    using namespace std;
    
    int main()
    {
     string s;
     double i;
    
     if (cin >> i)
       cout << "i = " << i << endl;
     else {
       cin.clear();
       cin >> s;
       cout << "s = " << s << endl;
     }
     return 0;
    }
    

    If you give it the input "bard" on a Mac, it produces the output:

    s = rd
    
    having pinched off the "ba".


    Reading single characters using cin

    If you use cin to read variables that are of type char, it reads in single characters. For example, the program ncnl.cpp uses cin to read each character of standard input. It counts the total number of characters and the total number of L's.

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int nc, nl;
      char c;
    
      nc = 0;
      nl = 0;
    
      while (cin >> c) {
        nc++;
        if (c == 'L') nl++;
      }
    
      cout << "# of characters: " << nc << endl;
      cout << "# of L's: " << nl << endl;
      return 0;
    }
    

    When we run it on input-mixed.txt and on ncnl.cpp, we see that each has exactly two L's:

    UNIX> cat input-mixed.txt
    Love me 2 times baby.  Love me twice 2 day.
    UNIX> ncnl < input-mixed.txt
    # of characters: 33
    # of L's: 2
    UNIX> ncnl < ncnl.cpp
    # of characters: 174
    # of L's: 2
    UNIX> 
    
    The Unix program wc counts lines, words and characters in a file. When we run it on the two input files, we see that the number of characters differs from ncnl:
    UNIX> wc input-mixed.txt
     1 10 44 input-mixed.txt
    UNIX> wc ncnl.cpp
     20  54 258 ncnl.cpp
    UNIX> 
    
    wc reports that input-mixed.txt has 44 characters, yet ncnl only read 33. Why? The reason is because when cin reads characters, it doesn't read "whitespace" -- spaces, tabs and newlines. If you count the number of non-whitespace characters in input-mixed.txt, you'll see that it has 33.

    Four common cin errors: #1 -- "!cin"

    The first common error is exemplified by the program in readten-bad.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, n1;
    
      for (i = 0; i < 10; i++) {
        if (!cin >> n1) {
          cout << "Done\n";
          return 0;
        }
        cout << "Number " << i << " equals " << n1 << endl;
      }
    
      return 0;
    }
    

    The intent of this program is to read ten integers and print them out. If one of the cin calls fails, then the program should exit prematurely, printing "Done." However, let's run it on an input file that has exactly 10 integers:

    UNIX> cat input-ten.txt
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    UNIX> g++ -o readten-bad readten-bad.cpp
    UNIX> ./readten-bad < input-ten.txt
    Number 0 equals 4096
    Number 1 equals 4096
    Number 2 equals 4096
    Number 3 equals 4096
    Number 4 equals 4096
    Number 5 equals 4096
    Number 6 equals 4096
    Number 7 equals 4096
    Number 8 equals 4096
    Number 9 equals 4096
    UNIX> 
    
    Hmmm. The bad line is

        if (!cin >> n1) {
    

    The boolean "not" operator (!) is being applied only to cin, and not to the entire expression. This is an "order of operations" thing -- (!) has higher precedence than (>>). The bad part about this mistake is that the program compiles legally -- evidently you can negate cin. Perhaps I should look up what it does, but I'm not going to -- it seems like a bad idea regardless of its meaning.

    To fix this, you must parenthesize as in readten-good.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, n1;
    
      for (i = 0; i < 10; i++) {
        if (!(cin >> n1)) { 
          cout << "Done\n";
          return 0;
        }
        cout << "Number " << i << " equals " << n1 << endl;
      }
    
      return 0;
    }
    

    This one works as it should:

    UNIX> g++ -o readten-good readten-good.cpp
    UNIX> ./readten-good < input-ten.txt
    Number 0 equals 10
    Number 1 equals 11
    Number 2 equals 12
    Number 3 equals 13
    Number 4 equals 14
    Number 5 equals 15
    Number 6 equals 16
    Number 7 equals 17
    Number 8 equals 18
    Number 9 equals 19
    UNIX> ./readten-good < input-1.txt
    Number 0 equals 11
    Number 1 equals 22
    Done
    UNIX> 
    

    Four common cin errors: #2 -- cin.eof() is not proactive!

    I've seen this one on tests so many times that I have to address it here. cin.eof() does not return TRUE when you have reached the end of the file. It returns TRUE when you have tried to read something, and that has failed because you are at the end of the file. That's a subtle distinction, but very important.

    Take a look at badeof.cpp:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      string s;
      int i;
    
      i = 0;
      while (!cin.eof()) {
        i++;
        cin >> s;
        cout << "String " << i << " is " << s << endl;
      }
      return 0;
    }
    

    This is the type of erroneous code I see on tests. The intent of this program is to number the words on standard input and print them out. However, it has a bug regarding cin.eof():

    UNIX> g++ -o badeof badeof.cpp
    UNIX> cat input-1.txt
    11
    22
    UNIX> ./badeof < input-1.txt
    String 1 is 11
    String 2 is 22
    String 3 is 22
    UNIX> cat input-2.txt
    UNIX> ./badeof < input-2.txt
    String 1 is 
    UNIX> 
    
    Both times, the program prints an extra line. That's because after the second line, the "if (!cin.eof())" statement returns true. It returns true because you haven't tried to read the word yet and failed. So, in this case, "cout << s" fails, and s remains the same, which is why that last line says "String 3 is 22". Since the cout statement failed, cin.eof() now returns FALSE, and the program exits.

    I like to say that cin.eof() and cin.fail() are not proactive, but reactive. They are only true when a cin statement failed, and they are telling you why.

    Below are two ways to write the program correctly. The first uses cin.eof() correctly, and the second doesn't bother using cin.eof() at all. Personally, I like the second better because it's less convoluted.

    good-eof-1.cpp:
    #include <iostream>
    using namespace std;
    
    int main()
    {
      string s;
      int i;
    
      i = 0;
      cin >> s;
      while (!cin.eof()) {
        i++;
        cout << "String " << i << " is " << s << endl;
        cin >> s;
      }
      return 0;
    }
    
    good-eof-2.cpp
    #include <iostream>
    using namespace std;
    
    int main()
    {
      string s;
      int i;
    
      i = 0;
      while (cin >> s) {
        i++;
        cout << "String " << i << " is " << s << endl;
      }
      return 0;
    }
    


    Four common cin errors: #3 -- cin reads words and not lines

    Mistake #3 is when you think that cin works on lines, forgetting that it works on a word-by-word basis:
    UNIX> cat input-twenty.txt
    10 110
    11 109
    12 108
    13 107
    14 106
    15 105
    16 104
    17 103
    18 102
    19 101
    UNIX> ./readten-good < input-twenty.txt
    Number 0 equals 10
    Number 1 equals 110
    Number 2 equals 11
    Number 3 equals 109
    Number 4 equals 12
    Number 5 equals 108
    Number 6 equals 13
    Number 7 equals 107
    Number 8 equals 14
    Number 9 equals 106
    UNIX> 
    
    The program is working fine -- cin reads words, not lines.


    Four common cin errors: #4 -- Clearing cin when it failes

    The last error is forgetting to clear cin and re-read bad input. The program forget-clear.cpp attempts to read ten numbers and flag when a number is not read correctly:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, n1;
    
      for (i = 0; i < 10; i++) {
        if (!(cin >> n1)) {
          cout << "Number " << i << " entered incorrectly\n";
        } else {
          cout << "Number " << i << " equals " << n1 << endl;
        }
      }
    
      return 0;
    }
    

    When we run it on input-mixed.txt, numbers 2 and 8 should be correct, while the rest are not. However, since we don't clear cin, each cin statement returns that it read incorrectly:

    UNIX> g++ -o forget-clear forget-clear.cpp
    UNIX> cat input-mixed.txt
    Love me 2 times baby.  Love me twice 2 day.
    UNIX> ./forget-clear < input-mixed.txt
    Number 0 entered incorrectly
    Number 1 entered incorrectly
    Number 2 entered incorrectly
    Number 3 entered incorrectly
    Number 4 entered incorrectly
    Number 5 entered incorrectly
    Number 6 entered incorrectly
    Number 7 entered incorrectly
    Number 8 entered incorrectly
    Number 9 entered incorrectly
    UNIX> 
    
    When you try to fix this mistake, you need to remember to both clear cin, and then read the offending word. The program forget-read.cpp remembers to do cin.clear(), but then each successive cin statement tries to read the same word ("Love"), and returns that it read incorrectly:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, n1;
    
      for (i = 0; i < 10; i++) {
        if (!(cin >> n1)) {
          cout << "Number " << i << " entered incorrectly\n";
          cin.clear();
        } else {
          cout << "Number " << i << " equals " << n1 << endl;
        }
      }
    
      return 0;
    }
    

    UNIX> g++ -o forget-read forget-read.cpp
    UNIX> ./forget-read < input-mixed.txt
    Number 0 entered incorrectly
    Number 1 entered incorrectly
    Number 2 entered incorrectly
    Number 3 entered incorrectly
    Number 4 entered incorrectly
    Number 5 entered incorrectly
    Number 6 entered incorrectly
    Number 7 entered incorrectly
    Number 8 entered incorrectly
    Number 9 entered incorrectly
    UNIX> 
    
    Finally, the program forget-nothing.cpp reads the offending word after clearing cin, and also detects when input has ended, because that's when reading the string fails:

    #include <iostream>
    using namespace std;
    
    int main()
    {
      int i, n1;
      string s;
    
      for (i = 0; i < 10; i++) {
        if (!(cin >> n1)) {
          cin.clear();
          if (!(cin >> s)) return 0;
          cout << "Number " << i << " entered incorrectly\n";
        } else {
          cout << "Number " << i << " equals " << n1 << endl;
        }
      }
    
      return 0;
    }
    

    When we run it, it correctly identifies numbers 2 and 8 as numbers. The second run identifies that there are only two lines -- one incorrect and one correct.

    UNIX> g++ -o forget-nothing forget-nothing.cpp
    UNIX> ./forget-nothing < input-mixed.txt
    Number 0 entered incorrectly
    Number 1 entered incorrectly
    Number 2 equals 2
    Number 3 entered incorrectly
    Number 4 entered incorrectly
    Number 5 entered incorrectly
    Number 6 entered incorrectly
    Number 7 entered incorrectly
    Number 8 equals 2
    Number 9 entered incorrectly
    UNIX> cat input-3.txt
    Fred
    44
    UNIX> ./forget-nothing < input-3.txt
    Number 0 entered incorrectly
    Number 1 equals 44
    UNIX>