CS140 Lecture notes -- Argc/Argv and Stringstreams

  • Jim Plank
  • Directory: ~plank/cs140/Notes/Argv
  • Lecture notes: http://web.eecs.utk.edu/~jplank/plank/classes/cs140/Notes/Argv
  • Last modification date: Mon Jan 21 10:55:15 EST 2013

    Argc and argv

    You may give main() two parameters, which are conventionally named argc and argv. They are usually declared as follows:

    int main(int argc, char **argv)
    

    Argc is an integer. It stores the number of arguments on the command line when the program was called. This includes the name of the program itself.

    Argv looks weird, but what it is is an array of C-style strings. Sometimes you see it declared as "char *argv[]," which is equivalent to the above. Element argv[i] will be a c style string containing the i-th command line argument. These are defined from 0 through argc-1.

    For example, the program argc.cpp prints out argc, and then each element of argv:

    #include <iostream>
    #include <cstdio>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      int i;
    
      printf("Argc is %d\n", argc);
      printf("\n");
      for (i = 0; i < argc; i++) {
        printf("argv[%d] = %s\n", i, argv[i]);
      }
    }
    

    A few examples are straightforward:

    UNIX> g++ argc.cpp -o argc
    UNIX> argc
    Argc is 1
    
    argv[0] = argc
    UNIX> argc 5
    Argc is 2
    
    argv[0] = argc
    argv[1] = 5
    UNIX> argc They call me Mellow Yellow - quite right, Slick
    Argc is 10
    
    argv[0] = argc
    argv[1] = They
    argv[2] = call
    argv[3] = me
    argv[4] = Mellow
    argv[5] = Yellow
    argv[6] = -
    argv[7] = quite
    argv[8] = right,
    argv[9] = Slick
    UNIX> 
    
    And now a few non-straightforward examples that have mostly do to with the Unix shell. If you perform redirection of standard input or standard output, those specifications are stripped out by the shell and are not included in argc/argv:
    UNIX> argc < argc.cpp
    Argc is 1
    
    argv[0] = argc
    UNIX> 
    
    You can even put those specifications at the beginning of the command -- it matters not:
    UNIX> > output.txt argc < argc.cpp Shaft
    UNIX> cat output.txt
    Argc is 2
    
    argv[0] = argc
    argv[1] = Shaft
    UNIX> < output.txt argc 1 2 3 > output2.txt 4 5 6
    UNIX> cat output2.txt
    Argc is 7
    
    argv[0] = argc
    argv[1] = 1
    argv[2] = 2
    argv[3] = 3
    argv[4] = 4
    argv[5] = 5
    argv[6] = 6
    UNIX> 
    
    If you use single or double quotes you can put spaces into single arguments. You can use single quotes to put double quotes into arguments and vice-versa:
    UNIX> argc "Jim Plank" 'Jim Plank'
    Argc is 3
    
    argv[0] = argc
    argv[1] = Jim Plank
    argv[2] = Jim Plank
    UNIX> argc "They call him 'Thor'"
    Argc is 2
    
    argv[0] = argc
    argv[1] = They call him 'Thor'
    UNIX> argc 'He said, "Quoting, it'"'s quite confusing."'"'
    Argc is 2
    
    argv[0] = argc
    argv[1] = He said, "Quoting, it's quite confusing."
    UNIX> 
    
    Here's how Unix derives the above string:
    1. The string 'He said...' gets terminated by the ' after the word "it", thus creating the string 'He said, "Quoting, it'.
    2. Because there is no white space between the ' and " in it'"'s, the " is considered to start a new string that should be concatenated to the string we started in 1. This string is "'s quite confusing." We concatenate this string to the one in string 1 to obtain 'He said, "Quoting, it's quite confusing.'
    3. Again there is no space between the "' after the word confusing, and so the string '"' gets concatenated to the string from 2, thus obtaining 'He said, "Quoting, it's quite confusing."'
    Finally, an asterisk (*) on the command line performs pattern matching on the file names in the current directory. For example, "*" matches all files. "*.cpp" matches all files that end with ".cpp". These are then put onto the command line and into argv:
    UNIX> argc *
    Argc is 19
    
    argv[0] = argc
    argv[1] = argc
    argv[2] = argc.cpp
    argv[3] = argv1int
    argv[4] = argv1int.cpp
    argv[5] = argv2int
    argv[6] = argv2int.cpp
    argv[7] = argvallint
    argv[8] = argvallint.cpp
    argv[9] = clearing-output
    argv[10] = clearing-output.cpp
    argv[11] = identify_words
    argv[12] = identify_words.cpp
    argv[13] = index.html
    argv[14] = input.txt
    argv[15] = onetoten
    argv[16] = onetoten.cpp
    argv[17] = output2.txt
    argv[18] = output.txt
    UNIX> argc *.cpp
    Argc is 8
    
    argv[0] = argc
    argv[1] = argc.cpp
    argv[2] = argv1int.cpp
    argv[3] = argv2int.cpp
    argv[4] = argvallint.cpp
    argv[5] = clearing-output.cpp
    argv[6] = identify_words.cpp
    argv[7] = onetoten.cpp
    UNIX> 
    

    Stringstreams for parsing command line arguments

    Stringstreams are C++ primitives which allow you to treat strings like cin and cout. If you want to treat a string s like cin, then you declare an istringstream and initialize it with the str() method. Here's an example that converts argv[1] into an istringstream and then uses that to convert argv[1] into an integer. It is in argv1int.cpp

    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      int i;
    
      if (argc != 2) { cerr << "usage: argv1int argument\n"; return 1; }
    
      ss.str(argv[1]);
      if (!(ss >> i)) {
        fprintf(stderr, "The argument %s is not an integer.\n", argv[1]);
        return 1;
      } 
    
      printf("Integer: %d\n", i);
    }
    

    First, if you want to use printf() to print to standard error, you simply call fprintf() and make the first parameter stderr.

    Second, you include sstream to use stringstreams. You treat ss above just like cin to determine whether argv[1] is an integer.

    UNIX> g++ argv1int.cpp -o argv1int
    UNIX> argv1int 
    usage: argv1int argument
    UNIX> argv1int 8000
    Integer: 8000
    UNIX> argv1int Fred
    The argument Fred is not an integer.
    UNIX> 
    
    An unfortunate part about a stringstream is that you cannot simply call the str() method to set it to another string. For example, argv2int.cpp is a logical extension to argv1int.cpp to read argv[1] into i and argv[2] into j:

    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      int i, j;
    
      if (argc != 3) { cerr << "usage: argv1int argument argument\n"; return 1; }
    
      ss.str(argv[1]);
      if (!(ss >> i)) {
        fprintf(stderr, "Argument i -- %s is not an integer.\n", argv[1]);
        return 1;
      } 
      printf("Argument i: %d\n", i);
    
      ss.str(argv[2]);
      if (!(ss >> j)) {
        fprintf(stderr, "Argument j -- %s is not an integer.\n", argv[2]);
        return 1;
      } 
      printf("Argument j: %d\n", j);
    }
    

    When we run it, it fails on j, even though argv[2] does represent an integer:

    UNIX> g++ argv2int.cpp -o argv2int
    UNIX> argv2int 10 20
    Argument i: 10
    Argument j -- 20 is not an integer.
    UNIX> 
    
    This is because you have to call clear on the stringstream before calling ss.str(argv[2]). That's a drag -- try to remember it so that you don't get burnt by it when programming. To hammer this home, argvallint.cpp tests each command line argument to see it it is an integer:

    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      int i, j;
    
      for (i = 1; i < argc; i++) {
        ss.clear();
        ss.str(argv[i]);
        if (!(ss >> j)) {
          printf("Argument %d -- %s is not an integer.\n", i, argv[i]);
        } else {
          printf("Argument %d -- %d\n", i, j);
        }
      }
    }
    

    UNIX> g++ argvallint.cpp -o argvallint
    UNIX> argvallint 1 2 buckle my shoe. 3 4 get out the door.
    Argument 1 -- 1
    Argument 2 -- 2
    Argument 3 -- buckle is not an integer.
    Argument 4 -- my is not an integer.
    Argument 5 -- shoe. is not an integer.
    Argument 6 -- 3
    Argument 7 -- 4
    Argument 8 -- get is not an integer.
    Argument 9 -- out is not an integer.
    Argument 10 -- the is not an integer.
    Argument 11 -- door. is not an integer.
    UNIX> 
    

    Stringstreams to create strings

    You can use the ostringstream type to create a string using functionality like cout. In this case, you simply call str() to convert the stringstream to a string. Here's an example in onetoten.cpp:

    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      ostringstream ss;
      string s;
      int i;
    
      for (i = 1; i < 11; i++) ss << i << " ";
      s = ss.str();
      cout << s << endl;
    }
    

    It creates a string containing the numbers 1 through 10 and then prints it out:

    UNIX> g++ onetoten.cpp -o onetoten
    UNIX> onetoten
    1 2 3 4 5 6 7 8 9 10 
    UNIX> 
    
    When you want to "reuse" an ostringstream, you need to clear it, and to call its str() method with an empty string as its argument. That resets it to be empty. If you don't do it and you're not ready for it, you'll get some surprising output. For example, clearing-output.cpp attempts to print out four random integers by first putting them into an ostringstream and then printing out the string:

    #include <iostream>
    #include <cstdio>
    #include <cstdlib>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      ostringstream ss, ss2;
      int i, j;
    
      srand48(0);
      printf("Not calling ss.str(\"\"):\n\n");
      for (i = 0; i < 4; i++) {
        ss.clear();
        ss << "#" << i << ": " << lrand48()%1000 << "\n";
        cout << ss.str();
      }
      cout << endl;
    
      srand48(0);
      printf("Calling ss2.str(\"\"):\n\n");
     
      for (i = 0; i < 4; i++) {
        ss2.clear();
        ss2.str("");
        ss2 << "#" << i << ": " << lrand48()%1000 << "\n";
        cout << ss2.str();
      }
    }
    

    In the first loop, it does not call str("") at the beginning of each iteration, and as a result, the same numbers get printed out multiple times. The second loop prints each number exactly once:

    UNIX> clearing-output
    Not calling ss.str(""):
    
    #0: 414
    #0: 414
    #1: 240
    #0: 414
    #1: 240
    #2: 554
    #0: 414
    #1: 240
    #2: 554
    #3: 841
    
    Calling ss2.str(""):
    
    #0: 414
    #1: 240
    #2: 554
    #3: 841
    UNIX> 
    

    Getline and stringstreams

    As mentioned in the lecture notes on string and vector basics, the procedure getline(cin, s) reads a line of input from standard input and puts it into the string s. Spaces are preserved. You can combine getline and stringstreams to identify and get access to individual words on each line, as in identify_words.cpp, which prints out each word on standard input, preceded by its line number and word number.

    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      string s;
      int l, w;
    
      l = 0;
      while (getline(cin, s)) {
        l++;
        ss.clear();
        ss.str(s);
        w = 0;
        while (ss >> s) {
          w++;
          printf("Line %3d, word %3d: %s\n", l, w, s.c_str());
        }
        printf("\n");
      }
    }
    

    UNIX> g++ identify_words.cpp -o identify_words
    UNIX> cat input.txt
    There's a port on a western bay
    And it serves a hundred ships a day.
    Lonely sailors pass the time away
    And talk about their homes.
    UNIX> identify_words < input.txt
    Line   1, word   1: There's
    Line   1, word   2: a
    Line   1, word   3: port
    Line   1, word   4: on
    Line   1, word   5: a
    Line   1, word   6: western
    Line   1, word   7: bay
    
    Line   2, word   1: And
    Line   2, word   2: it
    Line   2, word   3: serves
    Line   2, word   4: a
    Line   2, word   5: hundred
    Line   2, word   6: ships
    Line   2, word   7: a
    Line   2, word   8: day.
    
    Line   3, word   1: Lonely
    Line   3, word   2: sailors
    Line   3, word   3: pass
    Line   3, word   4: the
    Line   3, word   5: time
    Line   3, word   6: away
    
    Line   4, word   1: And
    Line   4, word   2: talk
    Line   4, word   3: about
    Line   4, word   4: their
    Line   4, word   5: homes.
    
    UNIX>