CS202 Lecture notes -- Argc/Argv and Stringstreams

  • James S. Plank
  • Directory: ~jplank/cs202/Notes/Argv
  • Lecture notes: http://web.eecs.utk.edu/~jplank/plank/classes/cs202/Notes/Argv
  • Original Notes -- 2011-ish
  • Last modification date: Tue Aug 24 17:11:32 EDT 2021

    Argc and argv

    You may give main() two parameters, which are conventionally named argc and argv. They are usually declared as follows:

    int main(int argc, char **argv)
    

    Argc is an integer. It stores the number of arguments on the command line when the program was called. This includes the name of the program itself.

    Argv looks weird, but what it is is an array of C-style strings. Sometimes you see it declared as "char *argv[]," which is equivalent to the above. Element argv[i] will be a c style string containing the i-th command line argument. These are defined from 0 through argc-1.

    For example, the program src/argc.cpp prints out argc, and then each element of argv, first using printf() with the C-style strings, and then by copying them to C++ strings and using cout:

    /* This program introduces you to argc and argv.  They are parameters to main(), and
       tell you the number of words on the command line, and what those words are.  Note
       that argc is always at least one, and argv[0] is usually the name of the program. */
    
    #include <iostream>
    #include <cstdio>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      int i;
      string s;
    
      /* Print argc. */
    
      printf("Argc is %d\n", argc);
      printf("\n");
    
      /* Print argv using printf() and c-style strings. */
    
      for (i = 0; i < argc; i++) {
        printf("argv[%d] = %s\n", i, argv[i]);
      }
      printf("\n");
    
      /* Print argv by copying each argument to a C++ string and printing it with cout. */
    
      for (i = 0; i < argc; i++) {
        s = argv[i];
        cout << "argv[" << i << "] = " << s << endl;
      }
      return 0;
    }
    

    A few examples are straightforward:

    UNIX> bin/argc 
    Argc is 1
    
    argv[0] = bin/argc
    
    argv[0] = bin/argc
    UNIX> bin/argc 5
    Argc is 2
    
    argv[0] = bin/argc
    argv[1] = 5
    
    argv[0] = bin/argc
    argv[1] = 5
    UNIX> bin/argc - Fred . +
    Argc is 5
    
    argv[0] = bin/argc
    argv[1] = -
    argv[2] = Fred
    argv[3] = .
    argv[4] = +
    
    argv[0] = bin/argc
    argv[1] = -
    argv[2] = Fred
    argv[3] = .
    argv[4] = +
    UNIX> 
    
    And now a few non-straightforward examples that have mostly do to with the Unix shell. If you perform redirection of standard input or standard output, those specifications are stripped out by the shell and are not included in argc/argv:
    UNIX> bin/argc < src/argc.cpp 
    Argc is 1
    
    argv[0] = bin/argc
    
    argv[0] = bin./argc
    UNIX> 
    
    You can even put those specifications at the beginning of the command -- it matters not:
    UNIX> > output.txt bin/argc < src/argc.cpp Shaft 
    UNIX> cat output.txt
    Argc is 2
    
    argv[0] = bin/argc
    argv[1] = Shaft
    
    argv[0] = bin/argc
    argv[1] = Shaft
    UNIX> 
    
    If you use single or double quotes, you can put spaces into single arguments. You can use single quotes to put double quotes into arguments and vice-versa:
    UNIX> bin/argc "Jim Plank" 'Jim Plank'
    Argc is 3
    
    argv[0] = bin/argc
    argv[1] = Jim Plank
    argv[2] = Jim Plank
    
    argv[0] = bin/argc
    argv[1] = Jim Plank
    argv[2] = Jim Plank
    UNIX> bin/argc "They call him 'Thor'"
    Argc is 2
    
    argv[0] = bin/argc
    argv[1] = They call him 'Thor'
    
    argv[0] = bin/argc
    argv[1] = They call him 'Thor'
    UNIX> bin/argc 'He said, "Quoting, it'"'s quite confusing."'"'
    Argc is 2
    
    argv[0] = bin/argc
    argv[1] = He said, "Quoting, it's quite confusing."
    
    argv[0] = bin/argc
    argv[1] = He said, "Quoting, it's quite confusing."
    UNIX> 
    
    Finally, an asterisk (*) on the command line performs pattern matching on the file names in the current directory. For example, "*" matches all files. "*.cpp" matches all files that end with ".cpp". These are then put onto the command line and into argv:
    UNIX> bin/argc src/*.cpp | head -n 10
    Argc is 8
    
    argv[0] = bin/argc
    argv[1] = src/argc.cpp
    argv[2] = src/argv1int.cpp
    argv[3] = src/argv2int.cpp
    argv[4] = src/argvallint.cpp
    argv[5] = src/clearing-output.cpp
    argv[6] = src/identify_words.cpp
    argv[7] = src/onetoten.cpp
    UNIX> 
    

    When using argv, you should convert the strings to C++ strings

    Whenever you use argv, unless you are just printing out the strings, you should convert them to C++ strings. To illustrate, here's src/argv-beware.cpp. Please read the inline comments.

    /* This program takes two strings on the comand line and compares them to see if they
       are identical or equal to the string "Fred".  The point of this is to show you that when you use 
       the elements of argv as a C-style string, you'll get unexpected results.  The takeaway is
       that you should always convert the strings in argv to C++-style strings. */
    
    #include <iostream>
    #include <cstdio>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      string a1, a2;
    
      /* Error check and print the two words. */
    
      if (argc != 3) {
        cerr << "usage: bin/argc-beware word1 word2\n";
        return 1;
      }
    
      printf("Word 1: %s\n", argv[1]);
      printf("Word 2: %s\n", argv[2]);
    
      /* This automatically converts them to C++ strings. */
    
      a1 = argv[1];
      a2 = argv[2];
    
      /* Now do various comparisons with the C and C++ versions */
    
      printf("Comparing them as C-style strings:    %d\n", (argv[1] == argv[2]));
      printf("Comparing them as C++ strings:        %d\n", (a1 == a2));
      printf("Equal to the C string \"Fred?\"?        %d\n", (argv[1] == "Fred"));
      printf("Equal to the C++ string \"Fred?\"?      %d\n", (a1 == "Fred"));
      
      printf("Doing the comparison with a typecast: %d\n", ((string) argv[1] == (string) argv[2]));
    
      return 0;
    }
    

    When we run this with two different strings, we get the expected output: They don't equal each other, and they don't equal the string "Fred":

    UNIX> bin/argv-beware 
    usage: bin/argc-beware word1 word2
    UNIX> bin/argv-beware a b
    Word 1: a
    Word 2: b
    Comparing them as C-style strings:    0
    Comparing them as C++ strings:        0
    Equal to the C string "Fred?"?        0
    Equal to the C++ string "Fred?"?      0
    Doing the comparison with a typecast: 0
    UNIX>
    
    When we specify two identical strings, you'll note that as C-style strings, they don't equal each other:
    UNIX> bin/argv-beware a a 
    Word 1: a
    Word 2: a
    Comparing them as C-style strings:    0          # Here's where they don't equal each other.
    Comparing them as C++ strings:        1
    Equal to the C string "Fred?"?        0
    Equal to the C++ string "Fred?"?      0
    Doing the comparison with a typecast: 1
    UNIX>
    
    You don't need to worry about the reason until you take CS360, but in case you're curious, it's because the C-style strings are pointers to the characters, and each of argv[1] and argv[2] points to different memory locations.

    If I have the arguments be "Fred", you'll see that again, the C-style strings don't equal each other or "Fred":

    UNIX> bin/argv-beware Fred Fred
    Word 1: Fred
    Word 2: Fred
    Comparing them as C-style strings:    0        # Again, they don't equal each other.
    Comparing them as C++ strings:        1
    Equal to the C string "Fred?"?        0        # Or Fred, as C-style strings
    Equal to the C++ string "Fred?"?      1
    Doing the comparison with a typecast: 1
    UNIX> 
    
    A final note -- if you put "(string)" before argv[i] in an expression, then it will convert argv[i] to a C++ string. That's in the code above, and it often useful for when you want to compare the string to a literal.

    And a post-final note, when you compile this with -Wall and -Wextra, you get a very valid warning:

    UNIX> g++ -Wall -Wextra --std=c++98 -o bin/argv-beware src/argv-beware.cpp
    src/argv-beware.cpp: In function 'int main(int, char**)':
    src/argv-beware.cpp:33:70: warning: comparison with string literal results in unspecified behavior -Waddress]
       printf("Equal to the C string \"Fred?\"?        %d\n", (argv[1] == "Fred"));
                                                                          ^~~~~~
    UNIX>
    

    Stringstreams for parsing command line arguments

    Stringstreams are C++ primitives which allow you to treat strings like cin and cout. If you want to treat a string s like cin, then you declare an istringstream and initialize it with the str() method. Here's an example that converts argv[1] into an istringstream and then uses that to convert argv[1] into an integer. It is in src/argv1int.cpp

    /* This program uses a stringstream to convert the first command line
       argument to an integer.  It prints an error if it is not an integer. */
    
    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      int i;
    
      /* Make sure there is exactly one command line argument. */
    
      if (argc != 2) { 
        cerr << "usage: argv1int argument\n"; 
        return 1; 
      }
    
      /* Use the stringstream to convert the argument to an integer, or print
         an error on standard error. */
    
      ss.str(argv[1]);
      if (ss >> i) {
        printf("Integer: %d\n", i);
      } else {
        fprintf(stderr, "The argument %s is not an integer.\n", argv[1]);
        return 1;
      } 
    
      return 0;
    }
    

    First, if you want to use printf() to print to standard error, you simply call fprintf() and make the first parameter stderr.

    Second, you include sstream to use stringstreams. You treat ss above just like cin to determine whether argv[1] is an integer.

    UNIX> bin/argv1int 
    usage: argv1int argument
    UNIX> bin/argv1int 8000
    Integer: 8000
    UNIX> bin/argv1int Fred
    The argument Fred is not an integer.
    UNIX> 
    
    An unfortunate part about a stringstream is that you cannot simply call the str() method to set it to another string. For example, src/argv2int.cpp is a logical extension to src/argv1int.cpp to read argv[1] into i and argv[2] into j:

    /* Here we try to read argument 1 into i and argument 2 into j, but the second
       one fails, because we haven't cleared the stringstream.  In other words, if
       you put two integers on the command line, it will fail. */
    
    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      int i, j;
    
      /* Error check to make sure we have three arguments. */
    
      if (argc != 3) { 
        cerr << "usage: argv1int argument argument\n";
        return 1; 
      }
    
      /* Read the first argument into i. */
    
      ss.str(argv[1]);
      if (ss >> i) {
        printf("Argument i: %d\n", i);
      } else {
        fprintf(stderr, "Argument i -- %s is not an integer.\n", argv[1]);
        return 1;
      } 
    
      /* Read the first argument into j. */
    
      ss.str(argv[2]);
      if (ss >> j) {
        printf("Argument j: %d\n", j);
      } else {
        fprintf(stderr, "Argument j -- %s is not an integer.\n", argv[2]);
        return 1;
      } 
    
      return 0;
    }
    

    When we run it, it fails on j, even though argv[2] does represent an integer:

    UNIX> bin/argv2int 1 2
    Argument i: 1
    Argument j -- 2 is not an integer.
    UNIX> 
    
    This is because you have to call clear() on the stringstream before calling ss.str(argv[2]). That's a drag -- try to remember it so that you don't get burnt by it when programming. To hammer this home, src/argvallint.cpp tests each command line argument to see if it is an integer:

    /* This reads all of the arguments on the command line to determine 
       whether each is an integer or not.  You have to call clear() on 
       the stringstream each time you set it to a new string.  */
    
    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main(int argc, char **argv)
    {
      istringstream ss;
      int i, j;
    
      for (i = 1; i < argc; i++) {
        ss.clear();                      // Here is the clear command.
        ss.str(argv[i]);
        if (ss >> j) {
          printf("Argument %d -- %d\n", i, j);
        } else {
          printf("Argument %d -- %s is not an integer.\n", i, argv[i]);
        }
      }
      return 0;
    }
    

    UNIX> bin/argvallint 1 2 buckle my shoe. 3 4 get out the door.
    Argument 1 -- 1
    Argument 2 -- 2
    Argument 3 -- buckle is not an integer.
    Argument 4 -- my is not an integer.
    Argument 5 -- shoe. is not an integer.
    Argument 6 -- 3
    Argument 7 -- 4
    Argument 8 -- get is not an integer.
    Argument 9 -- out is not an integer.
    Argument 10 -- the is not an integer.
    Argument 11 -- door. is not an integer.
    UNIX> 
    

    Stringstreams to create strings

    You can use the ostringstream type to create a string using functionality like cout. In this case, you simply call str() to convert the stringstream to a string. Here's an example in src/onetoten.cpp:

    /* Use an ostringstream to create a string that contains the numbers from 1 to 10. */
    
    #include <iostream>
    #include <cstdio>
    #include <sstream>
    using namespace std;
    
    int main()
    {
      ostringstream ss;
      string s;
      int i;
    
      for (i = 1; i < 11; i++) ss << i << " ";
      s = ss.str();
      cout << s << endl;
      return 0;
    }
    

    It creates a string containing the numbers 1 through 10 and then prints it out:

    UNIX> bin/onetoten
    1 2 3 4 5 6 7 8 9 10 
    UNIX> 
    
    When you want to "reuse" an ostringstream, you need to clear it, and to call its str() method with an empty string as its argument. That resets it to be empty. If you don't do it and you're not ready for it, you'll get some surprising output. For example, src/clearing-output.cpp uses a stringstream to create four strings with four numbers each. It does this twice. The first time, it doesn't call ss.str(""), and so what you see is that the stringstream keeps getting appended, instead of getting cleared with each line. The second time, it does call ss.str(""), and you see that the stringstream is cleared properly:

    /* This program's intent is to use a single ostringstream to create four strings, 
       each of which has four numbers in it.  We do it twice -- once incorrectly, because
       we don't call str("") on the stringstream to reset it.  The second time, we do it
       correctly. */
    
    #include <iostream>
    #include <cstdio>
    #include <cstdlib>
    #include <sstream>
    using namespace std;
    
    int main()
    {
      ostringstream ss;
      int i;
      string s;
    
      /* Here we create four strings, each of which is supposed to hold four numbers,
         with an ostringstream, but we do it incorrectly. */
    
      printf("Using the stringstream incorrectly:\n\n");
    
      for (i = 0; i < 4; i++) {
        ss.clear();
        ss << 10*i << " " << 10*i+1 << " " << 10*i+2 << " " << 10*i+3;
        s = ss.str();
        cout << s << endl;
      }
      cout << endl;
    
      /* Now we do the same thing, but correctly, by calling ss.str("") before
         putting numbers into the stringstream. */
    
      printf("Using the stringstream correctly:\n\n");
    
      for (i = 0; i < 4; i++) {
        ss.clear();
        ss.str("");                        // This is the only change from the code above.
        ss << 10*i << " " << 10*i+1 << " " << 10*i+2 << " " << 10*i+3;
        s = ss.str();
        cout << s << endl;
      }
      cout << endl;
    
      return 0;
    }
    

    Here's the output -- as you can see, the first set of strings looks really odd, because the stringstream does not get reset:

    UNIX> bin/clearing-output
    Using the stringstream incorrectly:
    
    0 1 2 3
    0 1 2 310 11 12 13
    0 1 2 310 11 12 1320 21 22 23
    0 1 2 310 11 12 1320 21 22 2330 31 32 33
    
    Using the stringstream correctly:
    
    0 1 2 3
    10 11 12 13
    20 21 22 23
    30 31 32 33
    
    UNIX> 
    

    Getline and stringstreams - a very useful program:

    As mentioned in the lecture notes on string and vector basics, the procedure getline(cin, s) reads a line of input from standard input and puts it into the string s. Spaces are preserved. You can combine getline and stringstreams to identify and get access to individual words on each line, as in src/identify_words.cpp. This program contains some code that I use all the time to process standard input. It reads a line, and then using a stringstream, it creates a vector of the words on the line. It then prints info and the words:

    /* This program contains code that I use all the time, to read a line on 
       standard input, and then to create a vector of all of the words on the line. */
    
    #include <iostream>
    #include <cstdio>
    #include <sstream>
    #include <vector>
    using namespace std;
    
    int main()
    {
      string line;              // The line
      vector <string> sv;       // This holds the words on the current line
      int ln;                   // Line number
      size_t w;                 // Word number
      string s;                 // Helper
      istringstream ss;         // Helper
    
      ln = 0;
    
      /* Read the current line and update the line number. */
    
      while (getline(cin, line)) {
        ln++;
    
        /* Using a stringstream, create the vector of words on the line. */
    
        sv.clear();
        ss.clear();
        ss.str(line);
        while (ss >> s) sv.push_back(s);
    
        /* Print the line number, number of words, and the words. */
    
        printf("Line %d.  # Words: %lu:", ln, sv.size());
        for (w = 0; w < sv.size(); w++) printf(" %s", sv[w].c_str());
        printf("\n");
      }
    
      return 0;
    }
    

    UNIX> cat data/input.txt
    There's a port on a western bay
    And it serves a hundred ships a day.
    Lonely sailors pass the time away
    And talk about their homes.
    UNIX> bin/identify_words < data/input.txt
    Line 1.  # Words: 7: There's a port on a western bay
    Line 2.  # Words: 8: And it serves a hundred ships a day.
    Line 3.  # Words: 6: Lonely sailors pass the time away
    Line 4.  # Words: 5: And talk about their homes.
    UNIX>