I. Ways to view input in C/C++ A. as a stream of characters B. as a stream of words: When treated this way, C views the input as an unbroken stream of words, separated by blank spaces 1. New line characters are ignored 2. The blank spaces are called 'delimiters' because they mark the beginning and end of each word. 3. The words are often called 'tokens'. There's no good reason for this name; it's just historical C. as a set of lines: This is the way that you normally view input II. Example: Suppose the user inputs the following lines: The quick brown fox jumped over the fence A. When viewed as a stream, C sees your input as: The quick brown fox jumped the fence 1. Notice that C ignored the new line characters in the input. B. When viewed as a set of lines, then C views the input just as you entered it above III. Formatted Input A. scanf: removes words sequentially from the front of the stream, tries to convert them to the indicated type, and stores them in the indicated location B. Can specify the maximum number of characters to read. For example: scanf("%10s %3d", x, &y); The input "brad 12345678" will assign "brad" to x and 123 to y The input "bradvanderzanden 12" will assign "bradvander" to x and 0 to y because "zanden 12" will be left and scanf will try and fail to convert "zanden" to an integer C. Useful for quick and dirty programs but for production programs it is greatly limited by: 1) its inability to detect newlines: Input is frequently grouped onto newlines and one of the error checks you want to perform is to ensure that each line has the correct number of inputs 2) its use of whitespace to delimit fields: the use of whitespace makes it impossible to read multi-word fields IV. Line Input A. char *gets(char *buffer): returns the next line of input and stores it in s 1. s is the return value if the read is successful, and NULL otherwise 2. gets is dangerous because it will read until it reaches the newline character, no matter how many characters that requires. Hence your buffer may overflow, no matter how big you create it. 3. the newline character is discarded and the line is terminated with the null character B. char *fgets(char *buffer, int size, FILE *input_stream): 1. characters are read from the input_stream into buffer until a. a newline is seen, b. the end of file is reached, or c. n-1 characters are read 2. the input is terminated with a null character--if a newline is read then the newline character is placed directly before the null character 3. if the read is successful, then fgets returns buffer; otherwise it returns NULL 4. use feof to test whether or not EOF has actually been reached: a read can fail for a variety of other reasons unrelated to EOF C. int getline(char **buffer, int *size, File *input_stream): 1. The getline function is the preferred method for reading lines of text from a stream, including standard input. The other standard functions, including gets, fgets, and scanf, are too unreliable. 2. The getline function reads an entire line from a stream, up to and including the next newline character. It takes three parameters: a) A pointer to a block allocated with malloc or calloc. It will contain the line read by getline when it returns. b) A pointer to a variable that specifies the size in bytes of the block of memory pointed to by the first parameter. c) The stream from which to read the line. 3. The pointer to the block of memory allocated for getline is merely a suggestion. The getline function will automatically enlarge the block of memory as needed, via the realloc function, so there is never a shortage of space -- one reason why getline is so safe. 4. getline will also tell you the new size of the block by the value returned in the second parameter. 5. If an error occurs, such as end of file being reached without reading any bytes, getline returns -1. Otherwise, the first parameter will contain a pointer to the string containing the line that was read, and getline returns the number of characters read (up to and including the newline, but not the final null character). V. Fields A. Problem with conventional CIO functions: They do not break a line into its component parts