CS360 Lecture notes -- Strings in C

James S. Plank

Directory: /home/plank/cs360/notes/Strings-In-C

Lecture notes: http://web.eecs.utk.edu/~jplank/plank/classes/cs360/360/notes/Strings-In-C/index.html

Lecture notes directory: /home/plank/cs360/notes/Strings-In-C

Bitbucket: https://bitbucket.org/jimplank/cs360-lecture-notes.

Original lecture notes ("PointMalloc"): Fri Aug 31 10:39:16 EDT 2007.

Last modified: Wed Jan 17 16:45:00 EST 2018

Topcoder problems that can give you practice if you do them in C

DigitStringDiv2 -- this is an easy one to give you a little string practice.
EqualSubstrings2 -- practice with strncmp(), or perhaps with strstr().

In C, we lose the ease of C++ strings, which is a pity. There are a lot of routines to help you create and manipulate strings in C. I go over many of them here. One important and inconvenient thing about C strings is that you have to manage your own memory, and that can lead to many pitfalls. One goal of this lecture is to help you avoid those pitfalls.

strcpy()

char *strcpy(char *s1, const char *s2);

Strcpy() assumes that s2 is a null-terminated string, and that s1 is a (char *) with enough characters to hold s2, including the null character at the end. Strcpy() then copies s2 to s1. It also returns s1. Why would you return your first argument? The answer is historical -- I'll talk about it with strdup().

Here's a simple program that uses strcpy() to initialize three strings and print them out (this is in src/strcpy.c):

For those unfamiliar with "Give Him Six!", please see this, this or this.

/* Initialize three strings using strcpy() and print them. */

#include <stdio.h>
#include <string.h>

int main()
{
  char give[5];
  char him[5];
  char six[5];

  strcpy(give, "Give");
  strcpy(him, "Him");
  strcpy(six, "Six!");

  printf("%s %s %s\n", give, him, six);
  return 0;
}

It runs fine:

UNIX> bin/strcpy
Give Him Six!
UNIX>

Suppose I try to copy a string that's too big. For example, look at src/strcpy2.c:

/* What happens when you call strcpy and didn't allocate enough memory? */

#include <stdio.h>
#include <string.h>

typedef unsigned long UL;

int main()
{
  char give[5];
  char him[5];
  char six[5];

  /* Print the addresses of the three arrays. */

  printf("give: 0x%lx  him: 0x%lx  six: 0x%lx\n", (UL) give, (UL) him, (UL) six);

  /* This is the same as before -- nice strcpy() statements, and then print. */

  strcpy(give, "Give");
  strcpy(him, "Him");
  strcpy(six, "Six!");
  printf("%s %s %s\n", give, him, six);

  /* Now, this strcpy() is copying a string that is too big. */

  strcpy(him, "T.J. Houshmandzadeh");
  printf("%s %s %s\n", give, him, six);

  return 0;
}

Clearly there's a problem with this -- the string "T.J. Houshmandzadeh" is much larger than five characters. Some compilers, like the one on my new Macintosh, will compile this, but others, like the one on my old Macintosh, will take issue with it:

UNIX> gcc -o bin/strcpy2 src/strcpy2.c
src/strcpy2.c: In function 'main':
src/strcpy2.c:21: warning: call to __builtin___strcpy_chk will always overflow destination buffer
UNIX>

That's a wise compiler. However, compilers are not all-seeing and all-knowing. Let's fool it by writing our own wrapper around strcpy() -- now it can't figure out the problem. The code is in src/strcpy3.c.

/* This is the same as strcpy2.c, but I write a procedure to call strcpy(), so that
   even a smart compiler won't figure out that I have a problem. */

#include <stdio.h>
#include <string.h>

typedef unsigned long UL;

void my_strcpy(char *s1, char *s2)
{
  strcpy(s1, s2);
}

int main()
{
  char give[5];
  char him[5];
  char six[5];

  printf("give: 0x%lx  him: 0x%lx  six: 0x%lx\n", (UL) give, (UL) him, (UL) six);

  strcpy(give, "Give");
  strcpy(him, "Him");
  strcpy(six, "Six!");

  printf("%s %s %s\n", give, him, six);

  my_strcpy(him, "T.J. Houshmandzadeh");

  printf("%s %s %s\n", give, him, six);
  return 0;
}

Now run it. Your memory addresses may differ, and your output may differ, but the interrelationship will be the same. I ran this in 32-bit mode on my old Mac:

UNIX> bin/strcpy3
give: 0xbfffe060  him: 0xbfffe050  six: 0xbfffe040
Give Him Six!
deh T.J. Houshmandzadeh Six!
UNIX>

Take a minute and try to figure out what's going on. Look at the following picture of memory -- I'm drawing this in big-endian, because it makes the character strings easier to parse. When we start, space has been allocated for give, him and six:

                    |----4 bytes----|           
                    |               |
                    | 0 | 1 | 2 | 3 | (I'm drawing this in big endian)
               
                    |               |           
     six----------> |               | 0xbfffe040
                    |               | 0xbfffe044
                    |               | 0xbfffe048
                    |               | 0xbfffe04c
     him----------> |               | 0xbfffe050
                    |               | 0xbfffe054
                    |               | 0xbfffe058
                    |               | 0xbfffe05c
     give---------> |               | 0xbfffe060
                    |               | 0xbfffe064
                    |               | 0xbfffe068
                    |               | 0xbfffe06c

Now, we make the first three strcpy() calls. At the point of the first printf() statement, memory looks like:

                    |----4 bytes----|           
                    |               |
                    | 0 | 1 | 2 | 3 | (I'm drawing this in big endian)
               
     six----------> |'S'|'i'|'x'|'!'| 0xbfffe040
                    | 0 |   |   |   | 0xbfffe044
                    |   |   |   |   | 0xbfffe048
                    |   |   |   |   | 0xbfffe04c
     him----------> |'H'|'i'|'m'| 0 | 0xbfffe050
                    |   |   |   |   | 0xbfffe054
                    |   |   |   |   | 0xbfffe058
                    |   |   |   |   | 0xbfffe05c
     give---------> |'G'|'i'|'v'|'e'| 0xbfffe060
                    | 0 |   |   |   | 0xbfffe064
                    |               | 0xbfffe068
                    |               | 0xbfffe06c

Now, we make the call strcpy(him, "T.J. Houshmandzadeh"). What happens is that the entire string is copied to him, and this overruns the memory allocated for give:

                    |----4 bytes----|           
                    |               |
                    | 0 | 1 | 2 | 3 | (I'm drawing this in big endian)
               
     six----------> |'S'|'i'|'x'|'!'| 0xbfffe040
                    | 0 |   |   |   | 0xbfffe044
                    |   |   |   |   | 0xbfffe048
                    |   |   |   |   | 0xbfffe04c
     him----------> |'T'|'.'|'J'|'.'| 0xbfffe050
                    |' '|'H'|'o'|'u'| 0xbfffe054
                    |'s'|'h'|'m'|'a'| 0xbfffe058
                    |'n'|'d'|'z'|'a'| 0xbfffe05c
     give---------> |'d'|'e'|'h'| 0 | 0xbfffe060
                    | 0 |   |   |   | 0xbfffe064
                    |               | 0xbfffe068
                    |               | 0xbfffe06c

So this means that him is indeed "T.J. Houshmandzadeh", but give has been modified as well, to be "deh". This accounts for the printout of:

deh T.J. Houshmandzadeh Six!

The bottom line is that when you modify memory that you have not allocated (as I did when I called strcpy(him, "T.J. Houshmandzadeh");), then strange things will happen. They have explanations, but until you figure it out, it will be confusing. If you're lucky, you get a segmentation violation or a bus error. If you're unlucky, you get wierd, inexplicable output. A corollary of this is that when you get a segmentation violation, a bus error, or wierd, inexplicable output, then chances are you have modified memory that you didn't allocate.

Here's the output on my Mac in 2021 -- I may well make this a clicker question, but see if you can figure out the output here.

UNIX> bin/strcpy3
give: 0x7ffeeea63197  him: 0x7ffeeea63192  six: 0x7ffeeea6318d
Give Him Six!
Houshmandzadeh T.J. Houshmandzadeh Six!
UNIX>

strcat()

char *strcat(char *s1, const char *s2);

Strcat() assumes that s1 and s2 are both null-terminated strings. Strcat() then concatenates s2 to the end of s1. I don't know what it returns -- read the man page if you care. Strcat() assumes that there is enough space in s1 to hold these extra characters. Otherwise, you'll start stomping over memory that you didn't allocate. Here is a simple example: (this is in src/strcat.c):

/* Using strcpy() and strcat() to create the string "Give Him Six!" incrementally. */

#include <stdio.h>
#include <string.h>

int main()
{
  char givehimsix[15];

  strcpy(givehimsix, "Give");
  printf("%s\n", givehimsix);
  strcat(givehimsix, " Him");
  printf("%s\n", givehimsix);
  strcat(givehimsix, " Six!");
  printf("%s\n", givehimsix);
  return 0;
}

The output is predictable:

UNIX> bin/strcat
Give
Give Him
Give Him Six!
UNIX>

Look at src/strcat2.c. Can you explain why the output is the way that it is? Try filling in memory as we did in the strcpy2 example above.

UNIX> bin/strcat2
give: 0xbfffe060  him: 0xbfffe050  six: 0xbfffe040
Give Him Six!
deh T.J. Houshmandzadeh Six!
deh Help! T.J. Houshmandzadeh Help! Six!
UNIX>

C-style strings are a little more difficult to handle than C++ style strings. For example, suppose you wanted to create a string with a given number of j's. In C++, you might write the following (src/makej.cpp):

/* Create a string with a given number of j's by using string concatenation. */

#include <iostream>
#include <cstdio>
#include <cstdlib>
using namespace std;

int main(int argc, char **argv)
{
  int i, n;
  string s;

  if (argc != 2) { fprintf(stderr, "usage: makej number\n"); exit(1); }
  n = atoi(argv[1]);

  for (i = 0; i < n; i++) s += "j";    // Here is the string concatenation.
  cout << s << endl;
  return 0;
}

Suppose you want to write the equivalent in C. It's a little more difficult, as you need to call malloc() first, to allocate the string. However, here it is (src/strcat3.c):

/* Trying to use strcat() like C++ string concatenation. */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
  char *s;
  int i;
  int n;

  if (argc != 2) { fprintf(stderr, "usage: strcat3 number\n"); exit(1); }

  n = atoi(argv[1]);
  s = (char *) malloc(sizeof(char)*(n+1));
  strcpy(s, "");

  for (i = 0; i < n; i++) strcat(s, "j");  /* Here's the strcat() call, which is really inefficient. */
  
  printf("%s\n", s);
  return 0;
}

When you run them on small numbers, they appear equivalent:

UNIX> bin/makej 50
jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
UNIX> bin/strcat3 50
jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
UNIX>

However, try them on a really big number. Here, I'm going to redirect standard output to /dev/null, which throws it away, and I'm going to time it with time:

UNIX> time sh -c "bin/makej 1000 > /dev/null"
0.002u 0.004s 0:00.01 0.0%	0+0k 0+0io 0pf+0w                          # Blink of an eye.
UNIX> time sh -c "bin/makej 10000 > /dev/null"
0.002u 0.004s 0:00.00 0.0%	0+0k 0+0io 0pf+0w                          # Blink of an eye.
UNIX> time sh -c "bin/makej 100000 > /dev/null"
0.004u 0.004s 0:00.01 0.0%	0+0k 0+0io 0pf+0w                          # Blink of an eye.
UNIX> time sh -c "bin/strcat3 1000 > /dev/null"
0.002u 0.004s 0:00.00 0.0%	0+0k 0+0io 0pf+0w                          # Blink of an eye.
UNIX> time sh -c "bin/strcat3 10000 > /dev/null"
0.039u 0.004s 0:00.04 75.0%	0+0k 0+0io 0pf+0w                          # A little slower
UNIX> time sh -c "bin/strcat3 100000 > /dev/null"
3.468u 0.005s 0:03.47 99.7%	0+0k 0+0io 0pf+0w                          # Nearly 100 times slower!
UNIX>

See the problem? The C++ string maintains the string's length, so concatenation is fast. In contrast, strcat() has to find the end of the string at each call, which makes the program O(n²). We can fix it, since we know where the end of the string is. This is in strcat4.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
  char *s;
  int i;
  int n;

  if (argc != 2) { fprintf(stderr, "usage: strcat4 number\n"); exit(1); }

  n = atoi(argv[1]);
  s = (char *) malloc(sizeof(char)*(n+1));
  strcpy(s, "");

  for (i = 0; i < n; i++) strcat(s+i, "j");  /* The only changed line */
  
  printf("%s\n", s);
  return 0;
}

UNIX> bin/strcat4 50
jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
UNIX> time sh -c "bin/strcat4 100000 > /dev/null"
0.003u 0.004s 0:00.01 0.0%	0+0k 0+0io 0pf+0w                         # Back to a the blink of an eye
UNIX>

Such is life in C.

strlen()

size_t strlen(char *s);

Strlen() assumes that s is a null-terminated string. It returns the number of characters before the null character. Strlen() is pretty obvious: (this is in src/strlen.c):

#include <stdio.h>
#include <string.h>

int main()
{
  char give[5];
  char him[5];
  char six[5];

  strcpy(give, "Give");
  strcpy(him, "Him");
  strcpy(six, "Six!");

  printf("%s %s %s\n", give, him, six);
  printf("%ld %ld %ld\n", strlen(give), strlen(him), strlen(six));
  return 0;
}

Output:

UNIX> bin/strlen
Give Him Six!
4 3 4

strcmp() and strncmp()

int strcmp(const char *s1, const char *s2)                      # We use ints as bools in C.
int strncmp(const char *s1, const char *s2, int n)

Strcmp() performs a lexicographic comparison of two strings. It returns 0 if they are equal, a negative number if s1 is less than s2, and a positive number otherwise. You will use strcmp() quite a bit in this class, because it's the easiest way to compare two strings.

Strncmp() stops comparing after n characters, if the null character has not be reached yet. It's a good exercise for you to do the D2 250-point problem from Topcoder SRM 683 as a standalone program in C, using strncmp() and strlen() rather than the C++ string library. I'll probably do it in class.

strchr()

char *strchr(const char *s, int c);

Strchr() is how you perform "find" for single characters in C strings. It assumes that s is a null-terminated string. C is an integer, but it is treated as a character. Strchr() returns a pointer to the first occurrence of the character equal to c in s. If s does not contain c, then it returns NULL.

Here is a simple program that prints out whether each line of standard input contains a space (this is in src/strchr.c):

/* Use strchr() to determine if each line of standard input has a space. */

#include <stdio.h>
#include <string.h>

int main()
{
  char line[100];
  char *ptr;

  while (fgets(line, 100, stdin) != NULL) {
    ptr = strchr(line, ' ');
    if (ptr == NULL) {
      printf("No spaces\n");
    } else {
      printf("Space at character %ld\n", ptr-line);
    }
  }
  return 0;
}

Since you haven't seen fgets() before, go ahead and read the man page. The arguments are a buffer of chars, the size of the buffer, and a "stream" from which to read. stdin is a global variable, defined in stdio.h that specifies to read from standard input. fgets() reads a line of text from the stream, up to the number of characters specified. It will include the newline at the end of the line, which is often a pain. Not so here, though.

I'm doing a little pointer arithmetic here -- ptr-line returns the number of characters between line and ptr. Here's an example of this running:

UNIX> bin/strchr
Jim
No spaces
Jim Plank
Space at character 3
James Plank
Space at character 5
 HI!
Space at character 0
     HI!!
Space at character 0
<CNTL-D>
UNIX>

We can modify this to print out where all the spaces are. Check out strchr2.c:

UNIX> bin/strchr2
Jim
No spaces
Jim Plank
Space at character 3
Jim  Plank
Space at character 3
Space at character 4
  Give   Him   Six!!!
Space at character 0
Space at character 1
Space at character 6
Space at character 7
Space at character 8
Space at character 12
Space at character 13
Space at character 14
<CNTL-D>
UNIX>

Go over the code -- why do I say

        ptr = strchr(ptr+1, ' ');

instead of

        ptr = strchr(ptr, ' ');

If you don't know, copy the code, modify it, and see for yourself!

If you want to find substrings rather than single characters, use strstr() (read the man page).

Scanf()

Scanf() is like printf() in that it takes a format string and some parameters. However, instead of writing the parameters to the terminal, it reads from the terminal (or whatever is standard input). Where scanf() confuses people is that there are no reference variables in C, so you have to use pointers. If you put "%d" in the format string, then scanf() will read an integer. The parameter that you have to pass is a pointer to the integer that you want read. The storage for the integer has to exist. Scanf() will read the integer from standard input, and will fill in the four bytes of the integer.

Here's a simple example in src/scanf1.c:

/* Read a single integer from standard input using scanf. */

#include <stdio.h>
#include <stdlib.h>

int main()
{
  int i;
  
  if (scanf("%d", &i) == 1) {
    printf("Just read i: %d (0x%x)\n", i, i);
  } else {
    printf("Scanf() failed for some reason.\n");
  }
  exit(0);
}

I have one integer, i. That's four bytes. They are located at i's pointer: &i. When I call scanf(), I say to read an integer from standard input, and fill in those four bytes with that integer. Scanf() returns the number of successful reads that it did. If our read is successful, the program prints i in decimal and in hexadecimal.

UNIX> bin/scanf1
10
Just read i: 10 (0xa)
UNIX> bin/scanf1
Fred
Scanf() failed for some reason.
UNIX> bin/scanf1
15.999999999999
Just read i: 15 (0xf)
UNIX> bin/scanf1
-15.99999999999999
Just read i: -15 (0xfffffff1)
UNIX> bin/scanf1
<CNTL-D>
Scanf() failed for some reason.
UNIX> echo "" | bin/scanf1
Scanf() failed for some reason.
UNIX> echo 15fred | bin/scanf1
Just read i: 15 (0xf)
UNIX>

Let's go over these examples.

The first successfully reads 10.
In the second, I didn't enter a number, so the scanf() call made no matches.
In the third, scanf() stops reading when it decides that the input is no longer an integer. In this case, that's at the decimal point, so it successfully reads 15.
The same thing happens in the fourth case -- scanf() is not rounding off -- it's simply reading text until it decides that its no longer reading a number.
In the fifth case, I type <CNTL-D>, which ends standard input. scanf() in this case returns EOF (defined in stdio.h). It is a negative number, so our program prints that scanf() failed.
In the sixth case, I use the program echo to provide standard input. In this case, echo produces a blank line, so scanf() returns EOF again.
Finally, the last case uses echo again to show that scanf() will successfully read the 15.

(I usually skip this program in class, but enjoy reading about it here if you're interested.)

The program scanf2.c is buggy.

int main()
{
  int *i;

  printf("i = 0x%lx\n", (unsigned long) i);
  if (scanf("%d", i) == 1) {
    printf("Just read i: %d (0x%x)\n", *i, *i);
  } else {
    printf("Scanf() failed for some reason.\n");
  }
  exit(0);
}

It will compile (although some nosy compilers will figure out it's buggy and yell at you). Whether the bug manifests or not is a matter of luck. Here's the program on my Mac in 2015:

UNIX> echo 10 | bin/scanf2
i = 0x7fff5fc01052
Bus error
UNIX>

What happened? The answer is that i is an uninitialized variable. It randomly started with a value of 0x7fff5fc01052. When scanf() tried to stuff the value 10 into those four bytes, a hardware error was generated -- that's the bus error. If you're lucky, when your program has uninitialized variables, they lead to segmentation violations and bus errors. If you're unlucky, they won't, and you don't discover your bug until (potentially much) later.

Just to test on some other machines, here it is on my Raspberry Pi in 2018:

@raspberrypi:~/CS360/cs360-lecture-notes/CStuff$ echo 10 | bin/scanf2
i = 0x0
Segmentation fault
pi@raspberrypi:~/CS360/cs360-lecture-notes/CStuff$

The fact that i was zero is good here -- the segmentation violation clues us into the fact that there is a bug.

In 2018, my Mac gave me the disaster output:

UNIX> echo 10 | bin/scanf2
i = 0x7fff57c662a0
Just read i: 10 (0xa)
UNIX>

The variable i just happens to be a legal and aligned address. The value 10 has been stuffed into bytes 0x7fff57c662a0 to 0x7fff57c662a3. Who knows what that is in my program. The fact that my program simply exits means that this bug is benign, but if I were to have lots more going on in my program, this bug would be extremely difficult to figure out. The reason is that when the error manifests, it will be much later in the program, when some other part of the program is using addresses 0x7fff57c662a0 to 0x7fff57c662a3. This is why it pays to be careful when you are programming.

Strings and scanf

As we know, a string in C is an array of char's. Recall, a char is a one-byte integer, which means that it has values between -128 and 127. Each of those values matches to a printable character, with zero equaling the "null" character. A string is an array of chars that ends with the null character. The following program (src/scanf3.c) uses scanf() to read a string from standard input, and then to print the individual characters:

/* This program uses scanf and %s to read a string and print out the characters.
   You should *only* use scanf and %s if you are guaranteed that the string you are
   reading will not be bigger than the memory allocated to it.  Otherwise, you expose
   yourself to a buffer overflow attack. */

#include <stdio.h>
#include <stdlib.h>

int main()
{
  char s[10];
  int i;
  
  if (scanf("%s", s) != 1) exit(0);

  for (i = 0; s[i] != '\0'; i++) {
    printf("Character: %d: %3d %c\n", i, s[i], s[i]);
  }
  exit(0);
}

Since an array variable like s is equivalent to a pointer to the first element, we do not have to pass &s to scanf() -- we simply pass s.

This program allows us to see the ASCII character codes for the characters in the string "Jim-Plank":

UNIX> echo "Jim-Plank" | bin/scanf3
Character: 0:  74 J
Character: 1: 105 i
Character: 2: 109 m
Character: 3:  45 -
Character: 4:  80 P
Character: 5: 108 l
Character: 6:  97 a
Character: 7: 110 n
Character: 8: 107 k
UNIX>

Scanf() with strings is problematic. In particular, think about what happens when you enter a string with more than 10 characters. Memory will get stomped on, just like the strcpy() and strcat() examples above with "T. J. Houshmanzadeh". For example, let's send a string with 80,000 'j' characters to bin/scanf3:

UNIX> bin/makej 80000 | bin/scanf3
Segmentation fault: 11
UNIX>

We were lucky to get a segmentation violation -- allowing your input to stomp on your memory is the heart of what's called a "buffer overflow attack". Using scanf() with strings is a very good way to expose yourself to a buffer overflow attack, unless you can guarantee that your input actually behaves. Using fgets() and subsequently calling sscanf() is a safer way to go.

(I often skip this, too.)

Sscanf()

Sscanf() is just like scanf(), except it takes an additional string as its first parameter, and it "reads" from that string instead of from standard input. It returns the number of correct matches that it made. Thus, it is quite convenient for converting strings to integers and doubles. It is far superior to atoi() and atof() because it lets you know when it fails, which is quite important.

Here's an example program that reads lines of text from standard input, and attempts to convert them to ints and doubles. It is in src/sscanf1.c:

#include <stdio.h>

int main()
{
  char buf[1000];
  int i, h;
  double d;

  while (fgets(buf, 1000, stdin) != NULL) {
    if (sscanf(buf, "%d", &i) == 1) {
      printf("When treated as an integer, the value is %d\n", i);
    } 
    if (sscanf(buf, "%x", &h) == 1) {
      printf("When treated as hex, the value is 0x%x (%d)\n", h, h);
    } 
    if (sscanf(buf, "%lf", &d) == 1) {
      printf("When treated as a double, the value is %lf\n", d);
    }
    if (sscanf(buf, "0x%x", &h) == 1) {
      printf("When treated as a hex with 0x%%x formatting, the value is 0x%x (%d)\n", h, h);
    }
    printf("\n");
  }
}

Here is an example of it running.

UNIX> bin/sscanf1
10
When treated as an integer, the value is 10
When treated as hex, the value is 0x10 (16)
When treated as a double, the value is 10.000000

55.9
When treated as an integer, the value is 55
When treated as hex, the value is 0x55 (85)
When treated as a double, the value is 55.900000

.5679
When treated as a double, the value is 0.567900

a 
When treated as hex, the value is 0xa (10)

0x10
When treated as an integer, the value is 0
When treated as hex, the value is 0x10 (16)
When treated as a double, the value is 16.000000
When treated as a hex with 0x%x formatting, the value is 0x10 (16)

UNIX>

The first four inputs should be straightforward. That last one is a little confusing, even to me, and the man page on sscanf() is not helpful. From that, it appears that %x and %lf recognize "0x" in the input and perform the proper conversion in hex. %d does not. That's one of those "features" on which I wouldn't rely -- I bet it's not implemented on all machines (that's just my gut feeling).

Strdup()

You'll be seeing more of strdup() in the Fields lecture, but I'll mention it now. The prototype of strdup() is:

char *strdup(const char *s);

It is basically implemented as follows:

char *strdup(const char s)
{
  return strcpy(malloc(strlen(s)+1), s);
}

In other words, it makes a copy of the string, allocating memory for the copy. Since it calls malloc(), if you are finished with the copy, you should call free() on it, to avoid memory leaks. See how it uses the return value of strcpy() that we all ignore? That's the only time you'll see that return value used. Again, we'll see more of strdup() in the Fields lecture.

Other useful procedures

I don't go over these, but you'll use them from time to time. It's good to be aware of them. Read their man pages.

strrchr() find the last occurrence of a character
strstr() find a substring.
strcasestr() find a substring but ignore case.
strsep() helps you break up strings with delimiters.
strncpy() does a restricted strcpy.
memcpy() copies one region of memory into another.
memcmp() does a byte-by-byte comparison of two regions of memory.
bzero() sets a region of memory so that every byte is zero.