CS360 Lecture notes -- Fork


Getpid, getppid

A process is an executing instance of a program. In Unix, you can make a listing of all the processes that you are currently running with the ps command. Here's an example on my MacBook Pro in 2012:
UNIX> ps x
  PID   TT  STAT      TIME COMMAND
  107   ??  Ss     1:01.26 /sbin/launchd
  111   ??  U      7:43.22 /System/Library/CoreServices/Dock.app/Contents/MacOS/Dock
  112   ??  S      5:26.09 /System/Library/CoreServices/SystemUIServer.app/Contents/MacOS/SystemUIServer
  117   ??  S      0:00.06 /usr/sbin/pboard
  119   ??  S      1:43.76 /System/Library/Frameworks/ApplicationServices.framework/Frameworks/ATS.framework/Support/fontd
  124   ??  S      0:23.31 /usr/libexec/UserEventAgent -l Aqua
  131   ??  S      0:13.61 /System/Library/CoreServices/AirPort Base Station Agent.app/Contents/MacOS/AirPort Base Station Agent -l
  283   ??  S      0:48.32 /System/Library/Services/AppleSpell.service/Contents/MacOS/AppleSpell -psn_0_167977
  791   ??  S      0:06.79 /Applications/iTunes.app/Contents/MacOS/iTunesHelper.app/Contents/MacOS/iTunesHelper -psn_0_286790
 9500   ??  S      0:05.05 /usr/bin/ssh-agent -l
10131   ??  R      2:47.86 /Applications/Utilities/Terminal.app/Contents/MacOS/Terminal -psn_0_25864361
26080   ??  S      0:00.01 /usr/libexec/ApplicationFirewall/Firewall
54614   ??  S      2:16.93 /Applications/Mail.app/Contents/MacOS/Mail -psn_0_26052823
55508   ??  S      0:00.87 /System/Library/Image Capture/Support/Image Capture Extension.app/Contents/MacOS/Image Capture Extension
55511   ??  S      6:01.32 /Applications/iTunes.app/Contents/MacOS/iTunes -psn_0_26093793
55554   ??  S      0:11.89 /Applications/Preview.app/Contents/MacOS/Preview -psn_0_26114278
57032   ??  S      1:37.95 /Applications/Safari.app/Contents/MacOS/Safari -psn_0_26339613
59738   ??  S      2:14.87 /Applications/OpenOffice.org.app/Contents/MacOS/soffice -psn_0_26700149
59908   ??  SNs    0:00.60 /System/Library/Frameworks/CoreServices.framework/Frameworks/Metadata.framework/Versions/A/Support/mdwor
60122   ??  S      0:02.33 /System/Library/PrivateFrameworks/WebKit2.framework/WebProcess.app/Contents/MacOS/WebProcess /System/Lib
60323   ??  S      0:00.29 /System/Library/CoreServices/Apple80211Agent.app/Contents/MacOS/Apple80211Agent -psn_0_26777992
84097   ??  Ss     0:06.57 /System/Library/PrivateFrameworks/DiskImages.framework/Resources/diskimages-helper -uuid 316BFEA2-505A-4
85810   ??  S     11:08.25 /System/Library/CoreServices/Finder.app/Contents/MacOS/Finder
93979   ??  S     18:17.94 /Applications/Adobe Photoshop CS3/Adobe Photoshop CS3.app/Contents/MacOS/Adobe Photoshop CS3 -psn_0_2489
93983   ??  Ss     0:08.07 /Library/Application Support/FLEXnet Publisher/Service/11.03.005/FNPLicensingService -r
60317 s000  Ss     0:00.02 login -pfq plank /bin/tcsh
60318 s000  S      0:00.04 -tcsh
60335 s000  S+     0:00.12 vi lecture.html
60348 s001  Ss     0:00.01 login -pfq plank /bin/tcsh
60349 s001  S      0:00.03 -tcsh
60358 s001  R+     0:00.00 ps x
UNIX>
If you try "ps x", you should see something like the above. The first column of numbers are the "Process ID" numbers, which are called "pid"s. Every process has a non-negative pid, which is unique to that process on that machine while the process is executing. There are two special processes with pids 0 and 1: the scheduler and the init program (on my Mac, init is called launchd). You can see them using ps aux: (on some machines, you'll see the init process as "/etc/init", and the scheduler as "sched". Also, some machines use different options for ps -- if "ps aux" doesn't work, try "ps -ef").
UNIX> ps aux
USER       PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
...
plank    93979   0.7  2.3   925224 192684   ??  S    Mon02PM  18:19.27 /Applications/Adobe Photoshop CS3/Adobe Photoshop CS3.app/Co
root        40   0.5  0.0  2460588   1120   ??  Ss    3Feb12  27:40.23 /usr/libexec/hidd
plank    10131   0.4  0.3  2803060  23696   ??  R    Tue11AM   2:49.99 /Applications/Utilities/Terminal.app/Contents/MacOS/Terminal
root         1   0.1  0.0  2475280   1004   ??  Ss    3Feb12  44:36.79 /sbin/launchd
plank    60318   0.0  0.0  2439404   1220 s000  S     4:27PM   0:00.04 -tcsh
root     60317   0.0  0.0  2436216   1592 s000  Ss    4:27PM   0:00.02 login -pfq plank /bin/tcsh
...
UNIX>
These processes are always running while the machine is running.

Every process has a "parent" process. This is the process that created it. You can get your pid and your parent's pid by using the getpid() and getppid() commands. The program src/showpid.c is a simple program that prints out its pid, and its parent's pid:

/* This program prints its process' pid, and its parent process' pid. */

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main()
{
  printf("My pid = %d.  My parent's pid = %d\n", getpid(), getppid());
  return 0;
}

Each time you run it, it will show a different pid, but the same parent pid:

UNIX> bin/showpid
My pid = 854.  My parent's pid = 381
UNIX> bin/showpid
My pid = 855.  My parent's pid = 381
UNIX> bin/showpid
My pid = 856.  My parent's pid = 381
UNIX> bin/showpid
My pid = 857.  My parent's pid = 381
UNIX> ps x
...
  381 p2 S     0:01 -sh (csh)
...
UNIX>
As you can see, the csh is the parent of each of these. This is because you typed the commands into the csh, which then created the showpid processes.

Fork

In Unix, all processes are created with the system call fork(). This is without exception. What fork() does is the following:

It creates a new process which is a copy of the calling process. That means that it copies the caller's memory (code, globals, heap and stack), registers, and open files. The calling process returns from fork() with the pid of the newly created process (which is called the "child" process. The calling process is called the "parent" process). The newly created process, as it is a duplicate of the parent, also returns from the fork() call (this is because it is a duplicate -- it has the same memory and registers, and thus the same stack pointer, frame pointer, and program counter, and thus has to return from the fork() call). It returns with a value of zero. This is how you know what process you're in when fork() returns.

Look at src/simpfork.c:

/* A simple program to show how fork() duplicates the calling process. */

#include <stdio.h>
#include <unistd.h>

int main()
{
  int i;

  printf("simpfork: pid = %d\n", getpid());
  i = fork();
  printf("Did a fork.  It returned %d.  getpid = %d.  getppid = %d\n",
    i, getpid(), getppid());
  return 0;
}

When it is executed, the following happens:

UNIX> bin/simpfork
simpfork: pid = 914
Did a fork.  It returned 915.  getpid = 914.  getppid = 381
Did a fork.  It returned 0.  getpid = 915.  getppid = 914
UNIX>

So, what is going on? When simpfork is executed, it has a pid of 914. Next it calls fork() creating a duplicate process with a pid of 915. The parent gains control of the CPU, and returns from fork() with a return value of the 915 -- this is the child's pid. It prints out this return value, its own pid, and the pid of csh, which is still 381. Then it exits. Next, the child gets the CPU and returns from fork() with a value of 0. It prints out that value, its pid, and the pid of the parent.

Note, there is no guarantee which process gains control of the CPU first after a fork(). It could be the parent, and it could be the child. When I executed simpfork a second time, the child got control first:

UNIX> bin/simpfork 
simpfork: pid = 928
Did a fork.  It returned 0.  getpid = 929.  getppid = 928
Did a fork.  It returned 929.  getpid = 928.  getppid = 381
UNIX>
(On some machines, it does appear that the child always gets control first, but you should not rely on such a fact when writing code).
Now, look at src/simpfork2.c:

/* In this program, the child sleeps longer than the parent, before exiting, 
   so that we can see what happens when the parent exits first. */

#include <stdio.h>
#include <unistd.h>

int main()
{
  int i;

  i = fork();
  if (i == 0) {
    printf("Child.  getpid() = %d, getppid() = %d\n", getpid(), getppid());
    sleep(5);
    printf("After sleeping.  getpid() = %d, getppid() = %d\n", getpid(), getppid());
  } else {
    sleep(1);
    printf("Parent exiting now\n");
  }
   return 0;   
}

After calling fork(), the parent exits after 1 second, and the child exits after 5 seconds. Before exiting, the child prints out its pid and its parent's pid. What is subtle about this? Well, by the time the child wakes up from sleeping, the parent has exited, and its process is no longer there. Thus, getppid() can't return the old value of the parent's pid, as that pid is no longer valid. What Unix does in these cases is transfer the parentage of the child's program. Specifically, when the parent of a program exits, the init program (pid 1) becomes its parent. Thus, when the child prints out its parent after sleeping, it prints out pid 1:

UNIX> bin/simpfork2
Child.  getpid() = 1301, getppid() = 1300
Parent exiting now
UNIX> After sleeping.  getpid() = 1301, getppid() = 1
Note that the "UNIX>" prompt returns once the parent returns, even though the child is still running. This is because csh waits only for the parent to complete, not for any other processes.

The name for a child process whose parent has exited is an orphan. Sometimes you'll hear that "orphans are inherited by init." That means that when a process' parent exits, init becomes its new parent.


src/simpfork3.c is a another simple program, to show that the parent's address space is indeed copied to the child during the fork:

/* This program has the child change the values of two variables -- local variable j
   and global variable K, to show that the child's changes do not affect the parent. */

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int K;

int main()
{
  int i;
  int j;

  j = 200;
  K = 300;

  printf("Before forking: j (Address 0x%lx) = %d.   K (Address 0x%lx) = %d\n", 
         (long unsigned int) &j, j, (long unsigned int) &K, K);

  i = fork();

  if (i > 0) {  /* Delay the parent */
    sleep(1);
    printf("Parent after:   j (Address 0x%lx) = %d.   K (Address 0x%lx) = %d\n", 
         (long unsigned int) &j, j, (long unsigned int) &K, K);

  } else {
    j++;
    K++;
    printf("Child after:    j (Address 0x%lx) = %d.   K (Address 0x%lx) = %d\n", 
         (long unsigned int) &j, j, (long unsigned int) &K, K);
  }
  return 0;
} 

After the fork, each process has memory locations for j (a local variable) and K (a global variable). Thus, when the child changes j to 201 and K to 301, it only affects its values of j and K, and not the parent's:

UNIX> bin/simpfork3
Before forking: j (Address 0x7ffc767e88b8) = 200.   K (Address 0x601048) = 300
Child after:    j (Address 0x7ffc767e88b8) = 201.   K (Address 0x601048) = 301
Parent after:   j (Address 0x7ffc767e88b8) = 200.   K (Address 0x601048) = 300
UNIX> 
You'll note that j and K have the same addresses in the parent and the child. But these are different memory locations on the machine, because the operating system and hardware cooperate so that addresses that are relative to a process are translated at runtime to other addresses on the machine. That is why the two processes have the same addresses, but different memory locations and different values.

Interestingly, if we redirect the output of simpfork3 to a file, we see the following odd behavior:

UNIX> bin/simpfork3 > output.txt
UNIX> cat output.txt
Before forking: j (Address 0x7ffdb989bd88) = 200.   K (Address 0x601048) = 300
Child after:    j (Address 0x7ffdb989bd88) = 201.   K (Address 0x601048) = 301
Before forking: j (Address 0x7ffdb989bd88) = 200.   K (Address 0x601048) = 300
Parent after:   j (Address 0x7ffdb989bd88) = 200.   K (Address 0x601048) = 300
UNIX> 
When redirecting output to a terminal, stdout is buffered line by line -- that is, once you do a putchar('\n') or equivalent, the buffer is written to standard output with "write(1, ...)". However, when stdout is redirected to a file (or a pipe), the stdio library buffers on a coarser scale -- not writing until some large buffer (4K or 8K characters) is full. Thus, at the time of the fork() call, the "Before forking:" string has not been written to fd=1. Instead, it has been buffered in the standard I/O library. That buffer is part of simpfork3's address space, and is thus copied to the child process when fork() is called. Thus, when the bytes are finally flushed from the buffer, the "Before forking: ..." string is written to the file twice -- once in the child and once in the parent. This is an important thing to realize. It looks strange, but has a logical explanation. Remember this, because if you try to debug and are getting weird duplicate output, the reason may be because the standard I/O buffer is not getting flushed.

The same thing happens when you redirect through a pipe:

UNIX> bin/simpfork3 | cat
Before forking: j (Address 0x7ffc19ab1738) = 200.   K (Address 0x601048) = 300
Child after:    j (Address 0x7ffc19ab1738) = 201.   K (Address 0x601048) = 301
Before forking: j (Address 0x7ffc19ab1738) = 200.   K (Address 0x601048) = 300
Parent after:   j (Address 0x7ffc19ab1738) = 200.   K (Address 0x601048) = 300
UNIX> 

To fix this, call fflush(stdout) before calling fork(). That makes the standard I/O library do the appropriate write() call, flushing the buffer, and now an empty buffer gets copied to the child process.


src/simpfork4.c shows how open files are shared across fork() calls.

int main()
{
  int i;
  int seekp;
  int fd;
  char *s1;
  char s2[1000];

  /* Open a file for writing, and then call fork(). */

  fd = open("tmpfile.txt", O_WRONLY | O_TRUNC | O_CREAT, 0666);

  s1 = "Before forking\n";
  write(fd, s1, strlen(s1));

  i = fork();

  /* Delay the parent by a second, so that the child runs first. */

  if (i > 0) {
    sleep(1);  
    s1 = "Parent";
  } else {
    s1 = "Child";
  }

  /* Print the file's seek pointer. 
     Do a write, and print the seek pointer again. */

  seekp = lseek(fd, 0, SEEK_CUR);
  printf("%s: After forking, before writing: Seek pointer = %d\n", s1, seekp);

  sprintf(s2, "%s: After forking.\n", s1);
  write(fd, s2, strlen(s2));

  seekp = lseek(fd, 0, SEEK_CUR);
  printf("%s: After forking, after  writing: Seek pointer = %d\n", s1, seekp);

  return 0;
}

Specifically, all open file descriptors are duplicated. Thus, the file opened for writing in src/simpfork4.c is open in both the parent and child processes. Note that the open file is the same -- when one writes to the end of the file, the seek pointer is changed in both the parent and the child. This is because both processes point to the same open file structure in the operating system. If the two did not share the same open file pointer, then the parent's write would overwrite the child's write.

UNIX> bin/simpfork4
Child: After forking, before writing: Seek pointer = 15
Child: After forking, after  writing: Seek pointer = 37
Parent: After forking, before writing: Seek pointer = 37
Parent: After forking, after  writing: Seek pointer = 60
UNIX> cat tmpfile.txt
Before forking
Child: After forking.
Parent: After forking.
UNIX> 
Note also that since we call write(), there is no buffering, so there's no need to flush a buffer before the fork() call.

I will go over more on sharing file descriptors in the dup lecture.


Fork Bombs

A fork bomb is a loop of fork() calls that has gone out of control. It is hard to deal with, because when you make too many fork() calls, the operating system will make subsequent fork() calls fail, and then your shell cannot call fork() to launch a program to kill the rogue processes.

To give you a little practice, try these two programs. First, look at src/not_a_forkbomb.c:

/* This program has a parent fork 10 times. */

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main()
{
  int i;
  int fv;

  for (i = 0; i < 10; i++) {
    fv = fork();
    printf("Process: %8d -- i = %d\n", getpid(), i);
    if (fv == 0) exit(0);
  }
  return 0;
} 

This isn't a fork-bomb -- it just shows one process creating 10 processes. If I asked you on a test how many lines of output this program produces, your answer should be 20 -- 10 for the process calling fork(), and one for each of the ten child processes:

UNIX> bin/not_a_forkbomb
Process:    60731 -- i = 0
Process:    60731 -- i = 1
Process:    60731 -- i = 2
Process:    60733 -- i = 1
Process:    60731 -- i = 3
Process:    60732 -- i = 0
Process:    60731 -- i = 4
Process:    60731 -- i = 5
Process:    60731 -- i = 6
Process:    60734 -- i = 2
Process:    60731 -- i = 7
Process:    60735 -- i = 3
Process:    60731 -- i = 8
Process:    60731 -- i = 9
Process:    60737 -- i = 5
Process:    60736 -- i = 4
Process:    60738 -- i = 6
Process:    60740 -- i = 8
Process:    60739 -- i = 7
Process:    60741 -- i = 9
UNIX> 
There are some interesting features of the output -- the operating system schedules processes, so sometimes the main process executes for a few iterations before the children get to (e.g. the first three lines). Moreover, once multiple children have been created, they can be executed in any order. For example, process 60740 (i=8) executes before process 60739 (i=7), even though process 60739 was created first.

The true fork bomb is in src/forkbomb.c:

/* A fork bomb. */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main()
{
  int i;
  int fv;

  for (i = 0; i < 4; i++) {
    fv = fork();
    printf("Process: %8d -- i = %d\n", getpid(), i);
    fflush(stdout);
  }
  return 0;
} 

On the first iteration, the parent calls fork(), so there are two processes printing out i=0. Now each of those two processes iterates and call fork(), so there are four processes with i=1. Similarly, there are 8 processes with i=2 and 16 with i=3. Exponential blow-up. These kinds of programs can happen (see the next lecture), so be aware of them. If you iterate too many times, the operating system will start denying the fork() calls, and you are in trouble......

UNIX> bin/forkbomb
Process:    61802 -- i = 0
Process:    61802 -- i = 1
Process:    61802 -- i = 2
Process:    61803 -- i = 0
Process:    61802 -- i = 3
Process:    61803 -- i = 1
Process:    61803 -- i = 2
Process:    61803 -- i = 3
UNIX> Process:    61809 -- i = 3     (The parent is done here)    
Process:    61804 -- i = 1
Process:    61804 -- i = 2
Process:    61804 -- i = 3
Process:    61810 -- i = 2
Process:    61810 -- i = 3
Process:    61811 -- i = 3
Process:    61812 -- i = 3
Process:    61805 -- i = 2
Process:    61805 -- i = 3
Process:    61813 -- i = 3
Process:    61806 -- i = 3
Process:    61807 -- i = 1
Process:    61808 -- i = 2
Process:    61807 -- i = 2
Process:    61808 -- i = 3
Process:    61814 -- i = 2
Process:    61815 -- i = 3
Process:    61807 -- i = 3
Process:    61814 -- i = 3
Process:    61816 -- i = 3
Process:    61817 -- i = 3