#include |
When you call it, if you don't have any children, it returns -1. Otherwise, it will wait for one of your children to exit. When they do, the wait() call will return with the process id of the child that exited. It will also fill in the integer pointed to by stat_loc with information on how the child exited. There are macros in the sys/wait.h include file that can help you parse this integer.
Here are examples. src/forkwait0.c calls wait() with no children:
/* This shows what happens when you call wait() and have no children. It will return with a value of -1. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { int rv, stat_loc; stat_loc = 0xabcdef; rv = wait(&stat_loc); printf("RV: %d. Stat_loc = 0x%x\n", rv, (unsigned int) stat_loc); return 0; } |
It returns instantly with a return value of -1, and stat_loc remains unchanged:
UNIX> bin/forkwait0 RV: -1. Stat_loc = 0xabcdef UNIX>src/forkwait1.c forks off one child, which exits instantly with a return code of zero. The parent calls wait() and prints out a bunch of stuff about the return status:
/* Fork off one child that exits with a value of 0. The parent uses macros to examine the status variable of wait. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { int i, j, status; i = fork(); if (i > 0) { j = wait(&status); printf("Parent: Child done.\n"); printf(" Return value: %d\n", j); printf(" Status: %d\n", status); printf(" WIFSTOPPED: %d\n", WIFSTOPPED(status)); printf(" WIFSIGNALED: %d\n", WIFSIGNALED(status)); printf(" WIFEXITED: %d\n", WIFEXITED(status)); printf(" WEXITSTATUS: %d\n", WEXITSTATUS(status)); printf(" WTERMSIG: %d\n", WTERMSIG(status)); printf(" WSTOPSIG: %d\n", WSTOPSIG(status)); } else { printf("Child (%d) calling exit(0)\n", getpid()); exit(0); // BTW, "return 0" will do the same thing. } return 0; } |
You can see with "WEXITSTATUS" that the child exited with a return code of zero.
UNIX> bin/forkwait1 Child (8575) calling exit(0) Parent: Child done. Return value: 8575 Status: 0 WIFSTOPPED: 0 WIFSIGNALED: 0 WIFEXITED: 1 WEXITSTATUS: 0 WTERMSIG: 0 WSTOPSIG: 0 UNIX>src/forkwait2.c is the exact same as src/forkwait1.c, except that the child calls exit(1) instead of exit(0):
UNIX> bin/forkwait2 Child (8747) calling exit(1) Parent: Child done. Return value: 8747 Status: 256 WIFSIGNALED: 0 WIFEXITED: 1 WEXITSTATUS: 1 WTERMSIG: 0 UNIX>With src/forkwait3.c, the child goes into an infinite loop, which means that the parent's wait() call does not return:
UNIX> forkwait3 Child (8912) doing nothing until you kill it Kill the child with 'kill -9 8912' or just 'kill 8912'In another window, go ahead and kill the child process:
UNIX> kill -9 8912Now, the parent's wait() call returns, and shows you that the child process was terminated with "signal 9." That is how you killed the child process with the "kill" command:
Parent: Child done. Return value: 8912 Status: 9 WIFSTOPPED: 0 WIFSIGNALED: 1 WIFEXITED: 0 WEXITSTATUS: 0 WTERMSIG: 9 WSTOPSIG: 0 UNIX>Finally, src/forkwait4.c has the child generate a segmentation violation. That is reported to the parent as terminating with "signal 11."
UNIX> bin/forkwait4 Child (9255) generating a seg fault Parent: Child done. Return value: 9255 Status: 11 WIFSTOPPED: 0 WIFSIGNALED: 1 WIFEXITED: 0 WEXITSTATUS: 0 WTERMSIG: 11 WSTOPSIG: 0 UNIX>We'll go over signals in detail in another lecture.
Ok. Now, look at src/forkwait5.c. It does the following:
When a child exits, its process becomes a "zombie" until its parent process either dies or calls wait() for it. By a "zombie", we mean that it takes up no resources, and doesn't run, but it is just being maintained by the operating system so that when the parent calls wait(), it will get the proper information. Look at the output of forkwait5:
UNIX> bin/forkwait5 Child (6698) calling exit(2) root 5286 0.0 0.1 168264 5504 ? Ss 10:55 0:00 sshd: plank [priv] plank 5315 0.0 0.0 168264 2428 ? S 10:55 0:00 sshd: plank@pts/0 plank 5316 0.0 0.0 127964 2160 pts/0 Ss 10:55 0:00 -csh plank 5617 0.0 0.1 158956 4960 pts/0 S+ 10:58 0:00 vim lecture.html root 6406 0.0 0.1 168264 5508 ? Ss 11:04 0:00 sshd: plank [priv] plank 6413 0.0 0.0 168264 2432 ? S 11:04 0:00 sshd: plank@pts/1 plank 6414 0.0 0.0 127964 2160 pts/1 Ss 11:04 0:00 -csh plank 6697 0.0 0.0 4172 360 pts/1 S+ 11:06 0:00 bin/forkwait5 plank 6698 0.0 0.0 0 0 pts/1 Z+ 11:06 0:00 [forkwait5] <defunct> plank 6700 0.0 0.0 113140 1428 pts/1 S+ 11:06 0:00 sh -c ps aux | grep plank plank 6701 0.0 0.0 161372 1840 pts/1 R+ 11:06 0:00 ps aux plank 6702 0.0 0.0 112676 964 pts/1 S+ 11:06 0:00 grep plank Parent: Child done. Return value: 6698 Status: 512 WIFSTOPPED: 0 WIFSIGNALED: 0 WIFEXITED: 1 WEXITSTATUS: 2 WTERMSIG: 0 WSTOPSIG: 2 UNIX> ps aux | grep plank root 5286 0.0 0.1 168264 5504 ? Ss 10:55 0:00 sshd: plank [priv] plank 5315 0.0 0.0 168264 2428 ? S 10:55 0:00 sshd: plank@pts/0 plank 5316 0.0 0.0 127964 2160 pts/0 Ss 10:55 0:00 -csh plank 5617 0.0 0.1 158956 5012 pts/0 S+ 10:58 0:00 vim lecture.html root 6406 0.0 0.1 168264 5508 ? Ss 11:04 0:00 sshd: plank [priv] plank 6413 0.0 0.0 168264 2432 ? S 11:04 0:00 sshd: plank@pts/1 plank 6414 0.0 0.0 127964 2160 pts/1 Ss 11:04 0:00 -csh plank 6821 0.0 0.0 161372 1840 pts/1 R+ 11:08 0:00 ps aux plank 6822 0.0 0.0 112676 964 pts/1 S+ 11:08 0:00 grep plank UNIX>The process 6698 is the zombie process, denoted in the "ps x" output with a capital Z. When forkwait5 (process 6697) calls wait(), then process goes away.
What happens if the parent exits without calling wait()? Then the child process goes away if/when the child has exited.
wait() returns whenever a child exits. If a process has more than one child, then you can't force wait() to wait for a specific child. You simply get whichever child exits first. For example, see src/multichild.c. This program forks off 4 children processes and then calls wait() four times. The children sleep for a random period of time, and then exit. As you see, the first wait() call returns the first child to return:
UNIX> bin/multichild Fork 0 returned 14160 Fork 1 returned 14161 Fork 2 returned 14162 Fork 3 returned 14163 Child 1 (14161) exiting Wait returned 14161 Child 3 (14163) exiting Wait returned 14163 Child 0 (14160) exiting Wait returned 14160 Child 2 (14162) exiting Wait returned 14162 UNIX>Now, you can use waitpid() to wait for a specific process, and you can even have it return if the specified process has not exited. I personally think using waitpid() is usually bad form, and most certainly using the version that returns instantly is really bad form.
You will not be allowed to use any wait() variant besides wait() in your jsh lab. I will instruct the TA's to be ruthless if you call waitpid() with NOHANG set (or one of the waitx() equivalents).
Execve() is simple in concept:
int execve(const char *path, const char **argv, const char **envp); |
Execve() assumes that path is the name of an executable file. Argv is an array of null-terminated strings, such that the last element is NULL, and envp is another null-terminated array of null-terminated strings. (I'm not going to go over envp -- it holds your environment variables, which you can get and set with getenv()/setenv(). You can get envp as a third argument to main(), which is what I'm going to do here).
Execve() overwrites the current process so that it executes the file in path with the arguments in argv, and the environment variables in envp. Execve() does not return unless it encounters an error, such as the file in path not existing, or not being an executable file.
This may seem confusing. Why does execve() not return? Well, look at the example in src/exec2.c:
/* A simple example of using execve() to run the program cat. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(int argc, char **argv, char **envp) { char *newargv[3]; int i; newargv[0] = "cat"; newargv[1] = "src/exec2.c"; newargv[2] = NULL; i = execve("/bin/cat", newargv, envp); perror("exec2: execve failed"); exit(1); } |
Suppose we compile this to the program bin/exec2. Then we execute it with no arguments. When we get to the execve() call, the state of memory is the following:
|-----------------| | | | code for exec2 | | | |-----------------| | | | globals | | for exec2 | | | |-----------------| | | | heap for exec2 | | | |-----------------| .... | stack for exec2 | |-----------------|Now, execve() is called. This is a system call that says "overwrite my process' memory so that it is running main() in the program in /bin/cat" with the arguments "cat" "exec2.c". When execve() is done, the state of memory has been changed so that we are in the main() routine of cat, with argc and argv set properly:
|-----------------| | | | code for cat | | | |-----------------| | | | globals | | for cat | | | |-----------------| | | | heap for cat | | | |-----------------| .... | stack for cat | |-----------------|You'll notice that everything concerning exec2.c is gone. This is because the state of memory has been overwritten to run cat. There is no trace of exec2 left. This is why execve() cannot return if it is sucessful -- the state to which it might have returned has been overwritten. It is gone. When cat exits, the operating system simply destroys the process.
Here it is running -- looks just like "cat exec2.c":
UNIX> bin/exec2 /* A simple example of using execve() to run the program cat. */ #include#include #include int main(int argc, char **argv, char **envp) { char *newargv[3]; int i; newargv[0] = "cat"; newargv[1] = "src/exec2.c"; newargv[2] = NULL; i = execve("/bin/cat", newargv, envp); perror("exec2: execve failed"); exit(1); } UNIX>
So how come when you execute cat in the shell it looks like it returns to the shell? This is because the shell calls fork(). It then has the child call execve(), and the parent calls wait().
There are six variants of execve() -- see the man page for execve(). I'll summarize them below:
If execve() is unsuccessful (for example, there is no file with the name "path", or that file does not have the executable bit set), it will return with a value of -1. For example, look at src/exec1.c.
This program tries to execute "cat", which does not exist. Thus, the execve() call fails, and the perror statement is executed.
UNIX> bin/exec1 exec1: execve failed: No such file or directory UNIX>
This leads to:
/* This program runs the "cat" program four times by calling fork() in a for loop. Inside the loop, the child calls execvp("cat"), and the parent calls wait(). Although this program runs fine, it will turn into a fork bomb if there's a bug. */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char **argv) { char *newargv[3]; int status, j; newargv[0] = "cat"; newargv[1] = "f1.txt"; newargv[2] = NULL; for (j = 0; j < 4; j++) { if (fork() == 0) { (void) execvp("cat", newargv); } else { wait(&status); } } return 0; } |
This program forks off four processes that all exec "cat f1.txt". Note the use of execvp(), which does not need an environment variable, and which searches the PATH variable to find "cat".
UNIX> bin/execcat1 This is file f1.txt This is file f1.txt This is file f1.txt This is file f1.txt UNIX>Now, execcat2.c simply substitutes execv() for execvp(). Now, the path is not searched, and the cat executable will not be found. When you run it, ostensibly nothing happens:
UNIX> bin/execcat2 UNIX>What's really going on? Well, the execv() call fails. This means that the execv() call returns with i = -1, and then the child process continues. It too will go through the for loop and call fork(). In other words, the number of processes blows up exponentially -- it's a fork bomb.
To help illustrate what goes on, look at execcat3.c, which prints out the value of j, and the process' process id at the top of each for loop, and when the process exits:
UNIX> bin/execcat3 Process 81647 - Top of the for loop. j = 0 Process 81648 - Top of the for loop. j = 1 Process 81649 - Top of the for loop. j = 2 Process 81650 - Top of the for loop. j = 3 Process 81651 exiting. Process 81650 exiting. Process 81649 - Top of the for loop. j = 3 Process 81652 exiting. Process 81649 exiting. Process 81648 - Top of the for loop. j = 2 Process 81653 - Top of the for loop. j = 3 Process 81654 exiting. Process 81653 exiting. Process 81648 - Top of the for loop. j = 3 Process 81655 exiting. Process 81648 exiting. Process 81647 - Top of the for loop. j = 1 Process 81656 - Top of the for loop. j = 2 Process 81657 - Top of the for loop. j = 3 Process 81658 exiting. Process 81657 exiting. Process 81656 - Top of the for loop. j = 3 Process 81659 exiting. Process 81656 exiting. Process 81647 - Top of the for loop. j = 2 Process 81660 - Top of the for loop. j = 3 Process 81661 exiting. Process 81660 exiting. Process 81647 - Top of the for loop. j = 3 Process 81662 exiting. Process 81647 exiting. UNIX>As you can see, fork() is called 15 times, not four, because the processes that failed the execv() call continue in the for loop. This isn't bad when the for loop stops when j = 4, but were that a 10 rather than a 4, then fork() would be called 1023 times. (i.e. fork() gets called 2n-1 times if the 4 were an n). This can be devastating. Fix the error by calling perror() and exit() if execv() returns, as in src/execcat4.c:
if (fork() == 0) { (void) execv("cat", newargv); perror("execcat4's execv call"); /* Here are the only changes to the code -- */ exit(1); /* no longer committing the Cardinal Sin. */ } else { wait(&status); } |
UNIX> bin/execcat4 Process 81733 - Top of the for loop. j = 0 execcat4's execv call: No such file or directory Process 81733 - Top of the for loop. j = 1 execcat4's execv call: No such file or directory Process 81733 - Top of the for loop. j = 2 execcat4's execv call: No such file or directory Process 81733 - Top of the for loop. j = 3 execcat4's execv call: No such file or directory Process 81733 exiting. UNIX>
The professor was a big, bearded dude named Stan Eisenstat. He seemed 8000 times smarter than the rest of us, and I was scared of him. Toward the end of the semester, he gave us an assignment that you share -- write a shell. He warned us to be careful, because if we had the wrong type of bug, we could shut down the department's mainframe. This is 1986 -- departments like ours had one computer. One. Faculty and student offices had VT100 terminals that all connected to the one mainframe. Undergrads used a roomful of VT100's, again connected to the same mainframe. So you didn't want to be the undergrad who shut down the mainframe.
I shut down the mainframe. It was a Cardinal Sin bug. I tested a shell command of about 10 Unix commands connected by pipes, and I didn't test the return value of exec. Worse, whatever loop I had that was executing the commands didn't increment to the next command when exec failed, so I had a colossal fork-bomb that had no chance of exiting. Within seconds, I couldn't do a ps, because the operating system had run out of processes, and no one's fork() could succeed. Panic.
So I slinked up to Dr. Eisenstat's office, terrified.
Me: Knock knock
Him: "PLANK!"
Me: Opening the door -- "Yes. Sorry."
Him: "It's ok -- just make sure you fix the bug."
Me: "Yes. Thank you."
That was our only exchange ever, but it was certainly memorable for me, and is why I give you the Cardinal Sin.
Of course, now, when you have your fork bomb, you can just unplug your workstation or shut down your laptop, so it's not quite so devastating, but hopefully you can enjoy the story with me.
Stan Eisenstat passed away in 2020 in his 70's. They made a nice memorial page for him at https://seas.yale.edu/news-events/news/memoriam-stanley-eisenstat-professor-computer-science.