For example, look at the programs forkwait.c, forkwait2.c and forkwait3.c. They show examples of forking off a child, waiting for it to exit, and then examining statusp to see how it exited. They are all straightforward.
In forkwait3, the child must be killed with a signal, using the command "kill". For example:
UNIX> forkwait3 & [1] 22326 UNIX> Child (22327) doing nothing until you kill it Kill the child with 'kill -9 22327' or just 'kill 22327'Now, you can kill the child manually with "kill -9 22327", which sends it the "sure kill signal (signal number 9)", or with "kill 22327", which sends it signal 15. Try both:
UNIX> forkwait3 & [1] 22326 UNIX> Child (22327) doing nothing until you kill it Kill the child with 'kill -9 22327' or just 'kill 22327' (hit return a few times) UNIX> kill -9 22327 UNIX> Parent: Child done. Return value: 22327 Status: 9 WIFSTOPPED: 0 WIFSIGNALED: 1 WIFEXITED: 0 WEXITSTATUS: 0 WTERMSIG: 9 WSTOPSIG: 0Try again:
UNIX> forkwait3 & [1] 22328 UNIX> Child (22329) doing nothing until you kill it Kill the child with 'kill -9 22329' or just 'kill 22329' UNIX> kill 22329 UNIX> Parent: Child done. Return value: 22329 Status: 15 WIFSTOPPED: 0 WIFSIGNALED: 1 WIFEXITED: 0 WEXITSTATUS: 0 WTERMSIG: 15 WSTOPSIG: 0forkwait3a.c has the child generate a segmentation violation, and you'll see that the parent can recognize this as the child terminating with signal 11. We'll go over signals in detail in another lecture.
Ok. Now, look at forkwait4.c. What it does is have the child exit immediately, and have the parent wait 4 seconds, print out the output of the "ps x" command, and then have it call wait(). It should be clear that by the time the parent calls system("ps x"), the child has exited. Thus, we might expect there to be no listing in the "ps x" command for the child, and possibly that the wait() might wait forever, since the child is completed. However, this is not the case.
When a child exits, its process becomes a "zombie" until its parent process either dies or calls wait() for it. By a "zombie", we mean that it takes up no resources, and doesn't run, but it is just being maintained by the operating system so that when the parent calls wait(), it will get the proper information. Look at the output of forkwait4:
UNIX> forkwait4 Child (1624) calling exit(4) PID TT STAT TIME COMMAND ... 381 p2 S 0:02 -sh (csh) 1623 p2 S 0:00 forkwait4 1624 p2 Z 0:00The process 1624 is the zombie process, denoted in the "ps x" output with a capital Z. When forkwait4 (process 1623) calls wait(), then process goes away.1625 p2 S 0:00 sh -c ps x 1626 p2 R 0:00 ps x ... Parent: Child done. Return value: 1624 Status: 1024 WIFSTOPPED: 0 WIFSIGNALED: 0 WIFEXITED: 1 WEXITSTATUS: 4 WTERMSIG: 0 WSTOPSIG: 4 UNIX> ps x ... 381 p2 S 0:02 -sh (csh) 1627 p2 R 0:00 ps x ...
What happens if the parent exits without calling wait()? Then the child zombie process should transfer parentage to /sbin/init. Instead, the child simply goes away.
wait() returns whenever a child exits. If a process has more than one child, then you can't force wait() to wait for a specific child. You simply get whichever child exits first. For example, see multichild.c. This program forks off 4 children processes and then calls wait() four times. The children sleep for a random period of time, and then exit. As you see, the first wait() call returns the first child to return:
UNIX> multichild Fork 0 returned 14160 Fork 1 returned 14161 Fork 2 returned 14162 Fork 3 returned 14163 Child 1 (14161) exiting Wait returned 14161 Child 3 (14163) exiting Wait returned 14163 Child 0 (14160) exiting Wait returned 14160 Child 2 (14162) exiting Wait returned 14162 UNIX>Now, you can use waitpid() to wait for a specific process, and you can even have it return if the specified process has not exited. I personally think using waitpid() is usually bad form, and most certainly using the version that returns instantly is really bad form.
You will not be allowed to use any wait() variant besides wait() in your jsh lab. I will instruct the TA's to be ruthless if you call waitpid() with NOHANG set.
Execve() is simple in concept:
int execve(char *path, char **argv, char **envp);
Execve() assumes that path is the name of an executable file. Argv is an array of null-terminated strings, such that the last element is NULL, and envp is another null-terminated array of null-terminated strings.
Execve() overwrites the current process so that it executes the file in path with the arguments in argv, and the environment variables in envp. Execve() does not return unless it encounters an error, such as the file in "path" not existing, or not being an executable file.
This may seem confusing. Why does execve() not return? Well, look at the example in exec2.c:
#includeSuppose we compile this to the program exec2. Then we execute it with no arguments. When we get to the execve() call the state of memory is the following:main(int argc, char **argv, char **envp) { char *newargv[3]; int i; newargv[0] = "cat"; newargv[1] = "exec2.c"; newargv[2] = NULL; i = execve("/bin/cat", newargv, envp); perror("exec2: execve() failed"); exit(1); }
|----------------| | | | code for exec2 | | | | | |----------------| | | | globals | | for exec2 | | | |----------------| | | | heap for exec2 | | | |----------------| | | .... | | | stack | | for exec2 | | | |----------------|Now, execve() is called. This is a system call that says "execute the program in /bin/cat" with the arguments "cat" "exec2.c". When execve() is done, the state of memory has been changed so that we are in the main() routine of cat, with argc and argv set properly:
|----------------| | | | code for cat | | | | | |----------------| | | | globals | | for cat | | | |----------------| | | | heap for cat | | | |----------------| | | .... | | | stack | | for cat | | | |----------------|You'll notice that everything concerning exec2.c is gone. This is because the state of memory has been overwritten to run cat. There is no trace of exec2 left. This is why execve() cannot return if it is sucessful -- the state to which it might have returned has been overwritten. It is gone. When cat exits, the operating system simply destroys the process.
So how come when you execute cat in the shell it looks like it returns to the shell? This is because the shell calls fork - exec - wait.
There are six variants of execve() -- see the man page for execve(). I'll summarize them below:
If execve() is unsuccessful (for example, there is no file with the name "path", or that file does not have the executable bit set), it will return with a value of -1. For example, look at exec1.c.
This program tries to execute "./cat", which does not exist. Thus, the execve() call fails, and the perror statement is executed.
This leads to:
execcat1.c forks off three processes that all exec "cat f1". Note the use of execvp(), which does not need an environment variable, and which searches the PATH variable to find "cat".
UNIX> execcat1 This is file f1 This is file f1 This is file f1 UNIX>Now, execcat2.c substitutes execv() for execvp(). When you run it, ostensibly nothing happens:
UNIX> execcat2 UNIX>What's going on? Well, the execv() call fails. This means that the execv call returns with i = -1, and then the child process continues. It too will go through the while loop and call fork(). To help illustrate what goes on, look at execcat3.c, which prints out the value of j and the pid of the process before every fork() call:
UNIX> execcat3 I am 4794. j = 1 I am 4795. j = 2 I am 4796. j = 3 I am 4795. j = 3 I am 4794. j = 2 I am 4799. j = 3 I am 4794. j = 3 UNIX>As you can see, fork() is called 7 times, not three, because the processes that failed the execv call continue in the while loop. This isn't bad for j = 3, but were the 3 a 10, then fork() would be called 1023 times. (i.e. fork gets called 2^n-1 times if the 3 were an n). This can be devastating. Fix the error by checking the return value of execv() as in execcat4.c:
UNIX> execcat4 execcat4: No such file or directory execcat4: No such file or directory execcat4: No such file or directory UNIX>