The pipe() system call gives parent-child processes a way to communicate with each other. It is called as follows:
int pipe(int fd[2]);In other words, you pass it an array of two integers. It fills in that array with two file descriptors that can talk to each other. Anything that is written on fd[1] may be read by fd[0]. This is of no use in a single process. However, between processes, it gives a method of communication.
Nonetheless, my first program will just be in a single process. Look at src/pipe0.c:
/* pipe0.c - Create a pipe in the current process, write to it and read from it. */ #include <stdio.h> #include <string.h> #include <stdlib.h> #include <unistd.h> int main() { int pipefd[2]; int i; char s[1000]; char *s2; /* Create the pipe. */ if (pipe(pipefd) < 0) { perror("pipe"); exit(1); } /* Write an 11-byte string to it. This gets stored in the operating system. */ s2 = "James Plank"; write(pipefd[1], s2, strlen(s2)); /* Now read the string from the pipe. Even though we ask for 1000 bytes, it simply returns the 11 bytes that are in the pipe. */ i = read(pipefd[0], s, 1000); s[i] = '\0'; printf("Read %d bytes from the pipe: '%s'\n", i, s); return 0; } |
This first calls pipe() to set up two file descriptors pipefd[0] and pipefd[1]. Anything written to pipefd[1] can be read by pipefd[0]. To put this in another way, whenever you call write(fd, buf, size), your process sends size bytes starting at the address specified by buf to the operating system. The fd tells the operating system what to do with those bytes. Usually fd is a file descriptor returned by open() -- thus, your write() call tells the operating system to write those bytes to a file. However, there are other types of file descriptors. For example, when you say write(1, buf, size), you are saying to print those bytes to standard output, which often is not a disk file, but instead is a terminal. When fd is the writing end of a pipe, the write() specifies for the operating system to hold those bytes in a buffer until some process requests for them by performing a read() on the read end of the pipe.
It's important to recognize that all interprocess communication must take place through the operating system. Pipes are a nice clean way for this to occur.
Back to src/pipe0.c. After the pipe() system call returns in pipe0, we can view the running process as having 5 open file descriptors: standard input (0), standard output (1), standard error (2), the read end of the pipe (pipefd[0]), and the write end of the pipe (pipefd[1]). Each of those file descriptors is a pointer to the operating system. We can visualize it as follows:
pipe0 |----------| file | code, | descriptors | globals, | |---------| | heap. | 0 <----- | | | | 1 -----> |operating| | | 2 -----> | system | | | pipefd[0] <--- | | | | pipefd[1] ---> |---------| | | | stack | |----------|Now, we first call: "write(pipefd[1], s2, strlen(s2));"
This sends the string "James Plank" to the operating system, which holds it in a buffer:
pipe0 |----------| file | code, | descriptors | globals, | |---------| | heap. | 0 <----- | | | | 1 -----> |operating| | | 2 -----> | system | | | pipefd[0] <--- | | | | pipefd[1] ---> | |-> "James Plank" | | |---------| | stack | |----------|Next, we call "i = read(pipefd[0], s, 1000);", which attempts to read up to 1000 bytes from the pipe. This extracts the string "James Plank" from the OS and puts it into the variable s:
pipe0 |----------| file | code, | descriptors | globals, | |---------| | heap. | 0 <----- | | | | 1 -----> |operating| | | 2 -----> | system | | | pipefd[0] <--- | | | | pipefd[1] ---> | | | | |---------| | stack s|-> "James Plank" |----------|This is a very simple use of pipes, and is not really something that you would ever do. However, it shows the use of a pipe from within one process.
/* This program shows how a parent and child can communicate with a pipe. */ int main() { int pipefd[2]; int pid; int i, line; char s[1000]; if (pipe(pipefd) < 0) { perror("pipe"); exit(1); } pid = fork(); /* The parent reads lines of input from standard input, and writes them to the pipe. */ if (pid > 0) { while(fgets(s, 1000, stdin) != NULL) { write(pipefd[1], s, strlen(s)); } | /* The child reads single characters from the pipe, and when it sees a newline, it writes the line to standard output, preceded by the line number. */ } else { i = 0; line = 1; while(read(pipefd[0], s+i, 1) == 1) { if (s[i] == '\n') { s[i] = '\0'; printf("%6d %s\n", line, s); line++; i = 0; } else { i++; } } } return 0; } /* I'll comment here that you shouldn't write code like the child that reads one byte at a time. Why? */ |
Again, after pipe() is called, the system looks like:
pipe1 |----------| file | code, | descriptors | globals, | |---------| | heap. | 0 <----- | | | | 1 -----> |operating| | | 2 -----> | system | | | pipefd[0] <--- | | | | pipefd[1] ---> |---------| | | | stack | |----------|Now, when fork() is called, a new process is created which is a duplicate of the original pipe1 process. The file descriptors are also duplicated so that they are the same pointers into the operating system. The state now looks like:
pipe1(parent) pipe1(child) |----------| file- |----------| | code, | descriptors | code, | | globals, | |---------| | globals, | | heap. | 0 <----- | | -----> 0 | heap. | | | 1 -----> |operating| <----- 1 | | | | 2 -----> | system | <----- 2 | | | | pipefd[0] <--- | | ---> pipefd[0] | | | | pipefd[1] ---> |---------| <--- pipefd[1] | | | | | | | stack | | stack | |----------| |----------|The parent process now calls fgets() to read lines of text from standard input, and writes them to the pipe. The child reads from the pipe, and prints each line on standard output, preceeded by its line number. This code should seem straightforward to you. You should see that each write to pipefd[1] goes to the operating system, which it passes it to the child process as it calls read on pipefd[0].
Try running pipe1:
UNIX> bin/pipe1 How bout them Vols! 1 How bout them Vols! Give him six! 2 Give him six! Juice em, Big Dog, Juice em! 3 Juice em, Big Dog, Juice em! <CNTL-D> UNIX>
Looks good. Now, try doing the same thing to an output file:
UNIX> bin/pipe1 > output How bout them Vols! Give him six! Juice em, Big Dog, Juice em! <CNTL-D> UNIX> cat output UNIX>Hmmm. This appears to be a problem. Since fork() duplicates file descriptors we'd assume that the child process writes to output, as that is where standard output has been redirected. This is correct. The problem can be seen by doing a ps x (or ps aux | grep $USER):
UNIX> ps aux | grep plank ... plank 6277 0.1 0.3 760 576 pts/22 S 09:40:25 0:00 grep plank plank 6241 0.0 0.2 684 436 pts/22 S 09:39:02 0:00 pipe1 plank 6244 0.0 0.2 700 452 pts/22 S 09:39:23 0:00 pipe1 ... UNIX>What's going on? Well, it can best be explained by the picture below. When the parent process receives CNTL-D, it closes pipefd[1], and then exits. The state of the system now looks as follows:
pipe1(parent) pipe1(child) |----------| exited | code, | |---------| | globals, | | | -----> 0 | heap. | |operating| <----- 1 | | | system | <----- 2 | | | | ---> pipefd[0] | | |---------| <--- pipefd[1] | | | | | stack | |----------|Note that the child process still has pipefd[1] open. Thus, it is waiting to read from pipefd[0], and the operating system doesn't know that no process will be writing to pipefd[1]. So the child process just sits there doing nothing. There is nothing in the output file because the child is printing using printf(), which buffers the output. As the buffer isn't full, it hasn't performed the write(1, ...) yet, and thus we don't see anything in the output file.
As the ps x command shows, there are two pipe1 processes -- one child from each of the two pipe1's that we called above. (i.e. "bin/pipe1" and "./pipe1 > output").
Make sure you understand what has gone on here before you read further. The child process is hung reading from pipefd[0]. The read() call will not return because there is nothing to read, and since pipefd[1] is not closed, the OS cannot make the read() call return with a value of zero. Thus, the process is hung.
UNIX> kill 6241 6244And look at src/pipe2.c. Pipe2.c has the parent close the file descriptors that it is not going to use, and it has the child close the file descriptors that it is not going to use. When the parent and child both enter their loops, the state of the system looks as follows, due to the closing of unused file descriptors:
pipe2(parent) file pipe2(child) |----------| desc- |----------| | code, | riptors | code, | | globals, | |---------| | globals, | | heap. | 0 <----- | | | heap. | | | |operating| <----- 1 | | | | 2 -----> | system | <----- 2 | | | | | | ---> pipefd[0] | | | | pipefd[1] ---> |---------| | | | | | | | stack | | stack | |----------| |----------|Now, when you type < CNTL-D >, the parent exits, leaving the system in the following state:
pipe2(parent) pipe2(child) |----------| exited | code, | |---------| | globals, | | | | heap. | |operating| <----- 1 | | | system | <----- 2 | | | | ---> pipefd[0] | | |---------| | | | | | stack | |----------|Note that the writing end of pipefd is gone completely. Thus, the operating system can have the child's read(pipefd[0], ...) return zero, and the child exits gracefully. So, when you call pipe2 as you did pipe1 before, the output file is correctly created, and there are no child processes left hanging around:
UNIX> bin/pipe2 > output How bout them Vols! Give him six! Juice em, Big Dog, Juice em! < CNTL-D > UNIX> cat output 1 How bout them Vols! 2 Give him six! 3 Juice em, Big Dog, Juice em! UNIX>If you do a "ps x", you should see no pipe2 processes.
UNIX> cat exec1.c | head -n 5 | tail -n 1You'll notice that the first process to die will be the middle one, because it exits after reading the first five lines of standard input. When it exits, the other two will exit automatically -- tail will have read() return 0, and will exit, and cat will try to write to an empty pipe, and thus will generate SIGPIPE and exit.
UNIX> head -n 10 src/headsort.c | sortIf we don't get to it in class, go over the code yourself. You will find it very helpful for the Jsh lab.
Look at src/pipe3.c.
It does the same thing as the others, but catches SIGPIPE (If signals are unknown to you, read Chapter 10 in the book, and the signal man page. We will have a lecture on signal later in the class). To test it, run pipe3;
UNIX> bin/pipe3 Juice em, Big Dog, Juice em! 1 Juice em, Big Dog, Juice em!Then, in another window, kill the child process -- it will be the one with the higher pid:
UNIX> ps aux | grep plank ... plank 7064 0.1 0.2 684 452 pts/22 S 09:44:24 0:00 pipe3 plank 7065 0.0 0.2 684 304 pts/22 S 09:44:24 0:00 pipe3 ... UNIX> kill 7065 UNIX>You'll see nothing happen in the pipe3 window, but the child is gone. (Type "ps aux | grep $USER" again to make sure). This means that there is no process that has pipefd[0] open. Thus, if you type into the bin/pipe3 process:
Give Him Six! 15454: caught a SIGPIPE UNIX>The write() to pipefd[1] generates SIGPIPE.