CS360 Lecture notes -- Signals

Jim Plank, modified by Jian Huang

Directory: ~huangj/cs360/notes/Signals

Lecture notes: http://www.cs.utk.edu/~huangj/cs360/360/notes/Signals/lecture.html

Signals

Chapter 10 in the book gives a very complete description of signals. It will make for better reading after you read these lecture notes, but you will profit from reading it nonetheless.

Signals are a complex flow-of-control operation. A signal is an interruption of the program of some sort. To fully and properly understand signals, one must ALWAYS remember signals are classic examples of asynchronous events. For example, when you hit CNTL-C, that sends the SIGINT signal to your program. When you hit CNTL-\, that sends the SIGQUIT signal to your program. When you generate a segmentation violation, that sends the SIGSEGV signal to your program. In any case, the process does not know a priori when and if the signal will occur.

Before discussing the technical details, let's familiarize ourselves with a few names that occurs in every documentation about signal. These are Unix Version 7 (abbreviated as Version 7), SVR4 (System V Release 4), and BSD 4.x (Berkeley Software Distribution version 4.x), POSIX. 1 (Portable Operating System Interface, backed by IEEE). Version 7 was the version of Unix released in 1979, when Unix was still developed in an open source environment. During that period of time, BSD was already in existence and made great contribution to Unix development in general. (Bill Joy, a key member of the BSD group cofounded Sun Microsystems in 1982). In 1983, AT&T rushed to commercialize the Unix and relabeled the system as System V. BSD still remained but quickly began to diverge with System V. A complete standardization of Unix has never happened after the turning point of 1983. IEEE launched their efforts to develop a portable operating system and ended up with a standard, which pretty much covers the comman ground among the major flavors of Unix.

What do all these have to do with signal? Well, first different systems have a different number of signals, and very often a slightly different set of signals as well. For instance, Version 7 only have 15 different signals, but SVR4 has 31 signals. ANSI C also defines a very small set of signals, which form a subset of POSIX signals. However, the ANSI C standard has made signal so general that often it is regarded as useless. The bad news here is, C programs that use signals for even moderately complicated purposes are not likely to be highly portable between different systems. Second, as you will likely read about in manpages, there are two models of signals, an unreliable one and a reliable one. Version 7 used the unreliable one (with good intentions that led to problems). SVR4, BSD4.3+ and POSIX etc. have all adopted the reliable model. Knowing this difference would make your reading manpages a lot easier.

Your program has various ways of dealing with signals. All signals have names, each start with three letters: SIG. By default, there are certain actions that take place. For example, when you hit CNTL-C, the program usually exits. That is the default action for SIGINT. When you hit CNTL-\ or get a segmentation violation, your program dumps core and then exits. That is the default action for SIGQUIT and SIGSEGV.

You can redefine what happens when you get these signals, which allows you to write very flexible programs. Internally, when a signal is generated, the operating system takes over from the currently running program. It saves the current state of the program on the stack. Then, it calls an "interrupt handler" for the specific signal. For example, the default interrupt handler for SIGINT causes the program to exit. The default interrupt handler for SIGSEGV and SIGQUIT causes the program to dump core and then exit. If the interrupt handler for a signal calls return, then what happens is that the operating system takes over again, and restores the program from the state that it has saved on the stack. The program resumes from where it left off (usually -- there are some times when it doesn't).

You can use the signal() function to define interrupt handlers for signals. As always, read the man page: man 3v signal.

For example, look at sh1.c:

#include < signal.h >

void cntl_c_handler(int dummy)
{
  printf("You just typed cntl-c\n");
  signal(SIGINT, cntl_c_handler);
}

main()
{
  int i, j;

  signal(SIGINT, cntl_c_handler);

  for (j = 0; j < 40; j++) {
    for (i = 0; i < 1000000; i++);
  }
}

What this does is set up an interrupt handler for SIGINT. Now, when the user hits CNTL-C, the operating system will save the current execution state of the program, and then execute cntl_c_handler. When cntl_c_handler returns, the operating system resumes the program from where it was interrupted. Thus, when you run sh1, each time you type CNTL-C, it will print "You just typed cntl-c", and the program will continue. It will exit by itself in 10 seconds or so.

The signal handler should follow the prototype of cntl_c_handler. In other words it should return a (void) (i.e. nothing), and should accept an integer argument, even if it will not use the argument. Otherwise, gcc will complain to you.

Also, note that I make a signal() call in the signal handler. On some systems (e.g. Version 7), if you do not do this, then it will reinstall the default signal handler for CNTL-C once it has handled the signal. On some systems, you don't have to make the extra signal() call. Such is life in the land of multiple Unix's.

You can handle each different signal with a call to signal. For example, sh1a.c defines different signal handlers for CNTL-C (which is SIGINT), and CNTL-\ (which is SIGQUIT). They print out the values of i and j when the signal is generated. Note that i and j must be global variables for this to work. This is one example when you have to use global variables.

Try this out by compiling the program and then running it, and hitting CNTL-C and CNTL-\ a bunch of times:

UNIX> sh1a
^CYou just typed cntl-c.  j is 2 and i is 539943
^CYou just typed cntl-c.  j is 2 and i is 919180
^\You just typed cntl-\.  j is 4 and i is 413031
^CYou just typed cntl-c.  j is 5 and i is 20458
^\You just typed cntl-\.  j is 6 and i is 73316
^\You just typed cntl-\.  j is 6 and i is 683034
^CYou just typed cntl-c.  j is 7 and i is 292244
^CYou just typed cntl-c.  j is 13 and i is 738661
^\You just typed cntl-\.  j is 14 and i is 789583
^\You just typed cntl-\.  j is 16 and i is 42225
^\You just typed cntl-\.  j is 16 and i is 209458
^CYou just typed cntl-c.  j is 17 and i is 260584
^\You just typed cntl-\.  j is 19 and i is 982514
UNIX>

You can also catch the segmentation violation signal. One of those CS legends is that some grad student used to put the following into his code:

#include < signal.h >
#include < stdio.h >

void segv_handler(int dummy)
{
  fprintf(stderr, "nfs server not responding, still trying\n");
  while(1) ;
}

main()
...
  signal(SIGSEGV, segv_handler();

  rest of the code
}

This is so that if he was demo-ing his code, and a segmentation violation occured (which always seems to happen when you're demo-ing code), it would look like the network had frozen. Very clever. (I.e. look at and run sh1b.c. It should cause a segmentation violation, but instead looks like the network is hanging).

Something really cool about SIGHUP

SIGHUP is a signal that the control terminal (shell process) sends to all the processes that it has spawn off and still owns as children. If you typed a command like "ls", or even something more advanced like "ls &", when these process run, the shell process of the controlling terminal is always still recorded by the OS as the parent process. Now, something special about shell is that it is designed to spawn off processes and hence it is by design going to be a responsible process. One thing in particular -- when a shell process ends when the controlling terminal closes, the shell process sends SIGHUP to all of its children processes. Well, if shell finds there are suspended jobs in the background, for instance, shell will echo a warning on stderr to remind you so. If you insist on closing the shell, shell will close down and send those signals. With what you have been taught in this lecture, you can tell this will by default kill all those children processes.

So this brings up something most of you should have experienced before. You remotely log onto a Unix machine, launches a job (say, you are doing a image processing lab assignment which takes a while to run) and then during the running time, your terminal got closed due to a temporary loss of wireless connection. Then, when you log back in, your job is gone. Much computing time got wasted. Now as an expert in system programming, you can say definitively that the culprit is SIGHUP.

Is there a way to go around that? Well, for one, you can have the child process ignore SIGHUP. But what if this program is not of your making. Then, let's disconnect the child process from the shell process. You can do so by "disown" or by "nohup". I will show you examples. Even better, how about not really running the job directly through shell? Try standard utilities such as "at" or "batch". Using those, you even get an email after the job is finished.

Some Details about signal

when calling signal, the two input arguments include the singal number and a pointer to the function. One should know that the second argument, the function pointer, can be of constants SIGIGN or SIGDFL. If SIGIGN is passed, then the process ignores the signal. Note that SIGKILL cannot be ignored, nor can be caught, meaning a programmer setting a different handler for a signal (in case a program messes up signal control, the administrator can still kill a run-away process with this SIGKILL (No. 9). If SIGDFL is passed, the process switches to the default signal handler. No matter in which case, the call signal returns the address of the previous handler for a signal.

Legitimate reasons for signal generation can be classified into the following categories:

  1. terminal generated from user key strokes (CNTL-D, for instance)
  2. hardware exception, e.g. divide by 0, bus error
  3. the kill function (a process sending a signal to another process or process group owned by the same user). Note, a process can send a signal to itself by a call to raise
  4. the kill command (merely a command interface to the kill function)
  5. software conditions like SIGPIPE and SIGALARM

To Block Signal Delivery

Using the call sigprocmask, a programmer can specify to block a set of signals to be delivered to a process. One can think of the mask as a bit vector, in which each bit represents a binary on/off switch. However, it is not suggested to directly manipulate each bit in the mask due to portability issues. Three standard macros are provided for accessing the mask bits: SIG_BLOCK, SIG_UNBLOCK and SIG_SETMASK. Read "man -s 2 sigprocmask".

If a signal is generated and its action is not SIGIGN, the signal remains pending until either the process unblocks it or the action is changed to SIGIGN. That is a transient period before a legitimate signal handler is called. sigpending can be used to find out what signals are blocked and pending.

To Stop Signal Generation

For some applications, we may not want some signals to be generated to begin with. In that case, we need to use a data type - a signal set. The major functions relating to signal sets are:

    sigemptyset, sigfillset, sigaddset, sigdelset and sigismember

All are in section 3C of the manpage.

Alarm()

Another use of signal handlers is the "alarm clock" that Unix provides. Read the man page for alarm(). What alarm(n) does is return, and then n seconds later, it will cause the SIGALRM signal to occur. If you have set a signal handler for it, then you can catch the signal, and do whatever it is that you wanted to do. For example, sh2.c is like sh1.c only it prints out a message after the program has executed 3 seconds. Note that alarm() is approximate -- it's not exactly 3 seconds, but we'll consider it close enough for the purposes of this class.

UNIX> sh2
Three seconds just passed: j = 26.  i = 638663
UNIX>

Finally, sh3.c shows how you can get Unix to send you SIGALRM every second. It's just a tweak to sh2.c where you have the alarm handler call alarm to make Unix generate SIGALRM one second after the current one.

UNIX> sh3
1 second just passed: j = 8.  i = 823534
2 seconds just passed: j = 17.  i = 715735
3 seconds just passed: j = 26.  i = 610604
4 seconds just passed: j = 35.  i = 513675
UNIX>

On some systems, when you are in a signal handler for one signal, you cannot process that same signal again until the handler returns. On other systems, you can handle that same signal again. For example, look at sh4.c.

Note that alarm_handler has an infinite loop in it, meaning that it never returns. The program runs for a second, and then SIGALRM is generated, and alarm_handler() is entered. It goes into an infinite loop, and one second later, SIGALRM is generated again. Depending on your version of Unix, different things may happen. In Solaris, the signal will be handled, and you'll enter alarm_handler() anew. In SunOS, the signal will be ignored until you return from alarm_handler(), which of course never happens.

So, here's the output on Solaris (try it on kenner):

UNIX> sh4
One second has passed: j = 7.  i = 697646
One second has passed: j = 7.  i = 697646
One second has passed: j = 7.  i = 697646
One second has passed: j = 7.  i = 697646
...

and here is the output on SunOS (try it on duncan):

UNIX> sh4
One second has passed: j = 7.  i = 584436

You can generate and handle other signals reliably whether in a signal handler or not. For example, when you hit CNTL-\ in sh4.c, it gets caught properly whether the program or the alarm_handler() is running -- give it a try.

Finally, you can send any signal to a program with the kill command. Read the man page. Signal number 9 (SIGKILL) cannot be caught by your program, meaning that you cannot write a signal handler for it. This is nice because if you mess up writing a signal handler, then "kill -9" is the only way to kill the program.

A few interesting questions to think about.

1. When an exec is called, does the new process handle the signals in the same way that the previous process do?

2. What (should) happen when a signal is triggered during the process its signal handler is currently being run?