CS360 Lecture notes -- Condition Variables, Joining

Directory: /blugreen/homes/plank/cs360/notes/CondVar

Lecture notes: http://www.cs.utk.edu/~plank/plank/classes/cs360/360/notes/CondVar/lecture.html

This lecture will cover two things: Condition Variables, and a way around explicitly calling pthread_join.

Condition Variables

Condition Variables are a second kind of synchronization primitive (Mutexes being the first). They are useful when you have a thread that needs to wait for a certain condition to be true. In pthreads, there are three relevant procedures involving condition variables:

pthread_cond_init(pthread_cond_t *cv);
pthread_cond_wait(pthread_cond_t *cv, pthread_mutex_t *lock);
pthread_cond_signal(pthread_cond_t *cv);

The first of these simply initializes a condition variable. The second two are related. Pthread_cond_wait() is called by a thread when it wants to block and wait for a condition to be true. It is assumed that the thread has locked the mutex indicated by the second parameter. The thread releases the mutex, and blocks until awakened by a pthread_cond_signal() call from another thread. When it is awakened, it waits until it can acquire the mutex, and once acquired, it returns from the pthread_cond_wait() call.

Pthread_cond_signal() checks to see if there are any threads waiting on the specified condition variable. If not, then it simply returns. If there are threads waiting, then one is awakened. It is not specified whether the thread that calls pthread_cond_signal() should own the locked mutex specified by the pthread_cond_wait() call of the thread that it is waking up. I recommend that it should.

Note, you should not assume anything about the order in which threads are awakened by pthread_cond_signal() calls. It is natural to assume that they will be awakened in the order in which they waited, but that may not be the case. Program accordingly.

A Simple Example

A simple example of using condition variables is in the program barrier.c. Here, we have 5 threads, and we want to make sure that they all synchronize at a particular point. Often this is called a ``barrier'', since all the threads stop at this barrier before proceeding. In barrier.c the number of threads waiting is held in the variable ndone, and if a thread reaches the barrier before ndone equals NTHREADS, it waits on the condition variable ts->cv. When the last thread reaches the barrier, it wakes all the others up using pthread_cond_signal. The output of barrier.c shows that they all block until the last thread reaches the barrier:

UNIX> barrier
Thread 0 -- waiting for barrier
Thread 1 -- waiting for barrier
Thread 2 -- waiting for barrier
Thread 3 -- waiting for barrier
Thread 4 -- waiting for barrier
Thread 4 -- after barrier
Thread 0 -- after barrier
Thread 1 -- after barrier
Thread 2 -- after barrier
Thread 3 -- after barrier
done
UNIX>

Calling pthread_join

(All of this is really irrelevant if you call pthread_detach after forking off a thread -- however, this is still an excellent essay in using condition variables.)

If you're like me, it will seem to you that it's a pain to always having to call pthread_join() to clean up a thread. In fact, you don't really have to call it if you don't care about when a thread finishes. However, there's a resource allocation problem if you don't call pthread_join() and you create a lot of threads. For example, look at bigfork.c:

void *thread(void *arg)
{
  return NULL;
}

main()
{
  pthread_t tid;
  int i;
  int j;

  j = 0;
  while(1) {
    printf("j = %d\n", j);
    j++;
    for (i = 0; i < 1000; i++) {
      if (pthread_create(&tid, NULL, thread, NULL) != 0) {
        perror("pthread_create");
        exit(1);
      }
    }
    sleep(2);
  }
}

This program iterates creating 1000 threads, and then sleeping for two seconds. The threads themselves simply return. Thus at the end of each iteration, there should be just the main thread, since all 1000 threads should be able to complete in the two seconds that the main thread is sleeping (think about how you would test that). When you run the program, you get:

UNIX> bigfork
j = 0
j = 1
j = 2
_alloc_chunk(): _mmap failed: Not enough space
pthread_create: Not enough space
UNIX>

We've run out of memory after forking somewhere between 2000 and 3000 threads. This is because each thread allocates its own stack space, and even though most of the 2000+ threads have exited, they do not release their stack space until pthread_join() is called. For example, bigfork2.c joins each group of 1000 threads after the sleep statement, and as you see, it can run forever without running out of memory.

UNIX> bigfork2
j = 0
j = 1
j = 2
j = 3 
...
UNIX>

Now, look at jthread.h. Jthread.h defines three procedures:

jthread_system_init(): You must call this to use the other procedures.
jthread_create(void (*func)(void *), arg): This forks off a thread with the given function and argurments. It returns zero on success and one on failure. You'll note it gives you no tid. You cannot call join on threads created with jthread_create(). They clean themselves up automatically. For that reason, func returns nothing, instead of a (void *).
jthread_exit(). This is what you call from the main thread or from a forked thread to exit (the forked threads can also call simply return). When a thread exits, it is cleaned up automatically.

The definition of these procedures is in jthread.c. I won't explain it just yet. Instead, look at hello_world.c and bigfork3.c. Both use jthread.h and jthread.c to fork without joining. You'll see that they both work just fine. Isn't that convenient?

Now, a brief explanation of jthread.c. jthread_system_init initializes a global mutex, condition variable, dlist, and counter of the number of existing threads. It then creates a garbage-collecting thread. What this thread does is call pthread_join on all the threads in the dlist, and when the dlist is empty, it waits. jthread_create increments the counter, and then forks off a thread calling the procedure jthread_starter(). This thread calls the desired function and argument, and if it returns, it calls jthread_exit(). Thus, whether the thread calls jthread_exit() directly, or whether it returns, it will call jthread_exit(). Now, jthread_exit() calls pthread_self() to get its tid (which is an int), and it appends it to the dlist and wakes up the garbage-collecting thread by calling pthread_cond_signal() on the condition variable. Then it exits. In this way, all the threads release their resources, since the garbage-collecting thread always calls pthread_join() on a thread after it exits.

Why do I need a dlist? Think about what happens when two threads exit more or less simultaneously. The dlist makes sure that a pthread_join() call does not get missed. We could do this without a dlist by adding a second condition variable and using a pthread_t instead of the dlist. (This is done in jthread2.c).