C460 Lecture notes -- Pseudo-Threads Lecture #3

Directory: /mahogany/homes/plank/cs460/notes/PThreads3

Lecture notes -- html: file:/mahogany/homes/plank/cs460/notes/PThreads3/lecture.html

Implementation of the pt library

In this lecture, we will be going over the implementation of the pseudo-threads library as implemented in pt.c. This is a fairly subtle piece of code, but once you see how it all fits together, you should see that it is relatively straightforward.

The thread struct

Each thread is represented by the following struct:

#define RUNNING 0
#define READY 1
#define BLOCKED 2
#define SLEEPING 3
#define ZOMBIE 4
#define JOINING 5

typedef struct thread {
  void (*function)();
  void *arg;
  int state;
  struct thread *joiner;
} Thread;

If the thread is currently running, its state is RUNNING. Otherwise, it is one of the following:

READY: Ready to run, but not running. A thread that is just created by pt_fork() is in this state.
BLOCKED: Blocked on a semahpore due to a gsem_P() operation.
SLEEPING: Sleeping due to a pt_sleep() call.
ZOMBIE: The thread has exited, either by returning or by calling pt_exit(). If there is no thread that has called pt_join() on the thread, or that has called pt_joinall(), then the exited thread enters the zombie state, waiting for a joiner.
JOINING: This is a thread blocked because it is waiting for another thread to exit, or it has called pt_joinall() and there are still threads in the READY or SLEEPING state.

If a thread's state is READY, BLOCKED, SLEEPING or JOINING, then the fields function and arg contain the continuation that is to be invoked when the thread is to run.

The joiner field contains a pointer to a thread that has called pt_join() on this thread. If there is no such thread, then the joiner field is NULL.

Thread id's are merely pointers to the thread's Thread struct. There is a global variable pt_self that points to the currently executing thread. It should always be the case that pt_self->state is RUNNING.

Global variables and initialization

There are a few global variables maintained by the threads system. pt_self is one. Another is a Dlist called Readyq which is a list of threads that are ready to execute. All threads on the Readyq will have states of READY.

Other global variables are the Sleepq, Joinall, thebuf and first_time. These will be explained later. Here are all the type declarations for the global variables:

Thread *pt_self = NULL;

static Dlist Readyq = NULL;
static Rb_node Sleepq = NULL;
static Thread *Joinall = NULL;
static int first_time = 1;
static jmp_buf thebuf;

Note that the only non-static global variable is pt_self because it is the only one that we let users use.

Whenever a thread routine is invoked, it first tests to see if Readyq is NULL. If so, the threads system has not been initialized yet. At this point, pt_initialize() is called to initialize the state of the system. This initialization is straightforward: The Readyq and Sleepq are initialized to be empty, and a new Thread struct is created for the currently running thread. This is put into pt_self. Note that the function and arg fields of pt_self are not touched. This is because the thread is currently running -- no continuation is necessary.

static pt_initialize()
{
  if (Readyq != NULL) {
    fprintf(stderr, "PT: Called pt_initialize twice\n");
    exit(1);
  }
  Readyq = make_dl();
  Sleepq = make_rb();
  pt_self = (Thread *) malloc(sizeof(Thread));
  pt_self->state = RUNNING;
  pt_self->joiner = NULL;
}

pt_fork()

Pt_fork() is a very simple procedure. It takes a continuation as arguments, creates a new thread struct and sets its state to READY. Next it puts the continuation into the struct, sets the joiner to NULL and puts the thread at the end of the Readyq. Then it returns. When it is done, as advertised, there is a new thread ready to execute on the ready queue.

void *pt_fork(function, arg)
void (*function)();
void *arg;
{
  Thread *p;

  if (Readyq == NULL) pt_initialize();

  p = (Thread *) malloc(sizeof(Thread));
  p->state = READY;
  p->function = function;
  p->arg = arg;
  p->joiner = NULL;

  dl_insert_b(Readyq, p);
  return (void *) p;
}

The subtle code -- block_myself()

Next, we go over block_myself(). This is the core of the pt library. It is called by a thread whenever the thread has to relinquish the CPU. Its function is to get a thread off of the ready queue and execute it. It does a little more than that, which we'll explain later.

It is assumed that the blocking thread has already set its state appropriately, and it has stored itself in the proper data structures. For example, if the thread has blocked on a semaphore, it is assumed that it has put itself into the semaphore's blocked threads queue and set its state to BLOCKING. Therefore block_myself does not have to do any bookkeeping on the currently blocked thread.

A very simple strategy for block_myself() would be to take a thread off the ready queue, set its state to RUNNING, put it into pt_self and call the continuation. The problem with this is stack space. If we keep recursively calling continuations from block_myself(), our stack may grow without bounds, and one of the neat features of the pt library is its stacklessness.

The solution to this problem is to use setjmp()/longjmp(). Specifically, the first time that block_myself() is called, it calls setjmp(thebuf). Whenever it is called again, it calls longjmp(thebuf). This pops off all stack frames currently above the first call to block_myself(), and is exactly what we need. Thus, whenever a thread blocks, it calls longjmp() to pop all its frames off the stack. This is how we get ``stackless'' threads.

static block_myself()
{
  /* Variable declarations */

  if (Readyq == NULL) pt_initialize();

  if (first_time) {
    first_time = 0;
    setjmp(thebuf);
  } else {
    longjmp(thebuf, 1);
  }

  ...

Now, the code for taking a thread off the ready queue and running it is straightforward:

...
  if (!dl_empty(Readyq)) {
    d = Readyq->flink;
    p = (Thread *) d->val;
    dl_delete_node(d);
    pt_self = p;
    p->state = RUNNING;
    (*p->function)(p->arg);
...

Note that I don't show what happens when a thread returns or when there are no threads left in the ready queue. I'll get to that later.

Semaphores

Semaphores are the main synchronization construct in the pt library. Their implementation is straightforward -- they have a value, and a dlist of blocked threads.

typedef struct gsem {
  int val;
  Dlist queue;
} *Gsem;

make_gsem() is straightforward. It allocates a Gsem struct, initializes its value from its argument, creates an empty dlist for queue, and returns the Gsem to the user as a (void *):

void *make_gsem(initval)
int initval;
{ 
  Gsem g;

  if (initval < 0) {
    fprintf(stderr, "make_gsem: initval < 0 (%d)\n", initval);
    exit(1);
  }
  g = (Gsem) malloc(sizeof(struct gsem));
  g->val = initval;
  g->queue = make_dl();
  return g;
}

gsem_P() is a potentially blocking call, so it cannot return. Instead, it sets up the system to call its continuation when it is done being blocked. Here's exactly how it works. First the value of the semaphore is decremented. If that value is less than zero, the thread must be blocked. Therefore, the continuation in pt_self is set to the arguments of gsem_P(), and pt_self is inserted into the queue. Then block_myself() is called, which will execute the first thread on the ready queue. This is the first example of a thread being blocked. It can only be unblocked by another thread calling gsem_V().

If the value of the semaphore is greater than or equal to zero, then the thread does not have to be blocked. However, gsem_P() still cannot return. Instead, its continuation must be called. Rather than call it directly in gsem_P() what happens is that pt_self is put at the beginning of the ready queue and block_myself() is then called. This means that the continuation is indeed called, but not until the stack is reset in block_myself(). Make sure you understand how this works.

gsem_P(g, function, arg)
Gsem g;
void (*function)(); 
void *arg;
{     
  Thread *p;

  if (Readyq == NULL) pt_initialize();

  g->val--;

  p = pt_self;
  p->function = function;
  p->arg = arg;

  /* If blocking, put the continuation on the semaphore's queue, otherwise
     put the continuation on the front of the ready_queue, and call
     block_myself().  The reason for this is to pop off all the stack
     frames and start anew */

  if (g->val < 0) {
    dl_insert_b(g->queue, p);
    p->state = BLOCKED;
    if (debug) fprintf(stderr, "0x%x: blocking on semaphore 0x%x\n",
                       pt_self, g);
  } else {
    dl_insert_a(Readyq, p);
    p->state = READY; /* This is not really necessary, since it's going
                         on the head of the queue */
    if (debug) fprintf(stderr, "0x%x: P called but no blocking on 0x%x\n",
                       pt_self, g);
  }
  block_myself();
}

gsem_V() is more straightforward. It increments the semaphore's value, and if that is less than or equal to zero, then there is a thread on the queue that needs to be awaken. It does this by removing the first thread off the queue, and putting it onto the ready queue. It then returns to its caller.

gsem_V(g)
Gsem g;
{
  Thread *p;
  Dlist d;

  if (Readyq == NULL) pt_initialize();

  g->val++;

  /* If g->val <= 0, unblock a thread */

  d = g->queue;
  if (g->val <= 0) {
    d = d->flink;
    p = (Thread *) d->val;
    dl_delete_node(d);
    dl_insert_b(Readyq, p);
    p->state = READY;
    if (debug) fprintf(stderr, "0x%x: V called on  0x%x -- waking up 0x%x\n",
                       pt_self, g, p);
  } else {
    if (debug) fprintf(stderr, "0x%x: V called on  0x%x no one to wake\n",
                       pt_self, g);
  }
}

Thread exiting and joining

There are two ways that a thread can perform a join operation. The first is with pt_join() which specifies to block until a particular thread is done. The second is with pt_joinall() which specifies to block until there are no more threads that can run. We'll start with pt_joinall. All that it does is set the global variable Joinall to point to itself, set its continuation, and then block by calling block_myself():

pt_joinall(function, arg)
void (*function)();
void *arg;
{
  if (Readyq == NULL) {
    pt_initialize();
  }

  pt_self->function = function;
  pt_self->arg = arg;
  pt_self->state = JOINING;
  Joinall = pt_self;
  block_myself();
}

pt_join() is a little trickier. There are two cases that it must worry about. The first is if the thread with which it wants to join (I'll call it the joinee) has not exited yet. In such a case, the current thread (the joiner) must block. Thus, it sets its continuation. It also needs to set itself up so that when the joinee exits, it can unblocks the joiner. This is done by setting the joiner field in the joinee's thread struct.

The second case is if the joinee has already exited. In this case, the joinee's state will be set to ZOMBIE. If so, the joinee's thread struct is freed, and the joiner puts itself at the beginning of the ready queue (as in gsem_P() above).

In either case, pt_join() ends by calling block_myself().

pt_join(thread, function, arg)
Thread *thread;
void (*function)();
void *arg;
{
  int fnd;
  Rb_node r;

  if (Readyq == NULL) pt_initialize(); 

  if (thread->joiner != NULL) {
    fprintf(stderr, "Called pt_join on a thread twice\n");
    exit(1);
  }

  /* If the thread is a zombie -- free it and go directly to the
     continuation */

  pt_self->function = function;
  pt_self->arg = arg;

  if (thread->state == ZOMBIE) {
    free(thread);
    pt_self->state = READY; /* Unnecessary -- see P() */
    dl_insert_a(Readyq, pt_self);
 
  /* Otherwise, block the thread as joining */

  } else {
    thread->joiner = pt_self;
    pt_self->state = JOINING;
  }

  block_myself();
}

Finally, pt_exit() is called when a thread wants to exit. It is also called in block_myself() when a continuation returns because that means that the thread should exit. It performs one of three actions:

If it has a joiner defined, then wake up the joiner by putting it at the end of the ready queue. Then the thread should free its thread struct.
If there is no joiner, but a Joinall thread exists, it simply frees itself.
Otherwise, it sets its state to zombie.

In all three cases, the last action performed is to call block_myself().

pt_exit()
{
  Thread *p;

  /* If there is a joiner, put it back on the ready queue and free yourself.
     Otherwise, become a zombie */

  if (pt_self->joiner != NULL) {
    p = pt_self->joiner;
    p->state = READY;
    dl_insert_b(Readyq, p);
    free(pt_self);
    block_myself();
  } else if (Joinall != NULL) {
    free(pt_self);
    block_myself();
  } else {
    pt_self->state = ZOMBIE;
    block_myself();
  }
}

Sleeping

You can't just call sleep() to implement sleeping threads, because sleep() suspends the entire process, and thus other threads would not be able to execute. Instead, we maintain a red-black tree called the ``sleep queue''. This holds sleeping threads, and is indexed on the time_t value of when the thread should awaken. Thus, pt_sleep() simply initializes this value for the thread, puts it on the sleep queue, and calls block_myself(). We also defined pt_sleep() so that if it is called with a non-positive value, it works like pt_yield(). In such a case, the thread is simply put at the end of the ready queue:

pt_sleep(sec, function, arg)
int sec;
void (*function)();
void *arg;
{
  long t;
  Thread *p;

  if (Readyq == NULL) pt_initialize();

  p = pt_self;
  p->function = function;
  p->arg = arg;

  if (sec <= 0) {
    dl_insert_b(Readyq, p);
    p->state = READY;
  } else {
    t = time(0)+sec;
    rb_inserti(Sleepq, t, p);
    p->state = SLEEPING;
  }
  block_myself();
}

Now, sleeping threads are awaken in block_myself(). Before it processes the ready queue, it checks the current time against the sleep queue, and puts all threads that should be awaken into the ready queue. The code is below:

block_myself()
{
  ...
  if (!rb_empty(Sleepq)) {
    t = time(0);
    while(!rb_empty(Sleepq) && rb_first(Sleepq)->k.ikey <= t) {
      p = (Thread *) (rb_first(Sleepq)->v.val);
      p->state = READY;
      dl_insert_b(Readyq, p);
      rb_delete_node(rb_first(Sleepq));
    }
  }
  ...

Tying it all together

Finally, below is the full code for block_myself. In addition to the things described above, it does the following:

Calls pt_exit() when a continuation returns.
If there are no threads on the ready queue, but there are threads on the sleep queue, it calls sleep() for the requisite amount of time so that the first thread on the sleep queue can be awaken.
If there are no threads on the ready or sleep queues, it unblocks the Joinall thread if there is one. Otherwise, it prints out ``No more threads to run'' and exits the program.

static block_myself()
{
  Dlist d;
  Thread *p;
  void (*function)();
  void *arg;
  long t;

  if (Readyq == NULL) pt_initialize();

  /* Always longjmp down to pop all thread frames off the stack */

  if (first_time) {
    first_time = 0;
    setjmp(thebuf);
  } else {
    longjmp(thebuf, 1);
  }

  /* If the sleep queue is not empty, check to see if any sleepq
     elements should come off of the queue */

  if (!rb_empty(Sleepq)) {
    t = time(0);
    while(!rb_empty(Sleepq) && rb_first(Sleepq)->k.ikey <= t) {
      p = (Thread *) (rb_first(Sleepq)->v.val);
      p->state = READY;
      dl_insert_b(Readyq, p);
      rb_delete_node(rb_first(Sleepq));
    }
  }

  /* Call the first thread on the ready queue */

  if (!dl_empty(Readyq)) {
    d = Readyq->flink;
    p = (Thread *) d->val;
    function = p->function;
    arg = p->arg;
    dl_delete_node(d);
    pt_self = p;
    p->state = RUNNING;
    (*function)(arg);

    /* If the function returns, the thread should exit */

    pt_exit();

  }

  /* Otherwise, if there are sleepers, sleep until one of them is ready */
  else if (!rb_empty(Sleepq)) {
    t = rb_first(Sleepq)->k.ikey-t;
    sleep(t);
    block_myself();
  }

  /* Otherwise, there are no more threads to run.  If there is
     a joinall continuation, call it.  Otherwise, exit */

  if (Joinall != NULL) {
    p = Joinall;
    p->state = READY;
    dl_insert_b(Readyq, p);
    Joinall = NULL;
    block_myself();
  }

  fprintf(stderr, "No more threads to run\n");
  exit(0);
}