CS560 Lecture notes -- KThreads Lecture #2

  • Jim Plank, Rich Wolski
  • CS560: Operating Systems
  • Directory: /home/plank/cs560/notes/KThreads2
  • Lecture notes -- html: http://web.eecs.utk.edu/~jplank/plank/classes/cs560/560/notes/KThreads2/lecture.html
    This lecture goes over the implementation of KThreads. It may be sketchy, but I'll try to beef it up over time.

    For a threads package, the structure of the KThreads library is pretty simple. The entire implementation is in the file /home/plank/cs560/560/src/kthreads/kt.c. Yes, it's 700+ lines, but that's not bad.

    The implementation is, unfortunately, not architecture-independent. This is because you have to have some machine-dependent code when you build a new stack. There's no way around it. This code works for solaris and Linux. The machine dependent code is bundled up in #ifdef statements.


    Thread Structure

    The structure of the system is as follows. Each thread is represented by a struct kt_str data structure. Its definition is as follows:
    struct kt_str
    {
            void (*func)();         /* function to call */
            void *arg;              /* arg to pass that function */
            int tid;                /* Unique thread id */
            int state;              /* queue state */
            JRB blocked_list;       /* list I'm blocked on */
            JRB blocked_list_ptr;   /* pointer to my node on the list */
            jmp_buf jmpbuf;         /* stack/PC state */
            jmp_buf exitbuf;        /* stack/PC state for immediate exit */
            char *stack;            /* stack pointer */
            int stack_size;         /* stack_size */
            unsigned int wake_time; /* if I'm sleeping, should I wake now? */
            int die_now;
            Ksem ks;                /* in case I get killed */
    };
    
    typedef struct kt_str *K_t;
    
    The first two variables should need no description. Tid is a unique thread id, which is a positive integer. State can take on one of six values: The other variables will be described later.


    Queues and Trees

    There are seven data structures that hold threads:
    K_t ktRunning;                  /* running thread */
    Dllist ktFree_me;               /* threads waiting to be freed */
    Dllist ktRunnable;               /* ready queue - threads ready to run */
    JRB ktBlocked;                  /* blocked on join or semaphores */
    JRB ktSleeping;                 /* sleeping threads */
    JRB ktActive;                   /* searchable list of active threads */
    K_T ktOriginal;                 /* the main program thread */
    
    Every active thread is held in the ktActive tree. The key is the thread's id, which we call its tid. The val is a pointer to the thread's struct. A thread becomes inactive and is taken off the tree when it exits.

    Obviously, the currently RUNNING thread is held in ktRunning, and all the RUNNABLE threads are held in the list ktRunnable. Threads that are BLOCKED are all in the tree ktBlocked. The keys of the nodes in ktBlocked fall into three classes:

    ktSleeping is a tree of sleeping threads. The key is the time (a time_t) when the thread should wake up. Therefore, the tree is sorted in the order in which the sleeping threads should wake up. Obviously, multiple threads may wake up at the same time.

    ktFree_me is a list of threads that need to be freed. They are put on a list for a subtle reason. We'll get back to that.

    Finally, ktOriginal is a pointer to the main program thread. This thread is special because we do not want to clean up its stack when it dies.


    Initializing the System

    There is a global variable KtInit_d that is originally set to zero. Every KThread call first calls InitKThreadSystem(), which checks this variable. If it is one, then the system has been initialized already. If not, then we initialize the system and set KtInit_d to one. There's little magic here:
    InitKThreadSystem()
    {
            if(KtInit_d) return; 
    
            ktActive = make_jrb();
            ktRunnable = new_dllist();
            ktFree_me = new_dllist();
            ktBlocked = make_jrb();
            ktSleeping = make_jrb();
            ktThread_count = 0;
            ktTidCounter = 1;
            ktSidCounter = -1;
    
            ktOriginal = InitKThread(0,NULL,NULL);
            ktRunning = ktOriginal;
    
            KtInit_d = 1;
    
            return;
    }
    
    A couple of things -- ktThread_count keeps track of the number of active threads. It is only used for debugging. The two counters ktTidCounter and ktSidCounter are so that the system can assign unique thread id's and semaphore id's. Next, we create a K_t for the main thread, set ktRunning to it, and return.

    The Scheduler -- Overview

    There is a routine called KtSched(). This routine is called when we want to give the CPU to another thread. Remember, this is a non-preemptive thread system, meaning that threads only give up control of the CPU voluntarily. Therefore, we only call the scheduler when a thread blocks. Specifically, this only happens during the following calls: Each of these routines call KtSched(), which switches control to another RUNNABLE thread. It is assumed that the routine calling KtSched() has already done all the necessary things for the thread to wake up when it is time (for example, if kt_sleep() has been called, the thread has already been put onto ktSleeping before KtSched() is called).

    When the thread is awakened and executed by the scheduler, it returns from KtSched(). The thread has unblocked and is now running again.

    We'll go over the implementation of the scheduler later.


    Easy Routines

    Ok -- now that you know the main data structures and how you call the scheduler, you can easily write some of the simple routines. For example, kt_self() simply returns the tid of the currently running thread:
    void *kt_self()
    {
            InitKThreadSystem();
    
            return((void *) (ktRunning->tid));
    }
    
    And kt_yield() simply puts the running thread onto the ready queue and calls the scheduler:
    void kt_yield()
    {
            InitKThreadSystem();
    
            ktRunning->state = RUNNABLE;
            dll_append(ktRunnable,new_jval_v(ktRunning));
            KtSched();
            return;
    }
    
    Kt_sleep() is pretty easy too. You simply calculate the time you want the thread to wake up, put the thread onto the sleep tree, and call the scheduler. There's a little more there too. We end up setting three extra fields in the thread's struct: blocked_list is set to be the ktSleeping tree, wake_time is set to be the time the thread should wake up, and blocked_list_ptr is set to be a pointer to the thread's node on the ktSleeping tree. This ends up making deletion easier. Here's the code:
    void kt_sleep(int secs)
    {
            int until = time(0)+secs;
    
            InitKThreadSystem();
    
            SleepKThread(ktRunning,until);
            KtSched();
            return;
    }
    
    void SleepKThread(K_t kt, int until)
    {
            kt->state = SLEEPING;
            kt->blocked_list = ktSleeping;
            kt->wake_time = until;
            kt->blocked_list_ptr = jrb_insert_int(ktSleeping,until,new_jval_v(kt));
            return;
    }
    
    Kt_join() is similar. It sets the thread's state to BLOCKING, puts the thread onto the ktBlocked tree, and calls the scheduler. Once again, there is a little more than that. If the tid does not exist in ktActive, then we assume that the thread to which we are joining has already exited and we return. Then we look up the tid in the ktBlocked tree. If it's there already, then there is another thread joining with that thread, so we flag an error. Otherwise, the code is similar to kt_sleep():
    void kt_join(void *i_join)
    {
            K_t me;
            JRB target;
            int tid;
    
            InitKThreadSystem();
    
            tid = (int) i_join;
    
            if (tid <= 0) {
              fprintf(stderr, "kt_join() -- bad argument\n");
              exit(1);
            }
    
            target = jrb_find_int(ktActive,tid);
            if(target == NULL) return;
    
            if (jrb_find_int(ktBlocked, tid) != NULL) {
              fprintf(stderr, "Called kt_join on a thread twice\n");
              exit(1);
            }
    
            BlockKThread(ktRunning,tid);
    
            KtSched();
    
            return;
    }
    
    void BlockKThread(K_t kt, int key)
    {
            kt->state = BLOCKED;
            kt->blocked_list = ktBlocked;
            kt->blocked_list_ptr = jrb_insert_int(ktBlocked,key,new_jval_v(kt));
            return;
    }
    
    and once you know kt_join(), kt_joinall is easy too:
    void kt_joinall()
    {
            InitKThreadSystem();
    
            if(jrb_find_int(ktBlocked,0) != NULL) {
              fprintf(stderr, "Error: two joinall threads\n");
              exit(1);
            }
            BlockKThread(ktRunning,0);
            KtSched();
            return;
    }
    

    Semaphores

    Ok -- now is also a good time to talk about semaphores. Our semaphore struct has two fields: the value, and an id.
    struct kt_sem_str
    {
            int val;                /* The value */
            int sid;                /* Unique id */
    };
    typedef struct kt_sem_str *Ksem;
    
    Semaphore id's are negative integers -- this is so that threads blocked on semaphores and threads blocked on join calls can be in the same tree (ktBlocked).

    At this point, you might ask yourself ``Why have a tree at all? Wouldn't it be more efficient to have a global variable for the joinall thread, a pointer to a joiner thread in each thread's struct, and a list of blocked threads inside each semaphore?'' The answer is, yes, it would be more efficient, and it would work. However, this structure makes for easier debugging, since you can traverse a single data structure (ktBlocked) to take a look at all blocked threads. If we cared more about performance, I think we would ditch the current structure in favor of one without a global tree.

    Once again, make_kt_sem() is straightforward -- malloc() and initialize:

    void *make_kt_sem(int initval)
    {
            Ksem ks;
    
            InitKThreadSystem();
    
            if(initval < 0) {  /* yell at the user */ ... }
    
            ks = (Ksem)malloc(sizeof(struct kt_sem_str));
            if(ks == NULL) { /* flag a malloc error */ ... }
    
            ks->val = initval;
            ks->sid = ktSidCounter--;
    
            return((void *)ks);
    }
    
    I won't show kill_kt_sem() or kt_getval() because they are too simple.

    Now, P_kt_sem() is also simple -- you decrement the val, and if it is less than zero, block yourself on the semaphore's id. We also set ktRunning->ks to be the semaphore. This is so that we can fix the semaphore in kt_kill() if necessary. When the P() call unblocks, we clear ktRunning->ks.

    void P_kt_sem(kt_sem iks)
    {
            Ksem ks = (Ksem)iks;
            K_t me = ktRunning;
    
            InitKThreadSystem();
    
            ks->val--;
    
            if(ks->val < 0)
            {
                    ktRunning->ks = ks;
                    BlockKThread(ktRunning,ks->sid);
                    KtSched();
                    ktRunning->ks = NULL;
                    return;
            }
    
            return;
    }
    
    V_kt_sem() increments val, and if it is less than or equal to zero, it must unblock a thread that is blocked on a P() call. It does this by looking up its sid in the ktBlocked tree, and removing a thread from it, and putting that thread onto the ready queue (ktRunnable). Note that this means that unblocks are not done in FIFO order.

    You'll also note that WakeKThread() uses the blocked_list_ptr field of a thread to delete it from its tree. That is why we set it when we block or sleep.

    void V_kt_sem(kt_sem iks)
    {
            Ksem ks = (Ksem)iks;
            K_t wake_kt;
    
            InitKThreadSystem();
    
            ks->val++;
    
            if(ks->val <= 0) {
                    wake_kt = jval_v(jrb_val(jrb_find_int(ktBlocked,ks->sid)));
                    WakeKThread(wake_kt);
            }
    
            return;
    }
    
    void WakeKThread(K_t kt)
    {
            if (kt->state == RUNNING || kt->state == RUNNABLE
                                     || kt->state == DEAD) {
              fprintf(stderr, "WakeKThread -- Bad thread state\n");
              exit(1);
            }
    
            jrb_delete_node(kt->blocked_list_ptr);
            kt->state = RUNNABLE;
            kt->blocked_list = NULL;
            kt->blocked_list_ptr = NULL;
            dll_append(ktRunnable,new_jval_v(kt));
            return;
    }
    
    
    That's it for semaphores.

    More difficult routines: kt_fork() and kt_exit()

    Actually, kt_fork() is pretty simple, because it delays its hard part to the scheduler. Kt_fork() creates a new K_t struct, and fills in all its fields. This is pretty easy. It allocates a stack for the thread, but does not do anything with the stack. Instead, it simply sets the thread's state to STARTING and puts it on the ready queue. When the scheduler goes about running the thread, that is when the hard work is done. Kt_fork() returns the thread's id, cast to a (void *):
    void * kt_fork(void (*func)(), void *arg)
    {
            K_t kt;
    
            InitKThreadSystem();
    
            kt = InitKThread(KT_STACK_SIZE,func,arg);
            if(kt == NULL) { print an error }
    
            kt->state = STARTING;
            dll_append(ktRunnable,new_jval_v(kt));
            return((void *) (kt->tid));
    }
    
    K_t InitKThread(int stack_size, void (*func)(), void *arg)
    {
            K_t kt;
            void *stack = NULL;
    
            if(stack_size > 0)
            {
                    stack = (char *)malloc(stack_size);
                    memset(stack,0,stack_size);
            }
    
            if((stack_size > 0) && (stack == NULL)) { error }
    
            kt = (K_t)malloc(sizeof(struct kt_str));
            if(kt == NULL) { error }
    
            kt->tid = ktTidCounter++;
            kt->stack = stack;
            kt->stack_size = stack_size;
            kt->func = func;
            kt->arg = arg;
            kt->state = STARTING;
            kt->die_now = 0;
    
            ktThread_count++;
            jrb_insert_int(ktActive,kt->tid,new_jval_v(kt));
    
            return(kt);
    }
    
    Kt_exit() is also relatively straightforward, although you have to know a bit about the scheduler to understand it. When the scheduler initializes a new thread, it makes a setjmp() call to initialize the thread's exitbuf field. The purpose of this is so that when the thread calls kt_exit(), you can longjmp() to exitbuf, and then whatever state the stack was in when kt_exit() was called is cropped down to the initial stack frame (this is the frame in which setjmp() was called). Moreover, you can employ the same code to handle exiting with kt_exit() and exiting by returning from the thread's main procedure call.

    Ok -- here is kt_exit():

    void kt_exit()
    {
            JRB tmp;
    
            InitKThreadSystem();
       
            if (ktRunning == ktOriginal) {
              ktRunning->state = DEAD;
    
              tmp = jrb_find_int(ktActive, ktRunning->tid);
              jrb_delete_node(tmp);
    
              /* If there is a thread waiting on me, wake it up */
              tmp = jrb_find_int(ktBlocked, ktRunning->tid);
              if (tmp != NULL) {
                WakeKThread((K_t)tmp->val.v);
              }
    
              KtSched();
    
              /* This should never return, because this thread will never
                 be rescheduled */
    
              exit(1);
    
            } else {
              longjmp(ktRunning->exitbuf,1);
            }
    }
    
    Ok -- kt_exit() does one of two things. If the thread is the main program thread, then you don't want to clean up its stack. You just want to remove it from the system and run other threads. That is what the first part of the code does. Note, since the thread is gone from the system, the scheduler will never try to run it. For that reason, KtSched() never returns.

    If the thread is not the main thread, then we longjmp() to the exitbuf and let that code clean up the thread.


    And now... The Scheduler

    Ok -- now for the big piece of code -- the scheduler. We'll go through it in pieces. The first part is to call setjmp() and save the state of the calling thread. Remember, the calling thread is calling KtSched() to block itself and run another thread. The call is expected to return when the calling thread is unblocked. This first piece of code sets it up so that when this thread is made runnable again, it can start running by a call to longjmp(ktRunning->jmpbuf).
    void KtSched()
    {
            K_t kt;
            Jval j_kt;
            JRB jb;
            unsigned int sp;
            unsigned int now;
            Dllist dtmp;
            JRB tmp;
    
            if(setjmp(ktRunning->jmpbuf) != 0)
            {
                    FreeFinishedThreads();
                    if(ktRunning->die_now) kt_exit();
    
                    return;
            }
    
    Ok -- now when this thread is rescheduled (i.e. when setjmp() returns a non-zero value), it does two things before returning from the KtSched() call. First, it calls FreeFinishedThreads(), to deallocate the stacks of any threads on the ktFree_me list. Next, it checks to see if it was killed by a kt_kill() call. If so, it exits. Otherwise, it returns to the caller of KtSched().

    Next, we have one of the evils of C programming: a label for a goto statement. Hopefully, it has been drilled into your heads that you should never do this. I agree. However, in this case, it does make the code a bit easier, especially since you may not want to use procedures to ease your flow of control, since you are messing with the stack. So, we're allowing it in this one case.

    After the label, we wake up sleeping threads by checking the time versus their wait time. The first check of ktSleeping is so that we do not burn the overhead of a system call (time(0)) when there are no sleeping threads.

    start:
    
            if (!jrb_empty(ktSleeping)) {
              now = time(0);
              while(!jrb_empty(ktSleeping))
              {
                    kt = (K_t) jval_v(jrb_val(jrb_first(ktSleeping)));
                    if(kt->wake_time > now) break;
                    WakeKThread(kt);
              }
            }
    
    Next, we check the ready queue. If it is empty, we have one of three options: if there are sleeping threads, sleep until the first one is ready to wake up. If there are no sleeping threads, but a joinall thread, then wake up the joinall thread. Finally, if there are no sleeping threads and no joinall thread, then exit. Here's that code. Note, if we don't exit, we go back to the start label to try again:
            if(dll_empty(ktRunnable)) {
    
                    /* first, check for sleepers and deal with them */
    
                    if(!jrb_empty(ktSleeping)) {
                            kt = jval_v(jrb_val(jrb_first(ktSleeping)));
                            sleep(kt->wake_time - now);
                            goto start;
                    }
    
                    /* next, see if there is a joinall thread waiting */
    
                    jb = jrb_find_int(ktBlocked,0);
                    if(jb != NULL) {
                            WakeKThread((K_t)jval_v(jrb_val(jb)));
                            goto start;
                    }
    
                    /* Otherwise, exit with a value depending on whether there
                       are blocked threads */
    
                    if(!jrb_empty(ktBlocked)) {
                            exit(1);
                    } else {
                            exit(0);
                    }
            }
    
    If we've gotten this far in the code, there is a thread that is ready to run. Take it off the runnable queue, and if it is a RUNNABLE thread, run it. Note, this longjmp() call will return to the beginning of KtSched() on the new thread's stack.
            dtmp = dll_first(ktRunnable);
            kt = (K_t) dtmp->val.v;
            dll_delete_node(dtmp);
    
            if(kt->state == RUNNABLE) {
    
                    ktRunning = kt;
                    ktRunning->state = RUNNING;
                    longjmp(ktRunning->jmpbuf,1);
                    /* This doesn't return */
            }
    
    
    Now, here comes the grungy code. The only other kind of thread that can be on the ready queue is a thread whose state is STARTING. In this thread, all we have is a clean stack and its calling parameters (func, arg). Here's how we get it going. We first call setjmp():
            if(kt->state == STARTING)
            {
                    if(setjmp(kt->jmpbuf) == 0)
                    {
    
    This saves the state of the registers. Note, the sp and fp registers will point to the current stack, and not to the new stack. We need to change that so that they do. The next code does this.
                            /*
                             * get double word aligned SP -- stacks grow from high
                             * to low
                             */
                            sp = (unsigned int)&((kt->stack[kt->stack_size-1]));
                            while((sp % 8) != 0)
                                    sp--;
    #ifdef LINUX
                            /*
                             * keep double word aligned but put in enough
                             * space to handle local variables for KtSched
                             */
                            kt->jmpbuf->__jmpbuf[JB_BP] = (int)sp; 
                            kt->jmpbuf->__jmpbuf[JB_SP] = (int)sp-1024;
    #endif
    #ifdef SOLARIS
                            /*
                             * keep double word aligned but put in enough
                             * space to handle local variables for KtSched
                             */
                            kt->jmpbuf[JB_FP] = (int)sp;
                            kt->jmpbuf[JB_SP] = (int)sp-1024;
    #endif
    
    The first bit of code sets sp to point to the last double-aligned word on the new stack. Then we set the frame pointer in the jmpbuf to be sp, and the stack pointer to be sp-1024. This is architecture-specific code, since each machine has a different register layout. Hence the ifdefs.

    Now, here's the key to this code -- when a thread calls longjmp(kt->jmpbuf), the process will return from that setjmp statement, but on the new stack, with a 1024-byte stack frame. Remember what you know about stack frames -- it will access local variables as negative offsets from the frame pointer. Thus, as long as KtSched() does not use more than 1024 bytes of local variables (which it obviously does not), and as long as we do not assume that any local variables have meaningful values after the setjmp() call returns with a non-zero value. For example, if setjmp() returns 1, we cannot assume that kt holds the current thread! Why? Because kt is a local variable, and will be at some address like frame-pointer-8. If we longjmp to the clean stack, then that value will be zero. For that reason, once setjmp() returns non-zero, we have to reload local variables if we want to use them. This is very important for you to understand.

    But now let's continue with the code when setjmp() returns zero. This is when we're still running on the old stack. What we do is set kt's state, set ktRunning so that we can find it after the longjmp() (since it's a global variable), and then make the longjmp().

                            kt->state = RUNNING;
                            ktRunning = kt;
                            longjmp(kt->jmpbuf,1);
    
    This has us return from setjmp(), but on the new clean stack. Since the return value of setjmp() is one, we go straight to the else clause. Again, remember, we cannot read any values from local variables until we write to them, because our stack is empty.

    The first thing we do is clean up dead thread stacks (for example, the previous thread may have just died, so we should clean up its stack). Then we set our exit jmpbuf(), and call the function:

                    else {
                            FreeFinishedThreads();
    
                            if(setjmp(ktRunning->exitbuf) == 0) {
                                    /* This is only relevant if kt_kill is called */
                                    if(ktRunning->die_now == 0)
                                    {
                                            ktRunning->func(ktRunning->arg);
                                    }
                            }
    
    
    Now the thread is running. If it exits normally, then ktRunning->func(ktRunning->arg) will return. If it calls kt_exit(), then it will return from the setjmp() call with a return value of one. In either case, when the thread is done, it will be at this point in the code. Therefore, we have the thread cleanup code here.

    First, we take ourselves off the Active list:

                            jb = jrb_find_int(ktActive,ktRunning->tid);
                            if(jb == NULL) { /* Yell and scream */
                            jrb_delete_node(jb);
    
    Next, we see if there is a thread joining with us -- if so, wake it up:
                            jb = jrb_find_int(ktBlocked,ktRunning->tid);
                            if(jb != NULL)
                            {
                                    WakeKThread((K_t)jval_v(jrb_val(jb)));
                            }
    
    Now, we're done. We set our state to dead, append ourselves to the Free_me list, and try to run the next thread by going back to start:
                            FreeFinishedThreads(); /* I don't think this needs to 
                                                      be called */
                            ktRunning->state = DEAD;
                            dll_append(ktFree_me,new_jval_v(ktRunning));
                            ktRunning = NULL;
                            goto start;
                    }
            }
    
    We should never reach the rest of KtSched(). THe only way we would would be if there were a thread on the ready queue whose state was not STARTING or RUNNABLE. Flag that as an error, and we're done:
            fprintf(stderr,
                    "Error: non-STARTING or RUNNABLE thread on the ready queue\n");
            exit(1);
    }
    
    That's it. A mouthful, but in my opinion, very well-structured code. Make sure you understand it. You might see it on an exam sometime.....