CS360 Lecture notes -- Thread #3

Directory: /blugreen/homes/plank/cs360/notes/Thread3

Lecture notes: http://www.cs.utk.edu/~plank/plank/classes/cs360/360/notes/Thread3/lecture.html

In this lecture, we go over race conditions in more detail, focusing on using mutexes, and the trade-off between safety and performance.

SSNSERVER

The lecture revolves around a piece of code that maintains a database of people/ages/social security numbers. The main code is in ssnserver.c. It maintains a red-black tree (t) keyed on a person's name (in the order last, first). The val field points to an Entry struct, which contains the person's name again, his/her age, and his/her social-security number, stored as a string.

Ssnserver.c creates the tree and then accepts four kinds of inputs from standard input:

ADD fn ln age ssn -- This adds an entry to the tree.
DELETE fn ln -- This deletes an entry from the tree.
PRINT -- This prints the tree.
DONE -- This causes the program to exit.

Try it out:

UNIX> ssnserver
ADD Jim Plank 31 987-65-4321
PRINT
__________________________________________________
Plank, Jim                     -- 987-65-4321   31
--------------------------------------------------
ADD Phil Fulmer 45 123-45-6789
ADD Pat Summitt 42 111-11-1111
PRINT
__________________________________________________
Fulmer, Phil                   -- 123-45-6789   45
Plank, Jim                     -- 987-65-4321   31
Summitt, Pat                   -- 111-11-1111   42
--------------------------------------------------
DELETE Jim Plank
DELETE Steve Spurrier
Error: No Steve Spurrier
PRINT
__________________________________________________
Fulmer, Phil                   -- 123-45-6789   45
Summitt, Pat                   -- 111-11-1111   42
--------------------------------------------------
DONE
UNIX>

INPUTGEN

Ok, now look at inputgen.c. This is a program that I wrote to really beat on ssnserver. As input, it takes a number of events, a random number seed, and a file of last names. The file of last names that I've created is lns, which is simply /usr/dict/words copied into a local file. The program reads the last names into the array lns, and it has an array fns of 65 first names. Now, what it does is create nevents random input events for ssnserver.c. The first 50 events are random ADD events, and thereafter, it will create either ADD, DELETE or PRINT events (these in the ratio 5/5/1). It ends with a PRINT and a DONE event.

In order to create DELETE events that correspond to entries in the tree, inputgen uses a rb-tree of its own. This tree is keyed on a random number, and its val field is one of the names that it added previously. When it creates a DELETE event, it chooses the first name in the tree -- this will be a random name, deletes it from the tree, and then uses this name for the DELETE event.

So, this is a little complex, but you should be able to understand it. Inputgen is set up so that the tree that it manages will average around 50 elements, regardless of the number of events that it generates. To prove this to yourself, try it:

UNIX> inputgen 5 1 lns
ADD Phil Normal 2 631-85-0230
ADD Peyton Negligible 7 339-29-9216
ADD Dave Relate 90 440-26-1032
ADD Carla Joseph 15 961-73-1275
ADD Jamal Lane 43 837-68-7746
PRINT
DONE
UNIX> inputgen 5 1 lns | ssnserver
__________________________________________________
Joseph, Carla                  -- 961-73-1275   15
Lane, Jamal                    -- 837-68-7746   43
Negligible, Peyton             -- 339-29-9216    7
Normal, Phil                   -- 631-85-0230    2
Relate, Dave                   -- 440-26-1032   90
--------------------------------------------------
UNIX> inputgen 6000 1 lns | ssnserver | tail -60
Storehouse, Jamal              -- 378-84-0504   54
Tar, Sergei                    -- 922-35-6408   65
Tennis, Jamie                  -- 699-90-2234   84
Triplicate, Catharine          -- 264-43-8097   17
Turing, LaShonda               -- 569-75-2160   42
Twx, Blanche                   -- 488-36-4112   19
Vale, Wendy                    -- 375-04-9327   49
Villainous, Elizabeth          -- 816-64-5753   58
Xerxes, Mary                   -- 489-82-7899   58
--------------------------------------------------
__________________________________________________
Accuse, Katie                  -- 270-74-3607   94
Anaglyph, Sandra               -- 611-70-3455   10
Antarctic, Peyton              -- 118-39-0627   32
Atrium, Bill                   -- 988-21-6157    7
Beau, Sergei                   -- 723-35-7731   98
Bedevil, Andrei                -- 685-03-6172   42
Biddable, Laura                -- 507-90-3170   67
Blather, Kim                   -- 889-98-2973   46
Boathouse, Jay                 -- 212-66-7283   59
Centum, Sandra                 -- 348-92-5649   91
Cockpit, Miles                 -- 712-40-8903   27
Cunning, Bill                  -- 059-56-2417    3
Deduct, Sergei                 -- 436-37-7921   83
Eat, Wendy                     -- 424-66-1180   86
Extraordinary, Dizzy           -- 502-24-9923   98
Finland, Raynoch               -- 665-67-0773   59
Geographer, Mary               -- 609-84-0078   29
Gravel, Mary                   -- 515-74-6403   66
Horton, Laura                  -- 884-01-5338   98
Hothouse, LaShonda             -- 098-38-0969   32
Impediment, Xavier             -- 859-72-2538   65
Inattention, Bruce             -- 198-48-1849   11
Litigate, Helen                -- 292-95-7974   67
Macdonald, Jana                -- 190-74-9144   38
Mcintosh, Bill                 -- 580-14-3161   22
Moot, Dizzy                    -- 085-92-1219   33
Nellie, Bruce                  -- 928-58-5623    0
Pantomimic, Leonard            -- 328-47-3183   71
Party, Jamal                   -- 295-15-7017   88
Peripheral, Emily              -- 357-03-3434   21
Portland, Elena                -- 315-91-3735   83
Punt, Dizzy                    -- 912-12-8252   20
Pyhrric, Emily                 -- 887-51-0852   52
Salesman, Semeka               -- 087-79-3275   14
Sledge, Heather                -- 872-39-4327   91
Soapstone, Catharine           -- 363-42-3221   92
Sony, Jane                     -- 960-59-0669   68
Spheric, Xavier                -- 915-24-7348   28
Storehouse, Jamal              -- 378-84-0504   54
Tar, Sergei                    -- 922-35-6408   65
Tennis, Jamie                  -- 699-90-2234   84
Tradesmen, Cindy               -- 860-63-2050   77
Triplicate, Catharine          -- 264-43-8097   17
Turing, LaShonda               -- 569-75-2160   42
Twx, Blanche                   -- 488-36-4112   19
Vale, Wendy                    -- 375-04-9327   49
Villainous, Elizabeth          -- 816-64-5753   58
Xerxes, Mary                   -- 489-82-7899   58
--------------------------------------------------
UNIX>

You'll note that the above tree has 50 elements.

Turning ssnserver into a real server

Now, look at ssnserver1.c.

What this does is turn ssnserver into a real server. It serves a socket, and then calls accept_connection(), and creates a server_thread() thread to service the connection. The server_thread() thread works just like ssnserver.c, with the exception that the tree is a global variable.

Try it out with telnet. For example, in one window on cetus3a I do:

UNIX> ssnserver1 cetus3a 5000

while in another, I do:

UNIX> telnet cetus3a 5000
Trying 128.169.94.33...
Connected to cetus3a.cs.utk.edu.
Escape character is '^]'.
ADD Jim Plank 31 123-45-6789
ADD Phil Fulmer 45 987-65-4321
PRINT
__________________________________________________
Fulmer, Phil                   -- 987-65-4321   45
Plank, Jim                     -- 123-45-6789   31
--------------------------------------------------
DONE
Connection closed by foreign host.
UNIX>

It works just fine. I modified inputgen.c to work as a socket client -- the code is in inclient.c. It is straightforward and uses a second thread to read the socket output and print it to standard out. Try it out on the same server:

UNIX> inclient cetus3a 5000 5 1 lns
__________________________________________________
Joseph, Carla                  -- 961-73-1275   15
Lane, Jamal                    -- 837-68-7746   43
Negligible, Peyton             -- 339-29-9216    7
Normal, Phil                   -- 631-85-0230    2
Relate, Dave                   -- 440-26-1032   90
--------------------------------------------------
UNIX>

Now, look at ssnserver2.c. This works just like ssnserver1 except that it can service multiple connections simultaneously by forking off one server_thread() per connection. Note however, that that access to t is not protected by mutexes. This presents a problem because, for example, one thread may be adding one element to the tree while another is deleting a nearby element. If the first thread is interrupted before it finishes adding the element, then the rb-tree pointers may not be where they should be when the second thread tries to delete. This will result in an error, probably a core dump.

To help illustrate this, I wrote a shell script called kill_it.sh. This forks off a given number of inclient processes who all blast away at the given ssnserver2 server.

Try it out: On one machine, start a ssnserver2. For example, I did the following on cetus3a:

UNIX> ssnserver2 cetus3a 5002

Then, on cetus4a, I had 5 inclients send 1000 entries simultaneously to the server:

UNIX> sh kill_it.sh cetus3a 5002 1000 5 > & /dev/null

Within a few seconds, the ssnserver2 dumped core. This doesn't always happen, but usually. The reason is that access to t is not protected.

Adding a mutex

Now look at ssnserver3.c. This adds a mutex that each thread locks while it processes a connection. This solves the problem with accessing t, because no two threads may access t simultaneously. I.e. try out kill_it.sh: On cetus3a:

UNIX> ssnserver3 cetus3a 5003

And on cetus4a:

UNIX> sh kill_it.sh cetus3a 5003 1000 5 > & /dev/null

No core dump!

So, this solves the mutual exclusion problem, but it is like stapling papers with a sledge hammer. By having each thread lock the mutex throughout its lifetime, we have serialized the server -- no two threads can do anything simultaneously, and this is a performance problem. Ssnserver4.c solves this problem in a very standard way. Instead of locking the mutex at all times, the thread only locks the mutex when it accesses the tree. This is within the code for ADD, DELETE and PRINT.

To show how this improves performance, I ran ssnserver3 on cetus3a, and simultaneously ran the following clients on cetus1a, cetus2a, cetus4a and cetus5a:

UNIX> time inclient cetus3a 5004 20000 0 lns > & /dev/null

The clients took 10, 37, 81 and 145 seconds respectively. This is because they were serviced serially. I then did the same test using ssnserver4, and the times were 71, 71, 79 and 80 seconds. Obviously, ssnserver4 is better at servicing the connections simultaneously, although the average client time is better with ssnserver3 (68.25 seconds) than with ssnserver4 (78 seconds).

ssnserver5

Is the fact that ssnserver3 has a better average client service time than ssnserver4 surprising to you? It actually shouldn't be. One reason is that in ssnserver4, the average tree size is going to be 200 elements for all clients. In ssnserver3 the average tree size is 125 (50 for the first client, which exits before the second client runs. Then 100 for the second client, 150 for the third, and 200 for the fourth). Another reason is that the server holds the mutex while printing the tree to the client. This is a time-consuming operation, and means that no other client operation may be serviced at this time.

Does the mutex really need to be locked while printing the tree? No, not really. You can do some buffering to help you. Instead, create the string that you'll be using to print the tree while holding this mutex. This will take some time, but not nearly as much as writing this string to the socket. Then you release the mutex and write the string. This is done in Ssnserver5.c. Note I keep the tree size in a global variable, and this helps me malloc() the buffer when needed.

Now, when I repeat the test of having 4 clients call

UNIX> time inclient cetus3a 5004 20000 0 lns > & /dev/null

I get times of 18, 50, 51 and 52 seconds. This is a big improvment.

The lesson to be learned

The lesson to be learned here is that you need to think carefully about your use of synchronization primitives. There are two issues: correctness and performance. You want to make sure that there are no mutual exclusion problems in your code, as there was in ssnserver2.c. These are also sometimes called race conditions. However, you want to eliminate these race conditions in a way that maximizes performance. This should be done by making sure you only hold a mutex for as long as you need it locked. If you are performing a very time consuming operation (such as writing to a socket or file) while holding the mutex, then you should consider the use of buffering so that you can move the time consuming operation out of the code that holds the mutex. This is what ssnserver5 does.