CS560 Final Exam: May 9, 2006 - Answers & Grading Guide

James S. Plank

Answer All Questions - Note the point differential between questions, and make sure to allocate your time accordingly.

Question 1: 5 points

Once again, Freddy has missed the boat a bit. Granted, he is testing for quite a few important things. However, the compiler can still have a Trojan Horse in it: It can stuff its own code into the executable files that it creates, and that code can do anything.

Grading

If you mentioned the Trojan Horse, you got all five points. I did give partial credit to some other answers:

Some of you said that although it only writes one file, it can read anything. Sure, that's true, but what is it going to do with that information if it can't communicate it to the evil source? I gave 2.5 points for that answer, since it was true, but incorrect.
Some of you said that it doesn't test the user's code to see if it is malicious. That is equivalent to solving the halting problem and thus is impossible. We don't expect compilers to be able to do the impossible. 1 point for that answer.
Some of you said that the code could easily have a buffer overflow attack without using gets() -- just make read/fread/fgets calls that use pointers that are too small. That's kind of reaching, I'd say, but not out of the realm of possibility, I guess. 3 points for that answer.

Question 2: 16 points

Part 1: To calculate the maximum, assume that the code spans two pages (e.g. it starts at an address like 0x7f0), and the stack spans two pages. How about s? Well, if x = 2, it can span two pages. If x = 1026, it can span three. Extrapolating, that is floor((x-2)/1024))+2. So the answer is:

Maximum = 6 + floor((x-2)/1024)).
To calculate the minimum, assume that the code is all in one page, the stack is all in one page, and s starts on a page boundary.

Minimum = 3 + floor((x-1)/1024)).
Part 2: A working set with a window size of delta is the set of pages that are touched by any of the last delta instructions.
Part 3: To calculate the maximum, suppose that s and e are on different pages. Moreover, suppose that L08 and L27 are on different pages. This means that four pages will be touched on every iteration of the loop. What about *s and *e? Well, first, assume that *s never equals *e. That means that on each iteration of the loop, 17 instructions are executed (L18-L20 are not executed). So, if delta equals 21K, you can perform 21/17 = 1.23K iterations of the loop. The exact value is not really important. What is important is that in 1.xK iterations, *s can touch three pages, as can *e. So the maximum is 10 pages.
To calculate the minimum, now suppose s and e are on the same page, as are all of the instructions. If you execute all of the instructions in the loop, that will make 21/20K iterations -- still greater than 1K. However, now you can assume that only two pages are touched by *s and *e. Thus, the minimum is 6 pages.
Part 4: Ok -- s starts on a page boundary, and e will start 36 bytes past a page boundary. Palcount starts on a page boundary, so all of its instructions are in one page, and since the fp is in the middle of a page, all the stack variables will be on one page. So, what's going to happen -- well, at every iteration, the stack and code pages are touched, as are the pages containing the current *s and *e. On the 37th iteration, and every 1024 iterations after that, *e will cross a page boundary. On the 1024th iteration and every 1024th iteration after that, *s will cross a page boundary. On the last iterations, *s and *e will be on the same page.
Put another way, all of s spans 1025 pages. We touch the stack/code variables multiple times each iteration; however we only touch *s at L15 and *e at L13. Thus, with LRU, the code & stack will never be kicked out; only old *s and *e pages. Thus, you will have 1025 page faults -- one for every page of s.
Part 5: With FIFO, here's what's going to happen:
- The code & stack are in memory.
- *e will fault in. Call it e_0.
- *s will fault in. Call it s_0.
- At iteration 37, a new e_1 will fault in.
- At iteration 1K, a new s_1 will fault in.
- At iteration 1K+37, a new e_2 will fault in, which will kick out a code or stack page.
- Then the code/stack page will fault in, kicking out the other code/stack page.
- Then the other code/stack page will fault in, kicking out e_0.
- At iteration 2K, s_2 will fault in, conveniently kicking out s_0.
- At iteration 2K+37, e_3 will fault in, conveniently kicking out e_1.
- At iteration 3K, s_3 will fault in, conveniently kicking out s_1.
- At iteration 3K+37, e_4 will fault in, kicking out a code/stack page.
- Then the code/stack page will fault in, kicking out the other code/stack page.
- Then the other code/stack page will fault in, kicking out e_2.
- And so on.
So, we will still have the 1025 page faults for every page of s. Additionally, we will have two page faults every time e_x is faulted in (x is even and greater than zero), because it will kick out a code/stack page. There are 1025 total pages. e_0 will be page #1024 (zero indexed). e_2 will be page #1022. The last e_x will be page #512. So, there are 256 even and positive values of x, resulting in 512 additional page faults. So - the total number of page faults is 1025+512 = 1537.
Part 6: This algorithm was mentioned in class and is in the book. Suppose we always keep a free frame pool of two pages, and when we free a frame, we maintain its identity. Then when a stack/code page gets evicted, it goes on the free frame list, but does not get overwritten. Then when it faults back in, it is sitting there on the free frame list, and does not cost any disk I/O. In this way you'll still have 1537 page faults, but only 1025 disk reads.

Grading

I tried to be lenient, and if you made an incorrect assumption that affected multiple parts, I tried not to let it harm you multiplicatively.

Part 1: 3 points. If you used division, I assumed that it was integer division.
Part 2: 2 points.
Part 3: 3 points.
Part 4: 2 points.
Part 5: 3 points.
Part 6: 3 points.

Question 3: 11 points

Part 1 - 1 point: 0x100 - 0x2ff.
Part 2 - 1 point: 0x400 - 0x6ff.
Part 3 - 1 point: 0xfffb00 - 0xfffeff.
Part 4 - 1 point: Ok -- the address space is 0x1000000 bytes. That makes 0x1000000 / 0x100 = 0x10000 = 64K. A single level page table will have to have 64K entries. A page can hold 256/4 = 64 PTE's, so RedHat will have to have 64K/64 = 1024 pages of PTE's.
Part 5 - 3 points:
Part 6 - 1 point: 18 pages -- 2 inner and 16 outer.
Part 7 - 3 points:

Question 4: 6 points

Looks like I asked this in 2005 as well. Should have been really easy for you. To quote: "A file allocation table is a way of implementing a linked index scheme for files that has a number of advantages over storing a link at the end of each block of a file. Specifically, the blocks on a disk are partitioned into data blocks and link blocks. The link blocks are in the first blocks of the disk, and are composed solely of links for the data blocks. The directory entry for the file is simply a pointer to the first block of the file.

"For example, the first link in the first link block contains the link for the first data block on the list. Thus, if a file is composed of multiple blocks, the link for the file's first data block points to the second data block, and so on. Obviously, the last block's link will contain a NULL pointer.

"This scheme is preferable to storing links in the data blocks for two main reasons. First, the block's size can be a power of two, which is often very convenient. In other words, the block itself is not broken up into a data portion and a metadata portion. Second, the link blocks themselves may be cached in the operating system, and therefore finding the bytes in the middle of a file does not require reading all of the previous data blocks from disk. Instead, the cached links may be used without any extra reading of data from disk.

"The links also provide a nice way to identify free blocks -- instead of having a NULL pointer or a pointer to another block, the link can have a different sentinel value that flags it as a free block.

"There are two kinds of caching in this system. The first is caching the link blocks, as discussed above. The second is performing standard disk block caching -- either LRU caching for frequently used blocks, or lookahead caching to optimize the performance of serial file access."

Grading

Variant of linked allocation scheme: 1 points
Each file's "metadata" is a pointer to the first block: 1 point
All the links held in one set of blocks on disk; the rest have data: 1 points
Link for block i is in pointer i.
Can cache the index blocks to reduce disk overhead on random access : 1 points
Can have standard read caching for sequential access: 1 point