CS360 Midterm Exam #2. March 11, 2015. James S. Plank
Answers and Grading
Question 1: 12 Points
The only subtle part of this question was dealing with p and k.
The for loop sets k so that it is:
0x00 |
0x01 |
0x02 |
0x03 |
0x04 |
0x05 |
0x06 |
0x07 |
The memcpy() statement overwrites the middle four bytes of k.
However, is k[2] the least significant byte of 0x90abcdef, or the most
signifcant? We don't know yet. However, we know that when we're done, k
is either going to be:
0x00 |
0x01 |
0x90 |
0xab |
0xcd |
0xef |
0x06 |
0x07 |
or
0x00 |
0x01 |
0xef |
0xcd |
0xab |
0x90 |
0x06 |
0x07 |
Line zero shows us that k[2] is 0xef, so it is the second representation of
k that we need to go with. That also means that when we represent integers,
their least significant bytes go first.
Now, we're ready to answer the question:
The program is in
q1.c if you want to test it:
UNIX> gcc q1.c
UNIX> a.out
Line 0: 0xef
Line 1: 0x1234567
Line 2: 0x1efb38
Line 3: 0x7
Line 4: 0x90
Line 5: 0xcdef0100
Line 6: 0x70690ab
UNIX>
Grading
Two points per line.
Here's the partial credit rubric:
- Line 1, you got .75 points for 0x123456b8, 0.50 points for
0x123456 and 0x12345678, and 0.25 for 0x12345780.
- Line 2: you got 1.75 for 0xefb38, 1.50 for
0x1dfb38, 0x1ef738, 0x1efb28, 0x1efb3a, 0x1efb3c, 0x1efbe8, 0x1efc38 and 0x1efd38,
and one for 0xef338, g0xefab8, 0xefb18 and 0xfdb38.
- Line 3: I gave a point for 0x0111.
- Line 4: I gave half a point for 0xef, and 0.3 points for 0x5/0x05.
- Line 5: I gave 1.5 for 0x0001efcd, 0x1efcd, 0xcdef0201 and 0xefcd0100. I gave 0.8 for 0x0001ef.
- Line 6: 1.5 for 0xab900607, 1.2 for 0xab9067, 1 for 0xf6070 and 0xef67.
Question 2: 12 Points
I was hoping the two problems would jump out at you:
- Problem #1: There is a memory leak due to strdup(), because
strdup() allocates and creates a new copy of each word, and passes that to
strcpy(). All that strcpy() does is copy the bytes from its second
argument to the first, so when strcpy() is done, the pointer to the newly
allocated string is lost. That is a classic memory leak.
- Problem #2: While this kind of code works in C++, it is bad in C.
The reason is that C++ maintains the string's length in the string class.
C does not. This means that each time you call strcat(), it has to find the
end of rv by searching from the beginning. As rv grows, this becomes
wasteful -- in particular, if the average word size is n and the size of
words is m, then this loop becomes O(m2n). That's
a problem.
Fixing the first bug is easy -- don't call strdup(). To fix the second bug,
you need to keep track of the end of rv, and simply call strcpy() or
strcat() there. I call strcpy(), because that way I don't have to
keep the string null-terminated when I add the space.
In my code, I add a second int which I name eorv. Then, I replace
the second for() loop with:
eorv = 0;
for (i = 0; i < numwords; i++) {
if (i != 0) {
rv[eorv] = ' ';
eorv++;
}
strcpy(rv+eorv, words[i]);
eorv += strlen(words[i]);
}
My loop is now O(mn). I have the old code in
q2.cpp and the new code in
q2-good.cpp. They both read words from standard input,
build a words array, and then call build_string(). As you can see
from q2-input.txt, the second is much faster:
UNIX> wc q2-input.txt
11000 23298 174306 q2-input.txt
UNIX> g++ q2.cpp
UNIX> time a.out < q2-input.txt
1.571u 0.003s 0:01.57 100.0% 0+0k 0+0io 0pf+0w
UNIX> time sh -c "cat q2-input.txt q2-input.txt | a.out"
6.243u 0.016s 0:06.59 94.8% 0+0k 5+2io 0pf+0w
UNIX> g++ q2-good.cpp
UNIX> time a.out < q2-input.txt
0.032u 0.002s 0:00.03 100.0% 0+0k 0+0io 0pf+0w
UNIX> time sh -c "cat q2-input.txt q2-input.txt | a.out"
0.066u 0.011s 0:00.07 100.0% 0+0k 0+0io 0pf+0w
UNIX>
Grading
Three points for spotting each problem. Three points for fixing each problem.
I gave a point for spotting problems that weren't really problems.
When you see "ML" in the grading, that means "Memory Leak".
Question 3: 8 Points
Here are the two lines of code that I was anticipating:
- Line 1: i = a() + b(): You need to store the return value of a()
while you call b(). To do so, you need to use
r2, r3 or r4, because r0 and r1 are not guaranteed
to retain their values across procedure calls. Suppose you use r2. Because
you use it, you must spill it, because whoever is calling you is relying on the same
guarantee.
- Line 2: i = k*j + m*n: Here, you need to use three registers, because
you need to use one to store the product k*j, and then one each to load
m and n. That means you have to use one of
r2, r3 or r4. Whichever one you use, you need to spill it at
the beginning of the procedure and unspill it at the end,
so that its value remains the same when the procedure is done.
Four points for a line that requires spilling, and four points for your explanation.
Question 4: 10 Points
Nuts and bolts assembler from the third Assembler lab.
a:
push #8 / Allocate i and b. i is [fp-4] and b is [fp]
ld [fp+16] -> %r0 / i = *y
ld [r0] -> %r0
st %r0 -> [fp-4]
ld [fp+12] -> %r0 / b = *x
ld [r0] -> %r0
st %r0 -> [fp]
ld [fp-4] -> %r0 / return b[i]
mov #4 -> %r1
mul %r0, %r1 -> %r0
ld [fp] -> %r1
add %r0, %r1 -> %r0
ld [r0] -> %r0
ret
|
In q4.jas, I have a main() and b() that set up
the stack in jassem exactly as in the question.
Grading
- The initial "Push #8" - 2 points
- The first line of C code - 2 points
- The second line of C code - 2 points
- The last line of C code - 4 points
Question 5: 19 Points
Each part was worth a point.
- A: This is fp+12: 0xfff418.
- B: This is fp+16: 0xfff41c.
- C: This is fp-4: 0xfff408. Half credit if you swapped your answers for C and D.
- D: This is fp: 0xfff40c. (Half credit if you swapped your answers for C and D.)
- E: This is the value of x: 0xfff42c.
The grading of this part and subsequent parts was on consistency with your previous
answers. So, to get full credit for this, your answer had to match your answer to part A.
This means, for example, if you answered 0xfff410 for part A, then you only got credit for
0xfff41c for this part.
- F: This is what's in 0xfff42c: 0xfff440.
To get full credit, your answer here had to match your answer to part E.
- G: This is what's in 0xfff440: 0xfff434.
To get full credit, your answer here had to match your answer to part F.
- H: This is the same as the value of *x: 0xfff440.
To get full credit, your answer here had to equal your answer to part F.
- I: This is *y, or, what's in address 0xfff428: 0x2.
To get full credit, your answer here had to match your answer to part B.
- J: This adds 2*4 to b - 0xfff448 - and dereferences
it: 0xfff430.
To get full credit, your answer here had to match your answers to parts
H and I.
- K: Now you dereference 0xfff430: 0xaa.
To get full credit, your answer here had to match your answer to part J.
- L: The caller of a()'s frame pointer is 0xfff41c.
There are four words in that stack frame. Going from bottom to top:
y in a(), x in a(),
the pc when a() returns and
the fp when a() returns. There's no room for any local
variables. The answer is zero. I gave half credit to eight.
- M: This is the word two below the frame pointer: 0x1054.
- N: This is the word above that: 0xfff41c.
- O: This is what happens when the pc and the fp are popped
off the stack: 0xfff414.
- P: Obviously, one is 0xfff40c. The next is a word below
that: 0xfff41c. And the next is the word below 0xfff41c: 0xfff448.
Question 6: 18 Points
As you can see, these two procedures are identical, except one reads from
ssd_buf, while the other writes to it. There are three egregious
problems with this code, which all involve making too many system calls:
- Both procedures call load_ssd_page() once for every byte
read or written. That's too many -- you can call it once, and then do
all of the reading and writing on the page, before you call it for the
next page.
- ssd_read() calls flush_ssd_page(), even though it never
modifies the page.
- ssd_write() calls flush_ssd_page() after every byte.
That's the same problem as with load_ssd_page() above.
A more subtle problem involving system calls is that if ssd_read()
needs to read from the page that's already loaded, there is no need to
call load_ssd_page() at all.
There are more minor problems, and perhaps your compiler can figure some
of them out, but why rely on that?
- You are doing divisions and mod operations on every byte, when you
don't need to.
- You are copying byte by byte, when most machines can copy at least
64 bytes at a time.
The code below solves all of these problems.
You can bet that memcpy() has been written to use the widest
word size possible to do its copying, so using it is much better
than trying to write it yourself.
This can be improved, and were
I writing these procedures for real, I would make the improvement. I'll
tell you how after you see the code (which is in
q6.c):
void ssd_read(char *buf, int size, int a)
{
int page, bytes;
page = a / 4096;
a %= 4096;
bytes = (a + size > 4096) ? (4096 - a) : size;
if (ssd_bufid != page) load_ssd_page(page);
memcpy(buf, ssd_buf+a, bytes);
size -= bytes;
buf += bytes;
while (size > 0) {
page++;
load_ssd_page(page);
bytes = (size > 4096) ? 4096 : size;
memcpy(buf, ssd_buf, bytes);
size -= bytes;
buf += bytes;
}
}
|
What's the improvement? Well, I only check to see if the first page
is loaded already. It could be that one of the pages in the middle of
the read is already in memory. I could read that page first, and then
read the remainders.
Grading
Spotting the problems was worth 10 points -- these sum to more than 10, but you
were capped at 10 points for this part:
- Too many system calls: 6 points.
- Copying done a character at a time: 3 points
- Flush is unnecessary in read: 2 points
- The calculations done on every byte are unnecessary.
Fixing the problem was worth 8 points. If your code structure was off, you started off with four
points, and if it was really off, you started with two points. I took off for things like calling flush_ssd_page(), not calling memcpy(), or not having a loop in your code.