- (18 points) For each of the following questions, choose the most appropriate answer
from the following list:
structural equivalence | content equivalence |
semantic equivalence | name equivalence |
hot spot compilation | dynamic compilation |
just in time compilation | virtual machine |
linker | loader | compiler | unrolling |
inlining | tail trimming | basic block | data block |
control block | dope vector | symbol table | relocation table |
dimensions table | jump table | hotspot table | re-ordering evaluation |
short circuit evaluation | lifting evaluation | hot read evaluation | runtime environment |
- basic block The name of a node in a control flow graph
that contains a maximal-length set of operations that should
execute sequentially at run time, with no branches in or out.
- dope vector The name of the data structure that is
maintained by the run-time environment to keep track of the lower
and upper dimensions of a dynamic array, and the size of each dimension
of the dynamic array.
- virtual machine The name given to an interpreter of byte codes.
- structural equivalence Under this form of type equivalence, two
types are considered equivalent if they consist of the same components,
put together the same way.
- jump table The name for the code generated by the
compiler for a switch statement that allows control to transfer
in O(1) time to the appropriate branch of the switch statement.
- inlining The name of the optimization that occurs
when the compiler replaces a function call with the function body.
- just in time compilation The name given to a compiler that converts
byte codes to machine language immediately before the program is about to
execute.
- unrolling The name of the optimization that occurs
when the compiler replaces a loop that executes exactly 5 times with
5 consecutive copies of the loop body.
- short circuit evaluation The name given to evaluation of a conditional
when control transfers to the then or else branch as soon as the
condition is known to be true or false
- (10 points) Explain two ways that Java dynamic compilation (e.g., Sun's Hotspot compiler)
is able to achieve performance comparable to C code. Use less than three
sentences for each of the two ways.
- It dynamically compiles hotspots in the code and uses aggressive
in-lining of functions, including nested in-lining of functions from
libraries.
- Since it knows the actual underlying architecture, it is able to
take advantage of the known number of functional units to fully
exploit pipelining.
- When it has to make virtual method calls, it can use profiling to
determine which types of objects are usually making these calls and
start using a conditional statement to make direct calls to these
object's methods, rather than having to look up the calls through a
virtual method table. Once it has the direct call, it can then use
its aggressive in-lining strategy.
- (10 points) In class we talked about how the compiler could improve instruction
scheduling if it is allowed to re-order the operations of an arithmetic
expression. In C, is it safe for the compiler to also try to re-order
the order in which the operands of a boolean expression are evaluated in
order to improve instruction scheduling. For example,
if you have an expression
of the form
if (exp1 && exp2)
is it safe to evaluate exp2 before exp1?
Why or why not? If you say it
is not safe, please give me a concrete C example that shows why it is unsafe.
It is not safe because C employs short-circuit evaluation, meaning that
if the first expression determines the value of the conditional, then
the second expression does not get executed. The programmer may rely
on this short-circuit evaluation to prevent the second expression from
executing. Here are three concrete C examples that illustrate different
scenarios where switching the order of evalution could be harmful:
- the programmer may be trying to
protect the second piece of code from executing if the first piece
determines the value of the conditional. For example:
if ((x != NULL) && (x->value == key))
If you reverse these two expressions, then you could get a seg fault
if x is a null pointer.
- the second expression could have side-effects that alter the
first expression and change the outcome of the condition. For
example:
x = 9;
if ((x < 10) || (test_and_increment(x, 9)))
This is a somewhat silly example, but suppose that test_and_increment
takes x as a reference parameter, increments it, and then
compares it with the passed in argument. If it is less than the argument
it returns true, and otherwise returns false. If the compiler executes
the expressions in the order given, then the conditional is true, because
x is less than 10. However, if it executes the second
expression first, then test_and_increment will increment x
to be 10, return false, and cause the first expression to fail as well.
Thus the outcome of the conditional will be false.
- the second expression could have side-effects that the programmer
does not expect to be executed if the first expression determines
the value of the conditional. In the above example, the programmer
would not expect x to get incremented if it is less than 10.
(8 points) Reference counting has a "flaw" that can prevent big chunks of memory
from being garbage collected, even if that memory is no longer accessable.
In three sentences or less, describe how this flaw can prevent the garbage
collection of memory.
The garbage collector will not be able to reclaim any circular structures
that have no outside reference, since every element in the circular
structure will have a reference. However, since there is no outside
reference to this structure, the memory should be garbage collected, since
it is inaccessable to the rest of the program.
- (8 points) In three sentences or less describe how a compiler can optimize a tail recursive
function.
Since no computation must be done after the tail recursive function returns,
the compiler can re-use the activation record for the current recursive
call, rather than allocating a new activation record.
- (8 points) In three sentences or less describe why it is important to know whether
a multi-dimensional array is organized in memory in row major or column
major order when you are writing a series of nested loops that will end
up touching each element in the array, as in:
for (i = 0; i < 10; i++)
for (j = 0; j < 20; j++)
for (k = 0; k < 30; k++)
... a[i][j][k]
As a programmer you want to organize your loops so that they access
contiguous elements in memory, in order to take advantage of cache lines
(when memory is retrieved from cache, normally several elements come in
on a cache line), which in turn will speed up your program.
- (10 points) Suppose you are given the follow declarations:
typedef struct {
double centimeters;
double meters;
} measure;
typedef measure metric_measure;
struct unitA {
double centimeters;
double meters;
};
struct unitB {
double meters;
double centimeters;
};
Answer the following questions:
- Under structural type equivalence, which, if any, of the above types are
considered equivalent?
measure, metric_measure, unitA
- Under strict name type equivalence, which, if any, of the above types are
considered equivalent?
None are equivalent. Strict name equivalence requires that two types
have the same name in order to be equal.
- (18 points) Consider the following two pseudo-C code files:
A.c | B.c |
Imports: add, z | Imports:nothing |
Exports:x,y,main | Exports:add, z |
Relocatable Names: main, x, y, z | Relocatable Names: add, z |
Code:
int x;
int y;
int main() {
z = add(x,y);
print x, y, z;
}
|
Code:
int z;
int add(int a, int b) {
return a + b;
}
|
Data:x,y | Data: z |
Fill in the remainder of the table as follows:
- Imports: Give a comma-separated list of all names imported into this file
- Exports: Give a comma-separated list of all names exported from this file. Assume that all
names in the global namespace of the file get exported.
- Relocatable names: Give a comma-separated list all the names in this file that might have
to be assigned new addresses by the linker. Include in this list both
names that appear in the data section and those that appear in the
code section, but list a name at most once: All globally declared
variables, all function names, and all imported variables are relocatable.
Local variables declared within functions and parameters are not
relocatable since they do not have to be assigned new addresses by the
linker. The compiler assigns offsets within an activation record to each
local variable and parameter, and these offsets are dynamically added
at run-time to a frame pointer to obtain the actual memory address for
the variable or parameter.
- Data: Give a comma-separated list all names in the file that would be allocated storage in
this section. Global variables go in the data section, but not local
variables or parameters. The latter two types of varibles go on the
stack.
If there are no names for a particular section, just leave that section
blank.
- (20 points) Using the attached figure from the Scott text, show the assembly code that
would be generated for the following code fragment. To assist you, I have
also shown the abstract syntax tree for this code :
while (i < 10)
sum = sum + i * i;
i = i + 1;
while-----------------------------
/ \ \
< := ------------ null
/ \ / \ \
i 10 sum + := ------
/ \ / \ \
sum * i + null
/ \ / \
i i i 1
goto L1
L2: r0 = sum
r1 = i
r2 = i
r1 = r1 * r2
r0 = r0 + r1
sum = r0
r0 = i
r0 = r0 + 1
i = r0
L1: r0 = i
r0 = i < 10
if r0 goto L2