Write a program that declares an array A to have 16000 integer elements
and
initialize A so that each element has its index as its value. Then create
a real array B that will contain the running average of array A. That is,
B(I) = (A(I-1) + A(I) + A(I+1))/3.0
except at the end points. You code should do the initialization of A and
the running average in parallel using 8 threads. Experiment with the
different scheduling types by timing the loop with different schedules.
Write a program to multiply two large matrices together.
a) Compile for single-processor execution and time the program.
b) Compile for multithreaded execution (automatic insertion of
OpenMP directives) and time for 4, 8, 16, and 32 processors.
c) Compare with the multithreaded version of the vendor math library if
available.
Compile the program alias.f
and run on four threads. Can you see the
inefficiency in the program? Write a new version that is more efficient.