Henrik Johansson, Dan Wallin and Sverker Holmgren
Information Technology
Uppsala University
Sweden
The performance of PDE solvers depends on many properties of the computer architecture. The ability to efficiently use cache memory is one such property. To analyze cache behavior, hardware counters can be used to gather rudimentary data like cache hit rate. However, such experiments can neither identify the reason for a cache miss, nor can they provide data from the communication associated with the cache memory system. By software simulation of a computer system, it is possible to derive e.g. the type of a cache miss and the amount of address traffic. A further advantage with this approach is that the simulated computer can be a fictious system.
We show how to simulate a cache memory with arbitrary parameters, both in theory and practice. We also describe the theory behind the simulation and the steps necessary to perform it. Finally, the cache memory behavior induced by a state-of-the-art PDE solver is studied. The results show that the solver exhibits some counter-intuitive charateristics that might be exploited to decrease the execution time.