Computer Science Department, Stanford University
The cache behavior of a parallel program is a critical factor in the performance of the program on a shared-memory multiprocessor. In particular, cache misses and contention between caches must be minimized for good performance, especially for large-scale systems. This will become more important as processor performance improves relative to the speed of memory and large-scale interconnects. In previous work, on shared-memory multiprocessor machines, we hypothesized that cache architecture sensitive restructuring (CASPAR) could significantly improve the cache behavior of parallel programs, particularly for machines that use large cache blocks,
We present the results of an initial investigation in which we measured the impact of CASPAR on the caching behavior of MP3D, a parallel partical simulator for rarefied flow. We achieved an order of magnitude improvement in the cache miss ratio of shared data by restructuring the program. Although a single effort is an inadequate basis for general conclusions, this favorable experience suggest that further efforts to generalize our strategies will be worthwhile.
* On leave from the Computer Science Department, University of the Witwatersrand.