Miss Penalty Reduction Using Bundled Capacity Prefetching in Multiprocessors

Dan Wallin and Erik Hagersten

In Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), Nice, France, April 2003.

Abstract

While prefetch has proven itself useful for reducing cache misses in multiprocessors, traffic is often increased due to extra unused prefetch data. Prefetching in multiprocessors can also increase the cache miss rate due to the false sharing caused by the larger pieces of data retrieved. The capacity prefetching strategy proposed in this paper is built on the assumption that prefetching is most beneficial for reducing capacity and cold misses, but not communication misses. We propose a simple scheme for detecting the most frequent communication misses and suggest that prefetching should be avoided for those. We also suggest a simple and effective strategy for reducing the address traffic while retrieving many sequential cache lines called bundling. In order to demonstrate the effectiveness of these approaches, we have evaluated both strategies for one of the simplest forms of prefetching, sequential prefetching. The two new strategies applied to this bandwidth-hungry prefetch technique result in a lower miss rate for all studied applications, while the average amount of address traffic is reduced compared with the same application run with no prefetching. The proposed strategies could also be applied to more sophisticated prefetching techniques for better overall performance.

Available as PDF (667 kB)

BibTeX file entry: Wallin:2003:apr

UART Publications

Dan Wallin and Erik Hagersten

Abstract