Analysis of application data cache behavior is important for program optimization and architectural design decisions. Current methods include hardware monitoring and simulation, but these methods lack from either limited flexibility or large run-time overhead that prevents realistic workloads. This paper describes a new fast and flexible tool based on StatCache. This tool is based on a probabilistic cache model instead of a functional cache simulator and use sparsely sampled run-time information instead of complete traces or sampled contiguous subtraces. A post-run analyzer calculates miss ratios of fully associative caches of arbitrary size and cache line size, from statistics gathered at a single run. It can also produce various data-locality metrics and give data-structure centric data-locality figures.
The implementation utilizes simple-hardware and operating-system support available in most operating systems and runs uninstrumented optimized code. We evaluate the method using the SPEC benchmark suite using the largest (ref) input sets and show that the accuracy is high. We also show the run-time overhead for this flexible "cache simulator" to be less than 20% for long-running applications, much faster than current simulators.
Available as compressed Postscript (514 kB)
Download BibTeX entry.