The growing gap between processor clock speed and DRAM access time puts new demands on software and development tools. Deep memory hierarchies and high cache miss penalties in present and emerging computer systems make execution time sensitive to data locality. Therefore, developers of performance-critical applications and optimizing compilers must be aware of data locality and maximize cache utilization to produce fast code. To aid the optimization process and help understanding data locality, we need methods to analyze programs and pinpoint poor cache utilization and possible optimization opportunities.
Current methods for run-time analysis of data locality and cache behavior include functional cache simulation, often combined with set sampling or time sampling, other regularity metrics based on strides and data streams, and hardware monitoring. However, they all share the trade-off between run-time overhead, accuracy and explanatory power.
This thesis presents methods to efficiently analyze data locality at run time based on cache modeling. It suggests source- interdependence profiling as a technique for examining the cache behavior of applications and locating source code statements and/or data structures that cause poor cache utilization. The thesis also introduces a novel statistical cache-modeling technique, StatCache. Rather than implementing a functional cache simulator, StatCache estimates the miss ratios of fully-associative caches using probability theory. A major advantage of the method is that the miss ratio estimates can be based on very sparse sampling. Further, a single run of an application is enough to estimate the miss ratio of caches of arbitrary sizes and line sizes and to study both spatial and temporal data locality.
Available as compressed Postscript (1.31 MB)
Download BibTeX entry.