This work focuses on the development of efficient techniques for capturing relevant performance information at runtime as well as the development of statistical modeling methods to predict and analyze application performance based on the information captured at runtime. Understanding an application behavior is necessary to fully exploit the available hardware. The memory hierarchy, latency hiding techniques, and many other advanced mechanisms in the CPU architecture dictate how fast applications execute. Simulation has been used to study the performance impact of these features, but is typically too slow to analyze large applications or datasets. Furthermore, fast modeling techniques are needed for providing guidance for schedulers, compilers and programmers.
This work seeks to develop new statistical and analytical models to predict and capture significant aspects of application performance and power consumption based on runtime information that can be cheaply captured. It also includes methods to cheaply collect such information.
Develop efficient statistical models of critical performance factors based on easily captured data. We will also develop the infrastructure to capture a wide variety of performance and execution information for real applications processing real data sets on real hardware. While the current modeling the memory hierarchy, future work will be expended to also cover other parts of computer systems.
Our work is based on three different core technologies, originally developed in this research group: Statistical reuse distance sampling, statistical modeling and on-line phase detection.
An application´s locality properties (e.g., its temporal and spatial locality) dictates how well it can utilize the cache hierarchy. Measuring an application´s stack-distance counts the number of UNOIQUE data pieces touch between two accesses to the same piece of data. This effectively captures the essential parts of an application´s locality properties and is used by many researchers to feed their analysis models. However, no proposal exists for how to efficiently capture stack distance information. We instead have proposed several new methods for capturing reuse distance information for an application, i.e., to count the number of other memory accesses between two accesses to the same piece of data. We have also implemented several efficient tools for capturing reuse distance information.
Ruse distance information alone cannot tell much about the properties of a system and will have to be refined to model system properties. We have developed several mathematical models for modeling caches with LRU replacement and random replacement based on reuse distance information. We have also shown that such caches can be effectively modeled even with very sparse reuse distance data.
The third core technology for efficient modeling is on-line phase detection. The behavior of an application tends to change over time and is often assembled from a handful of different behavior types, known as phases. Knowing the behavior for each such phace, as well as its fraction of the execution time, is sufficient for knowing the application´s overall behavior. Our approach involved techniques for detecting phase changes as well as detecting if the new phase has occurred before.
", In Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), White Plains, NY, USA, March 2010.
. Aug 2010 Springer.
.
.
2011. Senior: Erik Hagersten (Contact), David Black-Schaffer, Stefanos Kaxiras
Ph.D. students: Nikos Nikoleris, Muneeb Khan, David Eklöv, Andreas Sembrant