Synchronization is an indispensable part of parallel programs. Understanding the influence of synchronization is important for understanding the scalability of parallel programs.
For instance, locks are often necessary for protecting shared data, but they introduce lock contention, which causes performance issues. It would be desirable to have techniques that can assess how much a program would suffer from lock contention on aamulti-core systems. It would be even better if such assessment could be performed, based on information obtainable from profiling a run on a single core. If lock contention is shown to be a performance bottleneck, one of the ways to mitigate it is to use another lock implementation. Similarly, it would be desirable to have techniques for estimating cache performance under different cache configurations, given by cache size, associativity and replacement policy.
We develop techniques to answer what-if questions about performance via performance modeling. In performance modeling, the behavior of the software on some platform is represented by a model, which describes key features of this behavior at some level of abstraction. Thereafter, the model is instantiated with parameters of the software, usually obtained via observation or measurement.
In this project, we develop models that on the one hand can be analyzed to estimate performance for a range of platform configurations, and on the other hand can be constructed by low-cost software profiling on any available platform. We focus on two essential aspects of multi-core performance: cache performance and synchronization cost
We also develop techniques for efficient synchronization in parallel programs running on multicores. We focus on parallel discrete event simulation (PDES), which is a particularly challenging application, when it comes to parallelization. In collaboration with the Parallel Algorithms project in UPMARC, we realize these techniques to build an efficient simulator for spatial stochastic simulation.
- Xiaoyue Pan, Jonatan Lindén and Bengt Jonsson: Predicting the Cost of Lock Contention in Parallel Applications on Multicores using Analytic Modeling. Presented at Fifth Swedish Workshop on Multicore Computing,Nov. 22-23, 2012, KTH, Stockholm, Sweden
- Jonatan Lindén and Bengt Jonsson: A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention. in OPODIS 2013: 17th International Conference On Principles Of Distributed Systems December 16-18, Nice, France. LNCS 8304, pp. 206-220, Springer, 2013. DOI 10.1007/978-3-319-03850-6_15, Extended version as Technical Report 2013-025, Code and SPIN model for the algorithm.
- Xiaoyue Pan and Bengt Jonsson: Modelling Coherence Cache Misses on Multi-core with Reuse Distance. In ISPASS 2014, IEEE International Symposium on Performance Analysis of Systems and Software, March 23-25, 2014 Monterey, CA.
- Xiaoyue Pan and Bengt Jonsson: A Modeling Framework for Reuse Distance-based Estimation of Cache Performance. In ISPASS 2015, IEEE International Symposium on Performance Analysis of Systems and Software,, March 29-31, 2015, Philadelphia, PA
- Xiaoyue Pan: Performance Modeling of Multi-core Systems: Caches and Locks, Ph.D. thesis, Uppsala University, Dept. IT, 2016.
- P. Bauer, J. Linden, S. Engblom, B. Jonsson: Efficient Inter-Process Synchronization for Parallel Discrete Event Simulation on Multicores. In SIGSIM PADS 2015, ACM SIGSIM Conf. on Principles of advanced Discrete Simulation, June 15-17, 2015, London, UK.
- J. Linden, P. Bauer, S. Engblom, B. Jonsson: Exposing Inter-Process Information for Efficient Parallel Discrete Event Simulation of Spatial Stochastic Systems, in submission.
This project is funded by SSF (the Swedish Foundation for Strategic Research) as part of the project CoDeR-MP