28 September 2012Abstract:
This thesis deals with how to develop scientific computing software that runs efficiently on multicore processors. The goal is to find building blocks and programming models that increase the productivity and reduce the probability of programming errors when developing parallel software.
In our search for new building blocks, we evaluate the use of hardware transactional memory for constructing atomic floating point operations. Using benchmark applications from scientific computing, we show in which situations this achieves better performance than other approaches.
Driven by the needs of scientific computing applications, we develop a programming model and implement it as a reusable library. The library provides a run-time system for executing tasks on multicore architectures, with efficient and user-friendly management of dependencies. Our results from scientific computing benchmarks show excellent scaling up to at least 64 cores. We also investigate how the execution time depend on the task granularity, and build a model for the performance of the task library.
Available as PDF (1.92 MB)
Download BibTeX entry.