Composing task-based codes on heterogeneous architectures
Andra Hugo, INRIA Bordeaux
Date and Time
Tuesday, April 7th, 2015 at 14:00.
Polacksbacken, room 1245
Enabling HPC applications to perform efficiently when invoking multiple parallel libraries
simultaneously is a great challenge. Even if a uniform runtime system is used
underneath, scheduling tasks or threads coming from different libraries over the same set of
hardware resources introduces many issues, such as resource oversubscription, undesirable
cache flushes or memory bus contention.
We extend StarPU, a runtime system specifically designed for heterogeneous architectures,
in order to allow multiple parallel codes to run concurrently with minimal interference.
Such parallel codes run within scheduling contexts that provide confined
execution environments which can be used to partition computing resources. Scheduling
contexts can be dynamically resized to optimize the allocation of computing resources
among concurrently running libraries. We rely on a hypervisor in order to automatically
expand or shrink contexts using feedback from the runtime system (e.g. resource utilization).
We demonstrate the relevance of this approach by extending an existing generic
sparse direct solver (qr mumps) to use these mechanisms and introduced a new
decomposition method based on proportional mapping that is used to build the
scheduling contexts. In order to cope with the very irregular behavior of the application, the
hypervisor manages dynamically the allocation of resources. By means of the scheduling contexts
and the hypervisor we improve the locality and thus the overall performance of the solver.