Programs

UPMARC Summer School on Multicore Computing

The program

Monday till Wednesday Lunches (12:00)

Restaurang Feiroz, Blåsenhus, 750 02 Uppsala, Tel: 018 - 591918

Monday Dinner (19:45)

Åkanten, Eriks Torg, 753 10 Uppsala, Tel: 018 - 150150

Tuesday Dinner (19:30)

Peppar Peppar, Suttungs gränd 3, 753 19 Uppsala, Tel: 018 - 131360

Talks

Lecture: Multiprocessor Mixed-Criticality Scheduling Slides (pdf )
Baruah Sanjoy Department of Computer Science and Engineering Washington University, USA In mixed-criticality systems, functionalities of different degrees of importance are implemented upon a shared platform. Verifying the correctness of all functionalities in such systems at a level of assurance corresponding to the verification requirements of the highest criticality level tends to lead to implementations that make very inefficient utilization of platform resources during run-time. Mixed-criticality scheduling theory was developed in order to be able to obtain more resource-efficient implementations of mixed-criticality systems. During its first decade, mixed-criticality scheduling theory was primarily focused upon obtaining a better understanding of systems implemented upon uniprocessor platforms; we will survey some recent research that have attempted to extend the theory to apply to multiprocessor platforms, and enumerate some challenges that must be overcome in order to develop a comprehensive theory of multiprocessor mixed-criticality scheduling.
Lecture: Selected Topics in Coherence and Consistency (Slides )
Michel Dubois Department of Electrical Engineering The University of Southern California, USA In this presentation I will expose the fundamentals of coherency, store atomicity and memory consistency and apply them to small and large scale multiprocessors with in-order and out-of-order CPUs. Time permitting I will also cover advanced topics on these subjects.

Lecture: Application programming on parallel and distributed computing platforms
Daniele Lezzi Barcelona Supercomputing Center, Spain The growing complexity of current computing systems includes hardware aspects (multicore processors, accelerators, GPUS, FPGAs, etc) and system aspects (cluster and clouds). This forces application developers to deal with technological issues out of the scope of their work. They need to be aware of the concurrency of the system, existing APIs to access the computing platform, etc. This makes the code less readable, less portable, and in general decreases the productivity of the programmer. Several programming models have been proposed to improve the situation that provide higher levels of abstraction that cope with this complex systems. Task-based programming models have proven to be the right approach to exploit large-scale parallelism of current systems, by enabling a data-flow execution model and avoiding global synchronization. COMPSs falls into this category and is able to exploit the inherent concurrency of sequential applications and execute them in distributed platforms, including clusters and clouds, taking into account data locality and node heterogeneity. This talk will describe the alternative programming models that exist to develop applications for parallel and distributed computing platforms, mostly focusing on COMPSs. Aspects that will be described are: programming syntax, runtime features, interoperability between different clouds and computing platforms, access to heterogeneous devices (GPUs, FPGAs), access to/from mobile devices, and development in new IoT platforms.
Lecture: P : Modular and Safe Asynchronous Programming
Shaz Qadeer Microsoft Research, USA We describe the design and implementation of P, an asynchronous event-driven programming language. P allows the programmer to specify the system as a collection of interacting state machines, which communicate with each other using events. P unifies modeling and programming into one activity for the programmer. Not only can a P program be compiled into executable code, but it can also be validated using systematic testing. P was first used to implement and validate the USB device driver stack that ships with Microsoft Windows 8 and Windows Phone. P is now also being used for the design and implementation of robotics and distributed systems inside Microsoft and in academia.
Lecture: Lock-free Concurrent Data Structures Slides (part1, part2 )
Philippas Tsigas Department of Computer Science and Engineering Chalmers University of Technology, Sweden. Concurrent data structures provide the means to multi-threaded applications to share data. Concurrent data structure designers are striving to maintain consistency of data structures while keeping the use of mutual exclusion and expensive synchronization to a minimum, in order to prevent the data structure from becoming a sequential bottleneck. Maintaining consistency in the presence of many simultaneous updates is a complex task. Standard implementations of data structures are based on locks in order to avoid inconsistency of the shared data due to concurrent modifications. Locks though introduce a sequential component in Amdahl's law. Lock-free implementations of data structures support concurrent access. They do not involve mutual exclusion and make sure that all steps of the supported operations can be executed concurrently. Lock-free implementations employ an optimistic conflict control approach, allowing several processes to access the shared data object at the same time. They suffer delays only when there is an actual memory conflict between operations that causes some operations to retry, most of the times part of the operation. This feature allows lock-free algorithms to perform much better when the number of threads increases. The course will discuss lock-free algorithms for concurrent data-structures with a focus on: i) algorithmic techniques, that can be used for devising efficient lock-free implementations; and ii) models and analysis frameworks for capturing the performance behavior of lock-free concurrent data structures.
Tutorial: Why Memory Consistency Models Matter… And tools for analyzing and verifying them (Slides)
Yatin Manerkar and Caroline Trippel Department of Computer Science Princeton University, USA Heterogeneous parallelism and specialization are widely-used architecture levers for achieving high performance and power efficiency, especially inside smartphones and mobile devices. However, with many systems-on-chip comprising 6-10 different instruction set architectures and complex storage hierarchies, heterogeneous parallelism unfortunately often comes with increased challenges for software reliability, interoperability, and performance portability. At the same time, memory consistency models (MCMs) which originated with Lamport’s articulation of Sequential Consistency in 1979 have risen in both importance and complexity over recent years. MCMs are central to systems design, by establishing the rules that determine something as fundamental as the value a load instruction might legally see returned. Despite their importance, MCMs are often incompletely or informally specified in ways that make them hard to design for, hard to verify, and hard to understand. Over the past four years, our work has explored a set of issues for heterogeneously parallel systems, particularly related to specifying and verifying memory consistency models (MCMs), from high-level languages, down through compilers and operating systems and ISAs, to microarchitecture specifications, and eventually down to the level of processor RTL. The suite of MCM verification tools we have developed (http://check.cs.princeton.edu) offers comprehensive and fast analysis of memory ordering behavior across multiple system levels. These tools have been used to find bugs in existing and proposed processors and in commercial compilers. They have also been used to identify shortcomings in the specifications of high-level languages (C++11) and instruction set architectures (RISC-V). Geared for parallel programmers and computer architects, this 3-hour tutorial is intended to be accessible and worthwhile for a broad audience. We will first offer introductory information on memory consistency model basics. We will then discuss the key MCM challenges that exist in today’s hardware and software designs. We will use real-life examples to motivate why MCMs matter, and how they can be more crisply specified and efficiently verified. Our examples will highlight the role of different system layers (high-level languages, compilers, operating systems, architecture, microarchitecture, and RTL). The tutorial will include hands-on segments using open-source tools (http://check.cs.princeton.edu/tutorial.html) and we invite attendees to BYO litmus tests or design scenarios to drive exploratory work with the tools at the end of the tutorial. Most of the tutorial has been given once before at ISCA 2017.