Licentiate thesis 2005-001

Efficient Memory Management for Message-Passing Concurrency - part I: Single-threaded execution

Jesper Wilhelmsson

May 2005


Manual memory management is error prone. Some of the errors it causes, in particular memory leaks and dangling pointers, are hard to find. Manual memory management becomes even harder when concurrency enters the picture. It therefore gets more and more important to overcome the problems of manual memory management in concurrent software as the interest in these applications increases with the development of new, multi-threaded, hardware.

To ease the implementation of concurrent software many programming languages these days come with automatic memory management and support for concurrency. This support, called the concurrency model of the language, comes in many flavors (shared data structures, message passing, etc.). The performance and scalability of applications implemented using such programming languages depends on the concurrency model, the memory architecture, and the memory manager used by the language. It is therefore important to investigate how different memory architectures and memory management schemes affect the implementation of concurrent software and what performance tradeoffs are involved.

This thesis investigates ways of efficiently implementing the memory architecture and memory manager in a concurrent programming language. The concurrency model which we explore is based on message passing with copying semantics. We start by presenting the implementation of three different memory architectures for concurrent software and give a detailed characterization of their pros and cons regarding message passing and efficiency in terms of memory management. The issue examined is whether to use private memory for all processes in a program or if there may be benefits in sharing all or parts of the memory. In one of the architectures looked at, called hybrid, a static analysis called message analysis is used to guide the allocation of message data.

Because the hybrid architecture is the enabling technology for a scalable multi-threaded implementation, we then focus on the hybrid architecture and investigate how to manage the memory using different garbage collection techniques. We present pros and cons of these techniques and discuss their characteristics and their performance in concurrent applications. Finally our experiences from turning the garbage collector incremental are presented. The effectiveness of the incremental collector is compared to the non-incremental version. On a wide range of benchmarks, the incremental collector we present is able to sustain high mutator utilization (about 80% during collection cycles) at a low cost.

This work is implemented in an industrial-strength implementation of the concurrent functional programming language Erlang. Our eventual goal is to use the hybrid architecture and the incremental garbage collector as the basis for an efficient multi-threaded implementation of Erlang. The work described in this thesis is a big step in that direction.

Available as Postscript (13.58 MB), compressed Postscript (1.11 MB), and PDF (929 kB)

Download BibTeX entry.