Skip to main content
Department of Information Technology

DSZOOM Home Page

Low Latency Distributed Software-Based Shared Memory

dszoom-logo.jpg

Project Overview

Software-implementations of shared memory are still far behind the performance of hardware-based shared memory implementations and are not viable options for most fine-grain shared-memory applications. The major source for their inefficiency comes from the cost of interrupt-based asynchronous protocol processing, not from the actual network latency. As the raw hardware latency of inter-node communication decreases, the asynchronous overhead in the communication becomes more dominant. Elaborate schemes, involving dedicated hardware and/or dedicated protocol processors, have been suggested to cut the overhead.

This project demonstrates how all the asynchronous overhead can be completely removed by running the entire coherence protocol in the requesting processor. This not only removes the asynchronous overhead, but also makes use of a processor that otherwise would stall. The technique is applicable to both page-based and fine-grain software shared memory.

The DSZOOM project is supported in part by Sun Microsystems, Inc., and the Parallel and Scientific Computing Institute (PSCI).

Project Contributors

Conference Publications and Presentations

This paper gives a complete overview of the basic DSZOOM system. It demonstrates how all interrupt- and/or poll-based asynchronous protocol processing can be completely removed by running the entire coherence protocol in the requesting processor.

This paper presents a runtime system concept that enables unmodified POSIX P1003.1c (Pthreads) compliant binaries to run transparently on clustered hardware.

This paper introduces a new write permission cache (WPC) technique that exploits spatial store locality and batches coherence actions at runtime.

  • Flexibility Implies Performance by Håkan Zeffer, Zoran Radovic, and Erik Hagersten. Appears in Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, April 2006.

Workshop Publications and Presentations

Technical Reports

  • TMA: A Trap-Based Memory Architecture by Håkan Zeffer, Zoran Radovic, Martin Karlsson, and Erik Hagersten. Technical report 2005-015, Department of Information Technology, Uppsala University, May 2005.
  • Flexibility Implies Performance by Håkan Zeffer, Zoran Radovic, and Erik Hagersten. Technical report 2005-013, Department of Information Technology, Uppsala University, April 2005.

Doctoral Thesis

Licentiate Thesis (Swedish 1/2 PhD degree)

Master's Thesis

Updated  2005-12-15 17:21:31 by Zoran Radovic.