Removing the Overhead from Software-Based Shared Memory
Zoran Radovic and Erik Hagersten
In Proceedings of Supercomputing 2001 (SC2001), Denver, Colorado, USA, November 2001.
The implementation presented in this paper -- DSZOOM-WF -- is a sequentially consistent, fine-grained distributed software-based shared memory. It demonstrates a protocol-handling overhead below a microsecond for all the actions involved in a remote load operation, to be compared to the fastest implementation to date of around ten microseconds. The all-software protocol is implemented assuming some basic low-level primitives in the cluster interconnect and an operating system bypass functionality, similar to the emerging InfiniBand standard. All interrupt- and/or poll-based asynchronous protocol processing is completely removed by running the entire coherence protocol in the requesting processor. This not only removes the asynchronous overhead, but also makes use of a processor that otherwise would stall. The technique is applicable to both page-based and fine-grain software-based shared memory. DSZOOM-WF consistently demonstrates performance comparable to hardware-based distributed shared memory implementations.
Available as PDF (205 kB)
BibTeX file entry: Radovic:2001:nov