One essential component and a common bottleneck in current virtual memory systems is the translation lookaside buffer (TLB), a small, specialised cache that speeds up memory accesses by storing recently used address translations. A TLB can be viewed as a hash table that only has the capacity for holding a subset of the actively used address translations.
The traditional way to increase the performance of a TLB (other than making it larger) is to increase associativity, typically performing multiple comparisons in parallel to avoid slowing down lookups; however, this is expensive in terms of chip area and energy consumption. Skewed associativity, i.e. using several different hash functions for parallel lookups, has been demonstrated to yield good results with less parallelism and therefore at a lower cost.
In skewed-associative models, the sets of possible placements for two entries may only partially overlap. Thus, the current placement of entries will limit future replacement possibilities. This is an inherent inflexibility in traditional skewed-associative models, since we cannot predict which placements will enable the most desirable future replacement choices.
This thesis demonstrates how the performance of skewed-associative TLB models can be enhanced further by reorganisation - moving old entries around to allow for more efficient replacements. This gives even more efficient usage of TLB locations, increasing performance without further complicating lookups. The thesis introduces and demonstrates a collision tendency metric that enables simple comparison of the conflict miss vulnerability for a multitude of associativity models and degrees of associativity over a large range of sizes.
Simulations demonstrate that using skewed-associative techniques and reorganisation, efficient TLBs can be implemented with far less parallelism in hardware, allowing for more compact and much less energy-consuming designs without sacrificing performance.
Additionally, this thesis discusses adapting the skewed-associative TLB with reorganisation to handle real-time requirements, notably in applications where tasks with different real-time needs are run concurrently.
Note: M.Sc. thesis
Available as PDF (483 kB), Postscript (1.31 MB), and compressed Postscript (440 kB)
Download BibTeX entry.