Department of Information Technology
Uppsala Architecture Research Team

Full Speed Ahead: Detailed Architectural Simulation at Near-Native Speed

Most microarchitecture simulators are several orders of magnitude slower than the systems they simulate. This leads to multiple problems:

  • Due to the slow simulation rate, simulation studies are usually limited to the first few billion instructions, which corresponds to less than 10% the execution time of many standard benchmarks. Since such studies only cover a small fraction of the applications, they run the risk of reporting unrepresentative application behavior.
  • The high overhead of traditional simulators make them unsuitable for hardware/software co-design studies where rapid turn-around is required.
  • Experimental setup is cumbersome since the high overhead makes interactive use painful.

We propose two sampling methodologies to address these issues: FSA (Full Speed Ahead) and pFSA (Parallel Full Speed Ahead), which exploit a combination of detailed simulation (gem5) and off-the-shelf hardware virtualization support (kvm) to accelerate simulation. We implement hardware virtualization as a separate CPU module in the gem5 full-system simulator. This enables both efficient performance sampling (FSA, pFSA) and traditional fast-forwarding to generate checkpoints. Using an efficient state copying mechanism, pFSA simulates multiple samples in parallel and achieves almost linear speedup to close to native speed.

We have demonstrated how virtualization can be used to fast-forward the gem5 full-system simulator at 90% of native execution speed on average across SPEC CPU2006. Using virtualized fast-forwarding, we have demonstrated how our pFSA that can be used to accurately estimate the IPC of standard workloads with an average error of 2.2% while still reaching an execution rate of 2.0 GIPS (63% of native) on average.

Execution rates

Comparison of Execution Rates

IPC prediction accuracy



Updated  2014-04-14 18:34:19 by Andreas Sandberg.