The RAMpage hierarchy moves main memory up a level to replace the lowest-level cache by an equivalent-sized SRAM main mem- ory, with a TLB caching page translations for that main memory. This paper illustrates how more aggressive components higher in the hierarchy increase the fraction of total execution time spent waiting for DRAM. For an instruction issue rate of 1 GHz, the simulated standard hierarchy waited for DRAM 10% of the time, increasing to 40% at an instruction issue rate of 8 GHz. For a larger L1 cache, the fraction of time waiting for DRAM was even higher. RAMpage with context switches on misses was able to hide almost all DRAM latency. A larger TLB was shown to increase the viable range of RAMpage SRAM page sizes.
© Springer-Verlag 2003; published as a volume of LNCS
(PDF and HTML at Springer web site: subscription required; local Adobe Acrobat copy 204K)