Performance implications of context switches no misses to dram.

Date
2014-03-24
Authors
Meerdervoort, Lance Pompe v.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Advances in microprocessor technology have resulted in the situation where CPU performance is improving faster than DRAM main memory performance. While it is true that current differences between CPU and DRAM speeds are not yet at the point where DRAM should be considered a slow peripheral, it is worthwhile considering the possibility o f current trends continuing long enough that DRAM does resemble a slow peripheral. Eventually, a cache miss to DRAM could resemble a main memory page fault to disk - an expensive delay being experienced while the miss is handled. The logical approach may be to handle the miss as a page fault to disk is typically handled - to perform a context switch. A basic requirement for context switching on DRAM accesses is a fast context switch. The RAMpage memory hierarchy, which handles the 1 2 cache as a paged memory and main memory as secondary storage, is the ideal platform for evaluating this idea. The reasons for this are twofold. Firstly, the RAMpage model, in treating the secondary cache as main memory,allows context-switching code to be pinned in SRAM, meaning that this code cannot cause references to DRAM once in SRAM. This ensures a faster context switch on average and avoids the complications o f recursion in the miss handler when switching on a miss causes a miss itself. Secondly, although the RAMpage hierarchy has been shown to substantially reduce the number o f references to DRAM, this benefit is offset by an increase in the time needed to handle a DRAM access. Thus there exists good potential for improved CPU utilization and therefore performance improvement through context switching. This is particularly true for large page sizes, in which fewer but more expensive misses occur. The objective o f this study is to investigate the feasibility of context switching as an approach to dealing with the increasing gap between CPU and DRAM performance and to evaluate the scalability o f this approach as the speed gap widens. This objective is achieved by using tracedriven simulation to model the performance o f the RAMpage hierarchy with context switching on page faults to DR, *M. Overall results show context switching on page faults improves RAMpage performance from a previous maximum o f 17% over the standard hierarchy, to 25%. A secondary result shows that pinning code used to perform context switches in SRAM under the RAMpage hierarchy substantially improves average context switching time, particularly when large pages are used. Although a more detailed study would be needed to verify the merits o f this approach fully,this work demonstrates that context switching on a DRAM access is an approach worthy of consideration, particularly when looked at in the context o f the RAMpage model.
Description
Keywords
Citation
Collections