论文信息 - Self-Optimizing Memory Controllers: A Reinforcement Learning Approach

Self-Optimizing Memory Controllers: A Reinforcement Learning Approach

Efficiently utilizing off-chip DRAM bandwidth is a critical issue in designing cost-effective, high-performance chip multiprocessors (CMPs). Conventional memory controllers deliver relatively low performance in part because they often employ fixed, rigid access scheduling policies designed for average-case application behavior. As a result, they cannot learn and optimize the long-term performance impact of their scheduling decisions,and cannot adapt their scheduling policies to dynamic workload behavior.We propose a new, self-optimizing memory controller design that operates using the principles of reinforcement learning (RL)to overcome these limitations. Our RL-based memory controller observes the system state and estimates the long-term performance impact of each action it can take. In this way, the controller learns to optimize its scheduling policy on the fly to maximize long-term performance. Our results show that an RL-based memory controller improves the performance of a set of parallel applications run on a 4-core CMP by 19% on average (upto 33%), and it improves DRAM bandwidth utilization by 22%compared to a state-of-the-art controller.

[1] Eduard Ayguadé,et al. Increasing the number of strides for conflict-free vector access , 1992, ISCA '92.

[2] Rich Caruana,et al. Greedy Attribute Selection , 1994, ICML.

[3] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[4] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[6] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[7] Nicholas C. Gloy,et al. A Language For Describing Predictors And Its Application To Automatic Synthesis , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[8] Dirk Grunwald,et al. Evidence-based static branch prediction using machine learning , 1997, TOPL.

[9] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[10] J. Eliot B. Moss,et al. Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts , 1998, NIPS.

[11] Vipin Kumar,et al. ScalParC: a new scalable and efficient parallel classification algorithm for mining large datasets , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[12] Erik Brunvand,et al. Impulse: building a smarter memory controller , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[13] 本田雅一,et al. Intel Developer Forum詳細レポート , 1999 .

[14] S. Haykin,et al. A Q-learning-based dynamic channel assignment technique for mobile communication systems , 1999 .

[15] Trevor N. Mudge,et al. A performance comparison of contemporary DRAM architectures , 1999, ISCA.

[16] Sally A. McKee,et al. Access order and effective bandwidth for streams on a Direct Rambus memory , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[17] Richard E. Kessler,et al. The Alpha 21264 microprocessor , 1999, IEEE Micro.

[18] Sally A. McKee,et al. Dynamic Access Ordering for Streamed Computations , 2000, IEEE Trans. Computers.

[19] Zhao Zhang,et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality , 2000, MICRO 33.

[20] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[21] Trevor Mudge,et al. Modern dram architectures , 2001 .

[22] Daniel A. Jiménez,et al. Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[23] Patrick H. Worley,et al. Early Evaluation of the Cray X1 , 2003, SC.

[24] Rudolf Eigenmann,et al. Quantitative performance analysis of the SPEC OMPM2001 benchmarks , 2003, Sci. Program..

[25] Calvin Lin,et al. Adaptive History-Based Memory Schedulers , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[26] Scott Rixner,et al. Memory Controller Optimizations for Web Servers , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[27] Faye A. Briggs,et al. A study of performance impact of memory controller features in multi-processor server environment , 2004, WMPI '04.

[28] L. Kaelbling,et al. Mobilized ad-hoc networks: a reinforcement learning approach , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[29] Balaram Sinharoy,et al. POWER5 system microarchitecture , 2005, IBM J. Res. Dev..

[30] Gerald Tesauro,et al. Online Resource Allocation Using Decompositional Reinforcement Learning , 2005, AAAI.

[31] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[32] David Vengerov,et al. A Reinforcement Learning Framework for Dynamic Resource Allocation: First Results. , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[33] Santosh G. Abraham,et al. Chip multithreading: opportunities and challenges , 2005, 11th International Symposium on High-Performance Computer Architecture.

[34] Zhao Zhang,et al. A performance comparison of DRAM memory system optimizations for SMT processors , 2005, 11th International Symposium on High-Performance Computer Architecture.

[35] James E. Smith,et al. Fair Queuing Memory Systems , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[36] Calvin Lin,et al. Adaptive History-Based Memory Schedulers for Modern Processors , 2006, IEEE Micro.

[37] Jung Ho Ahn,et al. The Design Space of Data-Parallel Memory Systems , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[38] Jose Renau,et al. Effective Optimistic-Checker Tandem Core Design through Architectural Pruning , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[39] Jun Shao,et al. A Burst Scheduling Access Reordering Mechanism , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[40] Onur Mutlu,et al. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[41] Tao Li,et al. Informed Microarchitecture Design Space Exploration Using Workload Dynamics , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[42] Hsien-Hsin S. Lee,et al. Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[43] Won-Taek Lim,et al. Effective Management of DRAM Bandwidth in Multicore Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).