Optimal data layout for block-level random accesses to scratchpad
暂无分享,去创建一个
[1] Pei-Yun Tsai,et al. A Generalized Conflict-Free Memory Addressing Scheme for Continuous-Flow Parallel-Processing FFT Processors With Rescheduling , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[2] Cynthia A. Phillips,et al. Two-Level Main Memory Co-Design: Multi-threaded Algorithmic Primitives, Analysis, and Simulation , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[3] Cynthia A. Phillips,et al. k-Means Clustering on Two-Level Memory Systems , 2015, MEMSYS.
[4] C. Nicopoulos,et al. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory , 2006, ISCA 2006.
[5] Franz Franchetti,et al. A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing , 2013, 2013 IEEE International 3D Systems Integration Conference (3DIC).
[6] Onur Mutlu,et al. Ramulator: A Fast and Extensible DRAM Simulator , 2016, IEEE Computer Architecture Letters.
[7] Viktor K. Prasanna,et al. On-chip memory efficient data layout for 2D FFT on 3D memory integrated FPGA , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).
[8] Yong Chen,et al. HMC-Sim: A Simulation Framework for Hybrid Memory Cube Devices , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.
[9] Zhao Zhang,et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality , 2000, MICRO 33.
[10] Michael J. Levenhagen,et al. Exploring Memory Management Strategies in Catamount. , 2008 .
[11] Onur Mutlu,et al. Simultaneous Multi-Layer Access , 2016, ACM Trans. Archit. Code Optim..
[12] Jan Reineke,et al. Ascertaining Uncertainty for Efficient Exact Cache Analysis , 2017, CAV.
[13] Christian Bienia,et al. PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .
[14] Jung Ho Ahn,et al. CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[15] Luca Benini,et al. Logic-Base Interconnect Design for Near Memory Computing in the Smart Memory Cube , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[16] Jing-ling Yang. Parallel Interleavers Through Optimized Memory Address Remapping , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[17] Franz Franchetti,et al. Data reorganization in memory using 3D-stacked DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[18] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[19] Gabriel H. Loh,et al. 3D-Stacked Memory Architectures for Multi-core Processors , 2008, 2008 International Symposium on Computer Architecture.
[20] Cynthia A. Phillips,et al. Two-Level Main Memory Co-Design: Multi-threaded Algorithmic Primitives, Analysis, and Simulation , 2015, IPDPS.
[21] Hsien-Hsin S. Lee,et al. An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.