23.9 An 8-channel 4.5Gb 180GB/s 18ns-row-latency RAM for the last level cache

In recent years, the demand for memory performance has grown rapidly due to the increasing number of cores on a single CPU, along with the integration of graphics processing units and other accelerators. Caching has been a very effective way to relieve bandwidth demand and to reduce average memory latency. As shown by the cache feature table in Fig. 23.9.1, there is a big latency gap between SRAM caches in the CPU and the external DRAM main memory. As a key element for future computing systems, the last level cache (LLC) should have a high random access bandwidth, a low random access latency, a density of 1 to 8Gb, and all signal pads located on one side of the chip [1]. A logic-process-based solution was proposed [2], but it is not scalable, and has a high standby current due to its need for frequent refresh. HBM2 was also proposed [3], but its row latency is not better than conventional DRAM, and its random-access bandwidth is still limited by tFAW, as shown in Fig. 23.9.1. This paper describes the high-bandwidth low-latency (HBLL) RAM design: how it overcomes these challenges and meets requirements in a cost-effective way.

[1]  Sehat Sutardja,et al.  1.2 The future of IC design innovation , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[2]  Umut Arslan,et al.  13.1 A 1Gb 2GHz embedded DRAM in 22nm tri-gate CMOS technology , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[3]  Kyung Whan Kim,et al.  18.3 A 1.2V 64Gb 8-channel 256GB/s HBM DRAM with peripheral-base-die architecture and small-swing technique on heavy load interface , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).