Micron Technology revealed that it has commenced sampling the industry’s first 8-high 24GB HBM3 Gen2 memory, ushering in a new era of high-performance data center solutions.
The HBM3 Gen2 memory boasts remarkable features, including bandwidth greater than 1.2TB/s and pin speed over 9.2Gb/s, representing a staggering 50 percent improvement over currently available HBM3 solutions.
Micron’s latest offering brings unprecedented performance, capacity, and power efficiency to the forefront of artificial intelligence (AI) data centers. With a remarkable 2.5 times performance per watt improvement over previous generations, the HBM3 Gen2 is set to revolutionize the AI landscape. The improved training times for large language models, like the highly anticipated GPT-4 and beyond, are expected to streamline AI research and development, making it more efficient and cost-effective.
The foundation of Micron’s HBM solution is built upon its industry-leading 1β (1-beta) DRAM process node, allowing the assembly of a 24Gb DRAM die into an 8-high cube within an industry-standard package dimension. Notably, Micron will also be introducing a 12-high stack with 36GB capacity in the first quarter of calendar 2024, providing a substantial 50 percent increase in capacity for a given stack height when compared to existing competitive solutions.
The key to the HBM3 Gen2’s success lies in its ability to manage the intense power demands of modern AI data centers. Micron has achieved this by implementing various advancements, such as doubling the through-silicon vias (TSVs) over competitive HBM3 offerings, reducing thermal impedance through a five-time increase in metal density, and designing an energy-efficient data path.
Micron’s impressive technological strides have been recognized by TSMC, leading to a partnership in the 3DFabric Alliance. TSMC has received samples of Micron’s HBM3 Gen2 memory and is collaborating with the company to evaluate its potential for high-performance computing (HPC) applications.
The significance of Micron’s HBM3 Gen2 solution lies in its ability to meet the increasing demands of generative AI for multimodal, multitrillion-parameter AI models. With 24GB of capacity per cube and pin speeds exceeding 9.2Gb/s, the HBM3 Gen2 significantly reduces training times for large language models by over 30 percent, resulting in a lower total cost of ownership. Moreover, the offering allows for a significant increase in queries per day, optimizing the efficiency of trained AI models.
For modern AI data centers, Micron’s HBM3 Gen2 memory translates into tangible cost savings. For instance, with an installation of 10 million GPUs, every five watts of power savings per HBM cube is estimated to save operational expenses of up to $550 million over a five-year period.
Micron’s Vice President and General Manager of the Compute Products Group, Praveen Vaidyanathan, expressed the company’s focus on unleashing superior AI and HPC solutions for customers and the industry. The programmable Memory Built-In Self Test (MBIST) capability of the HBM3 Gen2 product allows for seamless integration and improved testing, leading to a faster time to market.
NVIDIA’s Vice President of Hyperscale and HPC Computing, Ian Buck, expressed enthusiasm for working with Micron on the HBM3 Gen2, recognizing the importance of accelerated computing and high bandwidth with energy efficiency in driving AI innovation.
Micron previously introduced 1α (1-alpha) 24Gb monolithic DRAM die-based 96GB DDR5 modules for capacity-hungry server solutions. The company aims to make available its 1β 32Gb monolithic DRAM die-based 128GB DDR5 modules in the first half of calendar 2024, showcasing Micron’s efforts to push the boundaries of AI server technology.