Skip to main content

Supermicro's Liquid-Cooled SuperClusters for AI Data Centers Powered by NVIDIA GB200 NVL72 and NVIDIA HGX B200 Systems, Deliver a New Paradigm of Energy-Efficient Exascale Computing

Supermicro's End-to-End Liquid-Cooled Solutions Propel Industry's Transition to Sustainable AI Data Centers with the NVIDIA Blackwell Platform

San Jose, Calif., – October 15, 2024 – Supermicro, Inc. (NASDAQ: SMCI), a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is accelerating the industry's transition to liquid-cooled data centers with the NVIDIA Blackwell platform to deliver a new paradigm of energy-efficiency for the rapidly heightened energy demand of new AI infrastructures. Supermicro's industry-leading end-to-end liquid-cooling solutions are powered by the NVIDIA GB200 NVL72 platform for exascale computing in a single rack and have started sampling to select customers for full-scale production in late Q4. In addition, the recently announced Supermicro X14 and H14 4U liquid-cooled systems and 10U air-cooled systems are production-ready for the NVIDIA HGX B200 8-GPU system.

"We're driving the future of sustainable AI computing, and our liquid-cooled AI solutions are rapidly being adopted by some of the most ambitious AI Infrastructure projects in the world with over 2000 liquid-cooled racks shipped since June 2024," said Charles Liang, president and CEO of Supermicro. "Supermicro's end-to-end liquid-cooling solution, with the NVIDIA Blackwell platform, unlocks the computational power, cost-effectiveness, and energy-efficiency of the next generation of GPUs, such as those that are part of the NVIDIA GB200 NVL72, an exascale computer contained in a single rack. Supermicro's extensive experience in deploying liquid-cooled AI infrastructure, along with comprehensive on-site services, management software, and global manufacturing capacity, provides customers a distinct advantage in transforming data centers with the most powerful and sustainable AI solutions."

https://www.supermicro.com/en/solutions/ai-supercluster

Supermicro's liquid-cooled SuperClusters for NVIDIA GB200 NVL72 platform-based systems feature the new advanced in-rack or in-row coolant distribution units (CDUs), and custom cold plates designed for the compute tray housing two NVIDIA GB200 Grace Blackwell Superchips in a 1U form factor. Supermicro's NVIDIA GB200 NVL72 delivers exascale AI computing capabilities in a single rack with Supermicro’s end-to-end liquid-cooling solution. The rack solution incorporates 72 NVIDIA Blackwell GPUs and 32 NVIDIA Grace CPUs, interconnected by NVIDIA's fifth generation NVLink network. The NVIDIA NVLink Switch system facilitates 130 terabytes per second (TB/s) of total GPU communication with extremely low latency, enhancing performance for AI and high-performance computing (HPC) workloads. In addition, Supermicro supports recently announced NVIDIA GB200 NVL2 platform, 2U air-cooled system featuring tightly coupled two NVIDIA Blackwell GPUs and two NVIDIA Grace CPUs that is suited for easy deployment with diverse workloads such as large LLM inference, RAG, data processing, and HPC applications.

Supermicro's leading 4U liquid-cooled systems and the new 10U air-cooled systems now support the NVIDIA HGX B200 8-GPU system and are ready for production. The newly developed cold plates and the 250kW capacity in-rack coolant distribution unit maximize the performance and efficiency of the 8-GPU systems, providing 64x 1000W NVIDIA Blackwell GPUs and 16x 500W CPUs in a single 48U rack. Up to 4 of the new 10U air-cooled systems can be installed and fully integrated in a rack, the same density as the previous generation, while providing up to 15x inference and 3x training performance.

SuperCloud Composer software, Supermicro's comprehensive data center management platform, provides powerful tools to monitor vital information on liquid-cooled systems and racks, coolant distribution units, and cooling towers, including pressure, humidity, pump and valve conditions, and more. SuperCloud Composer's Liquid Cooling Consult Module (LCCM) optimizes the operational cost and manages the integrity of liquid-cooled data centers.

BlackWell

Scaling the infrastructure for multi-trillion parameter AI models, Supermicro is at the forefront of adopting networking innovations for both InfiniBand and Ethernet, including NVIDIA BlueField®-3 SuperNICs and NVIDIA ConnectX® -7 at 400Gb/s, NVIDIA ConnectX® -8, SpectrumTM-4, and NVIDIA Quantum-3 to enable 800Gb/s networking for the NVIDIA Blackwell platform. The NVIDIA Spectrum-XTM Ethernet with Supermicro's 4U liquid-cooled and 8U air-cooled NVIDIA HGX H100 and H200 system clusters now powers one of the largest AI deployments to date.

From proof-of-concept (PoC) to full-scale deployment, Supermicro is a one-stop shop, providing all necessary technologies, liquid-cooling, networking solutions, and onsite installation services. Supermicro delivers a comprehensive, in-house-designed liquid-cooling ecosystem, encompassing custom-designed cold plates optimized for various GPUs, CPUs, and memory modules, along with multiple CDU form factors and capacity, manifolds, hoses, connectors, cooling towers, and monitoring and management software. This end-to-end solution seamlessly integrates into rack-level configurations, significantly boosting system efficiency, mitigating thermal throttling, and simultaneously reducing both the Total Cost of Ownership (TCO) and environmental impact of data center operations for the era of AI.

Supermicro at 2024 OCP Global Summit

  • New X14 4U liquid-cooled system with NVIDIA HGX B200 8-GPU system
  • Supermicro’s SuperCluster NVIDIA GB200 NVL72 Platform 4
  • H13 4U liquid-cooled system with NVIDIA HGX H200 8-GPU system
  • X14 JBOF system
  • X14 1U CloudDC with OCP DC-MHS design

Learn more at OCP Global Summit Booth #21, San Jose, California, October 15-17, 2024.