AI Infrastructure Server Solutions For Enterprise

Largest Liquid-Cooled AI Cluster in the World

xAI’s Colossus supercomputer cluster achieves massive scale using the NVIDIA Spectrum-X Ethernet networking platform to connect 100,000 NVIDIA Hopper Tensor Core GPUs.

The most powerful liquid-cooled AI SuperCluster is designed to take xAI’s Grok AI to another era. Supermicro accelerates the industry’s transition to liquid-cooled AI data centers to deliver a new paradigm of energy efficiency for the rapidly-heightened demand of energy and power requirements of today’s AI infrastructure. With extensive experience deploying large-scale direct-to-chip (DLC) liquid-cooled AI systems, Supermicro’s leading liquid cooling technology is powering the most ambitious AI infrastructure projects in the world.

pic.twitter.com/eVdrVdi3b8
—Supermicro (@Supermicro_SMCI) October 25, 2024

“These are the most advanced AI servers on the market right now, for a few reasons. One is the degree of liquid cooling. The other is how serviceable they are.”
Patrick Kennedy, ServeTheHome

Read Article

Large Scale AI Training

Large Language Models, Generative AI Training, Autonomous Driving, Robotics

Large-Scale AI training demands cutting-edge technologies to maximize parallel computing power of GPUs to handle billions if not trillions of AI model parameters to be trained with massive datasets that are exponentially growing. Leveraging NVIDIA’s HGX H100/H200 SXM 8-GPU/4-GPU and the fastest NVlink® & NVSwitch® GPU-GPU interconnects with up to 900GB/s bandwidth, and fastest 1:1 networking to each GPU for node clustering, these systems are optimized to train large language models from scratch in the shortest amount of time. Completing the stack with all-flash NVMe for a faster AI data pipeline, we provide fully-integrated racks with liquid cooling options to ensure fast deployment and a smooth AI training experience.

Workload Sizes

Extra Large
Large
Medium
Storage

Extra Large Workload size: Liquid Cooled AI Rack Solutions — Liquid Cooled AI Rack Solutions
Learn More

Large Workload size: 8U 8-GPU System — 8U 8-GPU System
Learn More

Medium Workload size: 4U 4-GPU System — 4U 4-GPU System
Learn More

Resources

Server Rack setup for Large Scale AI Training

HPC/AI

Engineering Simulation, Scientific Research, Genomic Sequencing, Drug Discovery

Accelerating time to discovery for scientists, researchers and engineers, more and more HPC workloads are augmenting machine learning algorithms and GPU-accelerated parallel computing to achieve faster results. Many of the world’s fastest supercomputing clusters are now taking advantage of GPUs and the power of AI.

HPC workloads typically require data-intensive simulations and analytics with massive datasets and precision requirements. GPUs such as NVIDIA’s H100/H200 provide unprecedented double-precision performance, delivering 60 teraflops per GPU, and Supermicro’s highly flexible HPC platforms allow high GPU counts and CPU counts in a variety of dense form factors with rack scale integration and liquid cooling.

NVIDIA® HGX H100/H200 GPU — HGX H100/H200, H100 NVL & H200 NVL

NVIDIA® H100 NVL/H200 NVL GPU — HGX H100/H200, H100 NVL & H200 NVL

NVIDIA® Grace Hopper Superchip — Grace Hopper Superchip

Workload Sizes

Large
Medium

Large Workload size: 4U 4-GPU System or 8U 8-GPU — 4U 4-GPU System or 8U 8-GPU System
Learn More

Large Workload size: 8U SuperBlade® — 8U SuperBlade®
Learn More

Medium Workload size: 4U/5U 8-10 GPU PCIe — 4U/5U 8-10 GPU PCIe
Learn More

Medium Workload size: 1U Grace Hopper System — 1U Grace Hopper System
Learn More

Resources

Enterprise AI Inference & Training

Generative AI Inference, AI-enabled Services/Applications, Chatbots, Recommender System, Business Automation

The rise of generative AI has been recognized as the next frontier for various industries, from tech to banking and media. The race to adopt AI has begun as a source to breed innovation, significantly boost productivity, streamline operations, make data-driven decisions, and improve customer experience.

Whether it is AI-assisted applications and business models, intelligent human-like chatbots for customer service, or AI to co-pilot code generation and content creation, enterprises can leverage open frameworks, libraries, pre-trained AI models, and fine-tune them for unique use cases with their own dataset. As the enterprise adopts AI infrastructure, Supermicro’s variety of GPU-optimized systems provide open modular architecture, vendor flexibility, and easy deployment and upgrade paths for rapidly-evolving technologies.

Workload Sizes

Extra Large
Large
Medium

Extra Large workload size: 4U/5U 8-10 GPU PCIe — 4U/5U 8-10 GPU PCIe
Learn More

Medium Workload size: 6U SuperBlade® — 6U SuperBlade®
Learn More

Medium workload size: 2U MGX System — 2U MGX System
Learn More

Medium workload size: 2U Grace MGX System — 2U Grace MGX System
Learn More

Resources

Server Rack setup for Enterprise AI Inferencing & Training

Visualization & Design

Real-Time Collaboration, 3D Design, Game Development

Increased fidelity of 3D graphics and AI-enabled applications by modern GPUs is accelerating industrial digitization, transforming product development and design processes, manufacturing, and content creation with true-to-reality 3D simulations to achieve new heights of quality, infinite iterations at no opportunity costs, and faster time-to-market.

Build virtual production infrastructure at scale to accelerate industrial digitalization through Supermicro’s fully-integrated solutions, including the 4U/5U 8-10 GPU systems, an NVIDIA OVX™ reference architecture, optimized for NVIDIA Omniverse Enterprise with Universal Scene Description (USD) connectors, and NVIDIA-certified rackmount servers and multi-GPU workstations.

Workload Sizes

Large
Medium

Large workload size: 4U/5U 8 GPU — 4U/5U 8 GPU
Learn More

Medium workload size: 2U Hyper — 2U Hyper
Learn More

Medium workload size: AI Workstation — AI Workstations
Learn More

Medium workload size: Graphic Workstation — Graphic Workstations
Learn More

Resources

Server Rack setup for Visualization & Omniverse

Content Delivery & Virtualization

Content Delivery Networks (CDNs), Transcoding, Compression, Cloud Gaming/Streaming

Video delivery workloads continue to make up a significant portion of current Internet traffic today. As streaming service providers increasingly offer content in 4K and even 8K, or cloud gaming in a higher refresh rate, GPU acceleration with media engines is a must to enable multi-fold throughput performance for streaming pipelines while reducing the amount of data required with better visual fidelity, thanks to the latest technologies such as AV1 encoding and decoding.

Supermicro’s multi-node and multi-GPU systems, such as the 2U 4-Node BigTwin® system meet the stringent requirements of modern video delivery, each node supporting the NVIDIA L4 GPU with the ability to feature plenty of PCIe Gen5 storage and networking speed to drive the demanding data pipeline for content delivery networks.

Workload Sizes

Large
Medium
Small

Large workload size: BigTwin® 2U 4-Node — 2U 4-Node BigTwin®
Learn More

Medium workload size: CloudDC 2U UP — 2U UP CloudDC
Learn More

Small workload size: Hyper-E 2U DP — 2U DP Hyper-E
Learn More

Resources

Server Rack setup for Content Delivery & Virtualization

Edge AI

Edge Video Transcoding, Edge Inference, Edge Training

Across industries, businesses whose employees and customers engage at edge locations – in cities, factories, retail stores, hospitals, and many more – are increasingly investing in deploying AI at the edge. By processing data and utilizing AI and ML algorithms at the edge, businesses overcome bandwidth and latency limitations, enabling real-time analytics for timely decision making, predictive care and personalized services, and streamlined business operations.

Purpose-built, environment-optimized Supermicro Edge AI servers with various compact form factors deliver the performance needed for low-latency, open architecture with pre-integrated components, diverse hardware and software stack compatibility, and privacy and security featuresets required for complex edge deployments out of the box.

Workload Sizes

Extra Large
Large
Medium
Small

Extra large workload size: Hyper-E — Hyper-E
Learn More

Large workload size: Compact box edge system — Compact
Learn More

Medium workload size: Short-depth Multi-GPU Edge Server — Short-depth Multi-GPU Edge Server
Learn More

Small workload size: Embedded — Fanless
Learn More

Resources

Featured Solutions

COMPUTEX 2024 CEO Keynote

Rackmount-Server

1U-Doppelprozessor

2U-Doppelprozessor

Einzelner Prozessor

Multiprozessor

Produkt-Familien

GPU-Server

8U-GPU-Linien

4U-GPU-Linien

2U-GPU-Linien

1U-GPU-Linien

Twin Servers

FlexTwin™

BigTwin®

GrandTwin®

TwinPro®

Twin

FatTwin®

Blade-Server

SuperBlade®

MicroBlade®

MicroCloud

Storage Servers

Alle Speichersysteme

All-Flash NVMe

Top-Loading Storage

JBOF

Unternehmensoptimierter Speicher

JBOD-Storage-Gehäuse

Motherboards

Gehäuse

SuperRack®

Zubehör

Edge & Telecom Servers

Fanless Edge Systems

Compact Edge Systems

Outdoor Edge Systems

1U Edge Network Systems

5G/Telecom Systems

Eingebettete Komponenten

Eingebettete Motherboards

Eingebettetes Fahrgestell

Switches

Adapters

SuperWorkstations

Liquid-Cooled AI Development Platform

Einzelprozessor

Dual-Prozessor

Supero™ Gaming Solutions

KI-Infrastruktur

KI-SuperCluster

KI-Lösungen für Branchen

Edge AI

KI-Speicher

NVIDIA-Lösungen

AMD-Lösungen

Intel-Lösungen

HPC

Rack-Lösungen

Flüssigkeits­kühlung

Datenverwaltung

KI-Speicher

Software-definierte Speicherung und Speicher

Hyperkonvergente Infrastruktur

Veeam

Unternehmensanwendungen und Datenanalyse

Datentechnik

Datenbank & ERP

Microsoft

Cloud & Virtualisierung

Cloud Service Providers (CSPs)

Google Distributed Cloud

Kanonischer OpenStack

Red Hat OpenStack

Kubernetes

Virtual Desktop

5G, Edge Computing und IoT

5G and Telecom Solutions

Rakuten Symphony

IoT Edge-Lösungen

Flüssigkeitskühlung

Supermicro
COMPUTEX CEO Keynote