NVIDIA Blackwell Redefines AI Training Performance in Latest MLPerf Benchmarks

NVIDIA Leads AI Performance Benchmarks with Breakthrough Blackwell Architecture

NVIDIA is partnering with companies around the world to build AI factories — next-generation data centers designed to accelerate the training and deployment of cutting-edge AI applications. At the core of these efforts is the new NVIDIA Blackwell architecture, purpose-built to meet the intense performance demands of today’s rapidly evolving AI workloads.

In the latest MLPerf Training v5.0 benchmark — the 12th round since its inception in 2018 — the NVIDIA AI platform demonstrated unmatched performance across all benchmarks. It was the only platform to submit results for every category, including the most challenging test: pretraining the Llama 3.1 405B large language model (LLM). NVIDIA delivered the top results at scale, showcasing both its hardware leadership and software ecosystem.

Unrivaled Performance Across AI Workloads

NVIDIA’s submissions used two powerful AI supercomputers built on the Blackwell architecture:

Tyche, leveraging NVIDIA GB200 NVL72 rack-scale systems
Nyx, based on NVIDIA DGX B200 systems

In collaboration with CoreWeave and IBM, NVIDIA also submitted GB200 NVL72 results using a total of 2,496 Blackwell GPUs and 1,248 NVIDIA Grace CPUs.

Highlights from the MLPerf results:

Llama 3.1 405B Pretraining: Blackwell achieved 2.2x greater performance than previous-generation architectures at the same scale.
Llama 2 70B LoRA Fine-Tuning: A DGX GB200 NVL72 system with eight Blackwell GPUs delivered 2.5x better performance compared to a prior DGX H100 setup with eight H100 GPUs.

These gains underscore major advances in the Blackwell architecture, including:

High-density, liquid-cooled racks
13.4TB of coherent memory per rack
Fifth-gen NVIDIA NVLink and NVLink Switch for scale-up
NVIDIA Quantum-2 InfiniBand for scale-out performance

Powering the Future of Agentic AI

These innovations are not just about raw speed. NVIDIA is laying the foundation for agentic AI — applications capable of autonomous reasoning, interaction, and decision-making. These future systems will run in AI factories, creating tokens and intelligent output that will transform industries and research alike.

The NVIDIA NeMo Framework, part of a robust software stack that includes CUDA-X libraries, TensorRT-LLM, and NVIDIA Dynamo, helps drive these advances by simplifying multimodal LLM training and reducing time to deployment.

A Thriving Ecosystem of Partners

NVIDIA’s leadership is reinforced by its strong ecosystem. In addition to CoreWeave and IBM, companies such as ASUS, Cisco, Dell Technologies, Giga Computing, Google Cloud, Hewlett Packard Enterprise, Lambda, Lenovo, Nebius, Oracle Cloud Infrastructure, Quanta Cloud Technology, and Supermicro submitted competitive MLPerf results based on NVIDIA platforms.

With the Blackwell-powered platform, NVIDIA is not only raising the bar in AI performance — it’s reshaping how the world builds, trains, and deploys the intelligent systems of tomorrow.

Source link

NVIDIA Leads AI Performance Benchmarks with Breakthrough Blackwell Architecture

Unrivaled Performance Across AI Workloads

Powering the Future of Agentic AI

A Thriving Ecosystem of Partners

Leave a ReplyCancel Reply