
The electrical grid is designed to support loads that are relatively steady, such as lighting, household appliances, and industrial machines that operate at constant power. But data centers today, especially those running AI workloads, have changed the equation.
Data centers consume a significant percentage of power plant and transformer capacity. Traditionally, diverse activities in the centers could average out consumption. Training large-scale AI models, however, causes sudden fluctuations in how much power is needed and poses unique challenges for grid operators:
If power demand suddenly ramps up, it can take one minute to 90 minutes for generation resources to respond because of physical limitations in their ramp rates.Repeating power transients could cause resonance and stress equipment.
If the data center suddenly reduces its power consumption, the energy production systems find themselves with excess energy and no outlet.These sudden changes can be felt by other grid customers as spikes or sags in supplied voltage.
In this blog, we’ll detail how NVIDIA addresses this challenge through a new power supply unit (PSU) with energy storage in the GB300 NVL72. It can smooth power spikes from AI workloads and reduce peak grid demand by up to 30%. And it’s also coming to GB200 NVL72 systems.
We will describe the different solutions for training workloads at the start, for running at full load, and for the end of the training run. Then we’ll share measured results using this new power smoothing solution.
The impact of synchronized workloads
In AI training, thousands of GPUs operate in lockstep and perform the same computation on different data. This synchronization results in power fluctuations at the grid level. Unlike traditional data center workloads, where uncorrelated tasks “smooth out” the load, AI workloads cause abrupt transitions between idle and high-power states, as shown in Figure 1.
Power smoothing in GB300 NVL72
To address these challenges, NVIDIA is introducing a comprehensive power smoothing solution in the GB300 platform. It’s comprised of several mechanisms across different operational phases. Figure 3 (below) shows the power cap, energy storage, and GPU burn mechanisms that together smooth the power demand from the rack. We will explore each mechanism in the image from left to right.
We again show the example AI training GPU power consumption as a gray line. Then we added a green line to show the desired power profile—a smooth ramp-up, a flat steady state, and a smooth ramp-down.
With the new power cap feature, GPU power draw at the start of a workload is capped by the power controller. New maximum power levels are sent to the GPUs and gradually increased, aligning with the ramp rates the grid can tolerate. A more complex strategy is used for ramp-down; if the workload ends abruptly, the GPU burn system continues to dissipate power by operating the GPU in a special power burner mode.
The solution in ramp-down is power burn hardware and a software algorithm that senses GPU power has reduced to idle levels when the running average power drops. The software driver that implements the power smoothing algorithm engages the hardware power burner. The burner keeps using constant power as it waits for the workload to resume; if the workload doesn’t resume, the burner smoothly reduces the power consumption. If the GPU workload does resume, the burner disengages instantly. When a workload ends, the burner tapers off the power draw at a rate consistent with grid capabilities and then disengages.



