Can we make AI less power-hungry? These researchers are working on it.

You May Be Interested In:‘We got stuck in puddles’: skiers upset by lack of snow on Swedish slopes



For the tests, his team used setups with Nvidia’s A100 and H100 GPUs, the ones most commonly used at data centers today, and measured how much energy they used running various large language models (LLMs), diffusion models that generate pictures or videos based on text input, and many other types of AI systems.

The largest LLM included in the leaderboard was Meta’s Llama 3.1 405B, an open-source chat-based AI with 405 billion parameters. It consumed 3352.92 joules of energy per request running on two H100 GPUs. That’s around 0.93 watt-hours—significantly less than 2.9 watt-hours quoted for ChatGPT queries. These measurements confirmed the improvements in the energy efficiency of hardware. Mixtral 8x22B was the largest LLM the team managed to run on both Ampere and Hopper platforms. Running the model on two Ampere GPUs resulted in 0.32 watt-hours per request, compared to just 0.15 watt-hours on one Hopper GPU.

What remains unknown, however, is the performance of proprietary models like GPT-4, Gemini, or Grok. The ML Energy Initiative team says it’s very hard for the research community to start coming up with solutions to the energy efficiency problems when we don’t even know what exactly we’re facing. We can make estimates, but Chung insists they need to be accompanied by error-bound analysis. We don’t have anything like that today.

The most pressing issue, according to Chung and Chowdhury, is the lack of transparency. “Companies like Google or Open AI have no incentive to talk about power consumption. If anything, releasing actual numbers would harm them,” Chowdhury said. “But people should understand what is actually happening, so maybe we should somehow coax them into releasing some of those numbers.”

Where rubber meets the road

“Energy efficiency in data centers follows the trend similar to Moore’s law—only working at a very large scale, instead of on a single chip,” Nvidia’s Harris said. The power consumption per rack, a unit used in data centers housing between 10 and 14 Nvidia GPUs, is going up, he said, but the performance-per-watt is getting better.

share Paylaş facebook pinterest whatsapp x print

Similar Content

NATO tests satellite internet as backup to sabotaged undersea cables
NATO tests satellite internet as backup to sabotaged undersea cables
Temporary scalp tattoo can be used to record brain activity
Temporary scalp tattoo can be used to record brain activity
Electric cars now last as long as petrol and diesel counterparts
Electric cars now last as long as petrol and diesel counterparts
How 'quantum software developer' became a job that actually exists
How ‘quantum software developer’ became a job that actually exists
Computer Vision - Field of Artificial Intelligence that Enables Computers to Extract Meaningful Information from Digital Images - Conceptual Illustration
You can now download the source code that sparked the AI boom
Google is bringing every Android game to Windows in big gaming update
Google is bringing every Android game to Windows in big gaming update
The News Spectrum | © 2025 | News