Neural Research

GPU availability defined the AI infrastructure narrative. The H100 allocation wars of 2023–2024 framed the constraint clearly: compute was the bottleneck. More GPUs meant more capability, more advantage, more justification for spend.

That constraint was real. It has now been solved.

H100 cloud prices dropped from $7–8/hour to $1.49–3.90/hour. AWS cut prices by 44% in a single move in June 2025. H200 with 141GB HBM3e is widely available. Blackwell architecture promises order-of-magnitude inference gains.

Compute is no longer the bottleneck.

Power is.

Why Power Became the Bottleneck

(A) AI Workloads Are Structurally Energy-Intensive

AI workloads are GPU-dense
They run continuously (24/7)
Cooling alone consumes ~40% of total energy

This creates a fundamentally different load profile:

Not elastic cloud demand — but continuous industrial baseload consumption

(B) Power Systems Were Not Designed for This

Electric grids assume:

Gradual demand growth
Distributed consumption
Predictable peaks

AI breaks all three:

Hyperscale data centers create city-scale demand at a single node
GPU clusters generate millisecond-level load variability
Grid interconnection timelines take years, not months

👉 Result: Compute scales in months. Power scales in years.

By 2026, global data center electricity consumption approaches 1,050 terawatt-hours — roughly equivalent to Japan’s total consumption.

AI data centers consume 26% of Virginia’s electricity
Ireland: 21% of national electricity, projected 32%
Wholesale electricity costs rose up to 267% near data center hubs

The Real Crisis: Deliverable Energy

The issue is not total energy supply. It is localized, reliable, always-on power.

Data centers still use only ~2% of global electricity
Yet projects are delayed due to local grid constraints
Regions show visible power stress from AI clustering

👉 This is a distribution and reliability crisis, not a generation crisis.

PJM Interconnection saw capacity prices jump from $28.92 to $269.92 per MW-day in one year (~9x increase).

In one event, 1,500 MW dropped from the Virginia grid, affecting 339,000 households.

The Escalation

Tech companies are restarting nuclear plants
Massive infrastructure investments (~$580B in 2025 alone)
Projected $3 trillion global spend by 2030

Training frontier models is also scaling energy impact:

Grok 4: 72,000–140,000 tons CO₂
GPT-4: ~5,184 tons

This represents a 14–27x increase in emissions in just two years.

The Specific Mechanism

The constraint is structural:

Compute is global — it can be shipped
Power is local — it cannot

Key friction points:

Grid upgrades take 5–10 years
AI infrastructure scales in 18–24 months
Data center hubs are hitting physical power limits

The mismatch between these timelines defines the constraint.

Water adds another layer:

Cooling requires massive water usage
Facilities face resistance in water-stressed regions

Carbon adds regulatory risk:

EU and US policies are moving toward mandatory energy disclosure
Environmental reporting will become unavoidable

The Industry Cost

The impact is already visible:

Rising consumer electricity costs in data center regions
Grid stress externalized as public cost
Increasing regulatory pressure

For AI companies, power is now a location constraint:

You can deploy compute anywhere
You can only deploy power-heavy compute where infrastructure allows

The scarcity has shifted — from GPUs to grid capacity.

What Needs to Exist

AI Energy Accounting + Power-Aware Infrastructure

Real-time energy tracking per model and workflow
Power-aware routing of workloads
Infrastructure marketplaces based on grid availability

This is the equivalent of FinOps — but for energy.

The tools exist.

The standard does not.

The Real AI Infrastructure Crisis Is Power, Not Compute