GPU availability defined the AI infrastructure narrative. The H100 allocation wars of 2023–2024 framed the constraint clearly: compute was the bottleneck. More GPUs meant more capability, more advantage, more justification for spend.
That constraint was real. It has now been solved.
H100 cloud prices dropped from $7–8/hour to $1.49–3.90/hour. AWS cut prices by 44% in a single move in June 2025. H200 with 141GB HBM3e is widely available. Blackwell architecture promises order-of-magnitude inference gains.
Compute is no longer the bottleneck.
Power is.
Why Power Became the Bottleneck
(A) AI Workloads Are Structurally Energy-Intensive
- AI workloads are GPU-dense
- They run continuously (24/7)
- Cooling alone consumes ~40% of total energy
This creates a fundamentally different load profile:
Not elastic cloud demand — but continuous industrial baseload consumption
(B) Power Systems Were Not Designed for This
Electric grids assume:
- Gradual demand growth
- Distributed consumption
- Predictable peaks
AI breaks all three:
- Hyperscale data centers create city-scale demand at a single node
- GPU clusters generate millisecond-level load variability
- Grid interconnection timelines take years, not months
👉 Result: Compute scales in months. Power scales in years.
By 2026, global data center electricity consumption approaches 1,050 terawatt-hours — roughly equivalent to Japan’s total consumption.
- AI data centers consume 26% of Virginia’s electricity
- Ireland: 21% of national electricity, projected 32%
- Wholesale electricity costs rose up to 267% near data center hubs
The Real Crisis: Deliverable Energy
The issue is not total energy supply. It is localized, reliable, always-on power.
- Data centers still use only ~2% of global electricity
- Yet projects are delayed due to local grid constraints
- Regions show visible power stress from AI clustering
👉 This is a distribution and reliability crisis, not a generation crisis.
PJM Interconnection saw capacity prices jump from $28.92 to $269.92 per MW-day in one year (~9x increase).
In one event, 1,500 MW dropped from the Virginia grid, affecting 339,000 households.
The Escalation
- Tech companies are restarting nuclear plants
- Massive infrastructure investments (~$580B in 2025 alone)
- Projected $3 trillion global spend by 2030
Training frontier models is also scaling energy impact:
- Grok 4: 72,000–140,000 tons CO₂
- GPT-4: ~5,184 tons
This represents a 14–27x increase in emissions in just two years.
The Specific Mechanism
The constraint is structural:
- Compute is global — it can be shipped
- Power is local — it cannot
Key friction points:
- Grid upgrades take 5–10 years
- AI infrastructure scales in 18–24 months
- Data center hubs are hitting physical power limits
The mismatch between these timelines defines the constraint.
Water adds another layer:
- Cooling requires massive water usage
- Facilities face resistance in water-stressed regions
Carbon adds regulatory risk:
- EU and US policies are moving toward mandatory energy disclosure
- Environmental reporting will become unavoidable
The Industry Cost
The impact is already visible:
- Rising consumer electricity costs in data center regions
- Grid stress externalized as public cost
- Increasing regulatory pressure
For AI companies, power is now a location constraint:
- You can deploy compute anywhere
- You can only deploy power-heavy compute where infrastructure allows
The scarcity has shifted — from GPUs to grid capacity.
What Needs to Exist
AI Energy Accounting + Power-Aware Infrastructure
- Real-time energy tracking per model and workflow
- Power-aware routing of workloads
- Infrastructure marketplaces based on grid availability
This is the equivalent of FinOps — but for energy.
The tools exist.
The standard does not.