GPU "Napkin Math"

How fast is energy consumption in LLMs growing?

kWh
1 kWh = 1 Block
1 Laptop (Daily Usage)(~0.2 kWh)
Single GPU (H100, per hour)(~0.7 kWh)
GPT-1 Training(100 kWh)
MWh
1 MWh = 1,000 kWh
GPT-2 Training(1 MWh)
1 US Home (Annual)(~10 MWh)
Geohotz Tinybox Pro(~70 MWh)
GWh
1 GWh = 1,000 MWh
Stanford Natural Language Computing (64 H100, annual) (~392 MWh)
GPT-3 Training(1.287 GWh)
GPT-4 Training (Estimate)(10 GWh)
Small City (Annual)(~20 GWh)
TWh
1 TWh = 1,000 GWh
Meta Cluster (350k H100s, Annual)(~2.15 TWh)
Nuclear Power Plant (Annual)(~7.88 TWh)
OpenAI 2025 Cluster (Estimate)(~100 TWh)
PWh
1 PWh = 1,000 TWh
US Power Grid (Annual)(~4 PWh)
World Power Grid (Annual)(~30 PWh)
kWh
MWh
GWh
TWh
PWh

Inspired by github.com/chubin/late.nz [MIT License]

And from "Jeff Dean's latency numbers"