I was attending a presention by the vendor providing administration services for my employer's health insurance policies ("benefits", in American business vernacular).
When the dental policy was explained, a colleague asked if it covers the cost of dental guards used to reduce the damages of teeth grinding.
"It does not", answered the presenter, "but this isn't a common need as far as I know," emphasizing, they asked: "How many people here grind their teeth so hard they need such a device?"
Most of the software engineers in attandance raised their hands.
An often quoted number is that each ChatGPT query uses about 3 Watt-hours of electricity to answer, and I got nerd-sniped to find how that value was computed. But before sharing what I found, I'd like to answer a different question:
Should We Care?
Not today, I think.
Most of the incentives, and hence efforts, were directed at having an AI that produces better answers. So less brainpower was dedicated to minimizing the energy consumption of running the models. As more resources are dedicated to the issue, we'd see significant gains in energy utilization. We already got a glimpse of what is possible with DeepSeek-R1, where a strong incentive to reduce the compute required for building and running the model resulted in impressive gains, that directly translate to a reduction in the energy used.
Public attention is a scarce resource, and heated arguments on energy use distracts from discussing things that can't be solved by throwing more engineering effort at. What we need to care about today is the significant impact AI already has on our lives.
The implict assumption that OpenAI's servers are always running at their peak power consumption is probably incorrect, as they'd melt if it was the case. But I am even more uncomfrotable with the provenance of the fleet size and usage numbers: they are sourced from an analyst report, which has no publicly published sourcing for those numbers, and from what I can tell, they were made up as resonable assupmtions for the model.
In other words, the estimate of 3 Watt-hours per query isn't backed by real world data or measurements.
If you were on a video call with me, you've seen the two prints on the walls behind me. Colleagues sometime ask about those, so it's useful to have a ready reference to share as an answer:
Hanging to my left is a signed print by David Reeb of his 2021 painting titled Two Ceasers (Hadrianus & Vespasian).
Reeb had the following to say about it (in Tohu Magazine, 2021; translation mine):
Take here those portraits […] the “tyrant” emperors who are another model of politicians. To me, they and Putin represent something similar, they represent authority, a splendor that is inherently ridiculous.
When someone asks, I joke it is an ad for Kubernetes (iykyk).
Not visible on camera, the wall opposite me has bookshelves, various personal electronic and computing projects, family photos, and a framed poster for a Sunn 0))) show in San Francisco.
Years ago, at one of my first software jobs, my team and I were tasked with rewriting a math app to utilize a new software framework. That framework had the codename Glass, and we followed that theme and gave the app itself the codename Crystal, or Crystal Math in full.
As it was far from finished, the app couldn't run by itself, so we wrote a tool to run the lines of code that were ready, and named that tool Lines.
The team was quite productive, with everyone crackin' hard and running Lines of Crystal Math all day long. Or at least we did until management found out, and insisted we'd replace the existing codenames with some less addictive ones.