TL;DR
Enterprise AI bills are tripling despite a 98% drop in per-token prices, as agentic tools drive consumption 18.6x higher per developer. The Linux Foundation is launching the Tokenomics Foundation to bring cost discipline to AI spending.
Uber blew through its entire 2026 AI coding budget by April. Microsoft revoked its developers’ Claude Code licences six months after enabling them. One company reportedly ran up a $500 million Claude bill in a single month after forgetting to set usage limits. A Priceline employee told TechCrunch that a routine Cursor contract renewal came back four to five times more expensive.
The pattern is the same everywhere. Per-token prices have collapsed, but the push for autonomous AI agents has sent consumption through the roof. Companies that gorged themselves on all-you-can-eat subscriptions in early 2025 are now scrambling to understand where the money went, and whether any of it produced a return.
The paradox in numbers
GPT-4-equivalent performance now costs roughly $0.40 per million tokens, down from $20 per million in late 2022. That is a 98% reduction. Yet enterprise AI bills have risen by an estimated 320%, according to multiple industry analyses. The average enterprise AI budget has grown from $1.2 million per year in 2024 to $7 million in 2026.
The culprit is volume. Agentic AI tools released since November 2025, including Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, have multiplied token consumption per task. A simple linear workflow in 2023 cost about $0.04 per interaction. An orchestrated agentic system in 2026 costs roughly $1.20, about 30 times more. Individual engineers at Microsoft were reportedly spending between $500 and $2,000 a month on tokens before the licences were pulled.
Nicholas Arcolano, head of research at engineering management platform Jellyfish, told TechCrunch that per-developer consumption has risen roughly 18.6 times in nine months. Engineers who used the most tokens were about twice as productive as lighter users, but they spent 10 times the tokens to get there. “Whether extreme spend pays off comes down to the ultimate business value of shipped code, which most companies still can’t measure,” Arcolano said.
From tokenmaxxing to guardrails
“Six months ago, I would have a conversation with a customer and it would be all about ‘What can it do? Is it good enough?’” Alexander Embiricos, OpenAI’s head of enterprise, told TechCrunch. “Now the conversations are about, ‘We’re spending so much. What visibility do you have? What token controls do you have?’”
J.R. Storment, executive director of the FinOps Foundation, described the shift bluntly. “In April and May, I started hearing from companies: ‘Oh my god, we are 3x over our entire 2026 token budget and it’s only April.’ The whole conversation shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how do we control this?’”
Priceline’s senior director of IT finance, Chris Reed, drew a comparison to the telecom billing era. “It’s like the crack-cocaine epidemic. They let you try it to get you hooked, and now you’re kind of beholden to it.” The company has begun placing token limits on certain groups. Reed said he is already seeing discrepancies between vendor-reported usage and Priceline’s internal data.
The Tokenomics Foundation
It is against this backdrop that the Linux Foundation this week unveiled plans for the Tokenomics Foundation, a new standards body aiming to bring the same cost discipline to AI tokens that FinOps brought to cloud spending.
The Foundation plans to build a canonical definition of “tokenomics,” open standards for AI token usage and billing, and new metrics including cost-per-intelligence and tokens-per-watt. A formal launch is planned for July. Nishant Gupta, chief availability officer at Salesforce, said in a statement that “token economics is fundamentally more abstract and opaque than anything we’ve managed at this scale before.”
The challenge is enormous. “Tracking cloud costs is a hundreds-of-millions-of-rows-a-month data problem,” Storment said. “Tracking token costs is a trillions-of-rows-a-month data problem.”
A market forms around the problem
Startups and established vendors are racing to fill the gap. Pay-i tracks and optimises AI spending. Paid lets developers bill based on actual value rather than subscription fees. Jellyfish, Waydev, and Faros AI provide agent monitoring to prove the ROI of developer tools. Ramp has moved into AI spend management. Datadog and New Relic have added token-level observability.
Model routing is emerging as the primary cost lever. Factory, an enterprise AI coding startup, launched a model router this week that automatically picks the cheapest adequate model for each task. Vitaly Gordon, CEO of Faros AI, said frontier labs are already doing this internally. “The financial report for how much you spend on Anthropic, even if you call the Opus model, some of the spend will be on Sonnet or Haiku, because they are smart enough to do it,” he said.
Goldman Sachs projects global token usage will multiply 24 times by 2030. The companies already over budget need solutions now, and the Tokenomics Foundation’s first deliverable is still months away. As Gordon put it: “Maybe we created a steam engine, but we still haven’t figured out the assembly line.”


