AI coding adoption is exploding. But most engineering leaders are still measuring usage instead of outcomes. That creates a costly blind spot. There is a question that nobody in the AI industry wants you to ask.
Not OpenAI. Not Anthropic. Not Google. Not the dozens of startups selling AI coding agents to your engineering team. The question is simple: how much of the code your AI agents generate actually reaches production?
Not how much was generated. Not how many prompts were run. Not how many seats are active. How much survived code review, passed CI, got merged, deployed, and reached a customer. Most engineering leaders cannot answer this. And the AI providers have no incentive to help them find out.
The spend is real. The visibility is not.
According to the Stanford AI Spend Index, the median company now spends $86 per developer per month on AI coding tools. That is across 140 companies and over 113,000 developers. The top quartile spends more than $195. Some companies spend over $28,000 per developer per month.
Anthropic just crossed $30 billion in annualized revenue. Up from $9 billion four months ago. According to SemiAnalysis, 4% of all public GitHub commits are now authored by Claude Code. That is projected to exceed 20% by year-end. Linear’s CEO declared issue tracking dead in March.
Coding agents are installed in more than 75% of Linear’s enterprise workspaces. The money is flowing. The code is flowing. But nobody is tracking how much of that code actually ships.
The incentive problem nobody talks about
AI providers bill by tokens. The more tokens your engineers consume, the more revenue the provider earns. The provider gets paid when a token is consumed. Not when the code it generated passes review. Not when it gets merged. Not when it deploys. Not when it works in production.
This creates a structural misalignment. A developer who prompts an AI agent ten times to generate a function that gets rewritten by a human reviewer costs you ten times more than a developer who gets it right on the first prompt. The provider earns ten times more from the first developer. The second developer is worth ten times more to your organization.
Right now, most engineering leaders cannot tell the difference. They see a single line item on the AI bill. They have no idea which tokens produced production code and which produced waste.
This is not a conspiracy. It is a structural incentive problem. And it is the VP of Engineering’s problem to solve because the provider has no reason to solve it for them.
We have seen this before
In the early days of cloud computing, companies moved to AWS and Azure and spent aggressively. The promise was efficiency. The reality was waste. It took years for the FinOps discipline to emerge. Companies eventually realized they were overspending by 30 to 40 percent on cloud infrastructure because nobody was measuring what was actually being used.
AI spend is following the exact same pattern. Except the growth rate is faster and the measurement gap is wider. Cloud providers eventually had to accept cost optimization tooling because customers demanded it.
The same thing is about to happen in AI. The engineering leaders who measure first will optimize faster, negotiate better, and know which tools to keep and which to cut. The ones who do not will keep writing checks and hoping the output is worth it.
The measurement that matters
The missing layer is not more dashboards showing adoption curves and seat utilization. Engineering leaders already have plenty of those.
What is missing is the ability to follow AI-generated code from the moment it is created to the moment it reaches production. Commit-level attribution that shows which agent wrote the code, what percentage of a commit was AI-generated versus human-edited, whether it passed review or got rewritten, and whether it deployed or died.
When you connect AI spend to production outcomes you can finally answer the questions that matter. Which teams get real leverage from AI agents and which burn tokens with nothing to show for it. Which vendors produce code that ships clean and which create more work for reviewers. Whether your AI costs are going up because adoption is working or because it is failing expensively.
At Waydev, this is what we spent the last year building. We have been measuring engineering behavior at scale for nine years for companies like Dropbox, American Express, and PwC. AI changed the inputs. We extended the measurement layer to match.
The new platform tracks AI adoption, AI impact, and AI ROI across the full software development lifecycle, connecting what organizations spend on AI agents to what actually reaches production.
Adoption is not value
The AI industry is asking engineering leaders to trust that more usage equals more value. But usage and value are not the same thing.
Adoption is not value. Usage is not impact. Tokens consumed is not code shipped.
A team that generates 10,000 lines of AI code per week and ships 2,000 to production is not outperforming a team that generates 3,000 and ships 2,500. But on every adoption dashboard in the industry today, the first team looks better.
That is the blind spot. And it is getting more expensive every quarter. The era of unaudited AI spend is ending. The engineering leaders who build the measurement layer now will own the conversation about AI ROI for the next decade.
The ones who wait will spend the next decade explaining bills they never understood.


