The Unbreakable Pots

Why building the best AI model doesn’t guarantee capturing the most value

Dec 11, 2025

a group of clay pots sitting next to each other — Photo by Jen Dries on Unsplash

There’s an old Indian folktale about a potter who prayed for the skill to make unbreakable pots. The gods granted his wish. Business boomed initially—everyone wanted these perfect pots. Then sales slowed. The pots never broke, so nobody needed replacements. He hadn’t eliminated his business—he still made a living selling to new customers. But while his sales plateaued, someone else was making fortunes.

In June 2024, Anthropic released what looked like their unbreakable pot.

When Claude 3.5 Sonnet launched, developers couldn’t stop talking about it. The model scored 64% on agentic coding tasks, nearly doubling Claude 3 Opus’s 38%. Cursor, a new IDE, added support almost immediately. Social media filled with people building web apps and games through Claude’s interface—many without traditional coding experience.

One developer captured the moment: “Finally offering what everyone was waiting for—competent and correct automated code production.”

I’d been using Claude since Claude 2 launched in July 2023, first through the web interface, then through tools like Cline. But Claude 3.5 Sonnet felt different. The model actually worked. It produced code that ran. It understood context in ways previous models couldn’t.

Anthropic’s strategy seemed obvious: build the best coding model, charge premium prices, dominate the market. And for a while, it worked. Claude Code hit $1 billion in revenue just six months after launching in May 2025.

Then something started to shift. The gap that felt insurmountable began closing.

When “Best” Stops Mattering

By September 2025, Claude Sonnet 4.5 scored 77.2% on SWE-bench Verified—still the highest score of any model available. Anthropic kept topping the benchmarks. The model remained technically superior.

But GPT-5 scored 74.9% on the same benchmark. That 2.3 percentage point gap looked significant on paper. In actual use, I stopped noticing the difference. I started using OpenAI’s Codex more often. It was cheaper. And for most of what I was building, it worked just as well.

I wasn’t alone in noticing this compression. Simon Willison, a veteran developer who beta-tested Claude Opus 4.5, wrote something striking. His preview access expired mid-project, forcing him to switch back to Claude Sonnet 4.5. He kept working at the same pace. “I’m not saying the new model isn’t an improvement,“ he wrote, “but I can’t say with confidence that the challenges I posed it were able to identify a meaningful difference in capabilities between the two.“

Then he added the line that captures something important: “This represents a growing problem for me. My favorite moments in AI are when a new model gives me the ability to do something that simply wasn’t possible before... but today it’s often very difficult to find concrete examples that differentiate the new generation of models from their predecessors.“

That’s the moment when premium pricing power dies. When the gap between “best” and “good enough” shrinks below what customers can perceive in daily use.

Claude’s premium pricing faces pressure from cheaper alternatives. When the quality gap narrows, customers optimize for cost. And most developers can’t reliably detect the difference in production work.

Like the potter watching his sales plateau, Anthropic is discovering that perfection doesn’t guarantee exponential growth. But there’s a deeper problem: even if Claude can replace a $120,000-a-year developer with a $240 annual subscription—or $2,400 for premium—where does that saved money actually go?

Not to Anthropic.

Where the Money Flows

The code still needs to run somewhere. Apps need hosting. Databases need infrastructure. Those costs don’t disappear when AI writes the code faster.

This is where the potter’s story reveals its lesson. The merchants who made fortunes weren’t selling pots. They built water distribution systems—the infrastructure everyone needed, every single day. The pot was just how the water got delivered.

In AI’s case, the model generates code once. Infrastructure hosts it forever. And that’s where the venture-scale returns live.

Take Bolt.new as an example. Bolt charges users $20 per month for their Pro plan that enables anyone to build applications. But, Bolt spends $22-45 in token costs to support users on the Pro plan. They’re losing money on the model usage itself. The apps Bolt generates get deployed to infrastructure providers like Netlify. Netlify captures 80% margins on hosting and gained 8 million developers, with 64% of recent growth coming from AI agents rather than human developers.

Bolt eliminated the developer expense. Netlify captured the compounding value.

This pattern isn’t unique to Bolt. It’s the architecture of how AI-generated code works. The model produces the artifact once. The infrastructure hosts it forever.

While Anthropic was optimizing for model quality, their competitors were optimizing for a different game entirely.

The Game Google Is Playing

Google understood what the potter didn’t: don’t sell pots. Build the water distribution system.

Google doesn’t need Gemini to be the best model. They need it to be good enough to keep you deploying on Google Cloud.

Gemini Pro costs $1.25-$2.50 per million input tokens—a fraction of Claude’s pricing. Google can afford this because they’re not just making money on the model. They’re making money on the infrastructure where your code runs, where your databases live, where your applications scale.

In June 2025, Google launched Gemini CLI, their open-source answer to Claude Code. It’s free for most developers—60 requests per minute, 1,000 requests per day at zero cost. Google doesn’t need to monetize the tool. They monetize what happens after the tool generates code.

Microsoft follows the same playbook. They’ve invested roughly $13 billion in OpenAI, and GPT models integrate tightly with Azure. But Microsoft doesn’t own OpenAI or its models. They’re capturing revenue the same way Google does—through Azure hosting, Azure databases, the entire infrastructure stack that runs after the model generates code.

And here’s the critical strategic insight: these cloud providers are model-agnostic. Google Cloud runs Gemini, Claude, GPT, and open-source models. AWS hosts Claude (after investing $8 billion in Anthropic), their own models, and competitors. Azure runs GPT alongside Claude and everything else. They don’t need model exclusivity. They just need your workloads running on their infrastructure.

This is Joel Spolsky’s “Strategy Letter V“ playing out in real-time: commoditize your complements. I’ve written about this dynamic before—when you make money on infrastructure, you want AI models to be as cheap and accessible as possible. Every dollar customers save on model costs is a dollar they might spend on hosting. The long-term recurring revenue is in infrastructure, not in the model generating the code.

Playing Different Games

Tabrez Syed

March 27, 2025

Read full story

Google and Microsoft aren’t competing with Anthropic on model quality. They’re making model quality irrelevant to their business model.

OpenAI’s Response

But what about OpenAI? OpenAI is in the same position as Anthropic—making incredible models while someone else collects the recurring revenue.

OpenAI saw this problem earlier and responded aggressively.

In January 2025, OpenAI announced the Stargate Project—a $500 billion infrastructure investment over four years, with $100 billion deployed immediately. Not a partnership with an existing cloud provider. Not a modest data center build-out. A complete infrastructure play with SoftBank providing funding and Oracle contributing cloud expertise.

By September 2025, Stargate had already secured over 8 gigawatts of planned capacity across multiple U.S. sites—Texas, New Mexico, Ohio, Michigan, Wisconsin. The project is ahead of schedule, approaching the full $500 billion commitment. OpenAI is building infrastructure at the scale of hyperscalers.

Anthropic’s Realization

Somewhere in 2025, Anthropic seems to have realized the game had changed—though later than OpenAI.

In December 2025, Anthropic acquired Bun—a JavaScript runtime that powers Claude Code. This was Anthropic’s first acquisition. The stated reason was performance and reliability. Claude Code was generating $1 billion in revenue. If Bun breaks, Claude Code breaks.

But there’s a deeper strategic logic. Bun isn’t just a runtime. It’s the beginning of controlling where code executes. It’s the first step toward owning infrastructure, not just generating code.

A month earlier, in November 2025, Anthropic announced $50 billion for their own data centers in Texas and New York. This wasn’t an expansion of training capacity. This was infrastructure for running customer workloads—the exact business Google and Microsoft already dominate.

They’re doing this despite a $30 billion deal with Amazon Web Services, despite $8 billion in Amazon investment, despite being deeply integrated with AWS for training and inference.

Anthropic is trying to change games mid-race—pivoting to own the infrastructure layer and capture the recurring revenue they’re currently giving away.

But that is a high bar when you’re competing against AWS, Google Cloud, and Azure—companies with decade-long head starts, millions of customers already locked in, and ecosystems built over tens of billions in infrastructure investment.

The Potter’s Dilemma

The potter didn’t starve. He made a living. New customers still bought his perfect pots. But while his business plateaued, the merchants who built water distribution systems captured exponential returns.

Anthropic and OpenAI are racing to build infrastructure—the Bun acquisition, the data centers, owning where code runs. These moves signal companies realizing where venture-scale returns actually live.

The question isn’t whether they can build viable businesses. It’s whether building the best models captures a meaningful fraction of the value they create—or whether infrastructure companies turn their innovation into compounding returns.

The best pots don’t always build the best business.

BoxCars AI

Playing Different Games

Discussion about this post

Ready for more?