Why Foundation Models Are Becoming Utilities

Why Foundation Models Are Becoming Utilities

The Foundation Model AI Race Is Still Wide Open

We’re still so early in the foundation model AI rat race. Anthropic’s Claude Code is getting all the buzz right now. New releases, flashy features, endless capabilities. But under the hood, how different are these models? When research has looked at curbing unwanted responses, semantic drift, and overcoming jailbreaking using “activation capping,” researchers found that foundational models all had very similar underlying mechanics.

The AI Hype Cycles I’ve Already Seen

I’ve been coding non-stop since 2020 and trying to build with AI tools since November 2023. I’ve built recommendation engines, desktop apps, terminal apps, full-stack web applications, and thousands of scripts and helper functions. During that time, I’ve seen many AI hype trains and tools come and go: custom GPTs, McDonald’s drive-thru AI, Apple AI news summaries… remember that one? That was fun. And it seems clear to me: we don’t know who the winners and losers will be in the AI race.

The Geometry Behind LLMs Might Be Universal

The geometry seems to be universal. You might think every LLM is unique, like a fingerprint, but interestingly, not quite. While the exact numerical weights of LLaMA, Qwen, and Gemma are different, they converge on a shared geometric structure for high-level concepts. So what does that suggest?

Shared “Persona Axes” Across Models

It suggests that many activation-space directions found within LLMs look similar across completely different models. LLaMA, Qwen, and Gemma all exhibit closely aligned persona axes, which essentially means that we, as humans, have given these semantic groupings different labels to understand them better through a human lens.

For example, LLM personas related to helpfulness, anger, or humour all have similar mathematical groupings under the hood. The way they reproduce language shares similar directional properties in multidimensional space, despite different internal representations.

A Universal Compass for Human Language

This suggests that as models move toward a deeper understanding of human language, they may develop a “universal compass” for organising human language and some of the ideas it represents.

Or they’re all basically trained on the same data, and human language is relatively simple without enough complexity to show marked differences in LLM behaviour.

Or a bit of both.

When Foundation Models Become Utilities

Therefore, foundational models are becoming more like utilities than software with a moat to protect a business; essentially, foundational models lack any kind of differentiation from one another.

As a result, this pushes their value down and creates a dangerous race to the bottom, with subsidised prices coming from the companies with the deepest pockets trying to become the “winner takes all.” So they can ramp up pricing in the future, which won’t happen because… China.

The Real Battleground: Wrappers

So what’s the second part of their plan?

The hyperscalers are going to try to produce the best wrappers for their LLMs to differentiate them from the herd. Because ultimately, people don’t buy a car just because of the engine; they buy it because of how comfortable the ride is, and people will pay more to increase that comfort.

The Talent Arms Race

This is why we’ve seen bidding wars and ridiculous acqui-hires for talent, like the $14.3 billion deal for Alexandr Wang (founder of Scale AI) to lead the new Meta Superintelligence Labs (MSL), and Peter Steinberger’s OpenClaw (recently won by OpenAI. It gives an LLM hands to actually be useful).

These moves aren’t just about hiring talent; they’re about giving hyperscalers an edge, because their models are all essentially the same. To differentiate these models and make them useful, you need structured wrappers around them using smaller ML models and traditional deterministic methods.

What Building With AI Actually Looks Like

This is something I’ve learned the hard way.

My journey into collaborating with AI began with code-related TODO comments within my files. I’d try to get the basics working myself, using Stack Overflow, and then hand that off to GPT-3 to refine what I’d done, give me suggestions, and potentially fix bugs.

Then this year, Claude’s RL efforts have made it very good at coding. So I updated my workflow to go back and forth between my codebase and Claude, refining my applications.

It became clear to me that as my projects grew, comments, technical documentation, technical language, understanding computer science and mathematical concepts, and clear organisation and structure became increasingly important for getting good results.

I had to become very disciplined around structure and repeatability.

Enter Claude Code

Then enter Claude Code.

To begin with, this was both a miracle drug and a teenager’s bedroom. Claude Code is really very good at getting an end result, but it will consume your city’s entire electrical grid if you keep prompting it badly.

To get this working well and increase my productivity, I had to do a lot of groundwork.

Therefore, I went back to my organisational drawing board. I decided to create a tree-like structure and a reusable, AI-first framework with detailed Markdown files, filled with comments and coupled with unit tests to get everything working.

Now, I feel like a 10x developer.

Don’t Believe the Hype

But why am I saying all of this?

Partly to give some insight into my experience so I don’t sound like a crypto bro selling a meme coin, and partly to say something about the present and make a prediction about the future of foundation models.

In the present, don’t believe the hype. Don’t get caught up in shiny objects and magic tricks.

Make no mistake about it: hyperscalers are moving into the utility space. They will become less profitable companies with much larger market share, trying to convince their customers that they have the best ride.

Developers Are Still the Differentiator

As a developer, I’ve faced endless loops, bugs, and the Will Smith eating spaghetti code.

For now, I think AI should complement developers, not replace us. With the right structure, developers are becoming more valuable, not less.

We still possess generalised logic baked into our foundations that LLMs are not able to parrot, like knowing how many “r”s are in “strawberry.”

And think: if every business has the same foundations and the same LLM tools, what’s your differentiator?

The Human Moat

Are we underestimating the human element in AI? Or is the moat just a comforting illusion for those with deep pockets?

Let’s talk about it. What’s your take?