Bring Your Own Agent
How WebMCP gives AI a seat at every website
“Mr. Watson, come here — I want to see you.”
Those were the first words ever spoken on a telephone. Alexander Graham Bell hadn’t meant it as a grand demonstration — he just needed his assistant in the room. But the irony is hard to miss: the most advanced communication device in history, and the first words it carried were a request for someone to be physically present. The telephone could transmit a voice across a wire. But it couldn’t point at anything.
Today, we’ve built ourselves the most capable assistants in history. They can reason through complex problems, recall details we’ve long forgotten, and increasingly, they’re starting to know us — our preferences, the way we like things done. And yet, every interaction follows the same script: we type into a little box, wait for text to appear, type again. We describe what we’re seeing. They describe what they’re thinking. Back and forth, through text.
Picture the absurdity of it. You’re staring at a website — a rich, visual interface full of images, layouts, interactive elements — and beside it, in a narrow chat panel, you’re trying to describe what you want to your AI assistant in words. It’s like calling someone on the phone to help you rearrange your living room. You can hear each other fine. But nobody can point.
For the past few years, the prevailing response has been to make the chat better. Chat gets smarter, richer, more capable — the telegraph becomes a fax. But what if the real leap isn’t a better chat? What if it’s getting in the same room?
The Death of the Homepage (and Other Predictions)
Last August, Fortune ran a piece with a blunt thesis: the homepage is dead. “The leading digital experiences of 2026 will be conversation systems with seamless visual supports, not websites in the traditional sense.” The numbers backed it up. Nearly 60% of Google searches now end without a single click — up from 26% just two years earlier. ChatGPT processes 2.5 billion prompts a day. Gartner predicts traditional search volume will drop 25% by the end of this year. Why visit a website when you can just ask?
The web took the hint. An entire industry sprang up around making sites readable by AI rather than visitable by humans. AIO — AI Optimization — became the new SEO, focused not on ranking in search results but on getting your content cited by chatbots. New protocols emerged to make sites machine-readable — structured data that AI agents could consume without ever rendering a page. The direction was clear: the web was becoming a data layer, and chat was becoming the interface.
Ask Not What Your Customers Think of You, But What Their AI Thinks of You
If you were a film director in the 90s, you'd spend opening weekend with your stomach in knots, waiting for Roger Ebert and Gene Siskel's verdict on your latest work. Those iconic "two thumbs up" weren't just reviews—they were career-makers. Studios would hold their breath as the critics delivered their judgment, knowing that a positive review could dri…
And when text-only chat felt too limiting, the industry’s answer was to make chat richer. Anthropic introduced Artifacts — interactive mini-apps that render alongside your conversation. OpenAI followed with Canvas, a collaborative workspace for writing and coding that lives inside ChatGPT. Then came MCP-UI, which embedded full interactive components — product selectors, image galleries, booking forms — directly into chat conversations. Shopify’s engineering team explained the motivation plainly: a product “isn’t just a SKU and price — it’s images showing different angles, color swatches you can click, size selectors that update availability.” Text alone can’t carry that. So they built a way to put those elements inside the chat window. By January 2026, MCP Apps became an official extension of the protocol, letting any tool return rich UI that renders right in the conversation.
Each step made the chat better — richer components, more interactive widgets, fuller visual experiences squeezed into a conversation thread. But notice the assumption underneath all of it: the chat window remains the center of gravity. The website comes to the chat — as data, as embedded cards, as sandboxed iframes. Everything flows toward that narrow column of conversation.
And for simple tasks, it works. Ask a question, get an answer. Book a flight, confirm the details. But somewhere between “book me a flight” and “help me design a wedding invitation,” the chat window hits a wall. UX researchers have been flagging the problem: text-based AI interfaces risk “regressing to high-memory-load, low-visual-feedback systems,” undoing decades of progress in interface design. One observer put it more simply: “Half of our work now is copy-pasting text from one chat to another.”
The fax machine got very good. But you still can’t rearrange furniture well with it.
Bring Your Own Agent
What if instead of bringing the website into the chat, the agent came to the website?
That’s the reversal at the heart of WebMCP, a new browser standard that quietly shipped in Chrome’s early preview this month. The idea is simple: a web page can register its features as “tools” — functions with plain-language descriptions that an AI agent can discover and call directly. A real estate site might expose “filter by neighborhood” and “compare these two listings,” while a design tool offers “change background color” or “show me templates like this.” The agent doesn’t screenshot the page and guess where to click. It reads the menu of what’s available and calls the function. Techstrong.ai had the right headline: Chrome just gave AI agents their own front door to the web.
The timing isn’t accidental. In late January, Google moved Gemini into Chrome’s side panel — not as a floating window you summon, but as a multimodal presence that can see the page you’re on, understand images and text, and act on what it finds. Microsoft has been building the same thing with Copilot in Edge. As these agents gain voice capabilities, you won’t just type to them — you’ll talk to your agent while you’re both looking at the same page. The browser is becoming a two-seat vehicle.
And here’s what makes this different from every website bolting on its own chatbot over the past two years. Those chatbots were strangers. Every site you visited, you met a new one — a fresh AI that knew nothing about you, your preferences, your history. You had to start from scratch each time, explaining what you wanted to a stranger who worked for the site.
With WebMCP, you bring your agent. The Gemini that rides in Chrome’s side panel has been with you all day. It knows what you were researching this morning, what you bookmarked last week, what decisions you’ve been weighing. This is BYOA — bring your own agent — and it flips the first wave of AI on the web entirely on its head. Wave one was every website hiring its own AI, a stranger at every door. Wave two is you walking in with yours.
Consider the difference. You land on a wedding venue site. The site’s chatbot greets you and asks: how many guests? What date? Indoor or outdoor? What’s your budget? You type it all in, one field at a time, teaching a stranger about your wedding.
Now imagine arriving with your agent instead. Your agent already knows you’re planning for 150 guests, that you’ve been looking at June dates, that your partner keeps pinning rustic outdoor spaces. The moment you land on the page, your agent reads the site’s tools and filters the results before you’ve typed a word. The page loads and it’s already showing you what you came for. You scroll through photos, click on two that catch your eye, and ask your agent to check availability. The page updates — calendars and pricing appear for just those two.
It’s the difference between walking into a department store alone and walking in with a personal shopper who knows your wardrobe, your size, and what you already own. The personal shopper makes sure the dressing room only has clothes that fit, in colors you actually wear. You still choose. But you skip the tedious work of explaining yourself to every store from scratch.
The website provides the tools. Your agent brings the context.
We’re early. WebMCP is still in preview, the agents are still learning what to do with the tools they’re given, and there’s a lot to figure out about trust, permissions, and what happens when your personal agent meets a website’s business model. But the direction is clear, and it’s not the one most people predicted. The visual web isn’t dying. It’s not being swallowed by chat. It’s being redesigned, for the first time, with a second seat.
Related
Software That Sees Before It Asks
I’ve been building software for over two decades. The process always follows the same pattern: design the interface, ship it, then users bring their data.


