Tim Neilen    Now    Books    Quotes    Cartography

The shell and the sidebar

Written with AI

Anthropic is shipping plugins for Word, Excel, and PowerPoint. Not Microsoft. Anthropic. A frontier lab is overtaking the platform incumbent on its own turf.

Claude for Word went into public beta on 10 April. PowerPoint in February. Excel in October 2025. Claude Code on the web the same month. Claude Code generally available in May 2025. Initial Claude Code preview back in February 2025.

And that list is still missing things. Opus 4.7 three days ago as I write this. Claude Design on 17 April. Managed Agents on 8 April. Project Glasswing on 7 April. Agent Skills. Four Opus revisions in fourteen months. Two Sonnet revisions. Haiku 4.5. A run of smaller API features I’ve lost track of. Somewhere around fifteen significant launches from one vendor in fourteen months, probably more. OpenAI’s cadence is similar. Google has Gemini wired into Workspace at increasing depth.

The pace is real. Part of my job is sorting signal from hype, and plenty of this list will get quietly shelved, repriced, or absorbed into something else. Even after that filter, what’s left is doing things a chat window can’t.

How many of these have actually landed in your team’s daily workflow? In most organisations, the honest answer is close to zero.

My instinct is to reach for a ring-fenced self-hosted deployment. A chat interface the organisation controls, pointed at model APIs. Open WebUI is the one I reach for most often, but LibreChat works, and there are others; the pattern is bigger than any single tool. Pay per token. Model choice. Data boundary. Familiar chat interface. It’s pragmatic. It fits the budget. It’s my default recommendation.

I’m starting to wonder if that’s actually the right advice.

The reflex

The economic argument for ring-fencing is clean. Claude Team is $35 per seat per month, minimum five seats. For 100 staff that’s $42,000 a year just for access. Before anyone asks the model a single question.

A CFO reads that and asks the right question: will all 100 of these people extract $420 of value a year? The honest answer is no. Some will. A lot won’t, at least not initially. So the pilot goes to twelve people, compliance asks questions, security asks more, and by the time the evaluation closes, something new has shipped and there’s fresh capability to evaluate.

Ring-fenced deployments route around this. Usage-based pricing means a quiet month costs less. Model choice means cheap questions go to cheap models. Identity integrates. Observability is built in.

That’s the argument I’ve been making for a while. It’s a good one. The maths works.

What the chat window gives up

Claude for Office apps share context. Work on a spreadsheet, add to the report, move to the presentation, and the model carries state between them. Claude Code isn’t a chat window pointed at a coding model. It’s an agent that operates inside the terminal, reads your repository, runs your tests, edits your files.

These aren’t features you rebuild in a chat shell, no matter how good the shell. If the capability is in the integration - and for these tools, it is - then pointing a chat window at the same underlying model and calling it equivalent is a category error.

Whichever ring-fenced interface you deploy won’t get Claude for Word’s document sidebar. It won’t get the shared context across Claude for Excel and PowerPoint. It won’t get Claude Code’s repository-level agency. It gets the model. It gives up the product.

Which is the real product

For the last few years, the frontier lab product was the model. Interfaces were almost incidental. People used the API, wired it into their own tooling, built their own shells. Self-hosted deployments made sense because you were paying for what actually mattered - token throughput - and not paying for wrapper features you didn’t use.

That has been quietly changing.

The frontier lab product is increasingly the integration. Claude for Excel is a product built on Sonnet, but the product is the Excel integration. Claude Code is a product built on the same underlying models, but the product is the repository-aware agent. The API just returns tokens.

If the product is the integration, then routing around it loses the product. The cost isn’t a less-rich interface. It’s the thing you were paying for in the first place.

The hedge the chat window keeps

If an organisation buys Claude Team, they’re not just buying the tool. They’re buying into Anthropic’s trajectory. The team learns the Claude for Word sidebar’s quirks. Workflows assume Claude’s output shape. Muscle memory forms around one vendor’s way of doing things.

Six months from now OpenAI ships something that outpaces Claude for Word. Or Google wires a Gemini-shaped competitor directly into Workspace. Or Anthropic prices the tier up. Or deprecates the feature the workflow was built around. Or shifts its API terms in a way that doesn’t suit.

Switching isn’t a commercial decision at that point. It’s a retraining program.

This is the Microsoft versus Mac versus Linux argument in a new suit. Pick an ecosystem, get its depth, lose the ability to move. The depth is real. So is the lock-in. Remember who’s shipping the Word plugins now. Not even Microsoft. Different people in the same organisation will suit different ecosystems, which the single-vendor rollout quietly ignores.

Ring-fenced deployments sit differently on this axis. The chat interface is yours. The identity integration is yours. The observability is yours. The model behind the API is swappable. What used to be Claude can become GPT or Gemini or an open-weight model running on your own infrastructure, and the interface staff use doesn’t change. They don’t re-learn. They don’t re-train. The wiring behind it just points somewhere else.

That’s a real hedge when the frontier is moving monthly and nobody can credibly tell you where it lands in two years.

It isn’t a free hedge. You give up integration depth to get the optionality. But on a three-year view, in an industry moving this fast, optionality has a value that’s easy to underprice.

Build it yourself, almost

There’s a third option that’s starting to look viable. Build your own integrations.

Claude Code can scaffold a Word add-in, an Excel add-in, or whatever identity-aware shell you want to wrap around an API. The lift that used to need a dedicated team is now feasible for a two-person partnership. The labs are quietly enabling the alternative to their own products.

So why not build Word-sidebar-equivalent integrations per organisation, tailored to their data, brand, workflow and identity? Cheaper than the Anthropic licence over time. Lock-in-proof. Integrated where it matters.

The dilemma is cross-cutting.

The vendor’s advantage isn’t that they can build a Word plugin. It’s that the Word plugin shares context with the Excel plugin, and all of it knows about the same conversation history. That cross-application state isn’t a plugin. It’s a platform that individual plugins plug into, and it’s an order of magnitude more work to build than any single add-in.

A bespoke Word add-in is a Word add-in. It isn’t a system. Build the system and you’re maintaining a mini-Anthropic, in a market where the actual Anthropic ships a release a month.

That’s a real option for some. For most, it’s a dilemma more than a plan.

The power-user projection

I’m a technologist. I enjoy these tools deeply. I build agents for fun. I chain model calls because it scratches an itch. When I think about who these tools are actually for, I have to keep catching a projection I make.

I imagine people wanting model choice and fine-grained control. Because that’s what I want. I imagine them being comfortable in a generic chat window. Because I am.

But the person who needs Claude for Word doesn’t want to open a separate chat tab, paste in their document, ask for a rewrite, copy the result back, paste it in. They want the sidebar that’s already aware of what they’re writing. The person modelling in Excel doesn’t want to copy rows into a prompt. They want the formula help to live where the data already is.

Training can move someone from “I don’t know how to use this” to “I get value from this.” Training cannot turn a chat window into a Word sidebar.

A ring-fenced chat deployed at a firm whose staff would genuinely benefit from Claude for Word isn’t a cheaper version of the same thing. It’s something that can’t do the thing they’d actually use.

Neither of these is something they want to build

Whichever pattern wins, the organisation doesn’t want to stand it up themselves. Not the chat interface. Not the identity integration. Not the model routing. Not the rollout plan. None of it.

They want someone to walk them through. Someone who’s made these choices before, who’ll sit with them through the awkward first months of adoption. Especially when the people making the call aren’t technologists. They know AI matters. They don’t know whether to buy Claude Team or deploy a self-hosted interface, and they don’t want to be the person making a $42,000 procurement decision based on a vendor pitch.

The economics argument for ring-fencing assumes the organisation can build it. Most can’t - not because they lack capability, but because it isn’t their job to know which open-source interface to choose, which identity provider to wire in, which model to route which request.

Where does that leave me

No clean answer. The honest recommendation is messier than “deploy a self-hosted chat” or “buy Claude Team.”

Ring-fenced is right for generic chat work across a broad staff base where per-seat economics don’t work. Vendor tools are right for integrated work where depth delivers the value. Most organisations have both, which makes the honest recommendation “both”. Uglier to quote. Harder to deploy. Probably closer to right.

The thing I keep having to catch in myself is simpler than which tool to recommend. It’s pattern-matching the recommendation to what I find easy to deliver - the ring-fenced pattern, the rehearsed pitch, the budget that fits - instead of sitting with what a given organisation’s work actually needs.

Last week I wrote about the gap between capability and context. This is the sibling problem: which product, and who brings the organisation through it. The tool is just the container. The recommendation isn’t the delivery.