Skip to content

Nate B Jones AI Politics

The Real Story Behind the Government GPT 5.6 Freeze.

Nate B Jones argues that OpenAI's government-restricted rollout of ChatGPT 5.6 isn't the real story on its own. It's one of five signals from the same week, alongside the new Siri, Anthropic's Claude Tag in Slack, Z.AI's GLM 5.2, and an OpenAI Codex adoption study, that all point to the same underlying fight: not which model is smartest, but which system actually knows what's going on. He walks through how Apple, Anthropic, and OpenAI are each racing to connect AI to context (personal, team, and file respectively) rather than just raising raw intelligence, then argues the government's cybersecurity freeze on frontier models is quietly handing open source models room to close the public capability gap even as the labs likely keep their private lead.

Published Jun 29, 2026 17:13 video 21 min read Added Jul 1, 2026 Open on YouTube →

At a glance

OpenAI just shipped ChatGPT 5.6, and then didn't ship it, not really: access is frozen to a small group of government approved partners while Washington runs a cybersecurity review. Nate B Jones argues that this freeze is not the story by itself. It is one of five things that happened the same week, alongside the new Siri, Claude Tag in Slack, Z.AI's GLM 5.2, and a Codex adoption study, and he says all five are the same underlying fight. Not a fight over which model is smartest, a fight over which system actually knows what is going on: which file is current, what the customer meant, what the team decided, what can be shared, what counts as done. His claim is that when frontier intelligence slows down, even for a few weeks, the advantage stops being the newest model and becomes the context that makes any good model useful, and the rest of the video walks that thesis through Siri, Claude Tag, and Codex before landing on what the government freeze actually does to the competitive map.

The frontier is slowing, and context is the new advantage

Jones opens with the plain fact. OpenAI released ChatGPT 5.6, but for now access is restricted to a small group of government approved partners while Washington reviews the cybersecurity risk. That is not a cancellation, he says, but it is a tremendous slowdown in frontier availability. And by the end of the video he wants the viewer to understand why that delay, the new Siri, Claude Tag, GLM 5.2, and Codex are all about the same underlying thing: a battle for the part of your brain that understands work. Not the sci-fi mind control sense of that phrase. The everyday part: which message matters, which file is current, what the customer actually meant, what the team decided, what can be shared, what can't be shared, what counts as done. Because if frontier intelligence slows down, even for a few weeks, the next advantage is not owning the newest model. It's having the context that makes any good model useful.

Look at the week through that lens and the news starts to rhyme. Apple is trying to fix Siri by giving it access to messages, photos, email, notes, screen, and apps. Anthropic has launched Claude Tag in Slack, where a team can give Claude access to selected channels, tools, data, and code bases. Z.AI's GLM 5.2 has made cheap, open, frontier-ish intelligence feel much closer to reality than it did a few weeks earlier. And OpenAI has a Codex paper showing that inside OpenAI, Codex has become the dominant surface for work related AI output. Those sound like four unrelated stories, OpenAI told to slow the rollout, Apple trying to make Siri less embarrassing, Anthropic dropping Claude into Slack, but Jones says the problem underneath all of them is identical: the model can be smart and still not know what's going on.

He grounds that in the everyday experience of using AI. Open Codex, ChatGPT, Claude, or Gemini and the model is genuinely capable, it can write, reason, summarize, help you think. But before it can do something useful, you have to carry the entire situation into the context window yourself, usually by uploading files. Paste the email, paste the memo, explain who the client is, explain which version of the deck is current, explain that yesterday's Slack thread changed the decision. That is what prompting has become as we've asked these models to do more, and only after all of that friction does the AI become genuinely useful. That friction point is exactly what the industry has been calling the agent problem: a problem you want agents to solve by going after the context window itself, which is the promise everyone has been chasing with agents for months.

From there he sets up the rest of the video as a guided tour: three surfaces, Apple's Siri, Claude Tag, and Codex's execution inside OpenAI, plus two pressure points, GLM 5.2 on one side and the ChatGPT 5.6 delay on the other. Underneath all of it is why intelligence is getting cheaper even as the newest frontier models come out more slowly, and what that means for context. His summary line: the next useful AI product probably won't be the one that wins a benchmark. It will be the one that knows where the work is, what it's allowed to see, what it's allowed to do, and knows all of that seamlessly.

Figure 1. Jones's frame for the entire video. Four stories that look unrelated, Apple's Siri overhaul, Anthropic's Claude Tag, OpenAI's Codex study, and the ChatGPT 5.6 government freeze, are really one story about who controls the context layer, with GLM 5.2 adding outside pressure by narrowing the public capability gap while the freeze is in effect.

Apple connects Siri to the context in your life

Jones turns first to Siri, because it's the surface everyone already has an opinion about. Siri has been bad for so long it's a punchline: ask it something normal and half the time it either misunderstands or hands you a web search that makes you wonder why you bothered speaking out loud. The easy headline has been that Apple is finally trying to fix Siri, with plenty of skepticism about whether it will actually be good. Apple's own framing leans into that: Siri as a conversational AI assistant, more natural conversations, richer answers, a dedicated Siri app. Jones says that's part of the story, and having used it himself, it may well work. But he doesn't think "Siri becomes ChatGPT" is the real story. He thinks the real story is that Apple is trying to make Siri useful by connecting it to the context already in your life.

His example: a question like "when is my mom landing" requires context from the calendar, the flight number, the email confirmation, whether another family member said they might pick her up instead, and whether the flight is running late. Apple's actual challenge, in his read, is finding a way to privately and securely connect Siri to where that context already lives on the phone: photos, calendar, notes, email, app state, the screen itself. If Siri can do that, its raw intelligence doesn't have to be especially high for it to be incredibly useful. Jones's pointed line is that the question of Siri's capability may be the wrong question entirely. Apple's answer isn't a capability answer, it's a context answer, an answer about where intelligence lives, and Apple is pushing that intelligence as close to your own systems as it can, on-device processing wherever possible, private cloud only where it has to be.

The product shape that follows is, in his telling, genuinely clever. Apple is effectively saying your assistant gets better the closer it sits to you, and conveniently, sitting close to you also lets Apple build a privacy architecture where the context stays only yours. That's a consumer answer to the context problem: Siri doesn't have to be brilliant, it just has to have the run of your phone to be extremely useful. And that flips where Apple's advantage actually comes from. It stops being the app store ecosystem or the hardware alone, and becomes the fact that Apple already has the context that lives inside the iPhone, and Apple can reach it in ways nobody else can. Keep that shape in mind, Jones says, heading into the work side of the story.

Anthropic launches Claude Tag inside Slack

On the surface, Anthropic's announcement is plain: Claude Tag starts in Slack. A team can grant Claude access to selected channels, tools, data, and code bases, tag it in, and Claude works through tasks and stages, responding in the thread. It remembers relevant information from the channels it's in, and it operates inside defined permission scopes, spend limits, and logs. That sounds like a Slackbot, Jones says, but don't say that too fast, Slack has had bots for a long time. What's actually interesting is that Anthropic is trying to put the assistant inside your team's context, not your context. On your phone, context is private and messy because it's your life. Inside a company, context is shared, permissioned, political, stale, half written, and scattered across six different places.

Work happens in exactly those messy places, and for a long time AI has stayed mostly separate from it, with a few exceptions: Devin has been genuinely successful from a coding angle, but there haven't been many strong off-the-shelf examples of intelligent AI operating inside that kind of mess more broadly. So when Anthropic says Claude Tag can build context over time, Jones treats that as more than a feature claim, he treats it as the heart of where the company is headed. And he calls it a powerful and dangerous statement, because the more useful Claude becomes inside Slack, the more it needs access to exactly the messy stuff companies are bad at governing: engineering decisions, customer tickets, pricing debates, information about people. Anthropic clearly knows this, which is why the launch language leans so heavily on scopes, permissions, admin controls, and channel defined memories. They know they have to earn that trust, because an AI teammate that breaks a boundary in Slack has just created a context leak, and a context leak is a corporate liability. The pitch, in effect, is: trust us with this context because you stay in charge the whole time.

Jones calls Claude Tag a far better signal than most of the AI co-worker startups crowding this space, because of what Anthropic is doing very deliberately: you already fed us formal context through prompts, through co-work, through Claude Code, now trust us with your informal context too, and let us be a more useful co-worker as a result. No other company, he says, can make that pitch in quite the same way, because no one else has already built up that formal-context relationship with users at scale. This, he says, is Anthropic doing for work exactly what Apple is doing for your phone.

Codex and how OpenAI employees actually adopted it

Jones then turns to a Codex adoption study that he admits is easy to dismiss, it doesn't read like news. But he thinks it's genuinely useful here because software makes the assistant-context problem visible in its cleanest form. The study tracks how actual OpenAI employees chose, or didn't choose, to adopt Codex over time, and what they used it for, which is really a study of what context those employees trusted Codex with. What makes it fascinating to him is the assumption people bring to it: that inside OpenAI, of all places, using Codex must be mandatory. It wasn't. Codex had to earn trust the normal way, starting with engineers and only later spreading to other kinds of knowledge workers.

The point that matters most to him is that even inside one of the most AI native companies on the planet, adoption of a specific AI tool with sensitive context is not a light switch you flip once, it's a slow accumulation with visible tipping points. One tipping point shows up clearly in the data: Codex got noticeably more useful in the months after 5.5 shipped, and adoption in non-technical circles inside OpenAI jumped sharply right after that release. With 5.5, Jones argues, Codex earned enough trust to start receiving legal work, recruiting work, sales work, the same kind of unglamorous, sensitive context it had already earned trust with engineers to touch for code.

He draws out the sharpest comparison of the whole video here: Codex is doing the opposite of Claude Tag. Claude Tag's pitch is "you already work in Slack, so tag Claude in." Codex's pitch is "your work is sensitive and important, point Codex at the local files that matter, and it will take care of the rest." Codex positions itself as your launchpad, your headquarters. Claude's frame is "let Claude come to where you already are, and hand it the mess." Both products touch files, both do chat, Jones concedes that's a simplification, but the legacies are real: Claude has always treated context as fundamentally conversational, Codex has always treated it as fundamentally file shaped, and you can still see that inheritance play out this week. He doesn't think this gap will last, these labs tend to copy each other and a tagged Codex is plausible soon, but the current difference in product shape is instructive regardless. Claude Code and co-work were exciting because you could just type into a terminal or a chat and the system would take care of the rest, bring your wheelbarrow of work and get an output back. Now Anthropic is taking that same shape into Slack, in a sandbox, doing the work there and handing back a result. It's gotten far more wide ranging as computer use has matured over the last couple months, but the underlying legacy, files for Codex, conversation for Claude, still shapes both products.

	Siri	Claude Tag	Codex
Context source	photos, calendar, notes, email, screen, app state	Slack channels, tools, data, code bases you tag it into	the local files and repo you point it at
Trust earned by	on-device processing, private cloud fallback	scopes, spend limits, admin controls, logged access	team by team, engineers first, then legal, recruiting, sales
Product shape	ambient, seamless, close to you	chat-shaped: comes to where you are messy	file-shaped: you bring the work to it structured
Where intelligence sits	close to you, deliberately not maximally smart	inside the conversation, remembering as it goes	inside the repo, earning sensitive work after 5.5
The bet	capability is not the constraint; knowing what's allowed and what's current is

Figure 2. The three surfaces Jones tours, compared. Siri, Claude Tag, and Codex are different answers to the same question, where does the assistant get its context and how does it earn the right to touch it, and each answer implies a different product shape: ambient, conversational, or file first.

Why the US government is slowing frontier releases, and what it starts

Jones closes by connecting the ChatGPT 5.6 delay to everything before it. If a frontier model spends the next few weeks or months in a restricted preview, which he expects will happen to most of them, the world doesn't pause and wait. Companies still have Claude, they still have OpenAI's existing models, and now they also have GLM 5.2, plus whatever open model comes next, maybe a new DeepSeek. What they have is anticipation of future frontier work from Anthropic and OpenAI, not the reality of it, even though both labs are still developing and accumulating knowledge rapidly behind closed doors, they simply can't release it as fast. The government restriction, in other words, adds friction right at the frontier of intelligence, and that friction pushes more pressure onto Anthropic and OpenAI to ship context features like Claude Tag, because if you can't win with a smarter model this month, you win by increasing the utility of the intelligence you already have. If tagging Claude into a Slack thread takes thirty seconds instead of the ten minutes it takes to brief an AI from scratch, that saved time adds up, and it adds up even if the underlying model never got smarter.

That is what Jones means by a context war, and he says it's the right lens for reading the news over the coming weeks. Apple is battling for your personal context, which, because people bring their phones to work, becomes a work context story too. Anthropic and OpenAI are battling directly over work context, just with different shaped products. And one of the more interesting side effects he flags is that the government slowdown is giving open source models room to catch up in public even while they're not catching up in private. Anthropic and OpenAI may well hold onto a six, seven, eight month lead over open source privately, but the public models people can actually reach may start to close that gap, precisely because the US government is slowing down frontier releases. That puts a tremendous amount of pressure on the context layer specifically: a real fight over how quickly and easily any given model can apply its intelligence to the context it's handed.

AppleConnects Siri to your personal context instead of chasing raw capability: calendar, photos, notes, email, the screen itself.
AnthropicShips Claude Tag in Slack, asking teams to hand over the messy, informal context that formal prompting never captured.
OpenAIPublishes the Codex adoption study, showing that even inside OpenAI, trust with sensitive context had to be earned team by team, and jumped only after 5.5 shipped.
Z.AIReleases GLM 5.2, making cheap, open, frontier-ish intelligence feel closer to reality than it did just weeks earlier.
WashingtonRestricts ChatGPT 5.6 to government approved partners during a cybersecurity review, freezing the newest frontier model in preview.

Figure 3. The week Jones says rhymes. Five moves from five different actors, read together as one story about who gets to control context while the frontier itself is stalled.

Figure 4. Jones's read on the freeze's side effect. Labs keep advancing privately regardless of the review, but the public frontier line flattens the moment a model like GPT-5.6 goes into restricted preview, and open source releases such as GLM 5.2 climb to close that public facing gap, even while the private lead labs hold over open source stays wide and effectively invisible to the public.

Jones's advice is to think concretely about your own position in that war: what context you're comfortable handing these companies, what you want to retain, and whether you're willing to spend time building pieces of a harness that let you decide where your context routes. He connects this to his own recurring "open brain and open engine" work, building harness pieces in public so people have more choices, and he's careful to say he's not alone in that effort, it's a real and welcome movement, because nobody should feel locked into a single model provider to get meaningful work done.

His closing frame is that the intelligence wars are shifting into context wars. It will matter less exactly when 5.6 ships, or eventually when a future model ships, and more when the next real step in applying intelligence usefully arrives. Siri, in his telling, already proves the point: you don't need a benchmark maxing model for incredible utility, you need intelligence applied seamlessly across the context you actually have. Pay attention to the context layer, he says. It's going to matter a lot.

Key takeaways

OpenAI shipped ChatGPT 5.6 but restricted it to government approved partners during a cybersecurity review. Not a cancellation, but a real slowdown in frontier availability.
Jones's thesis: that delay, the new Siri, Claude Tag, GLM 5.2, and the Codex study are all the same underlying fight, over which system knows what's actually going on, not over which model is smartest.
The everyday friction he names: pasting emails, memos, and client context into every prompt before an AI becomes useful. That friction is exactly the "agent problem" agents are supposed to solve.
Apple's real move with Siri is not raising its intelligence, it's privately and securely connecting it to context already on the phone: calendar, photos, notes, email, screen. On-device first, private cloud as fallback.
Claude Tag puts Claude inside a team's Slack context, betting that trust, built through scopes, permissions, and admin controls, will let it handle the messy, informal context formal prompting never captured.
The OpenAI Codex adoption study shows trust with sensitive context, legal, recruiting, sales, had to be earned team by team even inside OpenAI, and jumped sharply only after GPT-5.5 shipped.
Claude Tag and Codex are mirror images: Claude comes to where you already are (chat shaped), Codex asks you to bring the work to it (file shaped).
The government freeze creates a two-tier gap: labs likely keep a six to eight month lead over open source privately, but the public facing gap may narrow because open source keeps shipping while the labs' newest models sit in review.
Jones's practical advice: decide what context you're comfortable giving these companies, what you want to retain, and whether to invest in your own harness so you're not locked into one model provider.
The intelligence war is shifting into a context war. The next useful AI product wins by knowing where the work is and what it's allowed to touch, not by topping a benchmark.

Chapters

Timestamps are clickable. Click one and the player jumps there and keeps playing while you read.

0:00 The frontier is slowing and context is the new advantage
3:13 Apple connects Siri to the context in your life
5:44 Anthropic launches Claude Tag inside Slack
8:29 Codex and how OpenAI employees actually adopted it
12:51 Why the US government is slowing frontier releases

Notable quotes

The new Siri, Claude Tag, GLM 5.2, and Codex are all about the same underlying thing: a battle for the part of your brain that understands work. Nate B Jones, 0:23

If frontier intelligence slows down, even for a few weeks, the next advantage is not owning the newest model. It's having the context that makes any good model useful. Nate B Jones, 0:41

The model can be smart and still not know what's going on. Nate B Jones, 1:31

Ultimately, the question of Siri's capability may be the wrong one. Apple's answer is not a capability answer, it's a context answer. Nate B Jones, 4:39

Your assistant gets better when it's close to you. Nate B Jones, 5:04

If you put an AI teammate in Slack and it breaks boundaries, you've created a context leak. You've created a corporate liability. Nate B Jones, 7:40

Codex had to earn everyone's trust. Nate B Jones, 9:08

We are in the middle of a context war. Nate B Jones, 14:15

Pay attention to the context layer. It's going to matter a lot. Nate B Jones, 17:03

Resources mentioned

Nate B Jones, the channel (AI News & Strategy Daily), for the sober, non hype read on AI product moves this video delivers.
OpenAI and ChatGPT 5.6, the model whose rollout was restricted to government approved partners during a cybersecurity review.
Apple's Siri AI, the conversational assistant overhaul with a dedicated app, discussed as a context play rather than a capability play.
Anthropic and Claude Tag, the Slack native AI teammate that can be granted access to channels, tools, data, and code bases.
Codex, OpenAI's coding agent, and the adoption study Jones treats as the cleanest evidence of the context-trust problem, The Shift to Agentic AI: Evidence from Codex.
Z.AI's GLM 5.2, the open weight model Jones cites as narrowing the public capability gap while frontier releases from the major labs slow down.
DeepSeek, named as an example of the next open source model likely to follow GLM 5.2.
Devin, cited as one of the few AI products that had already succeeded at working inside messy, informal context, in its case for coding.
Slack, the platform where Claude Tag launches and where the "messy, shared, political" work context Jones describes actually lives.

Where it stands

This is a strategy read on a live, fast moving story, not a settled account, and Jones is explicit that he is naming a pattern across five separate announcements rather than reporting new facts about any one of them. The strongest claim, that Siri, Claude Tag, and Codex are different answers to the identical question of who controls context, holds up well: all three product teams are visibly making the same trade, trading raw capability for scoped access to where the work already lives. The weaker claims are the ones that depend on things nobody outside these companies can verify yet: exactly how many months of private lead the labs hold over open source, and whether the ChatGPT 5.6 review genuinely hands Z.AI and the next open source release a meaningful public opening, or just a few weeks of noise. Those are informed bets dressed as observations, and Jones treats them that way, framing the freeze's real effect as a hypothesis to watch rather than a fact already proven. The most durable idea in the video is the plainest one: a model can be smart and still not know what's going on, and every product he discusses is, in one shape or another, an attempt to fix that.

Full transcript

OpenAI just released ChatGPT 5.6, but not in a normal way. For now, access is restricted to a small group of government approved partners while Washington reviews the cybersecurity risk. That's not a cancellation, but it's a tremendous slowdown in frontier availability. And by the end of this video, I want you to understand why that delay, the new Siri, Claude Tag, GLM 5.2, and Codex are all about the same underlying thing: a battle for the part of your brain that understands work. Not your brain in the sci-fi mind control sense. The everyday part, which message matters, which file is current, what the customer actually meant, what the team decided, what can be shared, what can't be shared, what counts as done. Because if frontier intelligence slows down, even for a few weeks, the next advantage is not owning the newest model. It's having the context that makes any good model useful. Look at the week through that lens, and the news starts to rhyme. Apple's trying to fix Siri by giving it access to your messages and photos and email and notes and screen and apps. Anthropic has launched Claude Tag in Slack, where a team can give Claude access to selected channels and tools and data and code bases. Z.AI's GLM 5.2 has made cheap, open, frontier-ish intelligence feel much closer to reality than it did just a few weeks ago. And OpenAI has a Codex paper showing that inside OpenAI, Codex has become the dominant surface for work related AI output. And those really do sound like completely different stories. I get it. OpenAI being told to slow the rollout, Apple trying to make Siri less embarrassing, Anthropic dropping Claude into Slack, how are these related? The problem is the same underneath all of them. The model can be smart and still not know what's going on. If you use AI every day, you already know the feeling. You can open Codex or ChatGPT or Claude or Gemini, and the model is very capable. It can write, it can reason, it can summarize, it can help you think. But before it can do something useful, you have to carry that entire situation into the context window, often through uploading files to the chat box. You paste the email, you paste the memo, you explain who the client is, you explain which version of the deck is current, you explain that the Slack thread from yesterday changed that decision. This is what prompting has become as we've asked these models to do more. And then, after all of that, when you put all of that in, the AI finally becomes extremely useful. That's a really big friction point. And that is what we've described as an agent problem, a problem that you want agents to fix by going after the context window. That's the promise we've all been trying to realize with agents for the last few months. So I'm going to walk you through three surfaces here. Apple's Siri, Claude Tag, and Codex in terms of execution inside OpenAI. And I'm going to walk you through the pressure points around them, GLM 5.2 on the one hand, the delay of ChatGPT 5.6 on the other. Throughout, we're going to uncover the story of why intelligence is getting cheaper, the newest frontier intelligence is coming out more slowly, and what that means for all of us as far as context goes. Fundamentally, the next useful AI product is probably not going to be the one that wins a benchmark. It's going to be the one that knows where the work is, what it's allowed to see, what it's allowed to do, and it's going to be something that knows that seamlessly. So let's start with Siri, because Siri is something that, for better or worse, we all understand. Siri has been bad for so long that it's become a punchline. You can ask it something normal and half the time it either misunderstands the question or gives you a web search that makes you wonder why you bothered speaking out loud. So the easy headline for a long time has been: Apple's finally trying to do something with Siri, we don't know if it's actually good or not, we have a little skepticism, but Apple's relaunching Siri effectively. And I get where that story exists. Apple itself is talking about Siri as a conversational AI assistant, promising more natural conversations, richer answers, a dedicated Siri app, and that's all part of the story, and it may well work. I've gotten my hands on it a little bit, I've played with it, but I don't think Siri becomes ChatGPT is the story here. I think the story here is that Apple is trying to make Siri useful by connecting it to the context in your life. Like, when is my mom landing on the plane requires context from calendar, flight number, email confirmation, whether another family member said they might go pick up mom instead, whether the plane is late or not. So the challenge for Apple is to find a way to privately and securely connect Siri to where the context lives on your phone: photos, calendar, notes, email, app state, screen, and so on. And if Siri can do that, Siri's intelligence level doesn't have to be super high for Siri to be incredibly useful. Ultimately, the question of Siri's capability may be the wrong one, and Apple's answer is not a capability answer. It's a context answer. It's an answer about where intelligence lives, and they're trying to push it as close to your systems as possible. So on-device processing is Apple's goal wherever possible, and then private cloud where it's not. One of the things that's really interesting from a product shape here for Apple's solution is that Apple is essentially saying your assistant gets better when it's close to you. And very conveniently, when it's close to you, they can construct a privacy architecture that means it's only yours. That's a consumer answer to this context problem, an answer where Siri doesn't have to be that smart to use your phone to be extremely useful. So suddenly, instead of Apple's advantage coming from the app store ecosystem or from the hardware, Apple's advantage comes from the fact that we have Apple products and we have context that lives inside the iPhone, and Apple can access that context in ways that are very useful to us. Keep that in mind as we walk over to the work side and talk about Claude Tag. Now, the Anthropic product announcement is pretty plain on the surface. Claude Tag starts in Slack. A team can grant Claude access to selected channels, to tools, to data, to code bases. It can tag it in, and then Claude just works through tasks and stages as it's tagged in, and it can respond in the thread. It can remember relevant information from channels it's in, and it can operate inside particular permission scopes, particular spend limits, particular logs it can touch. That sounds like a Slackbot, but don't say that too fast, because Slack has had bots for a long time. The interesting thing is that Anthropic is trying to put the assistant inside your team's context. On your phone, the context is private and messy because it's your life. In a company, the context is shared and permissioned and political and stale and half written and in six places. The thing that's interesting about all of this is that work is happening in those messy places, and for a long time AI has been kind of separate from that except in a few instances. Devin has been very successful with this from a coding perspective, but there's not a lot of great off-the-shelf instances for really intelligent AI coming into that kind of messy context. So when Anthropic says Claude Tag can build context over time, that is not only a real claim, but the heart of where the company is going to go. It's a very powerful and dangerous statement, because the more useful Claude becomes in Slack, the more it needs access to messy stuff companies are bad at governing, like engineering decisions and customer tickets and pricing debates and people information. Anthropic knows this, which is why the launch language spends so much time on scopes and permissions, on admin controls, on channel defined memories. They recognize that they're going to have to earn that trust, because if you put an AI teammate in Slack and it breaks boundaries, you've created a context leak. You've created a corporate liability. So what Anthropic is saying is you can trust us with this context because you're in charge the whole time. I think Claude Tag is a much better signal than most out there of this whole AI co-worker phenomenon, because we have a lot of startups in this space. One of the things Anthropic is doing very intentionally here is saying: you fed us formal context through prompts, through co-work, through Claude Code for a while, now trust us with informal context and enable us to be a co-worker that's more useful as a result. No other company can say that in the same way. This is Anthropic doing for work what Apple is doing for your phone. Now let's bring in Codex. A Codex study is easy to dismiss, it doesn't feel like news, but I think the Codex paper is really useful in this conversation because software is showing us the assistant context problem in its cleanest form. Codex is a piece of software, and the study that's released is essentially how actual employees at OpenAI chose or did not choose to adopt Codex over the course of time, and what they used Codex for. In other words, what context did they trust Codex with? This is fascinating to me because you might think that at OpenAI it's a requirement and everyone's mandated to use Codex. That wasn't how it worked. Codex had to earn everyone's trust, first with engineers and then with other knowledge workers at OpenAI. The thing that matters most to me when I read this study is that even at a company that is one of the most AI native companies on the planet, you still have to think about where you trust a particular AI application with context, and it's not a zero to one light switch flip. But it is true that you can see tipping points, and one of the tipping points that's evident in the data from OpenAI, and that I have seen personally, is that Codex got a lot more useful in the last couple months after 5.5 was released, and you can see that in the adoption data, which shows that popular adoption of Codex after 5.5 in non-technical circles at OpenAI skyrocketed. The thing that stands out to me when you put that in the context conversation we've been having is that Codex, with 5.5, earned the trust to get legal stuff, recruiting stuff, sales stuff, all of that dirty context fed into it, the way it earned trust with engineers for code. There's a lot more in that study, I encourage you to read it, I can link it. Codex is doing the opposite of Claude Tag. If Claude Tag is basically saying you work in Slack, so tag Claude in, Codex is saying your work is sensitive, your work is important, make sure you point Codex at the local files you care about for that work, and Codex can take care of the rest. So that's a frame that has Codex as your launchpad, Codex as your headquarters, whereas Claude's frame is more, let Claude come to where you already are and you can give it the messy context. In both cases there's some mess, but Claude is saying they can tackle the human conversation and the context and still do useful work, and Codex is saying give us the files, give us the jobs, and we can produce great outputs for you, whether you're in legal or sales or HR. I love that distinction, not because I don't think OpenAI will release a tagged Codex soon, these models tend to copy each other, but because I think it shows the difference in product shape around context that these two labs have. Claude has always been a we come to you, we wrap our interface around you kind of product. Claude Code was really exciting and co-work was really exciting, partly because they basically said just type what you want into the terminal, type what you want into co-work, and we will just take care of it for you. Now they're taking the next step into Slack. It's in a sandbox, it's just going to do the work there and then you'll get an output. It's almost like bring your wheelbarrow of work and let us do the work and then we'll give you an output. And it's gotten much more wide ranging as computer use has come in over the last couple months, and that's made it much more useful, you can see that in the study, but it's still fundamentally a file-shaped tool, and Claude is kind of a chat-shaped tool. I realize that's a gross simplification, because both of them tackle files, both of them do chat. So I'm not saying it's one or the other, it's not a light bulb on-off conversation. Claude has for a long time thought of the problem of context as conversational in the way they've designed their product, and Codex for a long time has thought about the problem of context in terms of files, and it's been a file-shaped answer. You can still see the legacy of that in these moments this week. Now this brings us to the next OpenAI story, which is around this ChatGPT 5.6 delay. If a frontier model spends the next few weeks or months in a restricted preview, which it looks like almost all of them will, we in the world do not pause and wait. Companies still have Claude, they have OpenAI models, and they now have GLM 5.2, and they'll have whatever new open source model is coming right after that, maybe a new DeepSeek, who knows. They will have anticipation but not the reality of future frontier work from Anthropic and OpenAI, which, by the way, are still developing and still accumulating knowledge very rapidly internally, they're just not able to release it as fast. So the government restriction is putting friction at the frontier of intelligence, and it means there is more pressure on Anthropic and OpenAI to release features like Claude Tag, because you have to increase the utility of the intelligence you already have to bring it closer to context so that you get more value for the customer. If you can spend two minutes tagging in Claude, or thirty seconds tagging in Claude, instead of ten minutes briefing the AI, you've saved yourself a lot of time. You can add that up. If it becomes a seamless part of your work, then you perceive a lot more utility from that, even if the model didn't get smarter. What that means is that we are in the middle of a context war, and that is the way you should read the news for the next few weeks. Apple is battling for your personal context, which, because we bring our devices to work, becomes a work context conversation. Anthropic and OpenAI are definitely battling over work context, they have different shapes for how they do that. One of the most interesting things here is that effectively the government slowdown is giving open source models time to catch up in public, even if they're not catching up in private. So Anthropic and OpenAI may maintain their six, seven, eight month lead over open source models privately, but the public models we have access to may start to close, because the US government is slowing down frontier model releases. That leads to a tremendous amount of pressure on utility in the context layer. There's going to be a huge war over how quickly and easily an AI model can apply intelligence to that context. So look at Apple and Anthropic and OpenAI as being in the same boat, even though we don't typically put them in that boat. Think about your context. Think about what context you're comfortable giving to these companies. Think about what context you want to retain. And think about whether you are willing to put the time in to actually build elements of a harness that allow you to decide where to route your context. So when I've talked about open brain and open engine most recently, a lot of what I'm doing is basically building pieces of a harness in public so that you have more choices. I'm not the only one doing it, there are others doing it, it's good work, I'm glad it's widespread, there's a big movement around this. I think it's important that we have choice. We shouldn't have to feel like we're locked in to any given model provider. We should have the option to retain our context and use intelligence in order to get meaningful work done. I think the more we look at the story going forward, the more it's a story of the intelligence wars shifting into the context wars. It's going to be less about when does 5.6 come out, and it will eventually be less about when does Fable come out, and more about when can we make the next step in applying intelligence so that it's useful. The story of Siri really shows us, pardon me, Apple, that you don't have to have an incredibly intelligent model to have incredible utility. Your model doesn't have to max out the benchmarks, that's not what Siri is going to do, but Siri applied across your context on your phone seamlessly can still be incredibly powerful. So that's the story under the story this week. Pay attention to the context layer, it's going to matter a lot. And if you want more stories under the story, I do them every week, subscribe for more.