Skip to content

Nate B Jones AI Business

GLM 5.2 Is Free And Beats Claude On Most Work. So Why Can't Companies Switch?

Nate B Jones tested GLM 5.2, the free open weight model from Z.ai, and found it genuinely excellent, often better than Claude, for the fat middle of everyday AI work: brochure sites, standard decks, first pass copy, and coding tasks with familiar shapes, what he calls center of distribution work. But he argues companies still are not switching en masse, even though GLM 5.2 runs about 98 percent cheaper, because a model is only a brain in a jar without a harness, and swapping models means rebuilding the harness, tool calls, memory, and prompts from scratch, as Lindy's Flo Crivello discovered leaving Claude for a DeepSeek architecture. He walks through why the tipping point has not arrived: employee pressure for known brands, nobody measuring their own task distribution, harness rebuild cost, and increasingly context lock in from team level harnesses like Anthropic's new Claude Tag, which reads a company's live Slack context automatically. His conclusion is that the real trillion dollar opportunity is not the model war but the last mile, the scarce, well paid work of building agent agnostic harnesses that route between cheap and frontier intelligence, and he frames the next few years as the moment to learn to build that last mile before companies end up renting their own company brain back from the frontier labs.

Published Jun 28, 2026 17:35 video 22 min read Added Jul 1, 2026 Open on YouTube →

At a glance

Nate B Jones tried GLM 5.2, the open weight model from Z.ai (formerly Zhipu AI), and it genuinely impressed him. Not a fake impress, he says: for the fat middle of everyday AI work, the brochure site, the standard slide deck, first pass copy, coding tasks with familiar shapes, GLM 5.2 is often better than Claude, and it is close to free to run. So why isn't every company routing its work to the cheapest model that can do the job? Jones spends the video answering that, and the answer is not about intelligence at all. It is about the harness, the whole system of files, tools, memory, and workflow that a model sits inside, and about who already owns the harness your company runs on. He walks through why the ergonomics of brand pressure, the difficulty of measuring your own task mix, the cost of rebuilding a harness from scratch, and above all the context lock in of new team level products like Anthropic's Claude Tag all conspire to keep companies paying frontier prices even when a 98 percent cheaper model can do most of the work. His conclusion: this is not a story about GLM 5.2 being bad. It is a story about the last mile of AI being a trillion dollar problem, and about who is going to get paid to build it.

Why GLM 5.2 blew my mind on everyday work

Jones opens by naming exactly what he is promising: by the end of the video, viewers should know where GLM 5.2 can be relied on, where it can safely replace an expensive model, and where switching models is a trap, because you are not replacing a model call, you are replacing a whole work system. That throughline is the whole video.

GLM 5.2, he says, did not fake impress him. It actually impressed him, because it is not just cheap, it is free if you run your own servers and very cheap on the cloud, and for a lot of normal work it is incredibly good, often better than Claude. By normal work he means the fat middle of everyday AI tasks: a brochure site for a client, a fairly standard PowerPoint outline, a first pass of copy, routine synthesis, and coding tasks tackling familiar problem types, the kind with lots of prior examples and outputs a human can check quickly. He calls this the center of the distribution of AI work, meaning the pattern has been tried with models millions of times before, the answer shape is normal, and the output is easy to inspect. How many brochure sites has anyone seen, he asks. In that world, GLM 5.2 is fast, cheap, easy, and extremely high quality, higher quality than Claude, and for a lot of those tasks it is not just good enough, it is honestly the best model in the world at center of distribution work, especially anything where front end taste matters.

To be clear, this is not a video about GLM 5.2 being bad, even though it is not his daily driver, and he says he is going to explain why. GLM 5.2 is incredible, and he is still not using it every day. A lot of companies he knows are struggling with exactly this tension: they want to move toward a generic router that sends work to the cheapest model available, but it is not actually easy to do in practice.

	GLM 5.2	Claude (frontier)
Cost	free on your own servers, very cheap on the cloud, about 98% cheaper than Claude	expensive at scale; one engineer burned $80,000 in tokens in a week
Center of distribution work	brochure sites, standard decks, first pass copy, familiar coding: often the best model in the world here	solid, but you are paying frontier prices for common patterns
Edge of distribution work	not what it is tuned for; not the video's claim to make	this is where frontier pricing earns its keep
Own harness	shipped with a Codex clone harness, a first stab	Claude Code, and now Claude Tag, a team level Slack harness
Context lock in	none yet, still building the last mile	Claude Tag reads a team's live Slack context automatically, hard to rip out
Switching cost for a company	needs a rebuilt harness: new prompts, memory, tool calls, from scratch	zero, if your stack is already built around it

Figure 1. Jones's own reading of where GLM 5.2 stands next to Claude. It is not that GLM 5.2 is a weaker model, it is that price and everyday quality are not the only variables in a company's decision to switch.

Cheap AI is here, and frontier releases are slowing

Cheap AI, Jones says, is not a theory anymore, and it is going to keep arriving, in part because the US government is now slowing down frontier model releases. The next major frontier model, numbered 5.6, is the latest to be affected, apparently set to be released customer by customer, which he reads as code for nobody knowing when it is actually coming. For the first time, there is no defined cadence for future frontier releases, even though the labs are still doing phenomenal work training and reinforcement learning their models. That is going to push more of the conversation toward open source, and a lot of that conversation is frankly about moving down the cost curve, because frontier model costs get expensive fast. He cites a story making the rounds: one engineer spent $80,000 in token costs in a single week. When people are spending tens of thousands of dollars a week on tokens, there is tremendous incentive to make cheaper models work.

So why isn't there a tipping point away from the frontier labs? Why are Anthropic and OpenAI still growing revenue like crazy when incredibly good, incredibly cheap open models exist? Jones lists the reasons, drawn from conversations with engineers and leaders at companies actually living this.

Why companies still aren't switching to open models

The first reason is the ergonomics of work. If you already have a frontier model at home on your phone, you just want that access at work too. There is real employee pressure around Claude and OpenAI that simply does not exist for open source models, and it is not small: when people vocally say a tool will help their work, overburdened IT departments tend to listen.

Center of distribution vs edge of distribution tasks

The second reason is harder to see: it is genuinely difficult to correctly figure out whether your task load is weighted toward the center of the distribution or the edge of it. If your work is edge weighted, novel, high stakes, hard to verify, you actually do want frontier models. If it is center weighted, common patterns with lots of precedent, open source models are going to be really, really good. But people are not used to measuring their own work this way, individuals aren't, teams aren't, and almost no company has properly asked the question of what its own distribution of tasks actually looks like.

Figure 2. Jones's center of distribution versus edge of distribution frame. Most of the world's knowledge work, by definition, sits in the wide common center, which is exactly where a cheap, high quality model like GLM 5.2 excels. The tails, the genuinely novel or high stakes work, are where paying for a frontier model still earns its keep. Almost no company has actually measured which shape its own task load has.

Lindy rebuilt its whole harness to leave Claude

The people who have gone furthest on this, Jones says, are folks like Flo Crivello, who leads the Lindy team and very publicly wrote up his journey moving to a DeepSeek architecture away from Claude. Crivello saved a lot of money doing it, but he was also honest that his team had to rewrite its harness essentially from scratch around DeepSeek. They could not just take their existing systems for working with Claude, all their prompts, all the ways they handled memory, all their tool calls, and automatically lift and shift them onto the new model. It does not work that way. These models need their own harnesses.

Crivello was incentivized to do that work because Lindy literally serves AI as a service: if he can deliver a cheaper, equally effective product, it hits his margin directly, and it matters enormously. For companies using AI internally, for coding or back office automation, that ROI is not nearly as clear, and the incentive to make the jump is weaker. Jones says he has seen this from entrepreneurs personally: the ones actually making the switch, dealing with different tool calling conventions, different memory architectures, are the ones with a clear, direct ROI on a specific AI product already in market. For everyone else, the incentive is not as strong, so there is no equivalent push to wade through the harness rebuild.

A model is a brain in a jar without a harness

This is the idea Jones wants viewers to take away above all: a model can be an incredible brain in a jar, and it just is not useful to you without a harness. That is why he pays close attention to harness innovations, and he names three that are top of mind right now.

First, GLM 5.2 shipped with its own Codex clone harness. That tells him open source model makers are realizing they need to deliver harnesses too, not just weights, and he expects more innovation in that direction. Second, Codex itself is starting to publicly call out that you can use the Codex harness without using any OpenAI model at all, which is notable because it gives OpenAI a different path to value, and stickiness, than just being the default model inside it. Third, Anthropic is not sitting still either: they launched Claude Tag this week, and Jones calls it an incredibly sticky product, a team level harness at exactly the moment the industry is trying to figure out how to turn individually productive AI use into team productive AI use.

GLM 5.2 ships with its own Codex clone harness, an open model maker's first attempt at owning the last mile, not just the weights.
Codex starts publicly telling users its harness works without any OpenAI model underneath it, a different, stickier path to value than being the default model.
This week Anthropic launches Claude Tag: tag Claude in a Slack channel and it works the job, a team level harness rather than an individual chatbot.
Same week the US government's slowdown on frontier releases leaves the next major model, 5.6, shipping customer by customer with no public cadence.

Claude Tag and the rise of team level harnesses

Team level harnesses are where the energy is going, Jones argues, because so much of the value we have gotten from AI so far is individually productive, one person, one chat window, not team productive. Companies are trying to figure out how to turn individual productivity into something the whole team benefits from, and Claude Tag, literally just tagging Claude in a Slack channel, is one of the first examples of a sticky, viral, consumer facing team harness. An ordinary knowledge worker does not need to know the phrase "team harness." They just tag Claude and it works.

But look at it strategically from Anthropic's side, Jones says. They are no longer just capturing engineers who choose to use Claude Code. Now they are capturing every knowledge worker in Slack, reading all the messy context that lives there, the context nobody has ever managed to codify, and feeding it into Claude automatically. Within privacy policy limits, that is context Anthropic's systems get to learn from, long term, in the context of that specific company, and it starts to let Claude own the harness itself in a way no company can easily walk away from.

Why you can't rip out a model that owns your context

Here is the trap, stated plainly. Say GLM 5.2 really is roughly 98 percent cheaper than Claude and about as good on most tasks. It would be entirely rational to build a routing system that sends most work to GLM 5.2. Except: are you going to have Claude Tag on that stack? Is that convenience going to be there? Are you going to have to restart the job of giving your new AI the company context Claude already acquired automatically inside Slack?

Jones frames this with an old business truism: we have taught companies for decades that data is alpha, that data is the edge you have if you are serious. If that is true, then handing all of that data over as context to a frontier model provider, even a scrupulously ethical one with a strong privacy policy, means you are effectively renting your own context back from yourself. Claude sits inside your Slack as a team level harness, incredibly close to everything your team does, and that proximity makes it nearly impossible to rip out, no matter how cheap a GLM 5.2 class model gets.

Jones thinks the GLM 5.2 team knows this, which is exactly why they shipped a harness, a Codex like interface, alongside the model. It is a first stab, but the industry has to go much further, because the companies that most need to build their own harness generally cannot afford the AI talent to do it. That talent is scarce enough right now that it can charge almost anything, and it usually ends up at a hyperscaler or another large company instead. So the only companies actually building their own last mile harnesses and auto routers today tend to be the ones that can afford that scarce AI talent in the first place.

Figure 3. The friction stack Jones describes for why the obvious, rational switch to a much cheaper model keeps not happening. Each layer alone is survivable; stacked together, plus a lab's team harness reading your live company context, they add up to a genuine trap.

The harness talent shortage is a builder's opening

Jones frames all of this as, at bottom, not a story about intelligence. It is a story about the last mile in AI, and about how scarce the talent to build that last mile really is. He thinks that should be a source of optimism for a lot of viewers: if talent that scarce is what stands between companies and being locked into frontier contracts, there is an enormous opportunity in knowing how to build a harness.

It is not easy work. Knowing how to handle a tool call correctly in GLM 5.2, and how that should differ from handling one in Claude, figuring out how memory should work for that system, figuring out how a system prompt needs to change for a center of distribution model, all of that is serious technical work. Anyone who can do it, or even parts of it, refactoring agentic pipelines so they work well with an open source model, is going to be in high demand, especially paired with the skill of routing: recognizing on the fly which tasks are frontier model tasks and which should go to a cheaper open source model instead. Jones calls this a huge investment theme for companies in 2026 and 2027.

Compare that to what the frontier labs are doing with their pricing power. Claude Tag is a great example of how incentives in closed source, high margin models produce genuinely great experiences fast: if you have pricing power, you are heavily incentivized to make your product as convenient and ergonomic as possible, and features like Claude Tag are going to keep appearing rapidly from Anthropic and from OpenAI, because both are incentivized to protect high prices and win that business. Open source model makers do not have the same margin or cash flow to deploy thousands of forward deployed engineers making a harness sing.

So one of the strange, simultaneous truths Jones lands on is that GLM 5.2 can be an incredible model, one that technically savvy entrepreneurs switch to the moment the ROI is clear, and at the same time not an easy model for an average company pulled out of the phone book to actually use. Any given company has to think hard about how to use GLM 5.2 usefully, and it takes a lot less thought to just sign a frontier model contract that slots into existing workflows. That last mile, Jones says, is a literal trillion dollar problem, and one of the biggest open questions right now is whether the talent to build it will scale fast enough for businesses to tackle it without paying so much they cannot afford it. He does not know the answer, but expects it to become clear in the next three to six months, especially with the US government's effective pause on frontier releases still in place. For anyone in an agency or consulting, he calls this a golden goose moment: you can promise real token savings as part of an ROI pitch, as long as you can actually deliver the harness refactor without breaking quality, which he is careful to say is not a trivial task.

Take the last mile seriously before you rent your brain

Jones closes by refusing to let anyone dunk on GLM 5.2 for being good at center of distribution work, because by definition, most of the species's knowledge work is center of distribution work. A model that is genuinely excellent at that is worth taking seriously, and taking it seriously means taking the last mile seriously, which means taking the need for a harness around it seriously. That is a lot of what he has been doing publicly: articulating what it takes to build a harness, whether he calls the pieces open skills, open brain, or open engine, ideas he has talked through elsewhere on his channel, so that the pieces can be assembled in a way that is agent agnostic and model agnostic. The goal is being able to install those pieces and take advantage of whatever intelligence is on tap, Claude, Codex, Hermes, or whatever system comes next, easily.

He is direct about the stakes: this is a moment for builders, and there is a lot of custom work ahead for individual companies. But if companies do not start down that path, they are, in his words, going to end up renting their own company brain and company context back from the frontier model providers. Those providers will have that context, and they will use it to keep improving their systems, making them more convenient and more sticky, and companies will have no choice but to keep using them.

Jones calls this a genuinely pivotal moment: the firm has never before faced a situation where its own brain is effectively on rent, and that is exactly what tools like Claude Tag put on the table, useful, he stresses, genuinely useful, which is precisely what makes it dangerous. His advice, whether you run a large company or a one person agency, is to think seriously about whether you want to rent your context and intelligence long term. Ask yourself if you actually know the distribution of your own tasks. Ask whether you have access to the technical talent to build the last mile. Ask which specific task sets would save you real money in tokens if you moved them. Most people, he says, never sit down with pencil and paper and actually answer those questions, and he has a more detailed version of that question set posted on his Substack for leaders working through it. This is a serious moment for open source. GLM 5.2 opened the door, and it is up to each of us to decide how we take advantage of it.

Key takeaways

GLM 5.2 is free to run on your own servers, very cheap on the cloud, and for center of distribution work, brochure sites, standard decks, first pass copy, familiar coding tasks, Jones calls it honestly the best model in the world, often better than Claude.
The reason companies are not stampeding to switch is not intelligence, it is the harness: everything that turns a raw model into usable work, prompts, memory, tool calls, system design, none of which lifts and shifts between models.
Flo Crivello's Lindy switched 100 percent of traffic from Claude to a DeepSeek architecture and saved millions, but only after rebuilding its harness from scratch, because AI as a service gave him a direct, clear ROI that most internal AI users do not have.
A model is a brain in a jar without a harness. GLM 5.2 shipped its own Codex clone harness, Codex is marketing itself as usable without any OpenAI model, and Anthropic launched Claude Tag, a team level Slack harness, in the same window.
Claude Tag turns individual AI use into team AI use by letting anyone tag Claude in Slack, which also means Anthropic's systems are learning from a company's live, messy Slack context automatically, within privacy limits.
Because data is alpha, handing that context to a frontier provider means renting your own context back from yourself; that proximity to your company's real work is what makes a model impossible to rip out, regardless of how much cheaper an alternative is.
The scarce resource is not intelligence, it is the talent to build a last mile harness: figuring out tool calling, memory, and system prompts for a center of distribution model like GLM 5.2 is specialized, valuable, and in high demand.
One engineer reportedly spent $80,000 in token costs in a single week, illustrating why the incentive to find cheaper models is real even though the switch is hard.
The next major frontier model, numbered 5.6, is affected by a US government slowdown on frontier releases and is shipping customer by customer with no public cadence.
Jones's closing warning: without building your own harness, companies risk renting their own company brain and context back from frontier labs indefinitely, a genuinely pivotal moment for how firms operate.

Chapters

Timestamps are clickable. Click one and the player jumps there and keeps playing while you read.

0:00 Why GLM 5.2 blew my mind on everyday work
2:22 Cheap AI is here and frontier releases are slowing
3:41 Why companies still aren't switching to open models
4:11 Center of distribution vs edge of distribution tasks
4:53 Lindy rebuilt its whole harness to leave Claude
6:39 A model is a brain in a jar without a harness
7:23 Claude Tag and the rise of team-level harnesses
8:47 Why you can't rip out a model that owns your context
10:36 The harness talent shortage is a builder's opening
14:50 Take the last mile seriously before you rent your brain

Notable quotes

GLM 5.2 did not fake impress me. It actually impressed me because it's not just cheap, and it's very cheap to run on the cloud, it's free if you set up your own servers, and for a lot of normal work, it's incredibly good. Nate B Jones, 0:26

This is the best model in the world at those center of distribution kinds of tasks, especially ones where front end taste is important. Nate B Jones, 0:59

One engineer spending $80,000 in token costs in a week. That's a lot. Nate B Jones, 1:45

A model can be an incredible brain in a jar. And it just isn't useful to you without a harness. Nate B Jones, 6:39

We have taught companies for decades that data is alpha. If data is alpha, what do we think about giving all of that data to a frontier model provider as context? Nate B Jones, 8:47

It's actually not a story of intelligence. It's a story of the last mile in AI, and the fact that the talent to build the last mile in AI is incredibly scarce. Nate B Jones, 9:41

That last mile is literally a trillion dollar last mile in AI. Nate B Jones, 11:44

The firm has never faced a moment where the firm's brain has been on rent. And that is what we're on the verge of with tools like Claude Tag. Nate B Jones, 15:31

Resources mentioned

Nate B Jones, the channel, AI News and Strategy Daily, where this analysis was published.
Nate's Substack, where he has posted a more detailed question set for leaders working through their own harness and task distribution decisions.
GLM 5.2, the open weight model from Z.ai (formerly Zhipu AI) that is the subject of the video.
Claude and Anthropic, the frontier lab and model GLM 5.2 is compared against throughout.
Claude Tag, Anthropic's new team level Slack harness, launched the week of filming.
Codex, OpenAI's coding agent harness, cited for now marketing itself as usable without any OpenAI model underneath.
Lindy, the AI agent startup that switched all of its traffic from Claude to a DeepSeek architecture.
Flo Crivello, Lindy's founder and CEO, who wrote publicly about rebuilding Lindy's harness from scratch to leave Claude.
DeepSeek, the model architecture Lindy moved its traffic to.

Full transcript

I tried GLM 5.2 and it blew my mind. By the end of this video, you should know where GLM 5.2, an open-source model, can be relied on, where it can safely replace an expensive model, and where switching models is a bit of a trap because you're not replacing a model call. You're actually replacing a whole work system. And that's the thing I want to draw through in this video. So, let me start at the beginning here. GLM 5.2 did not fake impress me. It actually impressed me because it's not just cheap, and it's very cheap to run on the cloud, it's free if you set up your own servers, and for a lot of normal work, it's incredibly good. It's often better than Claude. And when I say normal work, I mean the fat middle of everyday AI tasks, right? So, if you're setting up a brochure site for a client, if you have a PowerPoint outline, it's a pretty standard deck, for a first pass copy, routine synthesis, for coding tasks that are tackling familiar problem types in coding, these are tasks with familiar shapes, with lots of examples, with outputs that a human can check quickly. The nerdier phrase for this is that this is the middle of the distribution work for AI. In other words, what you are getting is what someone has tried with models millions of times before, where the answer pattern is pretty normal, and the output is pretty easy to inspect. How many different brochure sites have you seen, right? In that world, GLM 5.2 is incredible. It's fast, it's cheap, it's easy, and it's extremely high quality. It's higher quality than Claude. And a lot of those tasks, I don't think it's honest to say it's just good enough. I think it's more accurate to say this is the best model in the world at those center of distribution kinds of tasks, especially ones where front-end taste is important. And so, this is not a video about GLM 5.2 being bad, even though it's not my daily driver, and I'm going to explain why. GLM 5.2 is incredible, but I'm still not using it every day. And in fact, a lot of companies I know are really struggling with the idea that they want to transition to more of a generic router where they can route to the cheapest model available, but it's not actually easy to do in practice. Why is that, right? We're going to talk about why that is, talk about where open source is going, talk about what the shape of work looks like in 2026, and we're going to tie it back into GLM 5.2 and the way we actually need to build to take advantage of models like this. Because cheap AI, it's not a theory anymore. Cheap incredible AI is here. In fact, it's going to be here more and more and more and more because the US government is now slowing down frontier model releases. 5.6 is the latest model to be affected. It's apparently going to be released customer by customer, which is code for we don't know when we're going to get it. For the first time, there is no defined expected cadence for future model releases that are frontier, even though the labs are still doing a phenomenal job training and reinforcement learning their models. And so we're going to have more and more of this open source conversation. And a lot of the open source conversation is frankly about moving down the cost curve, right? Because these frontier model costs are expensive. If you're running a company, they get really expensive. There are stories going around where the numbers are absolutely eye-popping. Like one engineer spending $80,000 in token costs in a week. That's a lot. So if you have that kind of pricing power, if people are spending tens of thousands of dollars a week on tokens, there's a tremendous amount of incentive to make these models work. So why is it so hard? Why are we not seeing a tremendous tipping point away? Why are we still seeing Anthropic growing their revenue like crazy, OpenAI growing their revenue like crazy when these incredibly good models exist? Well, there's a number of factors to that, and I want to list them for you so that you can actually understand the perspective. This is based on talking with engineers at companies as well as with leaders. The first one is the ergonomics of work. If you are just trying to get something you've heard about, seen about, you have a frontier model at home on your phone, you just want access to that. There's a lot of employee pressure around Claude and around OpenAI in a way that there just isn't for open source models. So, that's one piece. And it's not small. Like, when people are asking for it vocally saying this will help my work, overburdened IT departments tend to listen to that. Number two, it is actually very, very difficult to correctly figure out whether your task load is center of distribution or edge of distribution weighted. If it's edge distribution weighted, you actually do want the frontier models. If it's center of distribution, the open source models are going to be really, really good because they're common patterns. But people don't. They're not used to measuring their work that way. Individuals aren't, teams aren't. If you're a company trying to figure out what is your model strategy, you kind of got to tackle what is your distribution of tasks? And almost no one has asked that question properly yet. And people are trying to figure out how to measure that. The folks that have gone the farthest, actually, are folks like Flo Crivello, who is leading the Lindy team, and who very publicly wrote up his journey to a DeepSeek architecture away from Claude. And, you know, he saved a lot, etc., etc. But he was also very honest about the fact that the Lindy team had to essentially rewrite their harness from scratch around DeepSeek, and they could not just take all of their systems for working with Claude, all of their prompts, all of the way they handle memory, all of their tool calls, and just automatically lift and shift. It doesn't work that way. These models need their own harnesses. He was incentivized to do that because he is literally serving AI as a service, and if he can deliver a cheaper and more effective service that hits his margin, it's tremendously impactful. For folks who are using AI internally for coding or for back office automation, that ROI is not as clear, and the incentive to move is not as clear either. And so, what I have seen, and I have seen anecdotes from this, not just from Flo, but from other folks that I know personally, I know entrepreneurs who are wrestling with this today. The ones who are actually making the jump to open source and dealing with the different system, dealing with the different tool calls, dealing with a different memory architecture, etc., that is tuned around the fact that these are center of distribution models. Those guys or those gals are focused on ROI for a particular AI tool they have in market. Just like Lindy, they see value back in their pockets when they can cut their token costs. And for everyone else, because the incentive is not as strong, you don't have the same commitment to wade through the challenge of building a harness. And that is not a small thing. And one of the things I want you to take away from this video is that a model can be an incredible brain in a jar. And it just isn't useful to you without a harness. And so this is why I pay a ton of attention to harness innovations. And I want to name a couple that are top of mind as we look at GLM 5.2 in context. First, I notice that GLM 5.2 was released with its own Codex clone harness. That's one piece that I pay attention to. It looks like the open source model makers are realizing they need to deliver harnesses as well. And so I would expect more innovation in that direction. I notice that Codex is starting to call out publicly that you can use Codex the harness without using any OpenAI model. That's notable because there's a different path to value for OpenAI there. Maybe OpenAI's models are the default, but if they're calling out that they are actually the harness for all of work, it gives them a way to be stickier long term. Three, the Anthropic team is not just sitting there as all of these developments happen. They launched Claude Tag this week, and Claude Tag is an incredibly sticky product. It is a team level harness, and team level harnesses are where the energy is going because so much of the work we've got is individually productive work in AI. It's not team productive work. And we're trying to figure out, how do we align our efforts that are individually productive into something that is team productive? And Claude Tag, which is just tag Claude, anyone can tag Claude and get work done in Slack, is one of the first examples of a sticky viral consumer team harness. Where if you're an ordinary knowledge worker at a particular company, you can envision using that as a team harness. And you don't have to know the word team harness, it's just going to work. You tag Claude and it works. But look at it strategically from Anthropic's perspective. Now they're not just getting the engineers. Now they're getting everybody who's a knowledge worker in Slack and they're reading all of the messy context that lives in Slack that no one knows how to codify and that is now getting fed into Claude automatically, and it can be something that the Anthropic team learns from, within privacy policies, long term for Claude in the context of that company, to start to own the harness itself in a way that no company can get away from. It's an incredibly sticky experience because you think about it. Let's say you know that GLM 5.2 is a lot cheaper, which it is. It's like 98% cheaper or something like that. If it's that much cheaper than Claude and it's just about as good on most tasks, it is rational to build a routing system and assign most tasks to GLM 5.2. Except that, hey, are you going to have Claude Tag, right? Are you going to go tag in Claude on that stuff? Is that convenience going to be there? Are you going to have to restart the job of giving this AI context from your company because Claude magically acquired it in Slack and you didn't have to think about it? We have taught companies for decades that data is alpha. Data is something you have an edge with if you're serious. If data is alpha, what do we think about giving all of that data to a frontier model provider as context? Even if they don't release it into training data, even if the privacy policy is really good and they're behaving really ethically, which I have no reason to think they're not, you still are effectively renting your own context back to yourself because Claude is going to be in your Slack as a team level harness and is going to be incredibly close to all the work your team does, and it's going to be impossible to rip out. No matter how cheap the GLM 5.2 class models are, how can you rip out the model that's that close to context? And I think that the GLM 5.2 team knows this. That's why they released a harness, a Codex like interface with their AI. It's a first stab at it. But we got to get much farther there in tech, where the companies that know they need harnesses generally cannot afford to hire the AI talent to build those harnesses unless they're extraordinary companies, because that AI talent is so in demand right now that it can charge anything it wants, and it usually goes to one of the hyperscalers or another large company. And so we're in the dynamic where the only companies that can build their own last mile harnesses, their own auto routers, are companies that can afford that, that can afford the AI talent to do that, which is very scarce. And so if you actually think through this dynamic with GLM 5.2, how it's possible that at the same time we can have an incredible open source model that we're excited about and also that Anthropic still has pricing power to charge a lot for their tokens even though their tokens are just marginally better, it's actually not a story of intelligence. It's a story of the last mile in AI and the fact that the talent to build the last mile in AI is incredibly scarce. Which should, honestly, for a lot of you watching, be a source for optimism. If we have that scarce a talent, where people are ending up locked into contracts with a frontier model provider because they don't know how to build a harness for themselves, wow, is there a lot of opportunity in knowing how to build an AI. It's an incredible opportunity right now. It is not easy to do this work. It's not easy to know this is how you handle a tool call in GLM 5.2 and how you should do it differently from Claude. So does figuring out how memory will work for that system. So does figuring out how the system prompt needs to change because it's a center of distribution model. It's a lot of technical work. And if you know how to do that work, or know how to do parts of that work to essentially refactor agentic pipelines so they work with an open source model, you are going to be incredibly in demand. Especially if you compare that with the ability to route tasks where you can take a task and recognize on the fly that it's a frontier model task and it should go to a frontier model versus everything else going to a cheaper open source model. That is going to be a huge investment theme for companies in 2026, 2027, and they're going to keep innovating. Claude Tag is a fantastic example of how incentives in frontier closed source models are giving us incredible experiences. If you have pricing power, you are heavily incentivized to make sure that your experience is as convenient and ergonomic as possible. And so features like Claude Tag are going to appear really, really fast, really rapidly, really completely from teams at Anthropic, also from OpenAI, because they're incentivized to keep those prices high and to go after that business. And with open source models, you don't have the same margin to work with, you don't have the same cash flow to work with, and you don't have the same incentive to dig in and deploy thousands of forward deployed engineers and really make these harnesses sing. And so one of the really interesting facts that we come to after all of this can simultaneously be an incredible model, a model that a lot of entrepreneurs switch to when the ROI is clear and they're technically savvy enough to do it, and also not a model that is easy for a given company that you turn up in the phone book to actually use. Any given company is going to have to think about how they use GLM 5.2 to use it usefully, and they're going to have to think a lot less to sign up for a frontier model contract that's going to fit right into their existing workflows. That last mile is literally a trillion-dollar last mile in AI. And one of the biggest open questions right now is whether we will scale our talent fast enough to enable businesses to tackle that problem set without paying so much that they can't afford it. I don't know what the answer's going to be, but that's a question we're going to have an answer to. We will all collectively answer together in the next 3 to 6 months. We are going to find out, especially as the US government has this effective pause in place on frontier model releases, and the open source systems are going to continue to be available, we're going to find out whether companies can adjust to the fact that intelligence is 98% cheaper and takes a last mile to build. Can they actually build that last mile? Can they find teams to build that last mile? If you are in an agency or in a consulting space, this is a golden goose moment. You have a chance here. You can really go to town and basically promise to save people a ton of money on tokens as part of your ROI proposition, as long as you can deliver that refactor in a way that maintains quality, which is not a trivial task. If it was easy, we wouldn't be having this video. So, where does this leave us? GLM 5.2 is an incredible model. It is important not to shame a model or diss a model because it's good at center of distribution tasks, because by definition that is most of our work. Collectively as a species, most of our knowledge work is center of distribution, just by definition. And if that's the case, a model that's really good at that is worth taking really seriously. And if we take it seriously, that means we have to take the last mile seriously. We have to take the idea that we need a harness for that last mile seriously. And that's a lot of what I have been doing in public is starting to articulate what it takes to build a harness, whether it's open skills or open brain or open engine, which I've all talked about on this channel. How do you start to take these pieces and put them together in a way that is agent agnostic, that is model agnostic, so you can start to install those pieces and actually take advantage of all the intelligence on tap, whether it's Claude, whether it's Codex, whether it's Hermes, whether it's whatever system you want, whether it's your own iPhone, you should be able to easily build to that last mile. And I know that there's a lot of custom work for individual companies, and that's why I keep saying this is a time for builders. But if we don't start down that path, we're essentially going to be renting our company brain and company context back from the frontier model providers. And they're going to have it. And they're going to be able to use it to continue to improve their systems, and make them more useful, and they'll be incredibly convenient, incredibly sticky products. And what are we going to do? We're going to have to use them. So, this is a very pivotal moment for corporations. The firm has never faced a moment where the firm's brain has been on rent. And that is what we're on the verge of with tools like Claude Tag, which are incredibly useful. I'm not saying they're not useful, they're very useful. That's exactly the dangerous thing. So, I would encourage you, even if it's a tiny company, let's say you're building your own agency, you're an individual entrepreneur, think seriously, just as you would if you're a larger company leader, think seriously about whether you want to rent that context and intelligence or not. Think seriously about where you want to go with your context long term. Ask yourself, do you have an idea of the distribution of your tasks? Do you have access to technical talent that you can use to build out that last mile? What are the task sets that you would want to assign that would save you a ton in tokens? A lot of people don't sit down and get pencil and paper and actually ask themselves those kinds of questions. And I have a whole sort of question set that's in more detail that I've been going over with leaders. I put that on the Substack. But this is a really serious thing. This is a moment for open source. GLM 5.2 opened that door for all of us, and it's going to be up to us to see how we take advantage of it. Good luck with that. Cheers. Bye.