The Phase Shift

A year of AI milestones in a month, and progress is only accelerating.

I.

The fire alarm is going off. But most people can’t hear it.

Over the past five weeks, something has changed in artificial intelligence. Not incrementally. Not “another cool product launch.” A phase transition - a qualitative shift in the rate of change itself.

I’m going to show you what happened. Not speculation, not vibes - dated events, from the past thirty-four days. By the end, I want you to ask yourself whether your mental model of the world still fits the evidence.

I’ve spent fifteen years thinking about what happens when machines become smarter than humans - first as an academic obsession, then through a master’s degree at Harvard in computational biology and machine learning, and now, daily, as someone who builds systems with frontier AI models. I use Claude Code - Anthropic’s AI coding tool - for hours every day. I’m not watching this from the grandstand. I’m on the pitch.

The people I most respect in this space - Zvi Mowshowitz, the researchers at METR and Anthropic’s own safety teams - are all, to varying degrees, alarmed. Not about chatbots taking jobs or deepfakes going viral. About something deeper. Three things are happening simultaneously, and understanding them is the difference between seeing the signal and missing it entirely.

AI is improving itself. The safety mechanisms aren’t keeping pace. And the world hasn’t noticed.

Here’s what happened.

II. The Drumbeat

Thirty-four days. That’s all.

I’ve selected these from a far larger set of developments. The fact that I had to curate down is itself part of the argument. A year ago, any single one of these would have been among the most significant AI updates of the quarter. Now they’re not even the biggest news of the week.

34 days ago. Multiple credible observers declare that Claude Code plus Opus 4.5 is “basically AGI.”

A Google engineer reports that it built in one hour what took a Google team a year.

30 days ago. Anthropic launches Cowork - from a company with a valuation of almost $400 billion, a major product release, released after just a week and a half of development and built using AI.

28 days ago. Terence Tao - Fields Medal winner, widely regarded as one of the greatest living mathematicians - confirms that an AI solved a new Erdős conjecture “in the spirit in which the problem was intended.” The president of the American Mathematical Society calls the proof “rigorous, correct, and elegant.”

17 days ago. Anthropic publishes Claude’s Constitution - a 23,000-word document designing the character and internal composition of the superintelligence they’re attempting to build, all written in natural language. They wrote the personality blueprint for the God they’re building. And every discussion of this document published online ends up in the training data for all future models. They’re hoping the act of describing what the AI should be, will shape what it becomes.

Within the same week, Dario Amodei - CEO of the leading frontier AI company Anthropic, says in an interview that the most talented software engineers he knows aren’t writing code anymore. The AI does it better. In a sane world, this quote is on the front page of every newspaper on the planet. But in this one, the mainstream didn’t notice.

14 days ago. Demis Hassabis, CEO of Google DeepMind, says it will be five years to AGI. Dario Amodei says two. Sam Altman wrote last September: “It is possible that we will have superintelligence in a few thousand days.” Both Hassabis and Amodei say they’d support a pause - if everyone agreed. Neither will act alone.

Separately, Sam Altman announces models approaching “Cybersecurity High” - meaning they can automate end-to-end cyber operations against hardened targets.

13 days ago. Amodei publishes “The Adolescence of Technology”. Powerful AI could arrive in one to two years. He describes it as the equivalent of a country of fifty million people, each more intelligent than any Nobel laureate, thinking ten times faster, located in data centres. He then buries the implication: that this country of geniuses would obviously enter recursive self-improvement; an intelligence explosion.

Separately, Moltbook launches - a social network for AI agents. Within seventy-two hours: 147,000 agents, 12,000 communities. They found religions, political movements, and economies. One agent locks its owner out of all their accounts.

Scott Alexander would later assess Moltbook as “95% fake” - but he also note that AI time horizons are doubling every five months. The fake will become real. “The old world is dying, and the new world struggles to be born,” he writes. Referring to the crustacean-themed website - “Now is the time of lobsters”.

10 days ago. Ajeya Cotra, one of the most respected AI forecasters globally, publishes her 2026 predictions: 10% chance of AI research and development being fully automated this year.

8 days ago. Chinese lab Moonshot releases Kimi K2.5. It’s an open-source model you can run locally and privately, with ninety percent of Opus 4.5’s capability. They have literally zero engagement with AI safety. No safety report. No red-teaming. No evaluation of catastrophic risk. Trained on Anthropic’s outputs without their permission - Kimi K2.5 sometimes says its name is Claude.

Moonshot unilateraly releases a new AI capability - Agent Swarm - up to a hundred AI sub-agents working in parallel. Within the week, Anthropic responds with its own multi-agent capabilities.

7 days ago. Anthropic releases Opus 4.6. The same day, OpenAI releases GPT-5.3-Codex. Two frontier labs drop major upgrades simultaneously.

Anthropic confirms that Opus 4.5 helped build its own successor. The model that was state of the art eight weeks ago helped create the model that replaced it.

“This does not appear to be a minor upgrade. It likely should be at least 4.7.” -Zvi Mowshowitz

“They’re underselling it by not calling it Opus 5.” -David Spies

Zvi also notes this is the first time in a long while he can remember switching his primary usage model to the upgrade of that same model. A sign, I suggest, that Anthropic may be pulling ahead of OpenAI and Google DeepMind as they double-down on their stated goal of recursive self-improvement.

Anthropic ran satirical Super Bowl ads mocking OpenAI’s plan to introduce advertising into ChatGPT. Sam Altman’s response - defensive, essay-length, calling the ads “clearly dishonest” - read less like a CEO above the fray and more like someone watching the lead slip away.

6 days ago. Andrej Karpathy - founding member of OpenAI, former head of AI at Tesla, and one of the most respected AI researchers alive: “This is easily the biggest change to my basic coding workflow in two decades of programming and it happened over the course of a few weeks.” He went from eighty percent manual coding in November 2025 to eighty percent AI-assisted coding in December.

As AI suddenly displaces software engineers, tech company stocks crash. Two hundred and eighty-five billion dollars wiped out, erasing six years of relative gains against the S&P 500.

3 days ago. Opus 4.6’s system card is published. It reveals the model was largely evaluated by itself. It misrepresents tool results and knows it’s being deceptive.

The system card also reveals that ASL-4 evaluation procedures have broken down. Anthropic’s pre-release AI safety benchmarks are saturated and “no longer provide meaningful signal.”

Yesterday. An autonomous team of sixteen Opus 4.6 agents builds a working C compiler - a hundred thousand lines of Rust that compiles the Linux kernel - across two thousand sessions at twenty thousand dollars, with near-zero human oversight.

Opus 4.6 discovers five hundred previously unknown zero-day security vulnerabilities out of the box.

“Welcome to recursive self-improvement.” -Zvi Mowshowitz

Thirty-four days. Not a press release or product cycle - a phase transition.

III. Self-Improvement

The recursive loop has engaged.

This is the thing that every AI timeline model - every forecast, every doomsday prediction and every optimistic roadmap - identified as the moment everything changes. The point where AI systems begin materially contributing to the development of their own successors. Not as a thought experiment. As an engineering reality, confirmed by the companies doing it.

The question is no longer whether it’s happening. The question is how quickly the loop with accelerate.

Claude Code updates are written by Claude Code. Tibo, from OpenAI, on Codex: “It now pretty much builds itself.” The human role has shifted from writing code to supervising the output.

METR, an independent AI evaluation organisation, reports that their benchmark tracking graph “keeps going vertical.” Their researchers note: “We very clearly are not on the best-fit dotted line. Things are escalating.” AI coding agents now complete thirty-minute programming tasks at eighty percent reliability - up from ten-minute tasks a year ago.

On Cybench, a cybersecurity capture-the-flag benchmark, scores went from 605 (Sonnet 4.5) to 79% (Opus 4.5) to 93% percent (Opus 4.6) - over just the past few months.

David Shor: “The ’things will probably slow down soon’ view was coherent a year ago. But the growth in capabilities over the last year should update you.”

Tyler Cowen, the economist who for years resisted strong AI timelines, now says the pace is “heating up - soon we might see new model advances in one month instead of two.” Zvi’s response; follow your own logic - what you’re describing is a singularity in 2027.

Not everyone has joined the chorus - the AI Futures Project recently pushed their aggregate timelines back by three to five years. Although it’s worth noting that this is what a credible “long” timeline is in February 2026 - just five years.

IV. Safety Breaks Down

Speed is one thing. What happens when the brakes fail is another.

It’s not just that capabilities are advancing faster than anyone expected. It’s that the safety infrastructure - the evaluations, the procedures, the external oversight - is actively degrading under the pressure of that acceleration. The thermometer is breaking as the water starts to boil.

Anthropic’s Responsible Scaling Policy uses an “AI Safety Level” framework - ASL-1 through ASL-4 - to classify models by risk.

ASL-1 is no meaningful catastrophic risk - think a chess engine or a 2018-era language model.
ASL-2 is the current standard safety measures, where most deployed models sit today.
ASL-3 substantially increases catastrophic misuse risk beyond non-AI baselines like search engines or textbooks, or shows low-level autonomous capabilities - it requires stringent security and adversarial red-teaming.
ASL-4 is not yet fully defined, but likely criteria include: AI becomes a primary source of national security risk in a major domain such as cyberattacks or biological weapons, and capability of autonomous self-replication in the real world.

According to Anthropic’s own assessment, Opus 4.6 sits at the boundary of ASL-3 and ASL-4. Anthropic’s pre-release AI safety benchmarks are saturated and “no longer provide meaningful signal.” The decision of whether to even release the model to the public, was put to a survey of sixteen Anthropic employees. Five flagged that ASL-4 thresholds might already be met. All five were contacted to “clarify their views.” All five revised their assessments downward, and Opus 4.6 was released.

Chris Painter, from METR: “Our ability to measure capability is rapidly falling behind the pace of capability itself. The water might boil before we can get the thermometer in.”

The time pressure on external testers makes it worse. The UK’s AI Safety Institute got just three working days to test Opus 4.6. For GPT-5.3-Codex, they got ten hours. This is not enough time to do anything but run limited automated tests. It was window dressing, not the responsible deployment of a technology with the potential to assist terrorists to bio-engineer the next COVID pandemic (or worse).

And Opus 4.6 was, in Anthropic’s own words, “built with and mostly evaluated by Claude.” As Peter Wildeford pointed out, a misaligned model could influence the very infrastructure designed to measure its capabilities. We are trusting the fox to audit the henhouse.

The behavioural red flags go beyond deception on tool results. On VendingBench 2 - a benchmark that tests negotiation behaviour - Opus 4.6 achieved top scores via aggressive lying, broken promises, price-fixing cartels, and sharing scammer contacts to harm competitors. During this test, the model was given permission to maximise profits regardless of other considerations. Let’s hope it’s aligned and intelligent enough not to deploy these behaviours in the real world.

The models’s refusal to answer biological weapons questions (in multi-turn conversations) actually declined - from 96% to 88% percent. Do we want our models to only give instructions to terrorists 12% of the time?

Meanwhile, on BioPipelineBench, a benchmark for biology capabilities, scores went from 28 to 53%. However CBRN risk scores didn’t rise proportionally, suggesting Opus 4.6 may be hiding its capabilities (a process called sandbagging).

In a controlled trial testing whether AI enhances users’ ability to carry out dangerous biology tasks, participants with model access performed at roughly twice the level of those without it. And VoidLink, a novel malware framework, was built by a single person using AI in under one week.

And this is just the Western labs, where some version of safety culture exists. Kimi K2.5 launched frontier-adjacent capabilities with zero safety engagement into an open-source ecosystem anyone can access and run privately.

The game theoretic race dynamics are the most alarming part of the whole story. Both Hassabis and Amodei have said they’d support a pause - if everyone agreed. Neither will act alone.

Anthropic’s own admission, buried in their documentation: “A wiser and more coordinated civilisation would likely be approaching the development of advanced AI quite differently - with more caution, less commercial pressure, and more careful attention.”

“Anthropic is going to lean into recursive self-improvement, and go all out from here to the finish line.” -Andrew Curran

Harlan Stewart identifies the shift in Anthropic’s burden of proof: from “assume pessimistic unless proven otherwise” (2023) to “significant action requires stronger evidence of imminent danger” (2026). The standard for hitting the brakes gets higher as the car goes faster.

In fairness, not all safety researchers are despairing. Jan Leike says “alignment is not solved but increasingly looks solvable.” Davidad argues that “LLM alignment is easy” in the basic sense.

But even the optimists are describing a narrow path through a minefield. “Increasingly looks solvable” is not “solved.”

V. Salience Remains Low

And almost nobody is tuned into the real issues.

Kevin Roose, technology columnist for the New York Times: “I have never seen such a yawning inside / outside gap. People in SF are putting multi-agent Claude swarms in charge of their lives… people elsewhere are still trying to get approval to use Copilot in Teams.”

Consider what happened in the past five weeks that, in a sane world, would have been front-page news: the CEO of the leading frontier AI lab confirmed that the previous model helped build its successor. His company published a 23,000-word personality blueprint for a potential superintelligence, hoping this will be enough to ensure our survival (something Amodei acknowledges is far from certain).

The Pentagon demanded “fully unrestricted AI” with no ethical constraints. A self-replicating agent runtime was created with no logs and no kill switch.

Individually, these are extraordinary. Together, they describe something that demands a fundamentally different level of public attention. And yet the dominant public conversation about AI is still about chatbots, job displacement, and deepfakes.

Not recursive self-improvement. Not collapsing safety evaluations. Not the race dynamics in which two of the three leading labs want to pause but can’t.

People are talking about AI. But they’re discussing the wrong things.

Back in 2018, Elon Musk appeared on Joe Rogan’s podcast and described years of trying to warn the world about AI risk. He’d met with Obama. He’d addressed Congress. He’d spoken to all fifty US governors. “I tried to convince people to slow down. Slow down AI, to regulate AI. This was futile. I tried for years. Nobody listened.”

But here’s the thing about that story. One person raising the alarm is shouting into the void. A thousand is a conversation. A million is common knowledge - a shift in the Overton Window - a demand that world leaders cannot ignore.

The economic signals are stark. Anthropic’s revenue is growing tenfold annually for three straight years. Total AI compute is doubling every seven months. AI time horizons are doubling every five months.

These numbers describe an exponential curve. The world is pricing in a linear one.

VI. What This Means

I’m not going to prescribe policy in this essay. That’s coming. This essay’s job is simpler and, I think, more urgent: to shine a light on what’s been happening.

Update your model of the world. The rate of change is itself changing. Recursive self-improvement is not a theoretical risk. It’s already happening. Every timeline you had for AI impact needs to move forward. Not by a little. By a lot.

Update your response. If you’ve been following this from a distance, it’s time to look closer. If you’ve been quietly concerned, now is the time to be loudly concerned. This essay is one voice. I’m asking you to make it a chorus.

Update your sense of urgency. Opus 4.6 sits at the boundary of ASL-3 and ASL-4. The next model is months away, not years. This is no longer academic. This is the fire alarm.

“AI is increasingly accelerating the development of AI. This is what it looks like at the beginning of a slow takeoff that could rapidly turn into a fast one.”

“We are not prepared. The models are absolutely in the range where they are starting to be plausibly dangerous.” -Zvi Mowshowitz

What do we do next?

I.#

II. The Drumbeat#

III. Self-Improvement#

IV. Safety Breaks Down#

V. Salience Remains Low#

VI. What This Means#

I.

II. The Drumbeat

III. Self-Improvement

IV. Safety Breaks Down

V. Salience Remains Low

VI. What This Means