<aside> ‼️
Disclaimer:
This piece is preparation for a wider-audience op-ed I would hopefully write and manage to get published. I’m posting here to stress-test the arguments before distilling them for a non-technical audience. I’m most interested in feedback on the core open-source collapse argument, the hardware choke point logic, and the China section — particularly whether the case for feasible cooperation holds up. The resumption criteria are deliberately sketched — I know they’re underdeveloped.
</aside>
This post makes one central claim that I think is underappreciated: the entire current AI safety paradigm - alignment research, control, policy, red-teaming - is insufficient because it collapses against open source. Everything else follows from that. I derive a possible solution and mentally check it against the US-China race dynamics; and I scrutinise the safety work done in the major labs.
Open-source models consistently lag frontier models by somewhere between a few months and a year and a half, depending on how you measure. Epoch AI’s Capabilities Index puts the average at around three months; their earlier training compute analysis estimated roughly 15 months. The exact number matters less than the conclusion: whatever the frontier labs can do at any given time, open-source models can do within a short window after.
This means that even if we grant the most optimistic assumptions about safety work at frontier labs - perfect alignment techniques, robust control mechanisms, effective misuse prevention, airtight KYC policy, good policy and regulation to ensure proper incentives - the entire paradigm collapses once an open-source model reaches the same capability level.
Here is why:
The current safety paradigm, at its absolute best, buys somewhere between a few months and a couple of years of lead time before open-source models reach the same capability level. That is the actual output of billions of dollars of safety investment. It is not enough.
Stopping the AI race in 2023 would have cost the global economy tens of billions of dollars - mostly VC money - with little systemic disruption. As of early 2026, stopping it would mean writing off tens of trillions in direct and indirect investments. The public is massively exposed through the stock market, 401(k)s, and pension funds. By 2028, economic growth and equity markets will likely be so AI-leveraged that halting development would trigger a global recession of a severity not seen in nearly a century.
The incentives to continue the race - economic, geopolitical, career - are growing, not shrinking. Stopping is near-political suicide for whoever pushes it and has to absorb the fallout. I would argue not stopping is also suicide - not just politically.
We should stop now because waiting will only make it harder, and at some point it becomes impossible.
Even setting the open-source problem aside, the current safety landscape is inadequate.
A significant portion of safety spending at major labs is oriented toward steering AI to be controllable and useful - goals that conveniently align with commercial interests, enabling safety-washing of R&D budgets. The remainder goes to red-teaming in simulated scenarios and mechanistic interpretability, which is roughly analogous to fMRI research on human brains: genuinely interesting, but nowhere near sufficient to make guarantees about the behavior of systems we do not fundamentally understand.
The theoretical frameworks of AI safety lag far behind empirical progress. We are building systems whose capabilities outstrip our ability to reason about them. This gap is widening, not closing.