Physicists inherit, along with the formalism and the problem sets, a particular set of guilt. The profession has been working through its relationship to weapons, state violence, and the gap between scientific capability and ethical readiness since August 1945. This post is about why I think the current moment in AI closely resembles that history, and why Anthropic’s decision to draw a line matters even if — especially if — you think the line is imperfect.


What Just Happened

The news this week involves Anthropic and the question of whether and how large language models should be available for military applications. Anthropic has stepped back from a path toward unrestricted military use and restated a position: there are things their models will not be used for, weapons development and autonomous lethal systems among them. The response from parts of the defence and national security community has been predictable — naïve, idealistic, unilateral disarmament, your adversaries will not make the same choice.

These are not stupid objections. I want to take them seriously. But I also want to explain why, as someone who spent years studying physics in the shadow of the Manhattan Project’s legacy, the framing of those objections sounds very familiar, and why that familiarity is not reassuring.


What the Physicists Thought They Were Doing

The scientists who built the atomic bomb were not, for the most part, indifferent to what they were building. Many of them were refugees from European fascism. They understood what a Nazi atomic weapon would mean. The urgency was real, the moral reasoning was coherent, and the conclusion — build it before the other side does — followed from the premises.

What the premises did not include was adequate weight for what happens after the technical problem is solved.

By the time the Trinity test produced results in July 1945, Germany had already surrendered. The original justification — prevent the Nazis from getting there first — had evaporated. What remained was a weapon, an infrastructure for building more weapons, and a strategic and political logic that had largely moved beyond the scientists’ control. The Franck Report, written by a group of Manhattan Project scientists in June 1945, argued against using the bomb on a Japanese city without prior demonstration. It was ignored. Oppenheimer, who chaired the Interim Committee’s scientific panel, signed off on the Hiroshima target recommendation. He spent the rest of his life with that.

The lesson most physics students absorb from this history is something like: the scientists were not the decision-makers, the decision was going to be made anyway, and the presence of principled scientists in the room was better than their absence. The system was going to do what it was going to do; all you could influence was the margin.

I believed this for a long time. I am less sure of it now.


The Analogy and Its Limits

The comparison between the atom bomb and artificial general intelligence — or even current large language models at the capability frontier — is made often enough that it has become a cliché, which is usually the point at which people stop thinking carefully about it. Let me try to be specific about where the analogy holds and where it breaks.

Where it holds:

The core structural similarity is this: a small number of researchers, working at the frontier of a capability that most people do not understand, are making decisions that will constrain or enable uses they cannot fully anticipate, in contexts they will not control. The physics community in 1942 had a clearer view of what fission could do than any political or military decision-maker. The AI research community in 2026 has a clearer view of what large language models can do — and of what more capable successors will do — than most of the people who will deploy them.

That epistemic position is not morally neutral. Knowing more than the decision-makers does not mean you have unlimited responsibility, but it does mean you have more responsibility than someone who does not know. Feigning ignorance about downstream applications is not available to you.

The second similarity: once the capability exists and is demonstrated, the normative landscape changes. Before Trinity, the question of whether to build nuclear weapons was still open. After Trinity, it was no longer open in the same way — the knowledge existed, the infrastructure existed, the geopolitical expectations had already been set. The arms race was not caused by the bomb, but the bomb’s existence changed what the arms race meant and how fast it moved. We are somewhere in the vicinity of that transition with frontier AI systems. The question of whether to build them is still formally open for any given company or research group, but the landscape is already different from what it was five years ago.

Where it breaks:

The atom bomb was a single-use physical object whose primary function was destroying things. Large language models are general-purpose cognitive tools with a very wide range of applications, the majority of which are not weapons-relevant. This matters because it changes the policy space. You could, in principle, have not built the atom bomb. You cannot, in principle, not build language models while still having language models for medicine, education, scientific research, and the other applications that are clearly beneficial. The dual-use problem for AI is more severe, not less severe, than it was for physics.

The other important difference: the Manhattan Project was conducted in secret, under wartime conditions, with a relatively well-defined adversarial structure. The current AI landscape involves many organisations, many countries, public publication of research, and no clear equivalent of the Axis/Allied framing. The game theory of “if we don’t do it, they will” is more complicated when “they” is not a single identifiable adversary with symmetric interests.


What Anthropic’s Line Actually Says

Setting aside for a moment whether the line is in the right place, there is something worth examining in the act of drawing it at all.

The standard criticism — that a unilateral ethical commitment in a competitive field simply advantages less scrupulous actors — assumes that ethical commitments are pure costs with no countervailing benefits. This is the argument the weapons lobby has made about every arms control proposal in the history of arms control, and it has sometimes been right. Unilateral disarmament without reciprocal commitments can leave you worse off. This is not a trivial point.

But it smuggles in an assumption that deserves scrutiny: that the relevant competition is primarily between AI companies, and that the only variable that matters is relative capability. If you accept that framing, then any ethical constraint is a handicap and the only rational strategy is to develop as fast as possible with as few restrictions as possible.

That framing has a name in physics. It is called the arms race equilibrium, and the physics community spent thirty years understanding what it produces. It produces capability accumulation without a corresponding development of the normative frameworks, institutional safeguards, and mutual verification mechanisms that make the capability survivable. It produces Hiroshima, then the hydrogen bomb, then MIRV, then the point at which the accumulated arsenal is large enough to end complex life on Earth several times over, at which point you negotiate the first real arms limitation treaties — from a starting position of vastly more deployed capability than anyone needed and vastly less trust than anyone wanted.

The question Anthropic is implicitly asking is whether there is a path that does not look like that. The answer is not obvious. But I think it is worth asking.


What the Physicists Should Have Done

Here is the counterfactual that haunts the Manhattan Project’s legacy: what if the scientific community had treated the ethics of the bomb as seriously as the physics, from the beginning?

Not naïvely. Not by refusing to work on it and ceding the possibility of influencing it. But by making the ethical analysis parallel to the technical analysis, by treating the question of use as a scientific question with as much rigour as the question of yield, and by using the epistemic authority that came from being the people who understood the capability to push, hard, for the normative frameworks that did not yet exist.

Some scientists did this. Szilard circulated a petition, signed by 70 Manhattan Project scientists, against the use of the bomb on Japanese cities without prior warning. It did not work. But the effort was real, and the record of the effort matters — both as evidence that the scientific community was not unanimous in its acquiescence and as a model for what engaged dissent looks like from inside a project that is going to proceed regardless.

What most scientists did not do, and what the profession largely did not do in the decades that followed, was treat the ethical work as primary. Physics built its identity around the technical capability — the extraordinary achievement of understanding nature at the deepest level — and treated the ethical consequences as someone else’s department. The bomb was the military’s problem. The cold war was the politicians’ problem. The physicists kept doing physics.

This was comfortable and it was wrong.


What I Want From AI Researchers

I want AI researchers to do what the physicists did not, and to do it now, while the critical decisions are still open.

Anthropic drawing a line is one version of this. It is imperfect — the line is in a particular place, the enforcement mechanisms are limited, the competitive dynamics are real. But it is a claim that the people who built the capability have ongoing responsibility for how it is used, and that some uses are outside the bounds of what should happen regardless of what is technically possible.

That claim is not naïve. It is, in fact, the claim the Franck Report was making in 1945: that capability does not determine use, that scientists have a voice in the normative question, and that using that voice is part of the job rather than a distraction from it.

What I want beyond that is for the AI research community to treat the ethics as primary rather than as footnotes. Not ethics review boards that approve research post hoc. Not responsible AI teams that are consulted after the capability has been developed. A genuine integration of the normative analysis into the research process itself — asking, at each stage, what this capability makes possible and who benefits from that possibility and who pays the cost.

The physics community got to August 1945 before it had that conversation in earnest. The conversation has been going on ever since, and it has produced important institutional frameworks — the Bulletin of the Atomic Scientists, the arms control treaties, the export control regimes, the norms against first use. These things matter. But they were built in reaction to a capability that had already been deployed, and the shape of everything that followed was constrained by that starting point.

The AI community is not there yet. The starting point is still being established. That is what makes this moment consequential, and what makes Anthropic’s line — wherever exactly it is drawn — worth defending as an act of principle rather than dismissing as an act of commercial positioning.


A Note on the “Of Our Time” Framing

I am aware that comparisons to the atom bomb are sometimes used to generate unwarranted urgency, to short-circuit careful reasoning by invoking the most extreme case. I want to be clear about what I am and am not claiming.

I am not claiming that current large language models are as immediately dangerous as nuclear weapons. They are not.

I am claiming that the structural situation — researchers at the capability frontier, ahead of the policy frameworks, making decisions that will constrain future options, in a competitive environment with adversarial dynamics — is similar enough that the lessons of the Manhattan Project period are directly relevant. Not as prophecy. As a guide to the kind of mistakes that are available to make.

The physicists had plenty of warning. Szilard had been worried since 1933. Einstein wrote to Roosevelt in 1939. The Franck Report was written before Hiroshima. The warnings were on the record. What was not on the record was a scientific community that treated those warnings as actionable constraints on its own behaviour rather than as advisories for policymakers.

That is the thing I want to be different this time.


References

Franck, J. et al. (1945). Report of the Committee on Political and Social Problems (The Franck Report). National Archives, Record Group 77.

Oppenheimer, J. R. (1965). Interview on The Decision to Drop the Bomb (NBC documentary). Recorded 1965.

Rhodes, R. (1986). The Making of the Atomic Bomb. Simon & Schuster.

Russell, B., & Einstein, A. (1955). The Russell–Einstein Manifesto. Pugwash Conferences on Science and World Affairs.

Szilard, L. (1945). A Petition to the President of the United States. July 17, 1945. Available via the Atomic Heritage Foundation.

Bulletin of the Atomic Scientists (1945–present). Doomsday Clock statements. https://thebulletin.org/doomsday-clock/