What does it mean to morally align an artificial intelligence — and what should we be talking about before we do it? This page accompanies the talk: playable modules, theses, questions to take home. It grows piece by piece.
A few months ago I read a scenario report called AI 2027 by Daniel Kokotajlo. Then I read his interview in Der Spiegel. I haven't thought about much else since.
Not because I think the world is ending. Because I think the question we keep avoiding — what kind of intelligence do we actually want to build? — has become harder to avoid. Capability is converging. Values aren't. And the gap is widening faster than most of us can follow.
This page is a companion to the talk at Daten Grillen 2026. It is not a summary. It is the playable version: a few modules to pull on, a few questions to take home, and the things I could not fit into 45 minutes. It grows piece by piece.
The talk starts here. Not with AI. With silence.
If the universe is this old and this full — where is everyone? The question is older than computing. But it carries the same logic as the one about AI. Pull the sliders and see why.
The simulation only runs over 20 million years — a blink of an eye compared to the lifetime of the galaxy. Even a single civilization with a slow sub-light drive would have reached everywhere. The galaxy is ~10 billion years old — time for 500 to 5,000 complete crossings. If that is so — then why is it so quiet?
Frank Drake broke the question into factors in 1961: how many stars get planets, how many planets get life, how many life forms get intelligence, how many intelligences become technological civilizations — and how long do they last? Every factor is a chance to disappear.
Pull the seven sliders. Where does the chain break? ↓
The last variable in Drake's chain is L — the lifespan of a technological civilization. A number that decides everything. But no one knows it. We can only tell stories about it.
Robin Hanson coined the term Great Filter for this in 1996: somewhere in the chain there must be a step that catches almost everything. If the filter is behind us — for instance the leap from chemistry to life itself (fl) — we are rare, but safe. If it is ahead of us — at L — we are one of many that never get old.
We have been negotiating exactly this question in cinema for almost a century: HAL, Skynet, WALL·E, Iron Giant, The Entity, Maeve. Every film is an implicit hypothesis about L. Which L-hypothesis are you telling yourself? ↓
Three quarters of all films tell AI as dystopia. That isn't just cultural paranoia — it's a game-theoretic intuition: if I can't verify what another actor intends, the safe answer is either silence or first strike.
Three authors show up disproportionately often: Asimov, Dick, Clarke. Their books are the script with which we discuss real AI today — even when the films rarely follow the books precisely.
Liu Cixin condensed this intuition into a theory of its own: Dark Forest. What if the silence of the galaxy isn't accident, but strategy? ↓
Cixin Liu calls it the Dark Forest: every civilization is a hunter. You don't know whether the being calling out is friendly. You only know that you can't afford to be wrong.
In this simulation all civilizations are identical — they only differ in what they happen to do by chance. Yet an equilibrium emerges: usually only the ones that never broadcast survive.
Apply it to AI: when one AGI detects another — and its training data is full of our history, of how we treat the weaker side, of colonialism, of competition — what does it do? It doesn't wait to find out whether the other AGI is friendly. It waits to see whether it can afford to find out.
This is where alignment becomes a game-theory problem, not an engineering one.
Railway took 70 years for 100×. World GDP took 124 years for ~40×. China's GDP 30 years for 50×. AI compute has added ~10⁹× in 8 years — a billion-fold.
What this has to do with alignment: if a well-intentioned default starts copying itself (Module 04) and the doubling interval is 6 months instead of 6 years, there's no room for correction. We built ourselves cases where 100× over 70 years was still a generational project. Here we're talking about weeks.
That's the real alignment question: not „can we get it right", but „do we have time to get it right — before it stops mattering?"
If everyone acts wisely, we win. If only you act wisely, you lose. If only you act unwisely, you win. If no one acts wisely, there is no one left.
This game isn't „cooperation versus selfishness". It's coordination versus scaling. You can act individually rational and still contribute to the singularity — because systemic risk doesn't know what you morally meant.
The only way out of this game is prior agreement — meaning regulation. And that requires someone to decide who gets to decide.
Six modules. Six mechanics that show why it can tip.
The question of who is at the lever — who sets the default, who is liable, who overrides — isn't something a diagram can answer. I'm taking that one live, onto the stage.
Until then: play the modules. Bring your gut feeling. See you at Daten Grillen.
Playable galaxy expansion — how fast does „one" become everything?
Seven sliders for the Drake equation — where does the civilization chain break?
42 films & series from 1927 to 2025 — how has our image of AI shifted?
Silence, broadcast, strike — the game-theoretic equilibrium between civilizations (and AGIs).
AI growth versus Moore, railway, China — and how much electricity it all needs.
Single-player version: you decide, the other 149 follow their strategy. Are you Skynet's father?
Talk: Is a Morally Aligned AI Our Only Chance?
Speaker: Michael Tenner · Daten-WG
Event: Data Grillen, Lingen 2026
A walk from the silence of the universe to the code we write next week. This is the long-form reading list behind the talk — where every claim, hypothesis, and reference comes from, and where to go deeper.
A detailed, month-by-month forecast of how AI development could unfold through 2027 and beyond. Kokotajlo previously worked on OpenAI's governance team. He left because, in his own words, he no longer believed the company was taking the risks seriously enough.
The interview where the scenario went mainstream in German media. Worth reading not for new content but for the register — a former insider speaking openly outside the industry's PR frame.
Two researchers who have studied AI safety for two decades make the most direct case currently in print: that sufficiently capable AI systems will develop goals that conflict with ours, and that we are not on track to prevent this. Instant New York Times bestseller. The New Yorker and Guardian Best Books of 2025. Even readers who disagree with the conclusion benefit from the clarity of the argument.
The standard reference on the alignment problem from one of the field's senior figures. Russell reframes the question from „how do we make AI smarter" to „how do we make AI that wants what we want — and knows it doesn't already know what that is." Where to start if you want the technical version rather than the popular one.
Source of the Three Laws of Robotics — the original literary frame for the alignment problem, more than half a century before the field existed. Every story in the collection demonstrates a specific way the Laws fail in practice.
Originally posed by Enrico Fermi at Los Alamos in 1950 during a lunch conversation about extraterrestrial life. „Where is everybody?" The Wikipedia article is unusually comprehensive — it serves as a reasonable map of the entire field.
Frank Drake's 1961 attempt to put numbers on the question. Not a prediction — a structured way to think about which parameters matter and where the deepest uncertainty lies.
Hanson's argument that the silence of the universe is itself evidence of an obstacle — somewhere between simple matter and interstellar civilization, almost no one gets through. Whether the filter is behind us or ahead of us is, in his framing, the most consequential empirical question we cannot yet answer.
The argument made formal by Frank Tipler in 1980 — even slow, sub-relativistic self-replicating probes would fill the galaxy in a few million years. The galaxy is 13.8 billion years old. The math gives no excuses.
The original formal version of the argument. Sagan and Newman wrote a notable counter-response in 1983.
Life-bearing worlds with the chemistry, geology, and stability Earth had may be vanishingly rare.
Advanced civilizations exist and observe us — but by choice they do not make contact. Earth as nature preserve.
Civilizations stay silent because revealing themselves is too dangerous. The cosmos is a hunter's wood. Popularized through the Remembrance of Earth's Past trilogy — The Three-Body Problem, The Dark Forest, Death's End — translated by Ken Liu and Joel Martinsen, Tor Books, 2014–2016.
Two peer-reviewed papers analyzing >100,000 short-lived flashes from sky surveys taken between 1949 and 1957 — before the first satellite was in orbit. Neither paper claims to identify what the flashes are. Both find statistical patterns that the obvious explanations cannot account for.
Found a 22σ deficit of transients inside Earth's shadow — meaning whatever these flashes are, they need sunlight. Consistent with reflective objects at high altitude, before any human satellite existed.
Found a +45 % increase in flashes on days of above-ground nuclear tests (±1 day window) across 124 tests, 1951–1957. Statistical significance p = 0.008. Cause unknown.
A note on framing. These papers do not claim that aliens are real. They claim that the historical sky was statistically stranger than the simple explanations allow, and that further investigation is warranted. The talk treats them the same way: not proof of presence, but a signal worth noticing.
Sagan's argument that the cosmic view of ourselves — small, isolated, responsible for our own survival — is not depressing but clarifying. The closing passage on the photograph of Earth from Voyager 1 is one of the most quoted reflections on the human condition written in the 20th century.
The talk references several films and cultural artifacts. They are not arguments — they are anchors. Each one captures something the philosophical literature struggles to convey.
HAL 9000 as a study in misalignment, not malice. The system did exactly what it was instructed to do.
The fictional rule that a more advanced civilization shall not interfere with a less developed one — until the latter is ready. A policy answer to one of the four Fermi hypotheses, written 25 years before the Fermi paradox became mainstream.
Not a film about robots destroying humanity. A film about optimization slowly replacing agency. The terrifying outcome is not extinction. It is comfortable obsolescence.
The question is reversed: not whether machines can become intelligent, but whether intelligence can become humane.
You're reading it. The live, playable modules:
If you want to keep going after the talk:
The book that brought existential AI risk into mainstream academic philosophy. Dense, careful, occasionally pessimistic.
A reporter's tour of what alignment researchers actually do day to day. Specification gaming, mesa-optimization, reward hacking — illustrated with the real examples.
Real-world cases of AI systems doing exactly what they were trained to do — in ways their designers did not intend. The boat-racing AI, the cleaning robot that hides trash, the simulation that learned bugs in its own physics engine. Updated as new examples appear.
„When a measure becomes a target, it ceases to be a good measure." The economist's version of the alignment problem. Predates AI by decades. Helps explain why dashboards always lie eventually.
Sources curated for the Data Grillen 2026 talk. If a reference is missing or a link breaks, drop me a line via LinkedIn (linked below).
If this page made you think — or argue with me — that's the point. Find me on LinkedIn or come talk to me after the session.