← Knowledge Kitchen DE
Daten Grillen · Talk

Morally Aligned AI

What does it mean to morally align an artificial intelligence — and what should we be talking about before we do it? This page accompanies the talk: playable modules, theses, questions to take home. It grows piece by piece.

● Live Talk Daten Grillen Interactive Modules Work in Progress

A few months ago I read a scenario report called AI 2027 by Daniel Kokotajlo. Then I read his interview in Der Spiegel. I haven't thought about much else since.

Not because I think the world is ending. Because I think the question we keep avoiding — what kind of intelligence do we actually want to build? — has become harder to avoid. Capability is converging. Values aren't. And the gap is widening faster than most of us can follow.

This page is a companion to the talk at Daten Grillen 2026. It is not a summary. It is the playable version: a few modules to pull on, a few questions to take home, and the things I could not fit into 45 minutes. It grows piece by piece.

  1. The silence is the observation If the universe is this old and this full — where is everyone? We start at the Fermi Paradox because it carries the same logic as our question about AI.
  2. Scales are unintuitive What sounds small — one actor, slow, a short window — becomes very large very fast on cosmic or algorithmic scales.
  3. Alignment is not a switch „Morally aligned" is not a checkbox in the setup wizard. It is a chain of decisions, defaults and reach.
  4. Play first, then talk Every module is playable. Pull the sliders. What do you see? What surprises you? That is the bridge to the next chapter.

The talk starts here. Not with AI. With silence.

If the universe is this old and this full — where is everyone? The question is older than computing. But it carries the same logic as the one about AI. Pull the sliders and see why.

Module 01 · Playable

The Fermi Paradox

Pull the sliders.
See how fast something „small" fills everything.
Milky Way · Top-Down 0 M years

Parameters

1
1 % c
100,000 yr/s

Controls

0 M yr
Elapsed time
0M years
Reach (frontier)
0light-years
Galaxy explored
0%
Possible crossings
0× in 10 Bn yr

From scale to probability

The simulation only runs over 20 million years — a blink of an eye compared to the lifetime of the galaxy. Even a single civilization with a slow sub-light drive would have reached everywhere. The galaxy is ~10 billion years old — time for 500 to 5,000 complete crossings. If that is so — then why is it so quiet?

Frank Drake broke the question into factors in 1961: how many stars get planets, how many planets get life, how many life forms get intelligence, how many intelligences become technological civilizations — and how long do they last? Every factor is a chance to disappear.

Pull the seven sliders. Where does the chain break? ↓

Module 02 · Playable

Drake Equation · Great Filter & Rare Earth

Pull the 7 sliders.
Where does the civilization chain break?
Expected N
6
Communicating technological civilizations existing right now in the Milky Way — under these assumptions.
N = R* · fp · ne · fl · fi · fc · L

When L becomes a story

The last variable in Drake's chain is L — the lifespan of a technological civilization. A number that decides everything. But no one knows it. We can only tell stories about it.

Robin Hanson coined the term Great Filter for this in 1996: somewhere in the chain there must be a step that catches almost everything. If the filter is behind us — for instance the leap from chemistry to life itself (fl) — we are rare, but safe. If it is ahead of us — at L — we are one of many that never get old.

We have been negotiating exactly this question in cinema for almost a century: HAL, Skynet, WALL·E, Iron Giant, The Entity, Maeve. Every film is an implicit hypothesis about L. Which L-hypothesis are you telling yourself? ↓

Module 03 · Playable

AI in Film · 1927–2025

Click a dot.
Filter on the left.
Positive
Ambivalent
Dystopian
192719501970199020102025
Software AI Robot Hybrid · Cyborg · Upload
Click a dot for details — or filter by tendency.

Decades compared · Share of tendencies

What the films leave us with

Three quarters of all films tell AI as dystopia. That isn't just cultural paranoia — it's a game-theoretic intuition: if I can't verify what another actor intends, the safe answer is either silence or first strike.

Three authors show up disproportionately often: Asimov, Dick, Clarke. Their books are the script with which we discuss real AI today — even when the films rarely follow the books precisely.

Liu Cixin condensed this intuition into a theory of its own: Dark Forest. What if the silence of the galaxy isn't accident, but strategy? ↓

Module 04 · Playable

Dark Forest · silence as a strategy

Hit Start.
Who survives? Who stays silent? Who shoots first?
Dark forest Tick 0
silent broadcasting wiped out

Parameters

16
2 %
60 %
35 %
4 ticks/s

Controls

Tick
0
Alive
16
Broadcasting
0
Wiped out
0

Silence is a strategy

Cixin Liu calls it the Dark Forest: every civilization is a hunter. You don't know whether the being calling out is friendly. You only know that you can't afford to be wrong.

In this simulation all civilizations are identical — they only differ in what they happen to do by chance. Yet an equilibrium emerges: usually only the ones that never broadcast survive.

Apply it to AI: when one AGI detects another — and its training data is full of our history, of how we treat the weaker side, of colonialism, of competition — what does it do? It doesn't wait to find out whether the other AGI is friendly. It waits to see whether it can afford to find out.

This is where alignment becomes a game-theory problem, not an engineering one.

Module 05 · Playable

Tempo · Growth Curves & Energy

Against railway, Moore, China.
What does „unprecedented" actually mean?
Growth · log index → 2040
Y
X 1820 2050

AI forecast

12 mo.
2040
unlimited
AI 2030
·× over 2025
AI 2040
·×
AI 2050
·×
Energy 2030
·TWh/yr

Energy · TWh/yr (log)

Base: ~100 TWh/yr · World: ~30,000 TWh/yr
15 %/yr

Pace leaves no room for iteration

Railway took 70 years for 100×. World GDP took 124 years for ~40×. China's GDP 30 years for 50×. AI compute has added ~10⁹× in 8 years — a billion-fold.

What this has to do with alignment: if a well-intentioned default starts copying itself (Module 04) and the doubling interval is 6 months instead of 6 years, there's no room for correction. We built ourselves cases where 100× over 70 years was still a generational project. Here we're talking about weeks.

That's the real alignment question: not „can we get it right", but „do we have time to get it right — before it stops mattering?"

Module 06 · Playable

AI Prisoner's Dilemma · are you Skynet's father?

You decide each round.
The other 149 follow their strategy.
Choose the strategy of the other 149 players

The dilemma in one sentence

If everyone acts wisely, we win. If only you act wisely, you lose. If only you act unwisely, you win. If no one acts wisely, there is no one left.

This game isn't „cooperation versus selfishness". It's coordination versus scaling. You can act individually rational and still contribute to the singularity — because systemic risk doesn't know what you morally meant.

The only way out of this game is prior agreement — meaning regulation. And that requires someone to decide who gets to decide.

Six modules. Six mechanics that show why it can tip.

The question of who is at the lever — who sets the default, who is liable, who overrides — isn't something a diagram can answer. I'm taking that one live, onto the stage.

Until then: play the modules. Bring your gut feeling. See you at Daten Grillen.

● Module 01 · Live

Fermi Paradox

Playable galaxy expansion — how fast does „one" become everything?

● Module 02 · Live

Drake · Great Filter

Seven sliders for the Drake equation — where does the civilization chain break?

● Module 03 · Live

AI in Film

42 films & series from 1927 to 2025 — how has our image of AI shifted?

● Module 04 · Live

Dark Forest

Silence, broadcast, strike — the game-theoretic equilibrium between civilizations (and AGIs).

● Module 05 · Live

Tempo & Energy

AI growth versus Moore, railway, China — and how much electricity it all needs.

● Module 06 · Live

AI Prisoner's Dilemma

Single-player version: you decide, the other 149 follow their strategy. Are you Skynet's father?

Sources & Further Reading

Talk: Is a Morally Aligned AI Our Only Chance?

Speaker: Michael Tenner · Daten-WG

Event: Data Grillen, Lingen 2026

A walk from the silence of the universe to the code we write next week. This is the long-form reading list behind the talk — where every claim, hypothesis, and reference comes from, and where to go deeper.

01

AI & Alignment

The scenario that prompted this talk
AI 2027 — A scenario
Daniel Kokotajlo, Eli Lifland, Thomas Larsen, Romeo Dean · 2025

A detailed, month-by-month forecast of how AI development could unfold through 2027 and beyond. Kokotajlo previously worked on OpenAI's governance team. He left because, in his own words, he no longer believed the company was taking the risks seriously enough.

Kokotajlo in Der Spiegel
Interview · 2025

The interview where the scenario went mainstream in German media. Worth reading not for new content but for the register — a former insider speaking openly outside the industry's PR frame.

The book that crystallized the case
If Anyone Builds It, Everyone Dies — Why Superhuman AI Would Kill Us All
Eliezer Yudkowsky & Nate Soares · Little, Brown and Company · 16 September 2025 · ISBN 9780316595643

Two researchers who have studied AI safety for two decades make the most direct case currently in print: that sufficiently capable AI systems will develop goals that conflict with ours, and that we are not on track to prevent this. Instant New York Times bestseller. The New Yorker and Guardian Best Books of 2025. Even readers who disagree with the conclusion benefit from the clarity of the argument.

The textbook on the technical problem
Human Compatible — Artificial Intelligence and the Problem of Control
Stuart Russell · Viking · 2019 · ISBN 9780525558613

The standard reference on the alignment problem from one of the field's senior figures. Russell reframes the question from „how do we make AI smarter" to „how do we make AI that wants what we want — and knows it doesn't already know what that is." Where to start if you want the technical version rather than the popular one.

Foundational fiction
I, Robot
Isaac Asimov · 1950 · ISBN 9780553294385

Source of the Three Laws of Robotics — the original literary frame for the alignment problem, more than half a century before the field existed. Every story in the collection demonstrates a specific way the Laws fail in practice.

02

The Cosmic Question

The paradox
Fermi Paradox
Enrico Fermi · Los Alamos, 1950 — at lunch

Originally posed by Enrico Fermi at Los Alamos in 1950 during a lunch conversation about extraterrestrial life. „Where is everybody?" The Wikipedia article is unusually comprehensive — it serves as a reasonable map of the entire field.

Drake Equation
Frank Drake · 1961

Frank Drake's 1961 attempt to put numbers on the question. Not a prediction — a structured way to think about which parameters matter and where the deepest uncertainty lies.

The wall
The Great Filter
Robin Hanson · 1998

Hanson's argument that the silence of the universe is itself evidence of an obstacle — somewhere between simple matter and interstellar civilization, almost no one gets through. Whether the filter is behind us or ahead of us is, in his framing, the most consequential empirical question we cannot yet answer.

The mathematics behind the silence
Von Neumann probes (Self-replicating spacecraft)
Frank Tipler · 1980

The argument made formal by Frank Tipler in 1980 — even slow, sub-relativistic self-replicating probes would fill the galaxy in a few million years. The galaxy is 13.8 billion years old. The math gives no excuses.

Tipler, F. J. (1980) · Extraterrestrial intelligent beings do not exist
Quarterly Journal of the Royal Astronomical Society, 21: 267–281

The original formal version of the argument. Sagan and Newman wrote a notable counter-response in 1983.

The four hypotheses
Rare Earth Hypothesis
Peter D. Ward & Donald Brownlee · 2000

Life-bearing worlds with the chemistry, geology, and stability Earth had may be vanishingly rare.

Zoo Hypothesis
John A. Ball · 1973

Advanced civilizations exist and observe us — but by choice they do not make contact. Earth as nature preserve.

Dark Forest Hypothesis
Liu Cixin · 2008

Civilizations stay silent because revealing themselves is too dangerous. The cosmos is a hunter's wood. Popularized through the Remembrance of Earth's Past trilogy — The Three-Body Problem, The Dark Forest, Death's End — translated by Ken Liu and Joel Martinsen, Tor Books, 2014–2016.

03

The 2025 Evidence

Two peer-reviewed papers analyzing >100,000 short-lived flashes from sky surveys taken between 1949 and 1957 — before the first satellite was in orbit. Neither paper claims to identify what the flashes are. Both find statistical patterns that the obvious explanations cannot account for.

Villarroel, B., Bruehl, M., et al. (2025) · Identification and analysis of transients in POSS-I plates inside Earth's shadow
Publications of the Astronomical Society of the Pacific, 137:104504

Found a 22σ deficit of transients inside Earth's shadow — meaning whatever these flashes are, they need sunlight. Consistent with reflective objects at high altitude, before any human satellite existed.

Bruehl, M., Villarroel, B. (2025) · Correlation of transient flashes on POSS-I plates with nuclear test days and UAP reports
Scientific Reports, 15:34125

Found a +45 % increase in flashes on days of above-ground nuclear tests (±1 day window) across 124 tests, 1951–1957. Statistical significance p = 0.008. Cause unknown.

A note on framing. These papers do not claim that aliens are real. They claim that the historical sky was statistically stranger than the simple explanations allow, and that further investigation is warranted. The talk treats them the same way: not proof of presence, but a signal worth noticing.

04

Cosmic Humanism

Pale Blue Dot — A Vision of the Human Future in Space
Carl Sagan · Random House · 1994 · ISBN 9780394893815

Sagan's argument that the cosmic view of ourselves — small, isolated, responsible for our own survival — is not depressing but clarifying. The closing passage on the photograph of Earth from Voyager 1 is one of the most quoted reflections on the human condition written in the 20th century.

05

Cultural Touchstones

The talk references several films and cultural artifacts. They are not arguments — they are anchors. Each one captures something the philosophical literature struggles to convey.

2001: A Space Odyssey
Stanley Kubrick / Arthur C. Clarke · 1968

HAL 9000 as a study in misalignment, not malice. The system did exactly what it was instructed to do.

Star Trek · The Prime Directive
created by Gene Roddenberry · since 1966

The fictional rule that a more advanced civilization shall not interfere with a less developed one — until the latter is ready. A policy answer to one of the four Fermi hypotheses, written 25 years before the Fermi paradox became mainstream.

WALL·E
Andrew Stanton · Pixar · 2008

Not a film about robots destroying humanity. A film about optimization slowly replacing agency. The terrifying outcome is not extinction. It is comfortable obsolescence.

Bicentennial Man
Chris Columbus, after Asimov · 1999

The question is reversed: not whether machines can become intelligent, but whether intelligence can become humane.

06

Talk Companion · you're here

You're reading it. The live, playable modules:

07

For Deeper Study

If you want to keep going after the talk:

Superintelligence — Paths, Dangers, Strategies
Nick Bostrom · Oxford University Press · 2014 · ISBN 9780199678112

The book that brought existential AI risk into mainstream academic philosophy. Dense, careful, occasionally pessimistic.

The Alignment Problem — Machine Learning and Human Values
Brian Christian · W. W. Norton · 2020 · ISBN 9780393635829

A reporter's tour of what alignment researchers actually do day to day. Specification gaming, mesa-optimization, reward hacking — illustrated with the real examples.

Specification Gaming Examples in AI
Victoria Krakovna (DeepMind) · regularly updated

Real-world cases of AI systems doing exactly what they were trained to do — in ways their designers did not intend. The boat-racing AI, the cleaning robot that hides trash, the simulation that learned bugs in its own physics engine. Updated as new examples appear.

Goodhart's Law
Charles Goodhart · 1975

„When a measure becomes a target, it ceases to be a good measure." The economist's version of the alignment problem. Predates AI by decades. Helps explain why dashboards always lie eventually.

Sources curated for the Data Grillen 2026 talk. If a reference is missing or a link breaks, drop me a line via LinkedIn (linked below).

Module 01: Linear expansion on the galactic disk (Ø 100,000 ly), simplified.
Module 02: Drake equation (Drake 1961, Hanson 1996, Ward/Brownlee 2000) — parameters are estimates.
Module 03: 42 films & series, curated selection 1927–2025. The tendency classification is interpretive.
Module 04: Dark Forest after Cixin Liu — all civilizations identical & purely stochastic.
Module 05: Growth curves and energy projection — illustrative, not forecasting.
Module 06: AI Prisoner's Dilemma as a single-player game, 30 rounds. Risk dynamics & thresholds from the matching Python model (parameter sweep, ~150-player multiplayer). Player weight artificially boosted to 10 % so choices feel impactful. Score is log-scaled (raw 100 → 800, 10k → 1000). More modules to follow.

If this page made you think — or argue with me — that's the point. Find me on LinkedIn or come talk to me after the session.

Michael Tenner · Full Stack Power BI Engineer LinkedIn →