Human-level performance in 3D multiplayer games with population-based reinforcement learning

BlogDemis HassabisMay 11, 2026

AI Summary

This 2019 Science paper describes capture-the-flag (CTF) agents that can beat professional human players at a fast-paced 3D first-person shooter — a category of problem requiring teamwork, spatial navigation, and reactive coordination that was well beyond AI capability at the time. The key technical innovation is population-based training: instead of training a single agent, train an entire population of agents that compete against each other with varying skills and strategies, then select and reproduce the most successful variants. This creates a form of natural selection that produces diverse, adaptive strategies. The CTF paper is significant in Hassabis's arc because it extends superhuman AI performance beyond games of perfect information (Go, chess) to imperfect information, 3D spatial environments, and multi-agent coordination. The agents learn human-like behaviors — communicating to coordinate attacks, setting ambushes, defending flag carriers — not through explicit programming but through reward maximization. This was a proof point that RL can produce complex multi-agent coordination without any pre-specified social norms or communication protocols.

Original excerpt

Superhuman performance in 3D capture-the-flag — teamwork, spatial navigation, reactive coordination — via population-based reinforcement learning. Extends DeepMind's approach beyond board games to embodied multi-agent environments.

Frequently asked questions

What is "Human-level performance in 3D multiplayer games with population-based reinforcement learning" about?

This 2019 Science paper describes capture-the-flag (CTF) agents that can beat professional human players at a fast-paced 3D first-person shooter — a category of problem requiring teamwork, spatial navigation, and reactive coordination that was well beyond AI capability at the time. The key technical…

Who wrote "Human-level performance in 3D multiplayer games with population-based reinforcement learning"?

"Human-level performance in 3D multiplayer games with population-based reinforcement learning" was written by Demis Hassabis. It is curated in the Demis Hassabis vault on Burn 451, which covers agi · alphafold · scientific discovery.

How can I read more content from Demis Hassabis?

The complete Demis Hassabis reading list is available at burn451.cloud/vault/demis-hassabis. Each article includes an AI-generated summary so you can decide what to read in seconds. Connect the Burn 451 MCP server to Claude or Cursor to query all Demis Hassabis articles as live AI context.

Can I use "Human-level performance in 3D multiplayer games with population-based reinforcement learning" with Claude or Cursor?

Yes. Install the burn-mcp-server npm package and connect it to Claude Desktop, Claude Code, or Cursor. Once connected, your AI can search and reference this article and the full Demis Hassabis vault in real time — no manual copy-paste required.

31 more articles in this vault.

Import the full Demis Hassabis vault to Burn 451 and build your own knowledge base.

Content attributed to the original author (Demis Hassabis). Burn 451 curates publicly available writing as a reading index. For removal requests, contact @hawking520.