OpenAI's Approach to AI Safety
AI Summary
OpenAI's approach to AI safety document (2023, updated 2024) is the company's most detailed public statement of how it thinks about the alignment problem. The document describes a four-stage model for how AI systems might fail: broadly safe (corrigible, honest, working within human oversight), broadly ethical (avoiding harmful actions even when instructed), adherent to OpenAI's principles, and effective at the assigned task. Each stage is ordered intentionally — safety first, effectiveness last — which Altman has cited repeatedly in public when defending decisions that appear to trade capability for controllability. The document introduces "superalignment" as a research program: using AI to help solve AI alignment, on the premise that human researchers cannot keep pace with rapidly improving AI capabilities without AI assistance. Critics from the alignment community — including those who resigned from OpenAI's safety team — have argued that superalignment is circular (you need aligned AI to align AI) and that dissolving the safety team structure undermined the program in practice. The document is essential context for understanding the governance crisis of November 2023: board members who voted to remove Altman cited safety concerns, while Altman's supporters argued the company's safety practices were rigorous. This document is the artifact those competing claims refer to.
Original excerpt
The four-stage safety model, the superalignment research program, and the document at the center of the governance crisis. Altman's safety thesis versus the critics who resigned.
Frequently asked questions
What is "OpenAI's Approach to AI Safety" about?
OpenAI's approach to AI safety document (2023, updated 2024) is the company's most detailed public statement of how it thinks about the alignment problem. The document describes a four-stage model for how AI systems might fail: broadly safe (corrigible, honest, working within human oversight), broad…
Who wrote "OpenAI's Approach to AI Safety"?
"OpenAI's Approach to AI Safety" was written by Sam Altman. It is curated in the Sam Altman vault on Burn 451, which covers agi · openai strategy · the intelligence age.
How can I read more content from Sam Altman?
The complete Sam Altman reading list is available at burn451.cloud/vault/sam-altman. Each article includes an AI-generated summary so you can decide what to read in seconds. Connect the Burn 451 MCP server to Claude or Cursor to query all Sam Altman articles as live AI context.
Can I use "OpenAI's Approach to AI Safety" with Claude or Cursor?
Yes. Install the burn-mcp-server npm package and connect it to Claude Desktop, Claude Code, or Cursor. Once connected, your AI can search and reference this article and the full Sam Altman vault in real time — no manual copy-paste required.
26 more articles in this vault.
Import the full Sam Altman vault to Burn 451 and build your own knowledge base.
Content attributed to the original author (Sam Altman). Burn 451 curates publicly available writing as a reading index. For removal requests, contact @hawking520.