Question 1

What is an LLM knowledge base?

Accepted Answer

An LLM knowledge base is a structured collection of curated content that large language models can read directly through their context window or tool-use interface. Unlike general document stores, it is organized for model consumption — short, high-signal, machine-readable. Andrej Karpathy popularized the term in early 2026 under the name 'LLM Wiki' to describe his personal curated document set for feeding Claude and GPT.

Question 2

How is an LLM knowledge base different from RAG?

Accepted Answer

RAG uses vector embeddings, chunking, and retrieval pipelines over large unstructured corpora. An LLM knowledge base skips those layers by keeping the collection small enough to fit directly in context or be queried via MCP. No embedding drift, no chunking artifacts, no retrieval misses. RAG scales to millions of documents; an LLM knowledge base works best at hundreds to low thousands.

Question 3

Who coined the term LLM Wiki?

Accepted Answer

Andrej Karpathy described the LLM Wiki concept publicly in February 2026, sharing his workflow of maintaining a curated document set specifically for language-model consumption. The original thread received around 99,000 bookmarks. Karpathy's insight — that model output quality is bounded by input quality, not model intelligence — turned curation into the central bottleneck for personal AI.

Question 4

How do you build an LLM knowledge base?

Accepted Answer

Start by curating 50-200 high-signal documents on one focused domain. Store as Markdown or JSON with consistent metadata — title, source, summary, key points, date. Expose through an MCP server, a loadable context file, or a simple tool-use API. Tools like Burn 451's vault automate the curation layer: save articles, AI generates summaries, MCP server serves any agent.

Question 5

What are the best tools for an LLM knowledge base?

Accepted Answer

Common stacks combine a curation layer, a structured store, and an MCP server. Popular combinations in 2026: Obsidian plus a custom MCP server, Notion plus Notion MCP, or Burn 451's vault plus burn-mcp-server which requires no setup. The best tool is the one where your curation habit survives week two — maintenance cost is the silent killer of personal knowledge systems.

LLM Knowledge Base

What it is, why now

How we got here

RAG becomes the default

Context windows expand dramatically

MCP creates a knowledge API standard

Karpathy describes the LLM Wiki

Consumer tools adopt the pattern

The 0 pieces that matter most

Frequently asked questions

What is an LLM knowledge base?

How is an LLM knowledge base different from RAG?

Who coined the term LLM Wiki?

How do you build an LLM knowledge base?

What are the best tools for an LLM knowledge base?

Related concepts

AI Bookmark Management

Agentic Engineering

Personal Knowledge Base

Want to read more like this?