Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

BlogApr 19, 2026

Highlights

▸'Ethically trained' (out-of-copyright only) is becoming a real product category — not a slogan, a legal position backed by a working model
▸Mr. Chatterbox is weak on purpose — the point isn't conversation quality, it's proving 1899-only training yields a usable-if-limited model on ~$0 dataset
▸As lawsuits mount against frontier labs, expect more 'clean-corpus' models even at reduced capability — a pragmatic response to legal risk

Original excerpt

Sponsored by: Honeycomb — AI agents behave unpredictably. Get the context you need to debug what actually happened. Read the blog

Trip Venturella released Mr. Chatterbox, a language model trained entirely on out-of-copyright text from the British Library. Here’s how he describes it in the model card:

Mr. Chatterbox is a language model trained entirely from scratch on a corpus of over 28,000 Victorian-era British texts published between 1837 and 1899, drawn from a dataset made available by the British Library. The model has absolutely no training inputs from after 1899 — the vocabulary and ideas are formed exclusively from nineteenth-century literature. Mr. Chatterbox’s training corpus was…

Read full article on simonwillison.net

Frequently asked questions

What is "Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer" about?

This article by Simon Willison is part of the Simon Willison reading list on Burn 451, covering llm blog.

Who wrote "Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer"?

This piece is part of the Simon Willison vault on Burn 451, covering llm blog. The original author is attributed at the source link.

How can I read more content from Simon Willison?

The complete Simon Willison reading list is available at burn451.cloud/vault/simon-willison. Each article includes an AI-generated summary so you can decide what to read in seconds. Connect the Burn 451 MCP server to Claude or Cursor to query all Simon Willison articles as live AI context.

Can I use "Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer" with Claude or Cursor?

Yes. Install the burn-mcp-server npm package and connect it to Claude Desktop, Claude Code, or Cursor. Once connected, your AI can search and reference this article and the full Simon Willison vault in real time — no manual copy-paste required.

16 more articles in this vault.

Import the full Simon Willison vault to Burn 451 and build your own knowledge base.

View Full Vault Get Burn 451

Content attributed to the original author. Burn 451 curates publicly available writing as a reading index. For removal requests, contact @hawking520.