Sleep-time Compute

BlogLettaJun 14, 2025

AI Summary

Letta's April 2025 announcement of a second agent that runs during idle time, rewriting the primary agent's memory blocks based on recent interactions. The claim: test-time scaling is expensive because users are waiting, so push the thinking into moments when nobody is. Benchmarks show performance gains on math and reasoning tasks at lower per-query latency. Harrison Chase later picked up the same idea in his Sequoia interview. This is the clearest articulation of why stateful agents are an economic improvement, not just a UX one.

Original excerpt

[Letta Code Run agents locally inside your terminalLetta API Build agents into your apps with our APIResources

Blog Learn about product and research updatesCustomer Stories Read about Letta in productionDemos See Letta in actionModel Leaderboard Understand which LLMs work bestDeveloper Community Join the Letta community on DiscordCompany

About us Learn about our mission and teamCareers Join our team to work on open AIContact us Get in touch

Letta Developer Platform Use the Letta API to build agents that can actually remember and learn about your users over time. Open source, production ready, and fully model-agnostic.

Letta Code Letta Code is a memory-first coding harness, built on top of the…

18 more articles in this vault.

Import the full Agent Memory Patterns vault to Burn 451 and build your own knowledge base.

Content attributed to the original author (Letta). Burn 451 curates publicly available writing as a reading index. For removal requests, contact @hawking520.