Building Burn's MCP Server: 3 Patterns That Actually Work

MCP has been open-source for over a year now. Cursor shipped Bugbot MCP support earlier this month. Anthropic published a 2026 roadmap that explicitly moves spec evolution out of fixed release cadences and into working groups. And my own burn-mcp-serverjust crossed a thousand installs on npm. Somewhere in the middle of all that, MCP stopped being "the new Anthropic thing" and became the default way I think about exposing any product to an agent.

I build a read-later app called Burn 451. About a month ago I shipped burn-mcp-server on npm — 26 tools that let an agent read, triage, and curate my saved articles. Last 7 days: 349 installs. Last 30 days: 1152. Most of that is self-use plus a handful of early users. The adoption curve is not the point of this post. The point is that somewhere between the first tool and the twenty-sixth, I made three design calls that kept the thing from collapsing under its own weight, and I want to write those down while the code is still fresh.

I am not a professional engineer. I ship with Claude Code, I read my own diffs, I break things, I fix them. If you are building your first MCP server you will probably make the opposite call on at least one of these, and that is fine — the point is to know why you made it.

How should you expose your data to an MCP client — as tools or as resources?

Tools win for agents that act, resources win for read-only dumps, and in practice almost every MCP client I care about treats tools as first-class and resources as an afterthought. I went tool-first for Burn and I would do it again.

Here is what the code actually looks like. burn-mcp-server registers 26 server.tool() calls and exactly 2 server.resource() entries. The resources expose burn://vault/bookmarks and burn://vault/categories— bulk JSON dumps of the user's permanent knowledge base. They exist because the MCP spec says they should. In 30 days of production use I have not once seen Claude Code, Cursor, or Windsurf actually prefer a resource read over a tool call, even when both are wired up to return the same data.

The reason is simple. A tool has a schema. The agent knows what to pass, what it will get back, and what the side effects are. A resource is a URI you fetch; the agent has to guess whether fetching it helps. Claude Desktop will happily list your resources in the UI, but Claude Code in the terminal — which is where agents actually work — ignores resources most of the time unless you prompt the model explicitly to go look at them.

So I pushed everything that an agent might want to call into tools. search_vault with a query and limit. list_flame for the inbox. get_flame_detail for full article extraction. Even the bulk views got tools (list_vault, list_categories) because a tool call with an explicit filter is cheaper than dumping the whole vault as a resource and asking the model to grep it.

The one place resources still pull their weight is discoverability. When someone plugs burn-mcp-serverinto Claude Desktop for the first time and opens the MCP inspector, the two resources show up as a hint: "this server has a vault, here is what's in it." They are not the workhorse. They are the welcome mat.

If you are designing an MCP server from scratch, the rule I follow now: every action goes in a tool, every read that needs a filter goes in a tool, and you only reach for resources when you want a stable URI that external tools can reference by name. Everything else is noise.

Where should the filtering and business logic live — in the client or on the server?

On the server, every time, even when the client is perfectly capable of doing it. This is the call I made earliest and the one I am most glad about.

When I first sketched burn-mcp-serverI thought about thin server + thick client — expose raw Supabase queries, let the agent figure out filters, let each MCP client (Claude Code, Cursor, Windsurf, future ones I haven't seen yet) build its own opinionated UX on top. That lasted about one afternoon. The problem is that "let the agent figure out filters" is a line item that shows up in every agent prompt, on every client, forever. Each client has a slightly different personality. Each agent has a different definition of "recent." The status flow — Flame (24h inbox) → Spark (read, 30-day lifespan) → Vault (permanent) → Ash (expired) — is load-bearing for how Burn works, and letting a model reinvent it at the edge is how you ship bugs users cannot reproduce.

So burn-mcp-server carries the weight. Every tool has a single, explicit purpose. list_flame only returns status=active bookmarks and computes remainingHours, isBurning (≤ 6h), isCritical (≤ 1h) on the server. The agent does not need to know the countdown expires at countdown_expires_at; the tool tells it "this one burns in 2.3 hours, that one in 47 minutes." Same with search_vault — the client passes a query string, the server handles the title ilike, the tag fallback, and the deduplication. The response is shaped for an agent to reason about, not for a developer to query.

The most useful piece of this pattern is a helper I wrote called verifyBookmark(id, expectedStatus). Any tool that moves a bookmark through the status flow — move_flame_to_spark, move_spark_to_vault, move_flame_to_ash — calls it first. If the agent tries to promote a bookmark that is already in the Vault, the tool returns a readable error: "Bookmark is in Vault (expected Flame)". The agent recovers, the state stays consistent, and I do not spend evenings debugging "why did a Vault entry lose its category." Pushing this into the server means one implementation, one test surface, one place to change.

There is a commit I keep coming back to as a reminder of why this matters. Early on I let the server depend on whatever Supabase project the caller happened to point at. A config drift bug later — commit 03da444, "correct Supabase project ref in MCP server, add zod dependency" — reminded me that the server is the ground truth. Clients should not even have the option of pointing at the wrong database. zod showed up in that same commit because I added runtime validation on every tool input; schema in the SDK is nice but trusting it is not.

Thin client, thick server. If you are writing your own MCP server and you find yourself thinking "I'll let the agent handle this case" — stop, put it in the server.

How do you debug a server whose errors the client never shows you?

You log every tool call on the server, from the first line of every handler, and you keep the logs local-only so users are not shipping their reading history to your analytics pipeline.

This one hit me during the week I shipped MCP 2.0 with write capabilities (commit e21d4f5, "MCP 2.0 write capabilities + free tier + Web Collection + remove Cluster"). Suddenly agents were calling move_spark_to_vault and create_collection and write_bookmark_analysis — real writes, real state changes. Any one of them could fail silently. Claude Code in particular has this habit of quietly retrying a failed tool call with slightly different args, which looks like nothing to the user and panic to me when I check Supabase an hour later and see three half-written collections.

The observability layer is not fancy. It is console.error at the top of every handler, a sliding-window rate limiter that logs every cap hit (RATE_LIMIT_MAX_CALLS = 30 per minute), and a session cache at ~/.burn/mcp-session.json that records when tokens were exchanged and refreshed. console.error because stdio MCP servers use stdout for protocol and stderr for everything else — I tripped over that on day one, wrote a textResult with a leading log line, watched Claude Code crash with a JSON parse error, fixed it, moved on.

What matters is that the server writes its own audit trail. When a user tells me "the agent said it moved 10 Flames to Spark but nothing happened," I do not need them to reproduce. I ask them to check ~/.burn/ for the session cache and pull the last 50 stderr lines from their Claude Code logs. Nine times out of ten the answer is obvious: token expired mid-batch, rate limit hit at bookmark 7, Supabase RLS rejected a move because the user changed accounts between launches. A separate commit — 9519cbb, "fix: MCP auth independent session + rescue quota display" — came directly out of one of those reports, where the session cache and the quota display were stepping on each other.

There is a second half to this that I care about more than the first. I do not send any of this to a server I control. No Sentry, no PostHog, no backend analytics hook. The audit lives on the user's machine, behind 0o600file permissions, and it deletes itself when the cache rotates. An MCP server runs on the user's local stdio; the agent sees their full reading history; the last thing I want is a phone-home log that records which articles they asked about. If you are building an MCP server that touches any kind of personal data, resist the urge to wire in observability that leaves the machine. Your future self, and your users, will thank you.

FAQ

What is an MCP server?

An MCP (Model Context Protocol) server is a process that exposes tools, resources, and prompts to an AI agent over a standard protocol. The client — Claude Desktop, Claude Code, Cursor, Windsurf — connects to the server and calls tools as if they were local functions. MCP is the equivalent of a REST API for agents, except the schema is self-describing and the agent discovers the tools at connection time. Anthropic open-sourced the protocol in late 2024, and by 2026 it has become the de facto standard across the major agent-facing editors.

How do I install burn-mcp-server?

Run `npx burn-mcp-server` after setting BURN_MCP_TOKEN in your environment. Get the token from Burn App → Settings → MCP Server. For Claude Desktop, add an entry to mcpServers in your config pointing at burn-mcp-server with the token in env. The package is on npm and works with Claude Code, Cursor, and Windsurf without changes.

Why did you go tool-first instead of resource-first?

Because every MCP client I tested — Claude Code, Cursor, Windsurf, Claude Desktop — prefers tools in practice. Tools carry a schema the agent can reason about; resources are URIs the agent has to decide to fetch. For 26 operations across search, triage, curation, and analysis, tool-first keeps behavior consistent across clients. Resources still earn their keep as welcome-mat hints (burn://vault/bookmarks, burn://vault/categories) but not as the main interface.

How do you keep the server stateless across restarts?

Session caching at ~/.burn/mcp-session.json with 0o600 permissions. On first run the server exchanges the long-lived MCP token for a short-lived Supabase session, then caches both the access token and refresh token. Subsequent launches restore the session from disk with zero network calls. Supabase's onAuthStateChange listener writes fresh tokens back to the cache when they auto-refresh. If the cache is corrupt or the refresh fails, the server falls back to a full exchange.

What happens if an agent hits the rate limit?

The rate limiter is a sliding 60-second window with a 30-call cap per MCP session, in-memory. When an agent trips it, the tool returns a readable message — "Rate limit exceeded (30 calls/min). Retry after 12s." — instead of silently failing. Claude Code and Cursor both handle this gracefully; the agent usually pauses and retries. The cap exists to stay within Supabase's sane request budget, not to block abuse.

Is this the right time to build an MCP server?

Yes, if you already have a product with a data model worth exposing. The protocol has been open-source for over a year, Anthropic has moved spec evolution into working groups rather than fixed releases, and the same server I ship on npm runs on Claude Code, Cursor, and Windsurf without per-client forks. Tooling is mature enough that you spend your time on the data model, not on protocol plumbing. No, if you are building the server before the product — the 26 tools in burn-mcp-server are a direct mapping of the app's existing status flow, and trying to design tools without a product underneath is a quick path to an agent surface no one uses.