Problem
mcp serve already does smart resource management — shared backends, lazy init, adaptive idle shutdown, kill_on_drop — but only for clients that stay connected to the proxy (Claude Code, Cursor, Windsurf, etc.). Users running the CLI directly from the terminal get none of that.
Every mcp <server> <tool> ... invocation is a fresh process lifecycle:
- Spawn stdio backend (
npx -y slack-mcp-server@latest ...).
- Wait for initialization (handshake, tool discovery).
- Send one
tools/call.
- Tear everything down.
For stdio backends that are slow to boot — Node/npx packages, Python servers with heavy imports, chrome-devtools/mobile-mcp that hold device sessions — this is pure latency tax on every call, and discards state that is expensive to rebuild (browser tabs, auth handshakes, warm caches inside the server). A script that calls the same server 10 times pays the startup cost 10 times.
It also makes some backends effectively unusable from the CLI: anything that only becomes meaningful after the first call (opened browser, connected device, resolved context) is a cold MCP every time.
Desired outcome
A per-user background daemon that:
- Keeps warm stdio backends alive between isolated CLI invocations, so
mcp slack list_channels followed 2s later by mcp slack send_message reuses the same child process.
- Participates in the same lifecycle rules
mcp serve already has (lazy init, adaptive idle shutdown, zero orphans on restart/crash).
- Is observable and controllable — start, stop, restart, status, list of managed backends.
- Is opt-in per server (or per user), so ephemeral one-off calls don't pay for a daemon they don't need.
- Coexists with
mcp serve: if the proxy is already running a backend, the CLI should reuse that instead of spawning a second copy.
Out of scope
- Exact CLI surface (
mcp daemon start|stop|status|...), IPC mechanism (unix socket vs domain socket vs named pipe on Windows), and interaction with mcp serve — to be decided in design.
- Per-backend lifecycle policy syntax in
servers.json (keep-alive vs ephemeral).
- HTTP backends (not affected — daemon is specifically for stdio).
Related
Prior art
Problem
mcp servealready does smart resource management — shared backends, lazy init, adaptive idle shutdown, kill_on_drop — but only for clients that stay connected to the proxy (Claude Code, Cursor, Windsurf, etc.). Users running the CLI directly from the terminal get none of that.Every
mcp <server> <tool> ...invocation is a fresh process lifecycle:npx -y slack-mcp-server@latest ...).tools/call.For stdio backends that are slow to boot — Node/npx packages, Python servers with heavy imports,
chrome-devtools/mobile-mcpthat hold device sessions — this is pure latency tax on every call, and discards state that is expensive to rebuild (browser tabs, auth handshakes, warm caches inside the server). A script that calls the same server 10 times pays the startup cost 10 times.It also makes some backends effectively unusable from the CLI: anything that only becomes meaningful after the first call (opened browser, connected device, resolved context) is a cold MCP every time.
Desired outcome
A per-user background daemon that:
mcp slack list_channelsfollowed 2s later bymcp slack send_messagereuses the same child process.mcp servealready has (lazy init, adaptive idle shutdown, zero orphans on restart/crash).mcp serve: if the proxy is already running a backend, the CLI should reuse that instead of spawning a second copy.Out of scope
mcp daemon start|stop|status|...), IPC mechanism (unix socket vs domain socket vs named pipe on Windows), and interaction withmcp serve— to be decided in design.servers.json(keep-alivevsephemeral).Related
Prior art
mcporter0.9.0 ships a per-login daemon (mcporter daemon start|stop|status|restart) with opt-in via"lifecycle": "keep-alive"per server. See: https://github.com/steipete/mcporter