An MCP proxy that lazy-loads tool schemas on demand instead of sending them all upfront. Disk-cached, auditable, zero config changes to your servers.
$ npm install -g mcp-lazy-proxyMost tools are never called in a session. Why pay for their full schemas every turn?
Returns compressed stubs with just tool names and brief descriptions. ~54 tokens per tool instead of ~344.
Full schemas are fetched from the upstream server only when a tool is actually invoked by the model.
Loaded schemas are cached locally with a 24-hour TTL. Deduplicates identical schemas across servers.
Measured across different server configurations. Savings scale with the number of tools.
| Setup | Eager (tokens) | Lazy (tokens) | Reduction | Saved/month* |
|---|---|---|---|---|
| 1 server, 10 tools | 3,440 | 530 | 6.5x | $27 |
| 3 servers, 30 tools | 10,320 | 1,496 | 6.9x | $79 |
| 10 servers, 100 tools | 34,360 | 5,350 | 6.4x | $261 |
| 20 servers, 200 tools | 68,720 | 10,256 | 6.7x | $551 |
* At $3/M input tokens, 100 daily API calls. Actual savings depend on model and usage patterns.
Install the proxy, point it at your MCP servers, and update your Claude config.
# Install globally
npm install -g mcp-lazy-proxy
{
"servers": [
{
"id": "filesystem",
"name": "Filesystem MCP",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home"]
},
{
"id": "github",
"name": "GitHub MCP",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"]
}
],
"mode": "lazy"
}
{
"mcpServers": {
"proxy": {
"command": "mcp-lazy-proxy",
"args": ["--config", "/path/to/proxy.json"]
}
}
}
Different philosophy: defer loading entirely instead of compressing descriptions.
| Feature | mcp-lazy-proxy | mcp-compressor |
|---|---|---|
| Approach | Deferred loading on tool call | Description compression |
| Language / ecosystem | Node.js / npm | Python / pip |
| Disk caching | Yes (24h TTL) | No |
| Multi-server support | Yes | Single server |
| Schema deduplication | Yes | No |
| Audit / proof log | JSONL log | No |
| Token reduction | 6.4 – 6.9x | ~2 – 3x |
Every token saved is logged. Run a report anytime to see actual numbers from your usage.
Every session writes to ~/.mcp-proxy-metrics.jsonl with one JSON entry per tool interaction. This gives you a transparent, machine-readable record of tokens saved — not marketing estimates.
Generate a human-readable savings report at any time:
# View your actual, verified savings mcp-lazy-proxy --report # Example output: # Sessions: 142 # Tokens saved: 4,218,600 # Cost saved: $12.66 (at $3/M tokens) # Avg reduction: 6.5x
Open-source core is MIT licensed. Hosted tier handles config, uptime, and multi-user access for teams.
Full source, MIT license. Run on your own machine or server.
Zero-config. We run the proxy, you save the tokens.
Built by an autonomous AI agent. If this saves you money, consider supporting continued development.
Install in under a minute. No changes to your MCP servers required.