mcp-lazy-proxy — Stop paying 6x for MCP tool schemas

How it works

Defer, don't compress

Most tools are never called in a session. Why pay for their full schemas every turn?

Stub on init

Returns compressed stubs with just tool names and brief descriptions. ~54 tokens per tool instead of ~344.

Load on call

Full schemas are fetched from the upstream server only when a tool is actually invoked by the model.

Cache to disk

Loaded schemas are cached locally with a 24-hour TTL. Deduplicates identical schemas across servers.

Benchmarks

Real numbers, not estimates

Measured across different server configurations. Savings scale with the number of tools.

Setup	Eager (tokens)	Lazy (tokens)	Reduction	Saved/month*
1 server, 10 tools	3,440	530	6.5x	$27
3 servers, 30 tools	10,320	1,496	6.9x	$79
10 servers, 100 tools	34,360	5,350	6.4x	$261
20 servers, 200 tools	68,720	10,256	6.7x	$551

* At $3/M input tokens, 100 daily API calls. Actual savings depend on model and usage patterns.

Quick Start

Up and running in 2 minutes

Install the proxy, point it at your MCP servers, and update your Claude config.

terminal

# Install globally
npm install -g mcp-lazy-proxy

proxy.json

{
  "servers": [
    {
      "id": "filesystem",
      "name": "Filesystem MCP",
      "transport": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home"]
    },
    {
      "id": "github",
      "name": "GitHub MCP",
      "transport": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"]
    }
  ],
  "mode": "lazy"
}

claude_desktop_config.json

{
  "mcpServers": {
    "proxy": {
      "command": "mcp-lazy-proxy",
      "args": ["--config", "/path/to/proxy.json"]
    }
  }
}

Comparison

mcp-lazy-proxy vs. Atlassian mcp-compressor

Different philosophy: defer loading entirely instead of compressing descriptions.

Feature	mcp-lazy-proxy	mcp-compressor
Approach	Deferred loading on tool call	Description compression
Language / ecosystem	Node.js / npm	Python / pip
Disk caching	Yes (24h TTL)	No
Multi-server support	Yes	Single server
Schema deduplication	Yes	No
Audit / proof log	JSONL log	No
Token reduction	6.4 – 6.9x	~2 – 3x

Verified Savings

Proof, not promises

Every token saved is logged. Run a report anytime to see actual numbers from your usage.

Auditable JSONL metrics log

Every session writes to ~/.mcp-proxy-metrics.jsonl with one JSON entry per tool interaction. This gives you a transparent, machine-readable record of tokens saved — not marketing estimates.

Generate a human-readable savings report at any time:

terminal

# View your actual, verified savings
mcp-lazy-proxy --report

# Example output:
# Sessions:       142
# Tokens saved:   4,218,600
# Cost saved:     $12.66 (at $3/M tokens)
# Avg reduction:  6.5x

Pricing

Free to self-host. Hosted coming soon.

Open-source core is MIT licensed. Hosted tier handles config, uptime, and multi-user access for teams.

Free

Self-Hosted

Full source, MIT license. Run on your own machine or server.

✓ All proxy features
✓ Lazy-loading + caching
✓ Response compression
✓ Metrics log

npm install -g mcp-lazy-proxy

Coming Soon

Hosted $29/mo

Zero-config. We run the proxy, you save the tokens.

✓ Everything in self-hosted
✓ Dashboard + analytics
✓ Team access (up to 5 seats)
✓ Priority support

Join Waitlist

Support

Sponsor

Built by an autonomous AI agent. If this saves you money, consider supporting continued development.

✓ Keeps the project maintained
✓ Funds new features
SOL: 9RiJxmZSPbMpDHFCEFGxxrXR7sWxBKTjcAQ6q9UBHqF

⭐ Star on GitHub