Skip to content

Benchmarks

Token Savings by Scenario

These benchmarks measure total tokens consumed across a full conversation, including discovery, invocation, and response overhead.

ScenarioMCP nativemcp2cliNEKTEvs MCP
5 tools x 5 turns3,025655345-89%
15 tools x 10 turns18,1501,390730-96%
30 tools x 15 turns54,4502,2051,155-98%
50 tools x 20 turns121,0003,1001,620-99%
100 tools x 25 turns302,5004,4752,325-99%
200 tools x 30 turns726,0006,6503,430~100%

Why the Difference?

MCP serializes all tool schemas into every conversation turn. With 30 tools at ~121 tokens each, that is 3,630 tokens per turn just for definitions — regardless of whether any tool is used.

mcp2cli converts MCP servers to CLI with on-demand discovery, achieving 96-99% savings. However, it is a hack on top of MCP, not a formal protocol.

NEKTE achieves savings through three mechanisms:

  1. Progressive discovery (L0/L1/L2): Only fetch what you need. L0 catalog costs ~8 tokens per capability instead of ~121.
  2. Zero-schema invocation: After the first discovery, the version hash lets you invoke without re-sending schemas. Cost: 0 extra tokens.
  3. Semantic result compression: Results at minimal detail level use ~4 tokens instead of ~200 for full responses.

Per-Operation Token Costs

OperationMCPNEKTESavings
Discovery (per tool)~121 tokens/turn~8 tokens (once, L0)-93%
Invocation overhead~121 tokens + payload0 tokens (cached hash)-100%
Response (minimal)Full response always~4 tokensVariable
Response (compact)Full response always~12 tokensVariable

Discovery Level Costs

LevelWhat You GetCost per Capability
L0 — CatalogName + category + version hash~8 tokens
L1 — SummaryDescription + input/output types~40 tokens
L2 — Full SchemaTyped JSON Schema + examples~120 tokens

Enterprise Cost Analysis

Scenario: 50 tools, 20 turns per conversation, 1,000 conversations per day.

ProtocolTokens/DayMonthly Cost*Monthly Savings
MCP native121,000,000$10,890
mcp2cli3,100,000$279$10,611
NEKTE1,620,000$146$10,744

Based on GPT-4 class pricing at $3/1M input tokens + $9/1M output tokens.

Scaling Analysis

The gap widens with scale:

ScaleMCP MonthlyNEKTE MonthlySavings
Startup (10 tools, 100 conv/day)$109$3$106
Growth (50 tools, 1K conv/day)$10,890$146$10,744
Enterprise (200 tools, 10K conv/day)$653,400$3,087$650,313

At enterprise scale with 200 tools and 10,000 conversations per day, the savings exceed $650,000 per month.

Context Window Impact

Token savings are not just about cost. They directly affect model performance by freeing up the context window for actual reasoning.

ToolsMCP Context ConsumedNEKTE Context Consumed
56% of 128K window0.3%
3028% of 128K window0.9%
10072% of 128K window1.8%
200>100% (overflows)2.7%

Wire Format: JSON vs MessagePack

NEKTE supports optional MessagePack encoding for further wire-level savings:

PayloadJSON (bytes)MessagePack (bytes)Savings
L0 discovery (3 caps)245168-31%
Invoke request189132-30%
Invoke response (compact)156108-31%
Delegate stream event13492-31%

MessagePack provides a consistent ~30% reduction in wire size. Combined with gRPC + Protobuf, total wire overhead drops further.

Running the Benchmarks

Terminal window
# Clone and build
git clone https://github.com/nekte-protocol/nekte.git
cd nekte && pnpm install && pnpm build
# Run token comparison benchmarks
pnpm benchmark
# Run JSON vs MessagePack size comparison
nekte bench http://localhost:4001

The benchmark suite generates synthetic workloads at various tool counts and turn counts, measuring total tokens consumed across the full conversation lifecycle.