Skip to content

Protocol Overview

The Problem

Current AI agent communication protocols prioritize expressiveness over efficiency. The result:

  • MCP serializes all tool schemas into every conversation turn. With 30 tools, roughly 3,600 tokens per turn are burned on definitions alone — regardless of whether they are used. In enterprise scenarios (100+ tools), 72% of the context window is consumed before the model sees the first user message.
  • A2A (Google) solves agent-to-agent coordination but inherits the same verbose schema pattern via JSON and full Agent Cards in every interaction.

Reference Data

MetricMCP nativemcp2cliNEKTE
Cost per tool (discovery)~121 tokens/turn~16 tokens (once)~8 tokens (once)
Cost per invocation~121 tokens + payload~120 tokens (first)0 tokens (cached)
30 tools x 10 turns36,310 tokens1,734 tokensless than 800 tokens
120 tools x 25 turns362,000 tokens~5,000 tokensless than 2,000 tokens

Competitive Landscape

MCP (Anthropic, 2024)

Agent-to-Tool. De facto standard for connecting LLMs to external tools. 10K+ servers, 97M SDK downloads. Problem: eager schema loading, immature security, unsustainable overhead at scale.

A2A (Google - Linux Foundation, 2025)

Agent-to-Agent. 100+ partners (Salesforce, SAP, AWS). Under Linux Foundation governance. Covers discovery, delegation, peer-to-peer coordination. Enterprise critical mass. Does not address token efficiency. Supports task lifecycle (cancel, resume) but with polling-based status queries.

RTK (rtk-ai, 2026)

CLI proxy that compresses terminal output before it reaches the agent’s context. 17K+ GitHub stars. 60-90% reduction in command output tokens. Operates at the shell level, not the protocol level.

mcp2cli / CLIHub (Open Source, 2026)

Converts MCP servers to CLI with on-demand discovery. 96-99% savings on schema tokens. Proves that lazy discovery works, but as a hack — not a formal protocol.

Feature Comparison

FeatureMCPA2ARTKNEKTE
FocusAgent-to-ToolAgent-to-AgentCLI outputAgent-to-Agent (efficient)
DiscoveryEager (all upfront)Full Agent CardN/AProgressive L0/L1/L2
Zero-schema invocationNoNoN/AYes (version hash)
Native token budgetNoNoN/AYes (first-class)
Result compressionNoNoYes (CLI)Yes (minimal/compact/full)
Context permissionsNoLimitedN/AEnvelopes with TTL
VerificationNoNoN/ANative primitive
Task lifecycleNoCancel/resume (polling)N/ACancel/suspend/resume (push)
gRPC transportNoYesN/AYes (native)
MCP BridgeN/AN/AN/A@nekte/bridge

Design Principles

Token Budget as a First-Class Citizen

Every NEKTE message includes a budget field indicating the tokens available for the response. The receiving agent MUST respect this budget, adapting the granularity of its response.

{
"budget": {
"max_tokens": 500,
"detail_level": "compact"
}
}

The three detail levels:

  • minimal — A single-line string representation (~4 tokens)
  • compact — Structured data with abbreviated fields (~12 tokens)
  • full — Complete analysis with explanations (~200 tokens)

Progressive Discovery (Lazy Discovery)

No agent loads full schemas by default. Discovery occurs at three resolution levels, and the consuming agent decides how much it needs:

LevelWhat You GetEstimated Cost
L0 — CatalogList of names + categories~8 tokens/capability
L1 — SummaryDescription + main inputs/outputs~40 tokens/capability
L2 — Full SchemaTyped JSON Schema with examples~120 tokens/capability

Zero-Schema Invocation

If an agent already knows a capability from a previous interaction, it can invoke it directly using a version hash. The receiver validates the hash — if it matches, execution proceeds without re-sending the schema. If the hash is stale, the updated schema is returned in the error response, eliminating extra round-trips.

Semantic Result Compression

Results are returned at the detail level requested by the budget:

{
"result": {
"minimal": "positive 0.87",
"compact": { "sentiment": "positive", "score": 0.87 },
"full": { "sentiment": "positive", "score": 0.87, "explanation": "..." }
},
"resolved_level": "compact"
}

Transport Agnostic

NEKTE is transport agnostic (HTTP, gRPC, WebSocket, stdio) but opinionated about format:

  • JSON-RPC 2.0 as the envelope
  • gRPC with Protobuf for high-performance wire format
  • Compact fields by default, expanded only with detail_level: "full"
  • No schemas on the wire unless explicitly requested

Positioning

NEKTE does not compete with existing protocols — it complements them:

  • + MCP: MCP connects agents to tools. NEKTE connects agents to agents efficiently. The bridge enables instant adoption.
  • + A2A: Where A2A prioritizes enterprise governance, NEKTE prioritizes token efficiency. For startups, indie devs, and high-volume apps.
  • + RTK: RTK compresses CLI output. NEKTE compresses protocol overhead. Different layers, fully combinable.