Protocol Overview
The Problem
Current AI agent communication protocols prioritize expressiveness over efficiency. The result:
- MCP serializes all tool schemas into every conversation turn. With 30 tools, roughly 3,600 tokens per turn are burned on definitions alone — regardless of whether they are used. In enterprise scenarios (100+ tools), 72% of the context window is consumed before the model sees the first user message.
- A2A (Google) solves agent-to-agent coordination but inherits the same verbose schema pattern via JSON and full Agent Cards in every interaction.
Reference Data
| Metric | MCP native | mcp2cli | NEKTE |
|---|---|---|---|
| Cost per tool (discovery) | ~121 tokens/turn | ~16 tokens (once) | ~8 tokens (once) |
| Cost per invocation | ~121 tokens + payload | ~120 tokens (first) | 0 tokens (cached) |
| 30 tools x 10 turns | 36,310 tokens | 1,734 tokens | less than 800 tokens |
| 120 tools x 25 turns | 362,000 tokens | ~5,000 tokens | less than 2,000 tokens |
Competitive Landscape
MCP (Anthropic, 2024)
Agent-to-Tool. De facto standard for connecting LLMs to external tools. 10K+ servers, 97M SDK downloads. Problem: eager schema loading, immature security, unsustainable overhead at scale.
A2A (Google - Linux Foundation, 2025)
Agent-to-Agent. 100+ partners (Salesforce, SAP, AWS). Under Linux Foundation governance. Covers discovery, delegation, peer-to-peer coordination. Enterprise critical mass. Does not address token efficiency. Supports task lifecycle (cancel, resume) but with polling-based status queries.
RTK (rtk-ai, 2026)
CLI proxy that compresses terminal output before it reaches the agent’s context. 17K+ GitHub stars. 60-90% reduction in command output tokens. Operates at the shell level, not the protocol level.
mcp2cli / CLIHub (Open Source, 2026)
Converts MCP servers to CLI with on-demand discovery. 96-99% savings on schema tokens. Proves that lazy discovery works, but as a hack — not a formal protocol.
Feature Comparison
| Feature | MCP | A2A | RTK | NEKTE |
|---|---|---|---|---|
| Focus | Agent-to-Tool | Agent-to-Agent | CLI output | Agent-to-Agent (efficient) |
| Discovery | Eager (all upfront) | Full Agent Card | N/A | Progressive L0/L1/L2 |
| Zero-schema invocation | No | No | N/A | Yes (version hash) |
| Native token budget | No | No | N/A | Yes (first-class) |
| Result compression | No | No | Yes (CLI) | Yes (minimal/compact/full) |
| Context permissions | No | Limited | N/A | Envelopes with TTL |
| Verification | No | No | N/A | Native primitive |
| Task lifecycle | No | Cancel/resume (polling) | N/A | Cancel/suspend/resume (push) |
| gRPC transport | No | Yes | N/A | Yes (native) |
| MCP Bridge | N/A | N/A | N/A | @nekte/bridge |
Design Principles
Token Budget as a First-Class Citizen
Every NEKTE message includes a budget field indicating the tokens available for the response. The receiving agent MUST respect this budget, adapting the granularity of its response.
{ "budget": { "max_tokens": 500, "detail_level": "compact" }}The three detail levels:
- minimal — A single-line string representation (~4 tokens)
- compact — Structured data with abbreviated fields (~12 tokens)
- full — Complete analysis with explanations (~200 tokens)
Progressive Discovery (Lazy Discovery)
No agent loads full schemas by default. Discovery occurs at three resolution levels, and the consuming agent decides how much it needs:
| Level | What You Get | Estimated Cost |
|---|---|---|
| L0 — Catalog | List of names + categories | ~8 tokens/capability |
| L1 — Summary | Description + main inputs/outputs | ~40 tokens/capability |
| L2 — Full Schema | Typed JSON Schema with examples | ~120 tokens/capability |
Zero-Schema Invocation
If an agent already knows a capability from a previous interaction, it can invoke it directly using a version hash. The receiver validates the hash — if it matches, execution proceeds without re-sending the schema. If the hash is stale, the updated schema is returned in the error response, eliminating extra round-trips.
Semantic Result Compression
Results are returned at the detail level requested by the budget:
{ "result": { "minimal": "positive 0.87", "compact": { "sentiment": "positive", "score": 0.87 }, "full": { "sentiment": "positive", "score": 0.87, "explanation": "..." } }, "resolved_level": "compact"}Transport Agnostic
NEKTE is transport agnostic (HTTP, gRPC, WebSocket, stdio) but opinionated about format:
- JSON-RPC 2.0 as the envelope
- gRPC with Protobuf for high-performance wire format
- Compact fields by default, expanded only with
detail_level: "full" - No schemas on the wire unless explicitly requested
Positioning
NEKTE does not compete with existing protocols — it complements them:
- + MCP: MCP connects agents to tools. NEKTE connects agents to agents efficiently. The bridge enables instant adoption.
- + A2A: Where A2A prioritizes enterprise governance, NEKTE prioritizes token efficiency. For startups, indie devs, and high-volume apps.
- + RTK: RTK compresses CLI output. NEKTE compresses protocol overhead. Different layers, fully combinable.