Skip to content

[EXPERIMENTAL][ADR-044] feat(grpc-uds): gRPC-over-UDS internal IPC for Rust MCP runtime#3726

Draft
gandhipratik203 wants to merge 4 commits intomainfrom
feat/grpc-uds-internal-ipc
Draft

[EXPERIMENTAL][ADR-044] feat(grpc-uds): gRPC-over-UDS internal IPC for Rust MCP runtime#3726
gandhipratik203 wants to merge 4 commits intomainfrom
feat/grpc-uds-internal-ipc

Conversation

@gandhipratik203
Copy link
Copy Markdown
Collaborator

@gandhipratik203 gandhipratik203 commented Mar 18, 2026

Summary

Closes #3730

Experiment to explore the ideas in ADR-044 using an Axum dispatch strategy — companion to #3729 which takes the pure tonic approach. Together these two PRs represent two different ways of implementing the same gRPC-over-UDS boundary.

This PR's approach:

  • Adds McpRuntime gRPC service (Invoke, InvokeStream, CloseSession, HealthCheck) with Rust tonic server and Python grpc.aio client
  • McpRuntimeService wraps the existing Axum Router and dispatches each RPC into it via Tower oneshot
  • Gates all new Rust deps behind the grpc-uds Cargo feature flag; zero overhead when disabled
  • Wires RustMCPRuntimeGrpcProxy into main.py — selected when MCP_RUST_GRPC_UDS is set, HTTP proxy remains the default

ADR-044 Goals — What This Experiment Covers

Click to see which goals this implementation demonstrates, which are partial, and which remain future work.

ADR-044 goal coverage for this approach (Axum dispatch)

✅ Demonstrated

Typed contract via protobuf
The .proto file is the single source of truth. Python and Rust both generate stubs from it. If a field is added or a method renamed, both sides fail to compile before anything ships. The HTTP/JSON IPC boundary had no schema — a wrong header name would silently do nothing.

Language-neutral boundary
The McpRuntime proto contract does not care what is on either side. A Go module, a Java module, or a second Rust service could plug into the same contract today without any changes to the Python gateway. The HTTP/JSON IPC was Python-to-Rust only and relied on undocumented internal headers.

Native streaming RPCs
InvokeStream is a proper gRPC server-streaming RPC — not HTTP chunked encoding over a socket. The Python side gets an async iterator, the Rust side yields chunks. Backpressure, framing, and cancellation are handled correctly by construction.

Unix Domain Socket — no TCP overhead
All traffic stays in kernel memory. No TCP handshake, no loopback routing, no port allocation. HTTP/2 multiplexing means multiple concurrent RPCs share one connection rather than opening new connections per request.

Clean process boundary — crash isolation
If the Rust sidecar panics, Python keeps running and the gRPC channel reconnects. The proto contract is the explicit, versioned API surface — not an internal HTTP path that could drift silently.

Aligns with the existing plugin gRPC pattern
The external plugin framework in mcpgateway/plugins/framework/external/grpc/ already uses gRPC. The MCP runtime boundary now uses the same pattern — one mental model for module boundaries across the platform.


⚠️ Partially demonstrated

Independent scaling of modules
The process boundary exists — Rust is a separate binary on a separate socket. But in the current container setup they are co-located. The groundwork is there; exploiting it requires running the sidecar as a separate container or pod.

Structured auth context propagation
The AuthContext proto message is a better contract than the x-contextforge-auth-context header string used in HTTP/JSON IPC. The current implementation encodes it as base64 JSON inside the proto field — which works, but does not yet use the proto's ability to carry it as proper typed fields.


❌ Not yet demonstrated

Catalog change subscriptions and session broadcast
ADR-044 specifically calls out streaming RPCs for catalog change notifications and session broadcast patterns. InvokeStream currently handles only SSE relay. The push patterns the ADR envisioned are not implemented.

Multi-language beyond Python and Rust
The value of language-neutral codegen becomes concrete when a third language plugs into the same contract. Right now it is two languages that could have communicated over HTTP just fine. The payoff is visible only when a Go or Java module joins.

Changes

Rust sidecar (tools_rust/mcp_runtime/)

  • proto/mcp_runtime.proto — service contract
  • src/grpc.rs — tonic server; dispatches to existing Axum router as a Tower service (no extra network hop)
  • build.rs — compiles proto via tonic-build at build time
  • src/config.rsMCP_RUST_GRPC_UDS env var added to RuntimeConfig
  • src/lib.rs — spawns gRPC UDS server alongside existing Axum servers
  • Cargo.toml — optional deps: tonic, prost, tokio-stream, tower, bytes, http, http-body-util

Python gateway (mcpgateway/)

  • transports/grpc_gen/ — generated protobuf stubs
  • transports/rust_mcp_runtime_grpc_proxy.py — async gRPC proxy (same ASGI interface as HTTP proxy)
  • config.pyexperimental_rust_mcp_runtime_grpc_uds setting
  • main.py — selects gRPC proxy when MCP_RUST_GRPC_UDS is configured

Infrastructure

  • Containerfile.liteENABLE_RUST_MCP_GRPC_UDS build arg; installs protoc when compiling with gRPC
  • MakefileENABLE_RUST_MCP_GRPC_UDS_BUILD propagated through build chain; docker-prod-rust-grpc-uds convenience target
  • docker-compose.ymlMCP_RUST_GRPC_UDS and EXPERIMENTAL_RUST_MCP_RUNTIME_GRPC_UDS passthroughs
  • docker-entrypoint.sh — unsets empty MCP_RUST_GRPC_UDS (prevents clap parse error); mkdir -p socket directory

Tests

  • 15 unit tests: header stripping, request construction, POST/GET/DELETE ASGI flows, fallback paths, 502 error handling

Benchmark (125 users / 60s, MacBook + Colima)

Metric Python HTTP/JSON IPC gRPC-UDS this PR (Axum dispatch) gRPC-UDS #3729 (pure tonic)
RPS 512 1,930 1,690 1,778
Avg latency 167ms 2.84ms 11.51ms 8.71ms
p99 610ms 11ms 79ms 37ms
Failures 24 0 ✅ 0 ✅ 0 ✅

gRPC-UDS is slower than HTTP/JSON IPC for this topology because Axum already speaks HTTP natively — gRPC adds a protobuf serialization round-trip on both sides. The value of this transport is typed contracts, structured auth context propagation, and a foundation for streaming-first protocols where HTTP/JSON would require chunked encoding.

Flow diagrams: HTTP/JSON IPC vs gRPC-UDS IPC (Axum dispatch)

HTTP/JSON IPC (faster)

Python gateway                  Unix socket              Rust sidecar
─────────────────────────────────────────────────────────────────────

  [incoming MCP request]
        │
        │  Build HTTP request
        │  (headers already exist,
        │   body already bytes)
        │
        ▼
  ┌─────────────┐
  │  httpx      │ ──── raw HTTP bytes ──────────────────► Axum router
  │  AsyncClient│                                              │
  └─────────────┘                                              │
                                                         (Axum speaks
                                                          HTTP natively,
        ◄──────────── raw HTTP response bytes ───────────  no conversion)
        │
  [send response to client]

No format change. Python speaks HTTP, Rust speaks HTTP, the socket carries HTTP. One format end to end.


gRPC-UDS IPC (this PR — Axum dispatch)

Python gateway                  Unix socket              Rust sidecar
─────────────────────────────────────────────────────────────────────

  [incoming MCP request]
        │
        │  1. Serialize to protobuf          ← extra work
        │     (headers map, body bytes,
        │      auth context, session id...)
        │
        ▼
  ┌─────────────┐
  │  grpc.aio   │ ──── protobuf bytes ──────────────────► tonic server
  │  stub       │                                              │
  └─────────────┘                                    2. Deserialize protobuf
                                                        ← extra work
                                                              │
                                                     3. Re-encode as
                                                        http::Request
                                                        ← extra work
                                                              │
                                                              ▼
                                                         Axum router
                                                              │
                                                     4. Axum returns
                                                        http::Response
                                                              │
                                                     5. Serialize back
                                                        to protobuf
                                                        ← extra work
                                                              │
        ◄──────────────── protobuf bytes ─────────────────────
        │
  6. Deserialize protobuf                             ← extra work
        │
  [send response to client]

Six steps with overhead. Steps 1, 2, 3, 5, 6 are pure overhead that did not exist in the HTTP path. See #3729 for the pure tonic approach which eliminates steps 3 and the Tower dispatch.

Usage

# Build with gRPC-UDS support
docker build --build-arg ENABLE_RUST=true --build-arg ENABLE_RUST_MCP_GRPC_UDS=true -f Containerfile.lite .

# Run with gRPC-UDS enabled
RUST_MCP_MODE=full MCP_RUST_GRPC_UDS=/tmp/contextforge-mcp-grpc.sock make testing-up

…time (ADR-044)

Replace the HTTP/JSON Python→Rust sidecar boundary with a typed protobuf
contract over a Unix Domain Socket, as described in ADR-044.

Rust side:
- Add proto/mcp_runtime.proto defining the McpRuntime service (Invoke,
  InvokeStream, CloseSession, HealthCheck RPCs)
- Add src/grpc.rs implementing McpRuntimeService using tonic; handlers
  convert proto McpRequest into http::Request and call the existing Axum
  router directly as a Tower service — no additional network hop
- Add build.rs to compile the proto via tonic-build at build time
- Gate all new deps behind the grpc-uds Cargo feature flag
- Add MCP_RUST_GRPC_UDS env var to RuntimeConfig (src/config.rs)
- Wire serve_grpc_uds into run() alongside the existing Axum servers

Python side:
- Add mcpgateway/transports/grpc_gen/ with generated pb2 stubs
- Add rust_mcp_runtime_grpc_proxy.py: async gRPC proxy implementing the
  same ASGI interface as RustMCPRuntimeProxy but using grpc.aio over UDS
- Add experimental_rust_mcp_runtime_grpc_uds config setting
- Wire RustMCPRuntimeGrpcProxy into main.py: selected when
  MCP_RUST_GRPC_UDS is set; HTTP proxy remains the default

Tests:
- Add 15 unit tests covering header stripping, request construction,
  POST/GET/DELETE ASGI flows, fallback paths, and 502 error handling

Signed-off-by: Pratik Gandhi <[email protected]>
…uds feature

Extends the Containerfile.lite cargo build logic to conditionally compile
the Rust MCP runtime with the grpc-uds Cargo feature when
--build-arg ENABLE_RUST_MCP_GRPC_UDS=true is passed. Features are combined
so rmcp-upstream-client and grpc-uds can be enabled together.

Signed-off-by: Pratik Gandhi <[email protected]>
…ture is enabled

tonic-build requires protoc at compile time to generate code from the
proto file. Install the official protobuf release binary from GitHub
before the cargo build step when ENABLE_RUST_MCP_GRPC_UDS=true.

Signed-off-by: Pratik Gandhi <[email protected]>
… and fix socket setup

- Makefile: propagate ENABLE_RUST_MCP_GRPC_UDS_BUILD through
  docker-prod-rust-no-cache → container-build, add docker-prod-rust-grpc-uds
  convenience target, add GRPC_UDS_ARG to container-build
- docker-compose.yml: pass MCP_RUST_GRPC_UDS and
  EXPERIMENTAL_RUST_MCP_RUNTIME_GRPC_UDS into gateway containers
- docker-entrypoint.sh: unset MCP_RUST_GRPC_UDS when empty (prevents
  clap empty-string parse error) and mkdir -p the socket directory
- mcp_runtime_pb2_grpc.py: lower GRPC_GENERATED_VERSION to 1.78.0 to
  match grpcio installed in the image

Signed-off-by: Pratik Gandhi <[email protected]>
@gandhipratik203 gandhipratik203 marked this pull request as draft March 18, 2026 14:53
@gandhipratik203 gandhipratik203 changed the title feat(grpc-uds): implement gRPC-over-UDS internal IPC for Rust MCP runtime (ADR-044) WIP: feat(grpc-uds): implement gRPC-over-UDS internal IPC for Rust MCP runtime (ADR-044) Mar 18, 2026
@gandhipratik203 gandhipratik203 marked this pull request as ready for review March 18, 2026 19:30
@gandhipratik203 gandhipratik203 changed the title WIP: feat(grpc-uds): implement gRPC-over-UDS internal IPC for Rust MCP runtime (ADR-044) feat(grpc-uds): implement gRPC-over-UDS internal IPC for Rust MCP runtime (ADR-044) Mar 18, 2026
@gandhipratik203 gandhipratik203 changed the title feat(grpc-uds): implement gRPC-over-UDS internal IPC for Rust MCP runtime (ADR-044) [EXPERIMENTAL][ADR-044] feat(grpc-uds): gRPC-over-UDS internal IPC for Rust MCP runtime Mar 18, 2026
@gandhipratik203 gandhipratik203 added experimental Experimental features, test proposed MCP Specification changes rust Rust programming mcp-protocol Alignment with MCP protocol or specification performance Performance related items labels Mar 19, 2026
@crivetimihai crivetimihai added the COULD P3: Nice-to-have features with minimal impact if left out; included if time permits label Mar 20, 2026
@crivetimihai crivetimihai added this to the Release 1.3.0 milestone Mar 20, 2026
@jonpspri jonpspri marked this pull request as draft April 9, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

COULD P3: Nice-to-have features with minimal impact if left out; included if time permits experimental Experimental features, test proposed MCP Specification changes mcp-protocol Alignment with MCP protocol or specification performance Performance related items rust Rust programming

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE][RUST]: ADR-044 gRPC-over-UDS module communication boundary — POC

3 participants