[EXPERIMENTAL][ADR-044] feat(grpc-uds): gRPC-over-UDS internal IPC for Rust MCP runtime#3726
Draft
gandhipratik203 wants to merge 4 commits intomainfrom
Draft
[EXPERIMENTAL][ADR-044] feat(grpc-uds): gRPC-over-UDS internal IPC for Rust MCP runtime#3726gandhipratik203 wants to merge 4 commits intomainfrom
gandhipratik203 wants to merge 4 commits intomainfrom
Conversation
…time (ADR-044) Replace the HTTP/JSON Python→Rust sidecar boundary with a typed protobuf contract over a Unix Domain Socket, as described in ADR-044. Rust side: - Add proto/mcp_runtime.proto defining the McpRuntime service (Invoke, InvokeStream, CloseSession, HealthCheck RPCs) - Add src/grpc.rs implementing McpRuntimeService using tonic; handlers convert proto McpRequest into http::Request and call the existing Axum router directly as a Tower service — no additional network hop - Add build.rs to compile the proto via tonic-build at build time - Gate all new deps behind the grpc-uds Cargo feature flag - Add MCP_RUST_GRPC_UDS env var to RuntimeConfig (src/config.rs) - Wire serve_grpc_uds into run() alongside the existing Axum servers Python side: - Add mcpgateway/transports/grpc_gen/ with generated pb2 stubs - Add rust_mcp_runtime_grpc_proxy.py: async gRPC proxy implementing the same ASGI interface as RustMCPRuntimeProxy but using grpc.aio over UDS - Add experimental_rust_mcp_runtime_grpc_uds config setting - Wire RustMCPRuntimeGrpcProxy into main.py: selected when MCP_RUST_GRPC_UDS is set; HTTP proxy remains the default Tests: - Add 15 unit tests covering header stripping, request construction, POST/GET/DELETE ASGI flows, fallback paths, and 502 error handling Signed-off-by: Pratik Gandhi <[email protected]>
…uds feature Extends the Containerfile.lite cargo build logic to conditionally compile the Rust MCP runtime with the grpc-uds Cargo feature when --build-arg ENABLE_RUST_MCP_GRPC_UDS=true is passed. Features are combined so rmcp-upstream-client and grpc-uds can be enabled together. Signed-off-by: Pratik Gandhi <[email protected]>
…ture is enabled tonic-build requires protoc at compile time to generate code from the proto file. Install the official protobuf release binary from GitHub before the cargo build step when ENABLE_RUST_MCP_GRPC_UDS=true. Signed-off-by: Pratik Gandhi <[email protected]>
… and fix socket setup - Makefile: propagate ENABLE_RUST_MCP_GRPC_UDS_BUILD through docker-prod-rust-no-cache → container-build, add docker-prod-rust-grpc-uds convenience target, add GRPC_UDS_ARG to container-build - docker-compose.yml: pass MCP_RUST_GRPC_UDS and EXPERIMENTAL_RUST_MCP_RUNTIME_GRPC_UDS into gateway containers - docker-entrypoint.sh: unset MCP_RUST_GRPC_UDS when empty (prevents clap empty-string parse error) and mkdir -p the socket directory - mcp_runtime_pb2_grpc.py: lower GRPC_GENERATED_VERSION to 1.78.0 to match grpcio installed in the image Signed-off-by: Pratik Gandhi <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #3730
Experiment to explore the ideas in ADR-044 using an Axum dispatch strategy — companion to #3729 which takes the pure tonic approach. Together these two PRs represent two different ways of implementing the same gRPC-over-UDS boundary.
This PR's approach:
McpRuntimegRPC service (Invoke,InvokeStream,CloseSession,HealthCheck) with Rust tonic server and Python grpc.aio clientMcpRuntimeServicewraps the existing AxumRouterand dispatches each RPC into it via Toweroneshotgrpc-udsCargo feature flag; zero overhead when disabledRustMCPRuntimeGrpcProxyintomain.py— selected whenMCP_RUST_GRPC_UDSis set, HTTP proxy remains the defaultADR-044 Goals — What This Experiment Covers
ADR-044 goal coverage for this approach (Axum dispatch)
✅ Demonstrated
Typed contract via protobuf
The
.protofile is the single source of truth. Python and Rust both generate stubs from it. If a field is added or a method renamed, both sides fail to compile before anything ships. The HTTP/JSON IPC boundary had no schema — a wrong header name would silently do nothing.Language-neutral boundary
The
McpRuntimeproto contract does not care what is on either side. A Go module, a Java module, or a second Rust service could plug into the same contract today without any changes to the Python gateway. The HTTP/JSON IPC was Python-to-Rust only and relied on undocumented internal headers.Native streaming RPCs
InvokeStreamis a proper gRPC server-streaming RPC — not HTTP chunked encoding over a socket. The Python side gets an async iterator, the Rust side yields chunks. Backpressure, framing, and cancellation are handled correctly by construction.Unix Domain Socket — no TCP overhead
All traffic stays in kernel memory. No TCP handshake, no loopback routing, no port allocation. HTTP/2 multiplexing means multiple concurrent RPCs share one connection rather than opening new connections per request.
Clean process boundary — crash isolation
If the Rust sidecar panics, Python keeps running and the gRPC channel reconnects. The proto contract is the explicit, versioned API surface — not an internal HTTP path that could drift silently.
Aligns with the existing plugin gRPC pattern
The external plugin framework in
mcpgateway/plugins/framework/external/grpc/already uses gRPC. The MCP runtime boundary now uses the same pattern — one mental model for module boundaries across the platform.Independent scaling of modules
The process boundary exists — Rust is a separate binary on a separate socket. But in the current container setup they are co-located. The groundwork is there; exploiting it requires running the sidecar as a separate container or pod.
Structured auth context propagation
The
AuthContextproto message is a better contract than thex-contextforge-auth-contextheader string used in HTTP/JSON IPC. The current implementation encodes it as base64 JSON inside the proto field — which works, but does not yet use the proto's ability to carry it as proper typed fields.❌ Not yet demonstrated
Catalog change subscriptions and session broadcast
ADR-044 specifically calls out streaming RPCs for catalog change notifications and session broadcast patterns.
InvokeStreamcurrently handles only SSE relay. The push patterns the ADR envisioned are not implemented.Multi-language beyond Python and Rust
The value of language-neutral codegen becomes concrete when a third language plugs into the same contract. Right now it is two languages that could have communicated over HTTP just fine. The payoff is visible only when a Go or Java module joins.
Changes
Rust sidecar (
tools_rust/mcp_runtime/)proto/mcp_runtime.proto— service contractsrc/grpc.rs— tonic server; dispatches to existing Axum router as a Tower service (no extra network hop)build.rs— compiles proto via tonic-build at build timesrc/config.rs—MCP_RUST_GRPC_UDSenv var added toRuntimeConfigsrc/lib.rs— spawns gRPC UDS server alongside existing Axum serversCargo.toml— optional deps:tonic,prost,tokio-stream,tower,bytes,http,http-body-utilPython gateway (
mcpgateway/)transports/grpc_gen/— generated protobuf stubstransports/rust_mcp_runtime_grpc_proxy.py— async gRPC proxy (same ASGI interface as HTTP proxy)config.py—experimental_rust_mcp_runtime_grpc_udssettingmain.py— selects gRPC proxy whenMCP_RUST_GRPC_UDSis configuredInfrastructure
Containerfile.lite—ENABLE_RUST_MCP_GRPC_UDSbuild arg; installsprotocwhen compiling with gRPCMakefile—ENABLE_RUST_MCP_GRPC_UDS_BUILDpropagated through build chain;docker-prod-rust-grpc-udsconvenience targetdocker-compose.yml—MCP_RUST_GRPC_UDSandEXPERIMENTAL_RUST_MCP_RUNTIME_GRPC_UDSpassthroughsdocker-entrypoint.sh— unsets emptyMCP_RUST_GRPC_UDS(prevents clap parse error);mkdir -psocket directoryTests
Benchmark (125 users / 60s, MacBook + Colima)
gRPC-UDS is slower than HTTP/JSON IPC for this topology because Axum already speaks HTTP natively — gRPC adds a protobuf serialization round-trip on both sides. The value of this transport is typed contracts, structured auth context propagation, and a foundation for streaming-first protocols where HTTP/JSON would require chunked encoding.
Flow diagrams: HTTP/JSON IPC vs gRPC-UDS IPC (Axum dispatch)
HTTP/JSON IPC (faster)
No format change. Python speaks HTTP, Rust speaks HTTP, the socket carries HTTP. One format end to end.
gRPC-UDS IPC (this PR — Axum dispatch)
Six steps with overhead. Steps 1, 2, 3, 5, 6 are pure overhead that did not exist in the HTTP path. See #3729 for the pure tonic approach which eliminates steps 3 and the Tower dispatch.
Usage