Hash-bound (or cryptographic) approval for high-risk MCP mutations? #751

mirusser · 2026-05-14T13:34:05Z

mirusser
May 14, 2026

Pre-submission Checklist

I have verified that this discussion would not be more appropriate as an issue in a specific repository
I have searched existing discussions to avoid duplicates

Discussion Topic

Hash-bound (or cryptographic) approval for high-risk MCP mutations?

I opened a more concrete Kubernetes-focused RFC here:

containers/kubernetes-mcp-server#1150

I wanted to ask whether this problem belongs at the broader MCP level as an optional mutation-approval profile.

MCP already has useful foundations: tool annotations, human-in-the-loop guidance, and URL-mode elicitation for out-of-band flows. What I’m exploring is whether high-risk mutation tools need a portable lifecycle around the approval itself:

plan: server creates a concrete mutation plan, dry-run result, policy findings, and digest
challenge: server creates a time-bounded, identity-bound approval challenge for that exact plan
execute: server verifies the digest, approval state, TTL, requester/approver binding, and current dry-run result before mutating

The concern is not simply “should the user be asked before a dangerous tool runs?” Existing confirmation/elicitation mechanisms can do that.

The narrower concern is: did the human approve the exact mutation payload that was later executed?

This seems especially relevant for MCP servers that mutate real infrastructure: Kubernetes, cloud resources, CI/CD, IAM, databases, incident response systems, etc.

I have a reference implementation and SafetyE2E tests in Kubernetes-MCP-Guard, but I’m not proposing that implementation specifically. I’m asking whether MCP would benefit from a small optional interoperability profile for digest-bound mutation approval.

Questions:

Is this already covered by existing or planned MCP primitives?
Would this be better handled as implementation guidance, an extension, or a formal SEP?
Are there existing discussions/workstreams around mutation approval, replay protection, or payload-bound HITL that I should join instead?

armorer-labs · 2026-05-14T20:52:36Z

armorer-labs
May 14, 2026

I like the hash-bound approval direction, especially for mutations where the agent can transform intent into a materially different final action.

One extra boundary I would want is a final pre-execution scan over the exact tools/call arguments whose hash is being approved. The approval hash proves the user saw the same payload, but it does not by itself classify whether that payload contains prompt-injected instructions, credential material, or exfiltration-shaped destinations.

A pattern we are testing in Armorer Guard is:

canonicalize the final tool-call arguments
scan locally for credential leakage, exfiltration, prompt injection, and dangerous action reasons
if clean, bind the approval hash to that final payload
if changed, force a fresh scan and fresh approval

The MCP proxy shape is useful here because it can sit immediately before the wrapped server receives tools/call:

armorer-guard mcp-proxy -- npx your-mcp-server

Demo if useful for testing the UX: https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

Curious whether you imagine the hash-bound approval as purely client-side UX, or whether servers should also receive a policy/approval artifact they can verify.

1 reply

mirusser May 14, 2026
Author

I've been thinking and designing possible approach to this problem, here is my design that I've come up with so far:

This document sketches a possible MCP mutation-approval profile.

The profile goal is narrow: bind human approval to an exact mutation intent and an exact review snapshot, then allow execution only after approval and required pre-execution gates pass. It is not a full policy engine, authorization model, workflow product, or domain-specific (like Kubernetes) planning format.

Core Idea

The generic part is the approval lifecycle:

plan identity
intent and review digest binding
plan validity and challenge TTL
requester and approver binding
approval challenge lifecycle and terminal challenge outcomes
replay prevention through execution reuse policy
audit spine
approval-bound execution semantics

The domain-specific (Kubernetes in my case) part is the evidence and execution meaning:

what a mutation intent means
how impact is previewed
whether dry-run exists
whether diff exists
how freshness or drift is checked
how domain policy is evaluated
what the human can safely see
how execution retries and idempotency work

Roles

Generic Approval Core owns the domain-independent approval lifecycle: plan envelopes, lifecycle state, digest checks, approval challenges, challenge outcomes, approval grants, audit spine, review snapshot canonicalization, and pre-execution gate orchestration. It does not define domain-specific review content, but it may host a review surface while domain adapters supply the domain-specific evidence artifacts rendered there.

Domain Adapter defines, explains, and executes mutation intents for one target system. The Kubernetes adapter is the first adapter and owns Kubernetes mutation meaning, dry-run evidence, diffs, drift detection, Kubernetes policy checks, Kubernetes mutation-intent canonicalization, execution behavior, evidence artifact digests, and adapter audit payloads.

Approval Authority creates approval challenges, enforces approval policies, records challenge outcomes, and issues or exposes approval grants for execution. In my repository, that role is currently implemented by the gateway plus approval store. Another implementation could delegate the role to an external workflow system.

Review Surface renders the immutable review snapshot identified by the review digest to the approver. It must not rely on model-supplied approval content as the source of truth.

Plan Envelope

A plan envelope is a generic wrapper around one domain-specific mutation intent. It is not the mutation itself.

Minimum generic fields:

{
  "planId": "opaque workflow identifier",
  "profile": "mcp.mutation-approval",
  "operationType": "adapter-specific operation label",
  "requester": {
    "subject": "authenticated requester subject"
  },
  "approvalPolicy": {
    "type": "same-subject"
  },
  "executionReusePolicy": {
    "type": "single-execution"
  },
  "validFrom": "2026-05-14T00:00:00Z",
  "validUntil": "2026-05-14T01:00:00Z",
  "freshnessPolicy": {
    "checks": [
      {
        "type": "adapter-defined"
      }
    ]
  },
  "intentDigest": {
    "algorithm": "sha-256",
    "canonicalization": "adapter-defined",
    "value": "..."
  },
  "reviewDigest": {
    "algorithm": "sha-256",
    "canonicalization": "profile-defined",
    "value": "..."
  },
  "evidenceArtifacts": {
    "adapter": "kubernetes",
    "items": [
      {
        "type": "diff",
        "digest": {
          "algorithm": "sha-256",
          "canonicalization": "adapter-defined",
          "value": "..."
        }
      }
    ]
  }
}

planId is an opaque workflow handle for MCP calls, approval URLs, audit correlation, and storage. It is not an integrity mechanism. intentDigest proves the executable mutation intent is the same. reviewDigest proves the human-approved review snapshot is the same.

Digests

The profile uses two digest bindings:

Intent Digest binds the exact executable mutation intent.
Review Digest binds the immutable review snapshot, including plan-envelope metadata, the intent digest, evidence artifact digests or digest-bound references, redaction metadata, approval policy, execution reuse policy, freshness policy, requester, plan validity window, and review-surface context.

Every digest declares its algorithm and canonicalization. The generic approval core defines canonicalization for generic envelope metadata and the review digest. Each domain adapter defines canonicalization for its mutation intent and evidence artifacts.

Approval Lifecycle

The generic lifecycle is:

Create a plan envelope for a domain-specific mutation intent.
Compute intent and review digests using declared canonicalization.
Expose trusted plan evidence through a review surface.
Create one or more short-lived approval challenges while the plan remains valid.
Resolve the approval challenge by recording a challenge outcome, such as approved, denied, rejected, expired, or canceled, through the approval authority.
If the challenge is approved, issue or reference an approval grant bound to the plan identifier, requester, approver, intent digest, review digest, approval policy, expiry, and reuse constraints.
Before execution, verify all pre-execution gates.
Execute through the domain adapter only if every required gate passes.
Record the outcome in the audit trail.

Same-subject approval is the default approval policy: the approver must be the same authenticated subject as the requester. Other approval policies, such as delegated approval or multi-party approval, are future extension points.

Pre-Execution Gates

Approval is necessary but not sufficient. Immediately before mutation, approval-bound execution verifies:

plan validity window still allows execution
authorization check still passes
approval authority reports a valid approval grant
intent digest still matches executable intent
review digest still matches approved review snapshot
execution reuse policy allows another successful execution
declared freshness policy passes
required domain policy checks still pass

Freshness and domain policy meanings are adapter-owned, but the generic profile requires declared gates to be evaluated before execution. A freshness policy may contain zero or more freshness checks so adapters can combine checks such as dry-run, resource-version matching, preview recomputation, or other target-specific freshness signals.

Replay And Reuse

The default execution reuse policy is single-execution: one approved plan envelope may authorize at most one successful execution.

Reusable plans are an explicit future extension point. They must opt in through an execution reuse policy that defines how many successful executions are allowed and under what conditions.

Retry behavior for failed or unknown execution attempts is domain-adapter-owned because target systems differ sharply in idempotency and failure semantics.

Audit Spine

The profile requires a generic audit spine that proves the lifecycle:

plan.created
challenge.created
challenge.approved
challenge.denied
challenge.expired
challenge.rejected
challenge.canceled
grant.issued
execution.started
execution.blocked
execution.failed
execution.succeeded

Terminal challenge events record challenge outcomes. grant.issued exists only when the approval authority issues or references durable execution authorization for an approved challenge.

Generic audit events should carry plan identifier, intent digest, review digest, requester, approver when relevant, approval policy, grant identifier when relevant, timestamps, and event result. Domain adapters may attach adapter audit payloads such as Kubernetes object references, namespaces, dry-run summaries, drift messages, and policy findings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Context Protocol

Hash-bound (or cryptographic) approval for high-risk MCP mutations? #751

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Model Context Protocol

Hash-bound (or cryptographic) approval for high-risk MCP mutations? #751

Uh oh!

Uh oh!

mirusser May 14, 2026

Pre-submission Checklist

Discussion Topic

Hash-bound (or cryptographic) approval for high-risk MCP mutations?

Replies: 1 comment · 1 reply

Uh oh!

armorer-labs May 14, 2026

Uh oh!

Uh oh!

mirusser May 14, 2026 Author

Core Idea

Roles

Plan Envelope

Digests

Approval Lifecycle

Pre-Execution Gates

Replay And Reuse

Audit Spine

mirusser
May 14, 2026

Replies: 1 comment 1 reply

armorer-labs
May 14, 2026

mirusser May 14, 2026
Author