You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement a native plugin that protects MCP Gateway from resource exhaustion by limiting the number of content items returned from tool invocations. This prevents malicious or misconfigured MCP servers from overwhelming downstream clients with excessively large response payloads, ensuring system stability and predictable resource consumption.
Why Now?
With PR #1189 fixing the StreamableHTTP transport to return all content items instead of just the first one, the gateway now correctly honors the MCP protocol's multi-content specification. However, this opens a potential attack vector:
Uncontrolled Resource Growth: A malicious or buggy MCP server could return hundreds or thousands of content items (e.g., 100 items × 1MB each = 100MB), consuming excessive memory and bandwidth
Denial of Service Risk: Downstream clients (AI agents, web browsers) could be overwhelmed by large payloads, leading to crashes or degraded performance
Operational Visibility Gap: Currently no logging or metrics exist to detect when tools return abnormally large response sets
Policy Enforcement Need: Administrators need configurable controls to enforce organization-specific limits without code changes
By implementing this as a plugin, we leverage the existing plugin framework's capabilities (priority ordering, conditional execution, metadata tracking) and allow operators to enable/disable/tune the behavior independently of core gateway logic.
📖 User Stories
US-1: Platform Admin - Configure Content Limits
As a Platform Administrator I want to configure maximum content item limits globally and per-tool So that I can protect my infrastructure from resource exhaustion
Acceptance Criteria:
Given the plugin is registered in plugins/config.yaml
When I set the configuration:
config:
max_content_items: 50
truncate_mode: "truncate" # or "block"
log_violations: true
per_tool_limits:
- tool_pattern: "large-dataset-.*"
max_items: 10
- tool_pattern: "bulk-export"
max_items: 5
Then tools returning more than 50 items should be truncated to 50
And tools matching "large-dataset-.*" should be limited to 10 items
And violations should be logged with metadata
Technical Requirements:
Global max_content_items limit (default: 50)
Per-tool overrides via regex pattern matching
Two enforcement modes: truncate (trim excess) or block (reject entirely)
Structured logging of violations with tool name, original count, enforced limit
Metadata injection for observability
US-2: AI Agent - Receive Truncated Results with Metadata
As an AI Agent invoking tools via MCP I want to receive a warning when results are truncated So that I know the response is partial and can adjust my strategy
Acceptance Criteria:
Given a tool returns 100 content items
And the content limit plugin is configured with max_content_items: 25
When I invoke the tool via the gateway
Then I should receive:
- Exactly 25 content items (the first 25)
- Metadata indicating truncation occurred
- Original count (100) and enforced limit (25) in metadata
- A warning annotation in the response
Technical Requirements:
Preserve first N items (configurable whether to keep first or last)
Add content_truncated: true to response metadata
Add original_count and enforced_limit fields
Optionally append a warning TextContent item describing the truncation
As a Security Engineer I want to completely reject tool responses exceeding limits So that no truncated data reaches untrusted clients
Acceptance Criteria:
Given the plugin is in "block" mode
And a tool returns 200 content items
And the limit is 50
When the tool_post_invoke hook executes
Then the plugin should:
- Return continue_processing=False
- Set a PluginViolation with code "CONTENT_LIMIT_EXCEEDED"
- Include violation details: tool name, count, limit
- Log the violation at WARNING level
- Return an error to the client instead of partial data
Technical Requirements:
truncate_mode: "block" configuration option
Violation reason: "Content limit exceeded"
Violation code: "CONTENT_LIMIT_EXCEEDED"
Detailed violation metadata for audit trails
Client receives MCP error instead of partial response
US-4: DevOps - Monitor Content Limit Violations
As a DevOps Engineer I want to track content limit violations via logs and metrics So that I can identify problematic tools and tune limits
Acceptance Criteria:
Given the plugin has log_violations: true
When a tool exceeds the content limit
Then the following should be logged:
- Event type: "CONTENT_LIMIT_VIOLATION"
- Tool name and invocation ID
- Original content count vs. enforced limit
- Action taken (truncate or block)
- Timestamp and request context
And metrics should be incremented:
- content_limit_violations_total (counter)
- content_items_truncated_total (counter)
- Tool-specific counters with labels
Log level: WARNING for violations, INFO for normal operation
Correlation with request_id and user context
US-5: Developer - Conditional Execution by Context
As a Developer I want to apply content limits only to specific tools or tenants So that trusted internal tools aren't unnecessarily restricted
Acceptance Criteria:
Given the plugin configuration includes:
conditions:
- tools: ["external-.*"]
tenant_ids: ["tenant-untrusted"]
When I invoke an internal tool "admin-dashboard"Then the content limit should NOT be enforced
When I invoke "external-api-search" as "tenant-untrusted"Then the content limit SHOULD be enforced
Technical Requirements:
Leverage plugin framework's conditions matching
Support tool name patterns (regex)
Support tenant_id and server_id filters
Document condition examples in plugin README
Test conditional execution in unit tests
🏗 Architecture
Plugin Hook Flow
sequenceDiagram
participant Client as MCP Client
participant Gateway as Gateway Core
participant Plugin as ContentLimitPlugin
participant Tool as Tool Service
Client->>Gateway: POST /tools/invoke
Gateway->>Tool: invoke_tool(name, args)
Tool-->>Gateway: ToolResult{content: [100 items]}
Gateway->>Plugin: tool_post_invoke(payload, context)
alt Content count > limit
Plugin->>Plugin: Count content items (100)
Plugin->>Plugin: Check per-tool limits
alt Mode: truncate
Plugin->>Plugin: Slice content to first N items
Plugin->>Plugin: Add truncation metadata
Plugin-->>Gateway: PluginResult{modified_payload, metadata}
Gateway-->>Client: ToolResult{content: [N items], metadata}
else Mode: block
Plugin->>Plugin: Create violation
Plugin-->>Gateway: PluginResult{continue_processing=False, violation}
Gateway-->>Client: Error: CONTENT_LIMIT_EXCEEDED
end
else Content count <= limit
Plugin-->>Gateway: PluginResult{continue_processing=True}
Gateway-->>Client: ToolResult{content: [original]}
end
Loading
Component Architecture
graph TB
subgraph "Plugin Framework"
A[PluginManager]
B[ContentLimitPlugin]
C[PluginContext]
end
subgraph "Configuration"
D[plugins/config.yaml]
E[ContentLimitConfig]
end
subgraph "Core Gateway"
F[ToolService]
G[ToolResult]
end
D -->|loads| E
E -->|configures| B
A -->|executes| B
F -->|returns| G
G -->|passes to| A
A -->|invokes hook| B
B -->|accesses| C
B -->|modifies| G
Loading
📋 Implementation Tasks
Phase 1: Core Plugin Implementation ✅
Create Plugin Structure
Create plugins/content_limit/ directory
Create content_limit_plugin.py with ContentLimitPlugin class
Extend Plugin base class from framework
Implement tool_post_invoke hook method
Add comprehensive docstrings with examples
Configuration Schema
Define ContentLimitConfig dataclass/Pydantic model
🔌 Epic: Content Limit Plugin - Resource Exhaustion Protection
Goal
Implement a native plugin that protects MCP Gateway from resource exhaustion by limiting the number of content items returned from tool invocations. This prevents malicious or misconfigured MCP servers from overwhelming downstream clients with excessively large response payloads, ensuring system stability and predictable resource consumption.
Why Now?
With PR #1189 fixing the StreamableHTTP transport to return all content items instead of just the first one, the gateway now correctly honors the MCP protocol's multi-content specification. However, this opens a potential attack vector:
By implementing this as a plugin, we leverage the existing plugin framework's capabilities (priority ordering, conditional execution, metadata tracking) and allow operators to enable/disable/tune the behavior independently of core gateway logic.
📖 User Stories
US-1: Platform Admin - Configure Content Limits
As a Platform Administrator
I want to configure maximum content item limits globally and per-tool
So that I can protect my infrastructure from resource exhaustion
Acceptance Criteria:
Technical Requirements:
max_content_itemslimit (default: 50)truncate(trim excess) orblock(reject entirely)US-2: AI Agent - Receive Truncated Results with Metadata
As an AI Agent invoking tools via MCP
I want to receive a warning when results are truncated
So that I know the response is partial and can adjust my strategy
Acceptance Criteria:
Technical Requirements:
content_truncated: trueto response metadataoriginal_countandenforced_limitfieldsUS-3: Security Engineer - Block Excessive Responses
As a Security Engineer
I want to completely reject tool responses exceeding limits
So that no truncated data reaches untrusted clients
Acceptance Criteria:
Technical Requirements:
truncate_mode: "block"configuration optionUS-4: DevOps - Monitor Content Limit Violations
As a DevOps Engineer
I want to track content limit violations via logs and metrics
So that I can identify problematic tools and tune limits
Acceptance Criteria:
Technical Requirements:
US-5: Developer - Conditional Execution by Context
As a Developer
I want to apply content limits only to specific tools or tenants
So that trusted internal tools aren't unnecessarily restricted
Acceptance Criteria:
Technical Requirements:
conditionsmatching🏗 Architecture
Plugin Hook Flow
sequenceDiagram participant Client as MCP Client participant Gateway as Gateway Core participant Plugin as ContentLimitPlugin participant Tool as Tool Service Client->>Gateway: POST /tools/invoke Gateway->>Tool: invoke_tool(name, args) Tool-->>Gateway: ToolResult{content: [100 items]} Gateway->>Plugin: tool_post_invoke(payload, context) alt Content count > limit Plugin->>Plugin: Count content items (100) Plugin->>Plugin: Check per-tool limits alt Mode: truncate Plugin->>Plugin: Slice content to first N items Plugin->>Plugin: Add truncation metadata Plugin-->>Gateway: PluginResult{modified_payload, metadata} Gateway-->>Client: ToolResult{content: [N items], metadata} else Mode: block Plugin->>Plugin: Create violation Plugin-->>Gateway: PluginResult{continue_processing=False, violation} Gateway-->>Client: Error: CONTENT_LIMIT_EXCEEDED end else Content count <= limit Plugin-->>Gateway: PluginResult{continue_processing=True} Gateway-->>Client: ToolResult{content: [original]} endComponent Architecture
graph TB subgraph "Plugin Framework" A[PluginManager] B[ContentLimitPlugin] C[PluginContext] end subgraph "Configuration" D[plugins/config.yaml] E[ContentLimitConfig] end subgraph "Core Gateway" F[ToolService] G[ToolResult] end D -->|loads| E E -->|configures| B A -->|executes| B F -->|returns| G G -->|passes to| A A -->|invokes hook| B B -->|accesses| C B -->|modifies| G📋 Implementation Tasks
Phase 1: Core Plugin Implementation ✅
Create Plugin Structure
plugins/content_limit/directorycontent_limit_plugin.pywithContentLimitPluginclassPluginbase class from frameworktool_post_invokehook methodConfiguration Schema
ContentLimitConfigdataclass/Pydantic modelmax_content_items,truncate_mode,log_violations,add_warning_message,per_tool_limits,item_selection_strategytruncate_modeenum:truncate | blockitem_selection_strategyenum:first | lastmax_content_items>= 1[{tool_pattern: str, max_items: int}]Core Logic
ToolPostInvokePayload.resultTextContent,ImageContent,EmbeddedResourcecontent_truncated,original_count,enforced_limit,truncation_strategyTextContentitem when truncatingPhase 2: Error Handling & Logging ✅
Violation Handling
PluginViolationfor block mode{tool_name, original_count, enforced_limit}continue_processing=Falsein block modeLogging
log_violations: true)log_violations: true)Observability
content_limit.violated,content_limit.original_count,content_limit.enforced_limit,content_limit.actionPhase 3: Configuration & Registration ✅
Plugin Manifest
plugin-manifest.yamldescribing the pluginGateway Registration
plugins/config.yamlpermissive(log but don't block)Environment Configuration
.env.examplewith example plugin configPLUGINS_ENABLED=truerequirementPhase 4: Testing ✅
Unit Tests
Integration Tests
Edge Cases
Phase 5: Documentation ✅
Plugin README
plugins/content_limit/README.mdUser Guide
docs/docs/using/plugins.mdCode Documentation
Phase 6: Quality & Polish ✅
Code Quality
make autoflake isort blackmake flake8and fix all issuesmake pylintand address warningsmake doctestto validate examplesmake verifychecksPerformance
Security Review
⚙️ Configuration Example
plugins/config.yaml
Block Mode Example
✅ Success Criteria
🏁 Definition of Done
tool_post_invokehookmake verifychecksplugins/config.yamlwith examples📝 Additional Notes
🔹 Plugin Framework Benefits: By implementing this as a plugin rather than core logic, operators can:
🔹 Security Considerations:
per_tool_limitsmust be validated to prevent ReDoS attacks🔹 Performance Impact:
🔹 Future Enhancements:
🔹 Relationship to PR #1189:
🔗 Related Issues