[FEATURE][PLUGIN]: Content limit plugin - Resource exhaustion protection

# 🔌 Epic: Content Limit Plugin - Resource Exhaustion Protection

## Goal

Implement a **native plugin** that protects MCP Gateway from resource exhaustion by **limiting the number of content items** returned from tool invocations. This prevents malicious or misconfigured MCP servers from overwhelming downstream clients with excessively large response payloads, ensuring system stability and predictable resource consumption.

## Why Now?

With PR #1189 fixing the StreamableHTTP transport to return **all content items** instead of just the first one, the gateway now correctly honors the MCP protocol's multi-content specification. However, this opens a potential attack vector:

1. **Uncontrolled Resource Growth**: A malicious or buggy MCP server could return hundreds or thousands of content items (e.g., 100 items × 1MB each = 100MB), consuming excessive memory and bandwidth
2. **Denial of Service Risk**: Downstream clients (AI agents, web browsers) could be overwhelmed by large payloads, leading to crashes or degraded performance
3. **Operational Visibility Gap**: Currently no logging or metrics exist to detect when tools return abnormally large response sets
4. **Policy Enforcement Need**: Administrators need configurable controls to enforce organization-specific limits without code changes

By implementing this as a **plugin**, we leverage the existing plugin framework's capabilities (priority ordering, conditional execution, metadata tracking) and allow operators to enable/disable/tune the behavior independently of core gateway logic.

---

## 📖 User Stories

<details>
<summary>US-1: Platform Admin - Configure Content Limits</summary>

**As a** Platform Administrator
**I want** to configure maximum content item limits globally and per-tool
**So that** I can protect my infrastructure from resource exhaustion

**Acceptance Criteria:**

```gherkin
Given the plugin is registered in plugins/config.yaml
When I set the configuration:
 config:
 max_content_items: 50
 truncate_mode: "truncate" # or "block"
 log_violations: true
 per_tool_limits:
 - tool_pattern: "large-dataset-.*"
 max_items: 10
 - tool_pattern: "bulk-export"
 max_items: 5
Then tools returning more than 50 items should be truncated to 50
And tools matching "large-dataset-.*" should be limited to 10 items
And violations should be logged with metadata
```

**Technical Requirements:**
- Global `max_content_items` limit (default: 50)
- Per-tool overrides via regex pattern matching
- Two enforcement modes: `truncate` (trim excess) or `block` (reject entirely)
- Structured logging of violations with tool name, original count, enforced limit
- Metadata injection for observability

</details>

<details>
<summary>US-2: AI Agent - Receive Truncated Results with Metadata</summary>

**As an** AI Agent invoking tools via MCP
**I want** to receive a warning when results are truncated
**So that** I know the response is partial and can adjust my strategy

**Acceptance Criteria:**

```gherkin
Given a tool returns 100 content items
And the content limit plugin is configured with max_content_items: 25
When I invoke the tool via the gateway
Then I should receive:
 - Exactly 25 content items (the first 25)
 - Metadata indicating truncation occurred
 - Original count (100) and enforced limit (25) in metadata
 - A warning annotation in the response
```

**Technical Requirements:**
- Preserve first N items (configurable whether to keep first or last)
- Add `content_truncated: true` to response metadata
- Add `original_count` and `enforced_limit` fields
- Optionally append a warning TextContent item describing the truncation
- No data corruption or malformed responses

</details>

<details>
<summary>US-3: Security Engineer - Block Excessive Responses</summary>

**As a** Security Engineer
**I want** to completely reject tool responses exceeding limits
**So that** no truncated data reaches untrusted clients

**Acceptance Criteria:**

```gherkin
Given the plugin is in "block" mode
And a tool returns 200 content items
And the limit is 50
When the tool_post_invoke hook executes
Then the plugin should:
 - Return continue_processing=False
 - Set a PluginViolation with code "CONTENT_LIMIT_EXCEEDED"
 - Include violation details: tool name, count, limit
 - Log the violation at WARNING level
 - Return an error to the client instead of partial data
```

**Technical Requirements:**
- `truncate_mode: "block"` configuration option
- Violation reason: "Content limit exceeded"
- Violation code: "CONTENT_LIMIT_EXCEEDED"
- Detailed violation metadata for audit trails
- Client receives MCP error instead of partial response

</details>

<details>
<summary>US-4: DevOps - Monitor Content Limit Violations</summary>

**As a** DevOps Engineer
**I want** to track content limit violations via logs and metrics
**So that** I can identify problematic tools and tune limits

**Acceptance Criteria:**

```gherkin
Given the plugin has log_violations: true
When a tool exceeds the content limit
Then the following should be logged:
 - Event type: "CONTENT_LIMIT_VIOLATION"
 - Tool name and invocation ID
 - Original content count vs. enforced limit
 - Action taken (truncate or block)
 - Timestamp and request context
And metrics should be incremented:
 - content_limit_violations_total (counter)
 - content_items_truncated_total (counter)
 - Tool-specific counters with labels
```

**Technical Requirements:**
- Structured JSON logging with violation details
- OpenTelemetry span attributes for observability
- Prometheus-compatible metrics (if metrics enabled)
- Log level: WARNING for violations, INFO for normal operation
- Correlation with request_id and user context

</details>

<details>
<summary>US-5: Developer - Conditional Execution by Context</summary>

**As a** Developer
**I want** to apply content limits only to specific tools or tenants
**So that** trusted internal tools aren't unnecessarily restricted

**Acceptance Criteria:**

```gherkin
Given the plugin configuration includes:
 conditions:
 - tools: ["external-.*"]
 tenant_ids: ["tenant-untrusted"]
When I invoke an internal tool "admin-dashboard"
Then the content limit should NOT be enforced
When I invoke "external-api-search" as "tenant-untrusted"
Then the content limit SHOULD be enforced
```

**Technical Requirements:**
- Leverage plugin framework's `conditions` matching
- Support tool name patterns (regex)
- Support tenant_id and server_id filters
- Document condition examples in plugin README
- Test conditional execution in unit tests

</details>

---

## 🏗 Architecture

### Plugin Hook Flow

```mermaid
sequenceDiagram
 participant Client as MCP Client
 participant Gateway as Gateway Core
 participant Plugin as ContentLimitPlugin
 participant Tool as Tool Service

 Client->>Gateway: POST /tools/invoke
 Gateway->>Tool: invoke_tool(name, args)
 Tool-->>Gateway: ToolResult{content: [100 items]}
 Gateway->>Plugin: tool_post_invoke(payload, context)

 alt Content count > limit
 Plugin->>Plugin: Count content items (100)
 Plugin->>Plugin: Check per-tool limits
 
 alt Mode: truncate
 Plugin->>Plugin: Slice content to first N items
 Plugin->>Plugin: Add truncation metadata
 Plugin-->>Gateway: PluginResult{modified_payload, metadata}
 Gateway-->>Client: ToolResult{content: [N items], metadata}
 else Mode: block
 Plugin->>Plugin: Create violation
 Plugin-->>Gateway: PluginResult{continue_processing=False, violation}
 Gateway-->>Client: Error: CONTENT_LIMIT_EXCEEDED
 end
 else Content count <= limit
 Plugin-->>Gateway: PluginResult{continue_processing=True}
 Gateway-->>Client: ToolResult{content: [original]}
 end
```

### Component Architecture

```mermaid
graph TB
 subgraph "Plugin Framework"
 A[PluginManager]
 B[ContentLimitPlugin]
 C[PluginContext]
 end

 subgraph "Configuration"
 D[plugins/config.yaml]
 E[ContentLimitConfig]
 end

 subgraph "Core Gateway"
 F[ToolService]
 G[ToolResult]
 end

 D -->|loads| E
 E -->|configures| B
 A -->|executes| B
 F -->|returns| G
 G -->|passes to| A
 A -->|invokes hook| B
 B -->|accesses| C
 B -->|modifies| G
```

---

## 📋 Implementation Tasks

### Phase 1: Core Plugin Implementation ✅

- [ ] **Create Plugin Structure**
 - [ ] Create `plugins/content_limit/` directory
 - [ ] Create `content_limit_plugin.py` with `ContentLimitPlugin` class
 - [ ] Extend `Plugin` base class from framework
 - [ ] Implement `tool_post_invoke` hook method
 - [ ] Add comprehensive docstrings with examples

- [ ] **Configuration Schema**
 - [ ] Define `ContentLimitConfig` dataclass/Pydantic model
 - [ ] Fields: `max_content_items`, `truncate_mode`, `log_violations`, `add_warning_message`, `per_tool_limits`, `item_selection_strategy`
 - [ ] Validate `truncate_mode` enum: `truncate | block`
 - [ ] Validate `item_selection_strategy` enum: `first | last`
 - [ ] Validate `max_content_items` >= 1
 - [ ] Per-tool limits schema: `[{tool_pattern: str, max_items: int}]`

- [ ] **Core Logic**
 - [ ] Implement content counting from `ToolPostInvokePayload.result`
 - [ ] Handle different content types: `TextContent`, `ImageContent`, `EmbeddedResource`
 - [ ] Parse per-tool limits and match tool name against patterns (regex)
 - [ ] Determine effective limit (per-tool override or global)
 - [ ] Implement truncation logic (first N or last N items)
 - [ ] Implement blocking logic (return violation)
 - [ ] Add metadata fields: `content_truncated`, `original_count`, `enforced_limit`, `truncation_strategy`
 - [ ] Optional: append warning `TextContent` item when truncating

### Phase 2: Error Handling & Logging ✅

- [ ] **Violation Handling**
 - [ ] Create `PluginViolation` for block mode
 - [ ] Set reason: "Content limit exceeded"
 - [ ] Set code: "CONTENT_LIMIT_EXCEEDED"
 - [ ] Include details: `{tool_name, original_count, enforced_limit}`
 - [ ] Set `continue_processing=False` in block mode

- [ ] **Logging**
 - [ ] Log INFO when plugin initializes with config
 - [ ] Log WARNING when content is truncated (if `log_violations: true`)
 - [ ] Log WARNING when request is blocked (if `log_violations: true`)
 - [ ] Include structured fields: tool_name, request_id, user, original_count, limit, action
 - [ ] Use LoggingService from framework

- [ ] **Observability**
 - [ ] Add OpenTelemetry span attributes when limits enforced
 - [ ] Span attributes: `content_limit.violated`, `content_limit.original_count`, `content_limit.enforced_limit`, `content_limit.action`
 - [ ] Add metadata to plugin result for downstream metrics

### Phase 3: Configuration & Registration ✅

- [ ] **Plugin Manifest**
 - [ ] Create `plugin-manifest.yaml` describing the plugin
 - [ ] Document all configuration options
 - [ ] Provide usage examples

- [ ] **Gateway Registration**
 - [ ] Add entry to `plugins/config.yaml`
 - [ ] Set default priority (e.g., 200 - after auth/PII but before other transforms)
 - [ ] Set default mode: `permissive` (log but don't block)
 - [ ] Document recommended priority ordering
 - [ ] Example config with per-tool limits

- [ ] **Environment Configuration**
 - [ ] Update `.env.example` with example plugin config
 - [ ] Document `PLUGINS_ENABLED=true` requirement
 - [ ] Add Jinja template variables if needed

### Phase 4: Testing ✅

- [ ] **Unit Tests**
 - [ ] Test truncation mode with various content counts (0, 1, 50, 100)
 - [ ] Test block mode violation generation
 - [ ] Test per-tool limit matching (regex patterns)
 - [ ] Test item selection strategies (first vs last)
 - [ ] Test metadata injection
 - [ ] Test warning message appending
 - [ ] Test plugin with no limits (should pass through)
 - [ ] Test with empty content list
 - [ ] Test with mixed content types (TextContent + ImageContent)
 - [ ] Test condition filtering (only execute for certain tools)

- [ ] **Integration Tests**
 - [ ] Test via PluginManager with real ToolPostInvokePayload
 - [ ] Test priority ordering with other plugins
 - [ ] Test context isolation between requests
 - [ ] Verify logging output structure
 - [ ] Test configuration parsing from YAML

- [ ] **Edge Cases**
 - [ ] Test with malformed content structure
 - [ ] Test with None or missing result.content
 - [ ] Test with negative or zero limits (should error)
 - [ ] Test regex pattern compilation errors
 - [ ] Test timeout behavior (should complete quickly)

### Phase 5: Documentation ✅

- [ ] **Plugin README**
 - [ ] Create `plugins/content_limit/README.md`
 - [ ] Overview and use cases
 - [ ] Configuration reference with all options
 - [ ] Examples: global limit, per-tool limits, block mode
 - [ ] Troubleshooting section
 - [ ] Performance notes

- [ ] **User Guide**
 - [ ] Add section to `docs/docs/using/plugins.md`
 - [ ] Describe content limit plugin capabilities
 - [ ] Configuration walkthrough
 - [ ] Integration with observability stack

- [ ] **Code Documentation**
 - [ ] Comprehensive docstrings in plugin class
 - [ ] Inline comments for complex logic (regex matching, item selection)
 - [ ] Type hints for all methods
 - [ ] Doctests in class/method docstrings

### Phase 6: Quality & Polish ✅

- [ ] **Code Quality**
 - [ ] Run `make autoflake isort black`
 - [ ] Run `make flake8` and fix all issues
 - [ ] Run `make pylint` and address warnings
 - [ ] Run `make doctest` to validate examples
 - [ ] Pass `make verify` checks

- [ ] **Performance**
 - [ ] Ensure O(1) limit checking
 - [ ] Minimize overhead for unconstrained tools
 - [ ] Profile plugin execution time (<5ms typical)
 - [ ] Test with large content arrays (1000+ items)

- [ ] **Security Review**
 - [ ] Validate regex patterns safely (no ReDoS)
 - [ ] Ensure no information leakage in violation messages
 - [ ] Verify metadata doesn't expose sensitive data
 - [ ] Test with malicious content payloads

---

## ⚙️ Configuration Example

### plugins/config.yaml

```yaml
plugins:
 - name: "ContentLimitPlugin"
 kind: "plugins.content_limit.content_limit_plugin.ContentLimitPlugin"
 description: "Limits content items in tool responses to prevent resource exhaustion"
 version: "1.0.0"
 author: "MCP Gateway Team"
 hooks:
 - tool_post_invoke
 mode: "permissive" # Log violations but allow truncated results
 priority: 200 # After auth/PII filters, before other transforms
 
 conditions:
 # Apply to all tools by default; can be restricted here
 - tools: [".*"] # All tools
 
 config:
 # Global limit (default: 50 content items)
 max_content_items: 50
 
 # Enforcement mode: "truncate" (trim excess) or "block" (reject entirely)
 truncate_mode: "truncate"
 
 # Log violations for monitoring
 log_violations: true
 
 # Add a warning TextContent item when truncating
 add_warning_message: true
 
 # Item selection strategy: "first" (keep first N) or "last" (keep last N)
 item_selection_strategy: "first"
 
 # Per-tool overrides (regex patterns)
 per_tool_limits:
 - tool_pattern: "bulk-export-.*"
 max_items: 10
 - tool_pattern: "large-dataset-query"
 max_items: 5
 - tool_pattern: "streaming-logs"
 max_items: 100
```

### Block Mode Example

```yaml
plugins:
 - name: "ContentLimitPlugin-Strict"
 kind: "plugins.content_limit.content_limit_plugin.ContentLimitPlugin"
 hooks: [tool_post_invoke]
 mode: "enforce" # Block requests that violate limits
 priority: 200
 config:
 max_content_items: 25
 truncate_mode: "block" # Reject instead of truncate
 log_violations: true
 per_tool_limits:
 - tool_pattern: "external-api-.*"
 max_items: 10
```

---

## ✅ Success Criteria

- [ ] **Functionality**: Plugin correctly limits content items in truncate and block modes
- [ ] **Configurability**: Global and per-tool limits work as documented
- [ ] **Observability**: Violations are logged with structured metadata
- [ ] **Performance**: Negligible overhead (<5ms) for compliant responses
- [ ] **Testing**: 20+ unit tests with 90%+ coverage for plugin code
- [ ] **Integration**: Works seamlessly with existing plugin framework
- [ ] **Documentation**: Complete README with examples and troubleshooting
- [ ] **Quality**: Passes all linting, formatting, and verification checks

---

## 🏁 Definition of Done

- [x] Plugin class implemented with `tool_post_invoke` hook
- [x] Configuration schema defined and validated
- [x] Truncate and block modes functional
- [x] Per-tool limit overrides working (regex patterns)
- [x] Metadata injection for observability
- [x] Structured logging of violations
- [x] 20+ unit tests with 90%+ coverage
- [x] Integration tests with PluginManager
- [x] README documentation complete
- [x] Code passes `make verify` checks
- [x] Registered in `plugins/config.yaml` with examples
- [x] Tested with real tool responses (100+ items)
- [x] Security review completed (no ReDoS, no leaks)
- [x] Performance benchmarked (<5ms overhead)
- [x] Team review and approval

---

## 📝 Additional Notes

🔹 **Plugin Framework Benefits**: By implementing this as a plugin rather than core logic, operators can:
 - Enable/disable without code changes
 - Tune limits per environment (dev vs prod)
 - Customize behavior via conditions (tenant-specific limits)
 - Combine with other plugins (e.g., PII filter first, then content limit)

🔹 **Security Considerations**:
 - Regex patterns in `per_tool_limits` must be validated to prevent ReDoS attacks
 - Warning messages should not leak sensitive information about internal limits
 - Violation metadata is safe to expose (tool name, counts) but not raw content

🔹 **Performance Impact**:
 - Content counting is O(n) where n = number of items, but typically n < 100
 - Regex matching is O(m×p) where m = number of patterns, p = pattern complexity; keep patterns simple
 - Truncation/slicing is O(k) where k = limit (constant)
 - Expected overhead: <5ms for typical responses

🔹 **Future Enhancements**:
 - Size-based limits (e.g., max total bytes across all content items)
 - Rate limiting (max items per time window)
 - Content type-specific limits (e.g., max 10 images, unlimited text)
 - Dynamic limits based on client role/tier
 - Integration with quota systems

🔹 **Relationship to PR #1189**:
 - This plugin directly addresses the security concern raised during #1189 review
 - PR #1189 fixed a bug (truncation to first item); this plugin adds policy enforcement
 - Both changes are complementary: correct protocol behavior + protective guardrails

---

## 🔗 Related Issues

- #1188 - StreamableHTTP multiple content support (bug report)
- #1189 - Fix return multiple StreamableHTTP content (implementation)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE][PLUGIN]: Content limit plugin - Resource exhaustion protection #1191

🔌 Epic: Content Limit Plugin - Resource Exhaustion Protection

Goal

Why Now?

📖 User Stories

🏗 Architecture

Plugin Hook Flow

Component Architecture

📋 Implementation Tasks

Phase 1: Core Plugin Implementation ✅

Phase 2: Error Handling & Logging ✅

Phase 3: Configuration & Registration ✅

Phase 4: Testing ✅

Phase 5: Documentation ✅

Phase 6: Quality & Polish ✅

⚙️ Configuration Example

plugins/config.yaml

Block Mode Example

✅ Success Criteria

🏁 Definition of Done

📝 Additional Notes

🔗 Related Issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[FEATURE][PLUGIN]: Content limit plugin - Resource exhaustion protection #1191

Description

🔌 Epic: Content Limit Plugin - Resource Exhaustion Protection

Goal

Why Now?

📖 User Stories

🏗 Architecture

Plugin Hook Flow

Component Architecture

📋 Implementation Tasks

Phase 1: Core Plugin Implementation ✅

Phase 2: Error Handling & Logging ✅

Phase 3: Configuration & Registration ✅

Phase 4: Testing ✅

Phase 5: Documentation ✅

Phase 6: Quality & Polish ✅

⚙️ Configuration Example

plugins/config.yaml

Block Mode Example

✅ Success Criteria

🏁 Definition of Done

📝 Additional Notes

🔗 Related Issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions