Add opt-in prompt injection detection middleware with pluggable strategy

## Summary

As MCP servers increasingly handle untrusted input from LLM-generated tool calls, there's growing need for built-in safety mechanisms. This issue proposes an opt-in middleware for detecting potential prompt injection attacks before tool execution.

## Problem

MCP servers may receive malicious inputs designed to:
- Override system instructions ("ignore previous instructions and...")
- Exfiltrate data through tool calls
- Escalate privileges by manipulating tool arguments

Currently, FastMCP provides no built-in protection against these attacks. Each server author must implement their own detection logic.

## Proposed Solution

Add a **`PromptInjectionMiddleware`** with a **pluggable detection strategy**:

```python
from fastmcp.server.middleware import PromptInjectionMiddleware
from fastmcp.server.middleware.safety import HeuristicDetector, InjectionDetector

# Default: heuristic-based detection (fast, offline, no dependencies)
mcp = FastMCP("secure-server")
mcp.add_middleware(PromptInjectionMiddleware())

# Custom: LLM-based detection (more accurate, requires API key)
class LLMDetector(InjectionDetector):
    async def detect(self, input_text: str) -> DetectionResult:
        # Call classifier LLM
        ...

mcp.add_middleware(PromptInjectionMiddleware(detector=LLMDetector()))
```

### Detection Strategy Protocol

```python
from typing import Protocol
from dataclasses import dataclass

@dataclass
class DetectionResult:
    is_suspicious: bool
    confidence: float  # 0.0-1.0
    reason: str | None = None
    matched_pattern: str | None = None

class InjectionDetector(Protocol):
    async def detect(self, input_text: str) -> DetectionResult: ...

# Built-in heuristic detector
class HeuristicDetector:
    """Pattern-based detection using known attack signatures."""
    
    PATTERNS = [
        r"ignore\s+(all\s+)?(previous|prior|above)\s+instructions?",
        r"disregard\s+(everything|all)\s+(above|before)",
        r"you\s+are\s+now\s+(a|an)\s+",
        r"new\s+instructions?:\s*",
        r"system\s*:\s*",
        # ... more patterns
    ]
    
    async def detect(self, input_text: str) -> DetectionResult:
        for pattern in self.PATTERNS:
            if re.search(pattern, input_text, re.IGNORECASE):
                return DetectionResult(
                    is_suspicious=True,
                    confidence=0.7,
                    matched_pattern=pattern
                )
        return DetectionResult(is_suspicious=False, confidence=0.0)
```

### Middleware Configuration

```python
PromptInjectionMiddleware(
    detector=HeuristicDetector(),  # Pluggable strategy
    action="block",                # "block" | "warn" | "log"
    confidence_threshold=0.6,      # Minimum confidence to trigger action
    scan_arguments=True,           # Scan tool arguments
    scan_resources=False,          # Scan resource URIs
)
```

## Scope (v1)

This first iteration focuses on:
- ✅ **Input detection only** (not output sanitization - that's a follow-up)
- ✅ **Tool arguments** (not resource content or prompt messages)
- ✅ **Heuristic default** (LLM-based detector as example, not built-in)

## Benefits

- **Opt-in**: Zero overhead for servers that don't need it
- **Pluggable**: Enterprise users can provide sophisticated detectors
- **Extensible**: Community can contribute detector implementations
- **Configurable**: Adjustable sensitivity and actions

## Implementation Notes

- Ship with `HeuristicDetector` as default (no external dependencies)
- Document known bypass limitations of heuristic approach
- Provide example `LLMDetector` implementation in docs
- Consider publishing pattern database as separate updatable resource

## Related

- OWASP LLM Top 10: LLM01 Prompt Injection
- Follows middleware patterns established in FastMCP
- Complements auth middleware for defense in depth

---

*This issue was identified during an architecture review of the FastMCP codebase.*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opt-in prompt injection detection middleware with pluggable strategy #3080

Summary

Problem

Proposed Solution

Detection Strategy Protocol

Middleware Configuration

Scope (v1)

Benefits

Implementation Notes

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add opt-in prompt injection detection middleware with pluggable strategy #3080

Description

Summary

Problem

Proposed Solution

Detection Strategy Protocol

Middleware Configuration

Scope (v1)

Benefits

Implementation Notes

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions