-
Notifications
You must be signed in to change notification settings - Fork 273
Feat : Integrate hooks in LiteLLM to modify/reject requests and responses #3959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat : Integrate hooks in LiteLLM to modify/reject requests and responses #3959
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 AI Security analysis: "Docker Compose exposes a service port on all interfaces, allowing access from any network. This increases risk of unauthorized access, data exposure, and remote exploitation if the service lacks proper authentication and network restrictions."
| Risk Level | AI Score |
|---|---|
| 🟢 LOW | 35.0/100 |
Top 4 security issues / 4 total (Critical: 0, High: 0, Medium: 4, Low: 0)
| restart: always | ||
| networks: | ||
| - guardrails-network | ||
| ports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "127.0.0.1:80:8080"🟡 MEDIUM: Service port exposed on all interfaces
Bind the published port to localhost to avoid exposing the service on all host interfaces. If external access is required, bind to a specific allowed IP or use a reverse proxy with proper access controls.
| image: docker.litellm.ai/berriai/litellm:main-stable | ||
| networks: | ||
| - guardrails-network | ||
| ports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "127.0.0.1:4000:4000"🟡 MEDIUM: Service port exposed on all interfaces
Bind the published port to localhost (127.0.0.1) so the service is not exposed on all host interfaces; this limits access to the local machine and reduces unintended external exposure.
| POSTGRES_DB: "${POSTGRES_DB}" | ||
| POSTGRES_USER: "${POSTGRES_USER}" | ||
| POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}" | ||
| ports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "127.0.0.1:5432:5432"🟡 MEDIUM: Service port exposed on all interfaces
Bind the published PostgreSQL port to localhost to avoid exposing the database on all host interfaces (prevents external access from remote hosts). If remote access is required, use a secure network or firewall instead.
| volumes: | ||
| - prometheus_data:/prometheus | ||
| - ./prometheus.yml:/etc/prometheus/prometheus.yml | ||
| ports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "127.0.0.1:9091:9090"🟡 MEDIUM: Service port exposed on all interfaces
Bind Prometheus host port to localhost to avoid exposing the service on all network interfaces. This restricts access to the local machine (prevents external access).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR integrates LiteLLM proxy support into the Akto platform, enabling traffic capture and guardrails validation for LiteLLM-based AI agent deployments. The integration follows the existing pattern used for N8N, Langchain, and Copilot Studio connectors.
Changes:
- Added LiteLLM connector type with configuration constants and validation in Java backend
- Implemented Python-based custom hooks for LiteLLM proxy to intercept and validate requests via guardrails service
- Created Docker Compose setup with LiteLLM, PostgreSQL, Prometheus, and guardrails service integration
- Added frontend UI components and constants for LiteLLM connector configuration
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| libs/utils/src/main/java/com/akto/jobs/executors/AIAgentConnectorConstants.java | Added LITELLM connector type constant and configuration keys |
| libs/utils/src/main/java/com/akto/jobs/executors/AIAgentConnectorUtils.java | Extended validation to include LITELLM connector type |
| apps/dashboard/src/main/java/com/akto/action/AIAgentConnectorImportAction.java | Added LiteLLM-specific parameters and configuration building logic |
| apps/account-job-executor/src/main/java/com/akto/account_job_executor/executor/executors/AIAgentConnectorExecutor.java | Implemented LiteLLM connector execution logic |
| apps/guardrails-service/litellm/custom_hooks.py | Core Python implementation of LiteLLM hooks for request validation |
| apps/guardrails-service/litellm/docker-compose.yaml | Multi-service Docker setup for LiteLLM with guardrails integration |
| apps/guardrails-service/litellm/config.yaml | LiteLLM proxy configuration with custom hooks |
| apps/guardrails-service/litellm/prometheus.yml | Prometheus monitoring configuration |
| apps/guardrails-service/litellm/.env.example | Environment variable template for LiteLLM setup |
| apps/dashboard/web/public/litellm.svg | LiteLLM logo asset |
| apps/dashboard/web/polaris_web/web/src/apps/dashboard/pages/quick_start/constants/aiAgentConnectorConstants.js | Frontend constants for LiteLLM connector |
| apps/dashboard/web/polaris_web/web/src/apps/dashboard/pages/quick_start/transform.js | UI integration for LiteLLM connector setup |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case CONNECTOR_TYPE_LITELLM: | ||
| if (litellmUrl == null || litellmUrl.isEmpty() || litellmApiKey == null | ||
| || litellmApiKey.isEmpty()) { | ||
| loggerMaker.error("Missing required LiteLLM configuration", LogDb.DASHBOARD); | ||
| return null; | ||
| } | ||
| config.put(CONFIG_LITELLM_BASE_URL, litellmUrl); | ||
| config.put(CONFIG_LITELLM_API_KEY, litellmApiKey); | ||
| break; |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dataIngestionUrl parameter is added to the config map without any validation. If this URL is null or empty, it will still be added to the configuration, potentially causing issues downstream. While the other connector types validate their required fields, there's no validation for this common configuration parameter that appears to be mandatory across all connector types.
| @@ -0,0 +1,8 @@ | |||
| global: | |||
| scrape_interval: 15s | |||
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Prometheus scrape interval is set to 15 seconds, which may be too aggressive for a guardrails service that could be processing high volumes of requests. This could add unnecessary load on the LiteLLM service. Consider increasing the interval to 30s or 60s unless real-time monitoring at 15-second granularity is specifically required.
| scrape_interval: 15s | |
| scrape_interval: 30s |
|
|
||
| class GuardrailsHandler(CustomLogger): | ||
| def __init__(self): | ||
| super().__init__() | ||
| self.client = httpx.AsyncClient(timeout=TIMEOUT) |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The httpx.AsyncClient is created without any connection limits or proper resource management in the constructor. This could lead to connection pool exhaustion under heavy load. Consider adding connection limits using httpx.Limits and implementing proper lifecycle management to ensure the client is closed when the handler is destroyed.
| class GuardrailsHandler(CustomLogger): | |
| def __init__(self): | |
| super().__init__() | |
| self.client = httpx.AsyncClient(timeout=TIMEOUT) | |
| MAX_CONNECTIONS = int(os.getenv("GUARDRAILS_MAX_CONNECTIONS", "100")) | |
| MAX_KEEPALIVE_CONNECTIONS = int(os.getenv("GUARDRAILS_MAX_KEEPALIVE_CONNECTIONS", "20")) | |
| class GuardrailsHandler(CustomLogger): | |
| def __init__(self): | |
| super().__init__() | |
| self.client = httpx.AsyncClient( | |
| timeout=TIMEOUT, | |
| limits=httpx.Limits( | |
| max_connections=MAX_CONNECTIONS, | |
| max_keepalive_connections=MAX_KEEPALIVE_CONNECTIONS, | |
| ), | |
| ) |
| async def _validate_background(self, data: dict, call_type: str): | ||
| try: | ||
| allowed, _ = await self._call_guardrails(data, call_type) | ||
| if not allowed: | ||
| logger.warning("Guardrails violation detected (async)") | ||
| except Exception as e: | ||
| logger.error(f"Guardrails error (async): {e}") |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling catches all exceptions broadly with 'except Exception', which includes system-level exceptions that should typically propagate. This makes it difficult to distinguish between expected errors (like network timeouts) and unexpected errors (like programming bugs). Consider catching specific exception types like httpx.HTTPError, httpx.TimeoutException, and json.JSONDecodeError separately for better error diagnostics and handling.
| async def _call_guardrails( | ||
| self, | ||
| data: dict, | ||
| call_type: str, | ||
| ) -> Tuple[bool, str]: | ||
| if not GUARDRAILS_URL: | ||
| return True, "" | ||
|
|
||
| query = "" | ||
| if "messages" in data: | ||
| for m in data["messages"]: | ||
| content = m.get("content", "") | ||
| if isinstance(content, list): | ||
| for item in content: | ||
| if item.get("type") == "text": | ||
| query += item.get("text", "") + " " | ||
| elif isinstance(content, str): | ||
| query += content + " " | ||
| else: | ||
| query = data.get("prompt", "") | ||
|
|
||
| payload = { | ||
| "query": query.strip(), | ||
| "model": data.get("model", ""), | ||
| } | ||
|
|
||
| resp = await self.client.post( | ||
| f"{GUARDRAILS_URL}/api/validate/request", | ||
| json={ | ||
| "payload": json.dumps(payload), | ||
| "call_type": call_type, | ||
| }, | ||
| ) | ||
|
|
||
| if resp.status_code != 200: | ||
| raise RuntimeError(f"Guardrails HTTP {resp.status_code}") | ||
|
|
||
| result = resp.json() | ||
| return ( | ||
| result.get("Allowed", result.get("allowed", True)), | ||
| result.get("Reason", result.get("reason", "")), | ||
| ) |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function returns a tuple of (bool, str) for the allowed status and reason, but the reason value is retrieved from the response and never used in the calling code. The _validate_and_block and _validate_background methods both ignore the second return value. Consider either using the reason for logging/error messages, or simplifying the return type to just return a boolean.
| - "--detailed_debug" | ||
| environment: | ||
| DATABASE_URL: "postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db:5432/${POSTGRES_DB}" | ||
| STORE_MODEL_IN_DB: "True" |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker-compose.yaml file references environment variables from a .env file, but the litellm service also needs access to GUARDRAILS_BASE_URL, SYNC_MODE, and GUARDRAILS_TIMEOUT which are defined in .env.example. However, these are not explicitly listed in the environment section of the litellm service, relying only on env_file. While this works, it would be more explicit and maintainable to list these critical environment variables in the environment section with their default values or references to make the dependencies clear.
| STORE_MODEL_IN_DB: "True" | |
| STORE_MODEL_IN_DB: "True" | |
| GUARDRAILS_BASE_URL: "${GUARDRAILS_BASE_URL}" | |
| SYNC_MODE: "${SYNC_MODE}" | |
| GUARDRAILS_TIMEOUT: "${GUARDRAILS_TIMEOUT}" |
| if resp.status_code != 200: | ||
| raise RuntimeError(f"Guardrails HTTP {resp.status_code}") | ||
|
|
||
| result = resp.json() |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response from the guardrails service is parsed as JSON without checking the content-type header. If the service returns an error page (HTML) or plain text instead of JSON, the resp.json() call will raise an exception that gets caught in the generic exception handler, potentially masking the real error. Consider checking the response content-type or handling JSON decode errors specifically to provide better error messages.
| result = resp.json() | |
| content_type = resp.headers.get("content-type", "") | |
| if "application/json" not in content_type: | |
| # Guardrails service responded with a non-JSON payload; surface a clear error. | |
| body_preview = resp.text[:200] | |
| raise RuntimeError( | |
| f"Guardrails response is not JSON (content-type={content_type!r}). " | |
| f"Body preview: {body_preview!r}" | |
| ) | |
| try: | |
| result = resp.json() | |
| except (ValueError, httpx.DecodingError) as exc: | |
| body_preview = resp.text[:200] | |
| raise RuntimeError( | |
| f"Failed to decode Guardrails JSON response: {exc}. " | |
| f"Body preview: {body_preview!r}" | |
| ) from exc |
| volumes: | ||
| - postgres_data:/var/lib/postgresql/data | ||
| healthcheck: | ||
| test: [ "CMD-SHELL", "pg_isready -d litellm -U llmproxy" ] |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded database name 'litellm' and username 'llmproxy' in the healthcheck command don't match the dynamic environment variables POSTGRES_DB and POSTGRES_USER. This will cause the healthcheck to fail if different values are provided in the environment variables. The test command should use the environment variables instead.
| test: [ "CMD-SHELL", "pg_isready -d litellm -U llmproxy" ] | |
| test: [ "CMD-SHELL", "pg_isready -d \"$POSTGRES_DB\" -U \"$POSTGRES_USER\"" ] |
| asyncio.create_task( | ||
| self._validate_background(data, call_type) | ||
| ) |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async background task created with asyncio.create_task is not being tracked or awaited. This could lead to unhandled exceptions being silently ignored and potential resource leaks. The task should be stored and properly managed, or exception handling should be added to ensure errors are logged appropriately.
| logging.basicConfig(level=logging.INFO) | ||
| logger = logging.getLogger(__name__) | ||
|
|
||
| GUARDRAILS_URL = os.getenv("GUARDRAILS_BASE_URL") |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guardrails service URL is retrieved from the environment but there's no validation that it's a properly formatted URL or that it uses a secure protocol (HTTPS). If an invalid URL is provided, it will only fail at runtime when the first request is made. Consider validating the URL format during initialization and logging a warning if HTTP is used instead of HTTPS in production environments.
| GUARDRAILS_URL = os.getenv("GUARDRAILS_BASE_URL") | |
| def _validate_guardrails_url(url: str) -> str: | |
| """ | |
| Validate the guardrails base URL and log any potential issues. | |
| This does not raise; it only logs, to avoid changing existing behavior. | |
| """ | |
| if not url: | |
| logger.warning( | |
| "GUARDRAILS_BASE_URL is not set; guardrails calls may fail at runtime." | |
| ) | |
| return url | |
| try: | |
| parsed = httpx.URL(url) | |
| except Exception as exc: | |
| logger.error( | |
| "Invalid GUARDRAILS_BASE_URL '%s': %s. Guardrails calls may fail at runtime.", | |
| url, | |
| exc, | |
| ) | |
| return url | |
| scheme = parsed.scheme | |
| if scheme not in ("http", "https"): | |
| logger.warning( | |
| "GUARDRAILS_BASE_URL uses unsupported URL scheme '%s'.", scheme | |
| ) | |
| elif scheme == "http": | |
| env = os.getenv("ENV") or os.getenv("ENVIRONMENT") or os.getenv("PYTHON_ENV") | |
| if env and env.lower() == "production": | |
| logger.warning( | |
| "GUARDRAILS_BASE_URL is using HTTP in a production environment; HTTPS is recommended." | |
| ) | |
| return url | |
| RAW_GUARDRAILS_URL = os.getenv("GUARDRAILS_BASE_URL") | |
| GUARDRAILS_URL = _validate_guardrails_url(RAW_GUARDRAILS_URL) |
…e query validation
…d improve query validation" This reverts commit 03bc3e0.
| @@ -0,0 +1,23 @@ | |||
| # LiteLLM Configuration | |||
| LITELLM_MASTER_KEY=sk-1234 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not commit the key
| * Downloads the LiteLLM shield binary from Azure Storage and executes it with | ||
| * config as env vars. | ||
| */ | ||
| private void executeLiteLLMConnector(AccountJob job, Map<String, Object> config) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be necessary for this integration
| executeCopilotStudioConnector(job, config); | ||
| break; | ||
|
|
||
| case "LITELLM": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be necessary
| config.put(CONFIG_DATAVERSE_CLIENT_SECRET, dataverseClientSecret); | ||
| break; | ||
|
|
||
| case CONNECTOR_TYPE_LITELLM: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be necessary
| ]; | ||
|
|
||
| // LiteLLM Field Configuration | ||
| export const LITELLM_FIELDS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| callbacks: [custom_hooks.proxy_handler_instance] | ||
| drop_params: true | ||
| set_verbose: false | ||
| request_timeout: 600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep the timeout very small for the callback, ideally < 5 seconds
| global: | ||
| scrape_interval: 15s | ||
|
|
||
| scrape_configs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why prometheus is needed?
| return CONNECTOR_TYPE_N8N.equals(connectorType) || | ||
| CONNECTOR_TYPE_LANGCHAIN.equals(connectorType) || | ||
| CONNECTOR_TYPE_COPILOT_STUDIO.equals(connectorType); | ||
| CONNECTOR_TYPE_LANGCHAIN.equals(connectorType) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be needed as we are not scheduling any job.
| @@ -0,0 +1,195 @@ | |||
| from litellm.integrations.custom_logger import CustomLogger | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move all the files for litellm integration to the folder
apps/mcp-endpoint-shield/litellm.
ddb913c
into
akto-api-security:feat/litellm-integration

No description provided.