-
Notifications
You must be signed in to change notification settings - Fork 126
Open
Description
Is your feature request related to a problem?
- Yes, it is related to a problem
Describe the feature you'd like
π Feature Description
Reduce unnecessary LLM API calls in the message classification system by adding smart caching and simple pattern matching.
This feature will:
- Detect common messages (e.g. greetings, thanks, acknowledgments) without calling the LLM
- Cache previous LLM classification results using an LRU cache with TTL
- Normalize messages (lowercase, trim spaces, etc.) to improve cache hits
- Track basic metrics to measure cache usage and saved LLM calls
π Problem Statement
Currently, the ClassificationRouter makes an LLM API call for every single Discord message, even for very simple or repeated messages.
This leads to:
- Unnecessary API usage increasing
- Increased latency
- Higher operational costs
Current Behavior
async def should_process_message(self, message: str, context: Dict[str, Any] = None):
response = await self.llm.ainvoke([HumanMessage(content=triage_prompt)])Every incoming message triggers the LLM, regardless of whether it is:
- A simple greeting like βhiβ
- A repeated message
- A non-actionable acknowledgment
π― Expected Outcome
After this enhancement:
- Simple messages are handled using pattern matching
- Repeated messages reuse results from the cache
- LLM calls are made only when truly needed
- Overall performance and efficiency improve significantly
This will reduce API calls, lower costs, and make the system faster and more scalable.
Record
- I agree to follow this project's Code of Conduct
- I want to work on implementing this feature
Metadata
Metadata
Assignees
Labels
No labels