-
Notifications
You must be signed in to change notification settings - Fork 342
Feature request: Add structured memory primitives for better context engineering #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
commit: |
deathbyknowledge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a final review yet but I'm struggling to see the bigger picture with this, it seems quite opinionated and not sure if the benefits outweigh the negatives.
Could you update one or two of the existing examples (or make a new one) to use this approach and see how if feels?
| /** | ||
| * Token limit processor - ensures context fits within token limits | ||
| */ | ||
| export const tokenLimitRequestProcessor: RequestProcessor< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies on estimateTokenCount which just guesses tokens. Since there's quite some variation in tokenizers depending on the model at use, I'd prefer if we remove this one and let users provide their own.
Maybe we can add "character" truncation instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Ideally the choice of this processor implementation would be something left to the user. That would be step 0. But it could potentially be something to include at a later stage when context compaction to keep long-conversation going becomes more common, and if there's benefits to including it in the sdk at that point. The entire processors.ts can be part of a later PR.
| */ | ||
| export class Session { | ||
| readonly metadata: SessionMetadata; | ||
| readonly events: Event[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use the DO sqlite to handle storage here? Or perhaps storing events directly, kind of like an event store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. sqlite storage would be better, and fits better with the existing pattern. A cf_agents_session table might be a better approach to store user's session. Primary key will be a session_id, and events can be stored in order for easy retrieval of last 'N' user-agent turns in a given session, which would serve as the working-context for the current request. What are your thoughts?
(Apologies about the late response. I missed the email notification thread for this set of comments, and only noticed earlier today )
|
I'll review further on wednesday |
feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives
Summary
This PR introduces structured memory primitives for building context-aware agents, based on the tiered memory architecture principles from Google's ADK whitepaper.
Why?
Currently, agent developers must manually manage conversation history, handle context window limits, and implement their own compaction strategies. This leads to:
These primitives provide a model-agnostic foundation that separates the ground truth (Session) from the computed view (WorkingContext), enabling clean abstractions for context management and compaction.
What's Added
1. Session - The Ground Truth
A durable, structured log of agent interactions. Sessions are model-agnostic and serve as the source of truth.
2. WorkingContext - The Computed View
An ephemeral, computed view sent to the LLM. Rebuilt for each invocation.
3. Processor Pipeline - Modular Context Building
A pipeline of processors that transform Session state into WorkingContext, inspired by Google ADK's processor pattern.
Built-in Request Processors:
Built-in Response Processors:
Bonus: Compaction Without Model Coupling
The primitives enable context compaction without worrying about underlying model structure:
Why this matters:
Event Types
Strongly-typed events for all agent interactions:
Current Limitations
Files Added