Feature request: Add structured memory primitives for better context engineering #791

mchockal · 2026-01-20T17:53:36Z

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

This PR introduces structured memory primitives for building context-aware agents, based on the tiered memory architecture principles from Google's ADK whitepaper.

Why?

Currently, agent developers must manually manage conversation history, handle context window limits, and implement their own compaction strategies. This leads to:

Duplicated effort across agent implementations
Inconsistent approaches to memory management
Tight coupling between conversation state and model-specific formats
Complex compaction logic scattered throughout application code

These primitives provide a model-agnostic foundation that separates the ground truth (Session) from the computed view (WorkingContext), enabling clean abstractions for context management and compaction.

What's Added

1. Session - The Ground Truth

A durable, structured log of agent interactions. Sessions are model-agnostic and serve as the source of truth.

import { Session, EventAction, generateEventId } from 'agents/memory';

// Create a new session
const session = new Session('my-agent');

// Add events
session.addEvent({
  id: generateEventId(),
  action: EventAction.USER_MESSAGE,
  timestamp: Date.now(),
  content: 'What is the weather in San Francisco?'
});

// Get conversation turns (user message + agent response + tool calls)
const turns = session.getConversationTurns();

// Serialize for storage
const serialized = session.serialize();
const restored = Session.deserialize(serialized);

2. WorkingContext - The Computed View

An ephemeral, computed view sent to the LLM. Rebuilt for each invocation.

import { WorkingContext } from 'agents/memory';

const context = new WorkingContext('session-123');

// Add system instructions
context.addSystemInstruction('You are a helpful assistant.');

// Add conversation content
context.addContent({ role: 'user', content: 'Hello!' });
context.addContent({ role: 'assistant', content: 'Hi there!' });

// Convert to model format (currently supports workers-ai)
const modelInput = context.toModelFormat('workers-ai', {
  format: 'chat_completions'
});
// { messages: [{ role: 'system', content: '...' }, ...] }

3. Processor Pipeline - Modular Context Building

A pipeline of processors that transform Session state into WorkingContext, inspired by Google ADK's processor pattern.

import {
  ProcessorPipeline,
  basicRequestProcessor,
  instructionsRequestProcessor,
  compactionFilterRequestProcessor
} from 'agents/memory';

// Create pipeline with processors
const pipeline = new ProcessorPipeline();
pipeline.addRequestProcessor('basic', basicRequestProcessor);
pipeline.addRequestProcessor('instructions', instructionsRequestProcessor, {
  instructions: ['You are a helpful assistant.']
});
pipeline.addRequestProcessor('compaction-filter', compactionFilterRequestProcessor, {
  keepCompactionSummaries: true
});

// Execute pipeline to build WorkingContext from Session
const workingContext = await pipeline.executeRequestPipeline(session);

Built-in Request Processors:

Processor	Description
basicRequestProcessor	Initializes context with session metadata
instructionsRequestProcessor	Adds system instructions
identityRequestProcessor	Adds agent identity
contentsRequestProcessor	Transforms events to content
slidingWindowRequestProcessor	Keeps recent conversation turns
compactionFilterRequestProcessor	Filters compacted events, includes summaries
contextCacheRequestProcessor	Marks stable prefixes for caching
tokenLimitRequestProcessor	Truncates to fit token limits

Built-in Response Processors:

Processor	Description
statisticsResponseProcessor	Updates session statistics

Bonus: Compaction Without Model Coupling

The primitives enable context compaction without worrying about underlying model structure:

import { Session, EventAction, generateEventId, type CompactionEvent } from 'agents/memory';

// Session tracks compaction configuration
session.updateCompactionConfig({
  enabled: true,
  windowSize: 5,        // Keep last 5 turns
  strategy: 'sliding_window'
});

// When context exceeds limits, create a compaction event
const compactionEvent: CompactionEvent = {
  id: generateEventId(),
  action: EventAction.COMPACTION,
  timestamp: Date.now(),
  summary: 'User asked about weather in SF, NY, and London. All responses provided.',
  compactedEventIds: ['event-1', 'event-2', 'event-3'],
  compactionStrategy: 'sliding_window',
  originalTokenCount: 2000,
  compactedTokenCount: 150
};

// Add compaction event and remove compacted events
session.addEvent(compactionEvent);

// The compactionFilterRequestProcessor automatically:
// 1. Filters out events that were compacted
// 2. Includes the compaction summary in the context
// 3. Preserves recent turns for continuity

Why this matters:

Compaction logic is decoupled from model format - the Session stores structured events, not raw messages
The Processor Pipeline handles the transformation - compaction summaries are injected at the right place
Developers can implement custom summarization (e.g., using LLM) while the SDK handles the plumbing
Future model support requires only adding new toModelFormat() adapters, not changing compaction logic

Event Types

Strongly-typed events for all agent interactions:

Event Type	Description
USER_MESSAGE	User input
AGENT_MESSAGE	Agent response
TOOL_CALL	Tool invocation
TOOL_RESULT	Tool execution result
COMPACTION	Context summarization
ERROR	Error occurrence
CONTROL_SIGNAL	Control flow signals
AGENT_TRANSFER	Multi-agent handoff
SYSTEM_INSTRUCTION	Dynamic instruction updates

Current Limitations

⚠️ Note: For now, this only supports Workers AI chat completions format via toModelFormat('workers-ai'). Support for additional native model providers (OpenAI, Anthropic, etc.) , integration with vercel ai sdk will have to shortly follow.

Files Added

packages/agents/src/memory/
├── index.ts           # Module exports
├── events.ts          # Event types and EventAction enum
├── session.ts         # Session class
├── working-context.ts # WorkingContext class
└── processors.ts      # ProcessorPipeline and built-in processors

changeset-bot · 2026-01-20T17:53:41Z

⚠️ No Changeset found

Latest commit: 8078063

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

pkg-pr-new · 2026-01-20T19:01:54Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@791

commit: 8078063

deathbyknowledge

This is not a final review yet but I'm struggling to see the bigger picture with this, it seems quite opinionated and not sure if the benefits outweigh the negatives.

Could you update one or two of the existing examples (or make a new one) to use this approach and see how if feels?

deathbyknowledge · 2026-01-28T11:01:17Z

packages/agents/src/memory/processors.ts

+/**
+ * Token limit processor - ensures context fits within token limits
+ */
+export const tokenLimitRequestProcessor: RequestProcessor<


This relies on estimateTokenCount which just guesses tokens. Since there's quite some variation in tokenizers depending on the model at use, I'd prefer if we remove this one and let users provide their own.

Maybe we can add "character" truncation instead?

True. Ideally the choice of this processor implementation would be something left to the user. That would be step 0. But it could potentially be something to include at a later stage when context compaction to keep long-conversation going becomes more common, and if there's benefits to including it in the sdk at that point. The entire processors.ts can be part of a later PR.

deathbyknowledge · 2026-01-28T11:08:58Z

packages/agents/src/memory/session.ts

+ */
+export class Session {
+  readonly metadata: SessionMetadata;
+  readonly events: Event[];


Do we want to use the DO sqlite to handle storage here? Or perhaps storing events directly, kind of like an event store?

Good call. sqlite storage would be better, and fits better with the existing pattern. A cf_agents_session table might be a better approach to store user's session. Primary key will be a session_id, and events can be stored in order for easy retrieval of last 'N' user-agent turns in a given session, which would serve as the working-context for the current request. What are your thoughts?

(Apologies about the late response. I missed the email notification thread for this set of comments, and only noticed earlier today )

threepointone · 2026-02-02T13:45:42Z

I'll review further on wednesday

mchockal-cf and others added 2 commits January 20, 2026 08:40

Add memory primitives v0

4a37d1f

Merge branch 'cloudflare:main' into main

8078063

deathbyknowledge reviewed Jan 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Add structured memory primitives for better context engineering #791

Feature request: Add structured memory primitives for better context engineering #791

mchockal commented Jan 20, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Jan 20, 2026

Uh oh!

pkg-pr-new bot commented Jan 20, 2026

Uh oh!

deathbyknowledge left a comment

Uh oh!

deathbyknowledge Jan 28, 2026

Uh oh!

mchockal Feb 3, 2026

Uh oh!

deathbyknowledge Jan 28, 2026

Uh oh!

mchockal Feb 3, 2026

Uh oh!

threepointone commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Feature request: Add structured memory primitives for better context engineering #791

Are you sure you want to change the base?

Feature request: Add structured memory primitives for better context engineering #791

Conversation

mchockal commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

Why?

What's Added

1. Session - The Ground Truth

2. WorkingContext - The Computed View

3. Processor Pipeline - Modular Context Building

Bonus: Compaction Without Model Coupling

Event Types

Current Limitations

Files Added

Uh oh!

changeset-bot bot commented Jan 20, 2026

⚠️ No Changeset found

Uh oh!

pkg-pr-new bot commented Jan 20, 2026

Uh oh!

deathbyknowledge left a comment

Choose a reason for hiding this comment

Uh oh!

deathbyknowledge Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

deathbyknowledge Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

threepointone commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mchockal commented Jan 20, 2026 •

edited

Loading