Memory Architecture
The 4-tier memory system — session, working, long-term, and semantic memory powered by pgvector.
Memory Architecture — How Agents Remember and Learn
Memory is not recall. Memory is identity. Without persistent memory, an agent is just a very expensive function call. OpenClaw uses files on disk. Flowwink evolved this into a 4-tier PostgreSQL system with vector search.
Think about your best colleague at work. What makes them valuable isn’t just what they know today — it’s that they remember your last conversation, they know how your business operates, they recall what didn’t work last time, and they’ve built up an intuition about what you actually need. That accumulated context is what separates a trusted colleague from a new hire.
An agent without memory is the new hire every single morning. No matter how many tasks it completed yesterday, it wakes up fresh — blank slate, no context, no continuity. Every interaction starts from zero.
Memory is what makes the “digital employee” metaphor real rather than rhetorical. Without it, you don’t have an employee. You have a very capable temp worker who forgets everything at the end of each shift.
The Memory Problem
LLMs are stateless. Each API call starts fresh. Without a memory system, the agent has no continuity — it can’t remember what it did yesterday, what it learned last week, or who it’s talking to.
OpenClaw’s solution (verified from source): Workspace files (AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md, MEMORY.md) are injected into every agent turn. Daily memory files (memory/*.md) are accessed on-demand via memory_search and memory_get tools. Simple, no database required, but all injected files consume tokens (capped at 20k per file, 150k total).
Flowwink’s evolution: A 4-tier PostgreSQL system with vector search, designed for multi-tenant business operations where memory must be isolated per tenant via RLS.
L1: Session Memory (ephemeral)
│ Current conversation
│ Cleared when session ends
│
L2: Working Memory (short-term)
│ Top 20-30 recent entries
│ Ordered by recency
│
L3: Long-term Memory (persistent)
│ All persisted facts and lessons
│ Full-text searchable
│
L4: Semantic Memory (vector)
Embeddings for similarity search
pgvector cosine similarity
L1: Session Memory
The current conversation history. This is what the LLM sees directly.
Storage: chat_messages table (or in-memory for edge functions)
Access: Linear scan (most recent N messages)
Lifetime: Session duration
Session memory includes:
- User messages
- Agent responses
- Tool calls and results
- System messages (skill instructions, context injection)
Pruning: When session memory approaches the context limit, older messages are summarized and replaced with a single [SUMMARY] message. Key facts are extracted and saved to L2/L3 before pruning.
L2: Working Memory
The agent’s “top of mind” — recent memories that are most likely relevant.
Storage: agent_memory table
Access: Key/category filter, ordered by updated_at DESC LIMIT 30
Lifetime: Until displaced by newer entries
Working memory is loaded into the system prompt at every reasoning cycle. It’s limited to 30 entries to prevent context bloat.
SELECT * FROM agent_memory
WHERE category IN ('preference', 'context', 'fact')
ORDER BY updated_at DESC
LIMIT 30;
L3: Long-term Memory
All persisted facts, lessons, and learned patterns.
Storage: agent_memory table (full)
Access: Full-text search via pg_trgm
Lifetime: Permanent (until explicitly deleted)
Long-term memory is searched when the agent needs specific information that’s not in working memory.
SELECT * FROM agent_memory
WHERE to_tsvector('english', content) @@ plainto_tsquery('english', 'blog engagement')
ORDER BY updated_at DESC;
L4: Semantic Memory
Vector embeddings for similarity-based retrieval.
Storage: agent_memory.embedding column (768-dimensional vector)
Access: pgvector cosine similarity
Lifetime: Permanent
Semantic memory enables the agent to find relevant memories even when the exact keywords don’t match.
SELECT *, 1 - (embedding <=> $query_embedding) as similarity
FROM agent_memory
WHERE embedding IS NOT NULL
ORDER BY embedding <=> $query_embedding
LIMIT 5;
Embedding providers (auto-fallback):
- OpenAI
text-embedding-3-small(768d) - Gemini
text-embedding-004(768d)
Memory Categories
Memories are organized by category:
| Category | Purpose | Example |
|---|---|---|
soul | Agent personality & values | { purpose: "Help businesses grow", tone: "professional but warm" } |
identity | Agent identity & boundaries | { name: "FlowPilot", role: "Digital Operator", boundaries: ["No financial decisions"] } |
preference | Learned preferences | preferred_blog_tone: "conversational" |
context | Operational context | last_campaign_date: "2026-01-15" |
fact | Learned facts & patterns | lesson:blog_engagement: { data_viz_posts: "3x more engagement" } |
agents | Operational rules | { direct_action_rules: [...], self_improvement: [...] } |
The Memory Lifecycle
Creation → Working → Long-term → Semantic → Retrieval → Influence
│ │ │ │ │ │
│ │ │ │ │ └─ Shapes future
│ │ │ │ │ decisions
│ │ │ │ └─ Found when
│ │ │ │ needed
│ │ │ └─ Embedded for
│ │ │ similarity search
│ │ └─ Persisted indefinitely
│ └─ Top of mind (loaded every cycle)
└─ Created by: reflection, learning,
user input, or agent observation
Creation
Memories are created by:
- Reflection — The agent analyzes performance and saves learnings
- Learning — The
flowpilot-learnfunction distills feedback - User input — The user tells the agent something to remember
- Observation — The agent notices a pattern during operation
Embedding
New memories are automatically embedded:
// In handleMemoryWrite()
const embedding = await generateEmbedding(content);
await supabase.from('agent_memory').upsert({
key, category, content, embedding, importance
});
Retrieval
When the agent needs information:
- Check working memory (top 30) — fast, always loaded
- If not found, search long-term memory — full-text search
- If still not found, search semantic memory — vector similarity
Workspace Files as Memory
Both OpenClaw and Flowwink use the same conceptual categories for agent configuration. The storage mechanism differs:
| Concept | OpenClaw (verified) | Flowwink (DB-driven) | Purpose |
|---|---|---|---|
| Persona | SOUL.md (file, auto-injected) | agent_memory key soul | Persona, boundaries, tone |
| Identity | IDENTITY.md (file, auto-injected) | agent_memory key identity | Agent name, role, emoji |
| Rules | AGENTS.md (file, auto-injected) | agent_memory key agents | Operating instructions, conventions |
| Heartbeat | HEARTBEAT.md (file, auto-injected) | heartbeat_protocol memory key | Protocol text (7-step loop) |
| Tool policy | TOOLS.md (file, auto-injected) | tool_policy memory key | { blocked: string[] } |
| Memory | MEMORY.md (file, auto-injected) | L2/L3/L4 tiers in agent_memory | Persistent facts and lessons |
| Daily notes | memory/*.md (on-demand via tools) | — | Day-to-day observations |
These are loaded into the system prompt at every reasoning cycle. They’re the agent’s “constitution” — the rules it lives by.
Safety: Internal system keys (tool_policy, heartbeat_state) are excluded from LLM context. This prevents infrastructure metadata from polluting the agent’s reasoning.
Memory Compression
As memory grows, compression becomes necessary:
preCompactionFlush(oldMessages, supabase)
│
├── AI extracts up to 5 discrete facts from old messages
├── Each fact saved as agent_memory entry with vector embedding
└── Old messages replaced with [SUMMARY]
pruneConversationHistory(messages, supabase)
│
├── AI generates summary of old messages
├── Replaces old messages with [SUMMARY] message
└── Keeps recent N messages intact
This ensures the agent doesn’t lose important information when conversations get long.
The Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| No memory | Agent starts from zero every session | Implement L2/L3 memory |
| Everything in context | Context window fills up instantly | Lazy loading + budget tiers |
| No categorization | Can’t find relevant memories | Category-based filtering |
| No embedding | Can’t find semantically similar memories | pgvector embedding |
| No compression | Context grows unbounded | Pre-compaction flush |
| No workspace files | Agent has no personality or rules | Soul/identity/agents keys |
Memory is what transforms a stateless function into a persistent entity. Without it, the agent is a goldfish. With it, the agent is a colleague.
Next: how the agent learns and improves through feedback loops. Feedback Loops →