Back to portfolio

Max Memory

SQLite Memory Layer for OpenClaw

Persistent hybrid-search memory system using SQLite + embeddings, giving a personal AI agent durable recall across sessions without relying on context window.

01The Problem

AI agents restart fresh at every session. Without persistent memory, they can't remember preferences, decisions, or context from previous sessions — forcing the human to repeat themselves constantly. Context window is too precious to waste on long-term reference material, and files alone don't support semantic search.

02What the AI Does

maintains a SQLite database at `~/.openclaw/workspace/memory/brain.db` with four memory categories: `core` (permanent), `daily` (90-day), `conversation` (30-day), and `project` (permanent). Uses a hybrid search approach: 0.7 weight on vector similarity (OpenAI text-embedding-3-small) plus 0.3 weight on FTS5 BM25 keyword search. Stores memories with timestamps and category tags. Runs cron jobs that auto-capture session transcripts (every 2 hours) and enforce retention policies (daily at 3am EST). Built on: SQLite with FTS5, Python, OpenAI embeddings API.

03Design Decisions

01 · Choice

SQLite over a dedicated vector database

Why

Simplicity and portability. brain.db lives at a known filesystem path, can be backed up with any file tool, requires no separate service. The overhead of a vector DB (Pinecone, Weaviate) wasn't justified for a single-user personal memory system.

Constraint

SQLite is not built for concurrent writes from multiple processes. The system assumes a single OpenClaw instance writing at a time.

02 · Choice

Hybrid search (0.7 vector + 0.3 BM25)

Why

Pure vector search misses keyword-specific queries ("revenue sprint," "CM 100-day," "H1"). Pure BM25 misses semantic variation ("cold outreach" vs "outbound prospecting"). The hybrid blend handles both without a separate infrastructure.

Constraint

The 0.7/0.3 ratio is a judgment call — it reflects a bias toward semantic understanding but a deliberate acknowledgment that keyword precision matters for named entities and technical terms.

03 · Choice

Category-based retention with mandatory commits

Why

Some memory should be permanent (core preferences, decisions); some should be ephemeral (transient conversation). The category system lets the system age out noise without losing durable knowledge.

Constraint

6 mandatory commit triggers are defined in the skill — significant milestones, decisions, new preferences, and failures always get written to core memory regardless of retention policy.

04 · Choice

Transcript auto-capture every 2 hours

Why

Session transcripts contain the richest context about what's being worked on. Rather than relying on the agent to manually write memory after every session, the cron job captures transcripts automatically.

Constraint

Transcripts are captured raw — the agent still needs to write curated memory entries. Auto-capture provides the source material; human-instructed writing produces the durable record.

05 · Choice

OpenAI embeddings API for vector storage

Why

The same API key used for Whisper transcription also powers embeddings. Shared infrastructure, single token budget to track.

Constraint

Embeddings cost money per token. Large transcript captures can accumulate costs. The retention cleanup cron is partly there to prevent unbounded growth.

04Tradeoffs & Limits

- **Embedding quality degrades with very short or very long texts.** A single sentence won't have strong vectors; a 50-page transcript creates embedding noise. Chunking strategy matters and isn't yet optimized. - **SQLite FTS5 has no native vector similarity.** The hybrid search is a workaround using SQLite's full-text search for keyword matching, not a real vector database. For a high-volume multi-user system, this would be inadequate. - **No access control.** Anyone with access to the brain.db file can read all memories. For Brett's personal use this is fine; for shared environments it would be a problem. - **Retention cleanup is aggressive on daily/conversation memory.** 90-day and 30-day retention means some context is intentionally lost. If a project spans 6 months and the agent only worked on it in month 1 and month 6, the month 1 memory may be gone. - **The "core" category has no enforced size limit.** If everything gets marked core, the database grows unbounded. There's no eviction strategy for core memory.

05Key Insight

Personal AI memory systems need eviction policies as much as storage policies. The value of memory isn't measured by how much you store — it's measured by what you can reliably retrieve when it matters.