Memory & RAG
An agent that forgets everything between turns can't do much. Gerbil's memory layer gives it durable recall: store text with embeddings, keep it on the device across sessions, and pull back the most relevant pieces — within a token budget — to ground each response. The whole retrieval loop runs on the user's machine.
How it works
Memory is three small pieces you compose:
- —An embedder turns text into vectors. Gerbil's native embedding model does this on-device.
- —A store holds the text and its vectors. In the browser that's IndexedDB, so memory survives reloads.
- —Recall finds the closest matches to a query and packs them into a context block that fits a token budget.
One-line memory in React
In a component, reach for useMemory from @tryhamster/gerbil/hooks. It wires the native embedder to a durable IndexedDB store for you — zero config, no model argument — so storing and recalling is just add() and recall():
01"use client";02
03import { useChat, useMemory } from "@tryhamster/gerbil/hooks";04
05function Assistant() {06 const memory = useMemory({ namespace: "agent-memory" });07 const { send } = useChat();08
09 async function remember() {10 // Remember things across the conversation — and across sessions.11 await memory.add("The user prefers TypeScript and pnpm.");12 await memory.add("They're building a browser extension, not a server app.");13 }14
15 async function answer(question: string) {16 // Recall relevant context, capped to a token budget, and ground the reply.17 const { context } = await memory.recall(question, { tokenBudget: 256 });18 return send(`${context}\n\n${question}`);19 }20 // ...21}Because the store is IndexedDB, the agent still remembers those preferences the next time the user opens your app — no account, no backend, no data leaving the device. Pass namespace to isolate memories (per user, per app area), and model to choose a different embedder.
Under the hood: createMemory
useMemory is a thin wrapper over the lower-level createMemory API. Drop to it when you want to control the embedder or store directly — the layer is engine-agnostic, so the embedder is just a function from text to vectors:
01import { useEngine } from "@tryhamster/gerbil/hooks";02import { createMemory, createIndexedDBStore } from "@tryhamster/gerbil/memory";03
04// Native embeddings + a durable IndexedDB-backed store.05// No model argument needed — useEngine picks a default per capability.06const embedder = useEngine({ embedding: true });07const chat = useEngine();08
09const memory = createMemory({10 embed: async (texts) => Promise.all(texts.map((t) => embedder.embed(t))),11 store: createIndexedDBStore({ dbName: "agent-memory" }),12});13
14await memory.add("The user prefers TypeScript and pnpm.");15
16const { context } = await memory.recall("what should I recommend for their setup?", {17 tokenBudget: 256,18});19const answer = await chat.complete(20 `${context}\n\nRecommend a project setup for the user.`,21);Token-budgeted recall
recall() ranks stored entries by similarity to the query and packs as many as fit into tokenBudget, so the context you prepend never blows past the model's window:
const { context, records, tokensUsed } = await memory.recall("billing question", { tokenBudget: 512,});
context; // a ready-to-prepend string of the most relevant memoriesrecords; // the individual entries that were included, in packed ordertokensUsed; // approximate token count of context (≤ tokenBudget)Tagging and filtering memories
Attach metadata when you add an entry, then scope recall to a subset — per user, per topic, per source:
await memory.add("Ship date moved to March.", { metadata: { kind: "fact", project: "atlas" },});
// Only recall memories from this project.const { context } = await memory.recall("what's the latest on the timeline?", { tokenBudget: 256, filter: { project: "atlas" },});Adding long documents
Pass chunk: true to split a long document into overlapping passages so each can be embedded and retrieved on its own:
await memory.add(longArticleText, { chunk: true, metadata: { source: "handbook" },});On the server
The same memory API runs in Node — swap the IndexedDB store for a file-backed one to persist memory to disk:
import { createMemory, createFileStore } from "@tryhamster/gerbil/memory";
const memory = createMemory({ embed, // any (texts) => Promise<Float32Array[]> store: createFileStore("./.gerbil/memory.json"),});Part of the agent harness
Memory is one piece of building agents that act, not just answer. It pairs with tool calling (let the model fetch data and take actions), skills (package reusable capabilities), and MCP (connect to external tool servers) — all running on-device. Together they let an agent remember context, decide what to do, and carry it out, entirely on the user's machine.