Persistence & GerbilGate
Gerbil caches model weights on-device so they download once and load instantly afterward, and the <GerbilGate> component gates your app behind that download with a splash screen while the engine warms up.
Persistence
Gerbil caches downloaded model weights in IndexedDB automatically. The engine owns this — there is nothing extra to configure. The first time you create the engine it downloads the model; every run after that loads straight from the on-device cache, with no network and no re-download.
On iOS especially, the browser can evict cached data under storage pressure. To keep your model cached across visits, request persistent storage — once granted, the cache survives eviction:
01import {02 requestPersistentStorage,03 getStorageStatus,04} from "@tryhamster/gerbil/browser";05
06// Ask the browser to keep cached data across visits.07// On iOS this is most reliably granted from the Home Screen (PWA).08const persisted = await requestPersistentStorage();09console.log(persisted); // true once the browser grants it10
11// Inspect the current storage status (persisted flag + usage/quota).12const status = await getStorageStatus();13console.log(status);Before kicking off a download, you can check whether a model of a given size will fit in the remaining quota. canCacheModel(sizeMB) answers that directly, and checkStorageQuota() returns the raw usage and quota numbers so you can surface them in your UI:
01import {02 canCacheModel,03 checkStorageQuota,04} from "@tryhamster/gerbil/browser";05
06// Will a ~550 MB model fit in the remaining quota?07const fit = await canCacheModel(550);08if (!fit.fits) {09 // Warn the user, free up space, or pick a smaller model.10}11
12// Or read the raw numbers (all in MB) and decide yourself.13const { usedMB, quotaMB } = await checkStorageQuota(550);14console.log(`Using ${usedMB} of ${quotaMB} MB`);The four helpers: requestPersistentStorage(), getStorageStatus(), canCacheModel(sizeMB), and checkStorageQuota() all live in @tryhamster/gerbil/browser.
For how weights are stored and reused across runs, see the Caching guide. For the iOS and PWA specifics of persistent storage, see the Mobile & PWA guide.
GerbilGate
<GerbilGate> is a React component (added in 1.0.2, exported from @tryhamster/gerbil/hooks) that gates your app behind the model download. Wrap your app in it: while the engine downloads and initializes, it shows a splash/loading screen; once the engine is ready, it renders its children. Your UI only mounts when on-device inference is live, so no child component ever has to handle a not-yet-loaded model.
Because the gate stays mounted for the lifetime of your app, it also keeps the shared engine warm. Client-side navigation never re-uploads the weights to the GPU and never trips the engine’s ~30s teardown window — every route change keeps inference instantly available.
01import { GerbilGate } from "@tryhamster/gerbil/hooks";02
03export default function Root() {04 return (05 <GerbilGate06 model="mlx-community/Qwen3.5-0.8B-4bit"07 fallback={<div>Loading…</div>}08 >09 <App />10 </GerbilGate>11 );12}The full prop surface:
01interface GerbilGateProps {02 model?: string;03 children: ReactNode;04 fallback?: GateSlot; // splash while loading05 errorFallback?: GateSlot; // shown on error06 enableVision?: boolean;07 dtype?: "auto" | "f32" | "q4";08 maxSeqLen?: number;09}10
11// A slot is either static UI or a render function that receives live state.12type GateSlot =13 | ReactNode14 | ((s: {15 progress: number | null; // 0–100, or null if unknown16 status: string | null; // e.g. "Downloading model…"17 error: unknown;18 errorKind: EngineErrorKind | null;19 }) => ReactNode);Splash page
Pass a render function to fallback to build a real splash screen with live progress. It receives progress (0–100, or null while the total is unknown) and a human-readable status. Use errorFallback the same way to render a recovery screen keyed off errorKind:
01import { GerbilGate } from "@tryhamster/gerbil/hooks";02import { Splash } from "./Splash";03import { ErrorScreen } from "./ErrorScreen";04import { App } from "./App";05
06export default function Root() {07 return (08 <GerbilGate09 model="mlx-community/Qwen3.5-0.8B-4bit"10 fallback={({ progress, status }) => (11 <Splash>12 {status} {progress != null && `${progress}%`}13 </Splash>14 )}15 errorFallback={({ error, errorKind }) => (16 <ErrorScreen kind={errorKind} />17 )}18 >19 <App />20 </GerbilGate>21 );22}Combine the two. GerbilGate keeps the engine warm and persistence keeps the weights cached, so the model downloads only once: the splash appears on the very first visit, and every later load is instant and works offline.