React Hooks
Client-side React patterns for text generation, vision, embeddings, and speech on the native WebGPU engine — useEngine, useTTS, and useSTT.
Installation
npm install @tryhamster/gerbiluseGerbil
The main hook for text generation and streaming:
01"use client";02
03import { useGerbil } from "@tryhamster/gerbil/react";04
05function MyComponent() {06 const {07 generate, // Generate text (returns Promise)08 stream, // Stream text (returns AsyncIterator)09 isLoading, // Loading state10 error, // Error object if any11 reset, // Clear the last result + error12 } = useGerbil({13 endpoint: "/api/ai", // Your API endpoint14 });15
16 const handleGenerate = async () => {17 const result = await generate("Write a haiku");18 console.log(result.text);19 };20
21 const handleStream = async () => {22 for await (const chunk of stream("Tell me a story")) {23 console.log(chunk);24 }25 };26
27 return (28 <div>29 <button onClick={handleGenerate} disabled={isLoading}>30 Generate31 </button>32 <button onClick={handleStream} disabled={isLoading}>33 Stream34 </button>35 <button onClick={reset}>Reset</button>36 {error && <p>Error: {error.message}</p>}37 </div>38 );39}useGerbil Options
01const gerbil = useGerbil({02 // API endpoint your server route is mounted at (default: "/api/gerbil")03 endpoint: "/api/ai",04 // Connect on mount instead of on first call05 autoConnect: true,06
07 // GerbilConfig defaults forwarded to the server08 model: "mlx-community/Qwen3.5-0.8B-4bit",09 cache: { enabled: true, ttl: 3600 },10});11
12// Per-call options (maxTokens, temperature, system, thinking, onToken)13// are passed to generate()/stream(), not to useGerbil().generate()
Generate text and wait for the complete response:
01const { generate } = useGerbil({ endpoint: "/api/ai" });02
03// Basic usage04const result = await generate("Hello!");05console.log(result.text);06
07// With options08const result = await generate("Explain React", {09 maxTokens: 500,10 temperature: 0.8,11 system: "You are a helpful teacher.",12});13
14// Result shape15interface GenerateResult {16 text: string;17 thinking?: string; // If thinking mode enabled18 tokensGenerated: number;19 tokensPerSecond: number;20 totalTime: number;21}stream()
Stream text token by token:
01const { stream } = useGerbil({ endpoint: "/api/ai" });02
03// Basic streaming04const [text, setText] = useState("");05
06const handleStream = async () => {07 setText("");08 for await (const chunk of stream("Tell me a story")) {09 setText((prev) => prev + chunk);10 }11};12
13// With options14for await (const chunk of stream("Explain hooks", {15 maxTokens: 500,16 onToken: (token) => console.log(token),17})) {18 // Process each chunk19}useChat
Full-featured chat hook with message history:
01"use client";02
03import { useChat } from "@tryhamster/gerbil/react";04
05function ChatUI() {06 const {07 messages, // Array of messages08 input, // Current input value09 setInput, // Set input value10 handleSubmit, // Submit handler for forms11 isLoading, // Loading state12 error, // Error object13 reset, // Clear the conversation14 } = useChat({15 endpoint: "/api/chat",16 system: "You are a helpful assistant.",17 });18
19 return (20 <div>21 {/* Message list */}22 <div>23 {messages.map((m, i) => (24 <div key={i}>25 <strong>{m.role}:</strong> {m.content}26 </div>27 ))}28 </div>29
30 {/* Input form */}31 <form onSubmit={handleSubmit}>32 <input33 value={input}34 onChange={(e) => setInput(e.target.value)}35 placeholder="Type a message..."36 />37 <button type="submit" disabled={isLoading}>38 Send39 </button>40 </form>41 {error && <p>Error: {error.message}</p>}42 </div>43 );44}useChat Options
01const chat = useChat({02 // API endpoint (default: "/api/gerbil")03 endpoint: "/api/chat",04
05 // System prompt prepended to the conversation06 system: "You are a helpful assistant.",07
08 // Seed the conversation09 initialMessages: [10 { role: "assistant", content: "Hello! How can I help?" },11 ],12
13 // GerbilConfig defaults forwarded to the server14 model: "mlx-community/Qwen3.5-0.8B-4bit",15});Message Type
interface Message { role: "user" | "assistant"; content: string;}Thinking Mode
Qwen3 models reason before answering. generate() returns the chain-of-thought separately on result.thinking — pass thinking: true per call and display it:
01function ReasoningView() {02 const { generate, isLoading } = useGerbil({ endpoint: "/api/ai" });03 const [result, setResult] = useState<{ text: string; thinking?: string }>();04
05 const ask = async (prompt: string) => {06 setResult(await generate(prompt, { thinking: true }));07 };08
09 return (10 <div>11 <button onClick={() => ask("What is 127 × 43?")} disabled={isLoading}>12 Ask13 </button>14 {result?.thinking && (15 <div className="text-gray-500 italic text-sm mb-2">16 <strong>Thinking:</strong> {result.thinking}17 </div>18 )}19 {result && <div>{result.text}</div>}20 </div>21 );22}Streaming UI Pattern
01function StreamingChat() {02 const [output, setOutput] = useState("");03 const { stream, isLoading } = useGerbil({04 endpoint: "/api/ai",05 });06
07 const handleSubmit = async (prompt: string) => {08 setOutput("");09 for await (const chunk of stream(prompt)) {10 setOutput((prev) => prev + chunk);11 }12 };13
14 return (15 <div className="whitespace-pre-wrap">16 {output}17 {isLoading && <span className="animate-pulse">▌</span>}18 </div>19 );20}JSON Generation
Structured output is generated with the top-level json() helper. Run it in a server route and call it from the client — the schema-validated object comes back as JSON:
01// app/api/extract/route.ts (server)02import { json } from "@tryhamster/gerbil";03import { z } from "zod";04
05const PersonSchema = z.object({06 name: z.string(),07 age: z.number(),08 city: z.string(),09});10
11export async function POST(req: Request) {12 const { text } = await req.json();13 const person = await json(text, { schema: PersonSchema });14 return Response.json(person);15}01// Client component02function ExtractForm() {03 const handleExtract = async (text: string) => {04 const res = await fetch("/api/extract", {05 method: "POST",06 body: JSON.stringify({ text }),07 });08 const data = await res.json();09 console.log(data); // { name: "John", age: 32, city: "NYC" }10 };11
12 return (13 <button onClick={() => handleExtract("John is 32 from NYC")}>14 Extract15 </button>16 );17}Error Handling
01function ChatWithErrors() {02 const { messages, error, handleSubmit, reset } = useChat({03 endpoint: "/api/chat",04 });05
06 if (error) {07 return (08 <div className="p-4 bg-red-100 text-red-800 rounded">09 <p>Something went wrong: {error.message}</p>10 <button onClick={reset}>Start Over</button>11 </div>12 );13 }14
15 return (16 // ... chat UI17 );18}Persistence
Save and restore chat history. The on-device useChat from @tryhamster/gerbil/hooks exposes setMessages for hydration:
01import { useChat } from "@tryhamster/gerbil/hooks";02
03function PersistentChat() {04 const { messages, setMessages } = useChat();05
06 // Save to localStorage07 useEffect(() => {08 localStorage.setItem("chat-history", JSON.stringify(messages));09 }, [messages]);10
11 // Restore on mount12 useEffect(() => {13 const saved = localStorage.getItem("chat-history");14 if (saved) {15 setMessages(JSON.parse(saved));16 }17 }, [setMessages]);18
19 return (20 // ... chat UI21 );22}With React Context
01// GerbilProvider.tsx02import { createContext, useContext } from "react";03import { useGerbil } from "@tryhamster/gerbil/react";04
05const GerbilContext = createContext<ReturnType<typeof useGerbil> | null>(null);06
07export function GerbilProvider({ children }: { children: React.ReactNode }) {08 const gerbil = useGerbil({09 endpoint: "/api/ai",10 });11
12 return (13 <GerbilContext.Provider value={gerbil}>14 {children}15 </GerbilContext.Provider>16 );17}18
19export function useGerbilContext() {20 const context = useContext(GerbilContext);21 if (!context) {22 throw new Error("useGerbilContext must be used within GerbilProvider");23 }24 return context;25}26
27// Usage28function MyComponent() {29 const { generate, isLoading } = useGerbilContext();30 // ...31}Native In-Browser Inference: useEngine
For fully local inference, use useEngine from @tryhamster/gerbil/hooks. It drives Gerbil's native WebGPU engine — a lean bundle that runs on any device with WebGPU (Chrome/Edge 113+, Firefox 141+, desktop Safari 18+, iPad/iPhone on iOS/iPadOS 26+). The hook lazy-loads the model, streams tokens into completion, and shares one engine per model across your app. Call it with no arguments for a default, or pass any repo:
01"use client";02
03import { useState } from "react";04import { useEngine } from "@tryhamster/gerbil/hooks";05
06function BrowserAI() {07 // No model → a device-aware default. Or: useEngine({ model: "..." }).08 const { complete, completion, isGenerating, isLoading, tps } = useEngine();09
10 return (11 <div>12 <button13 onClick={() => complete("Explain React in one sentence")}14 disabled={isGenerating || isLoading}15 >16 {isLoading ? "Loading model…" : "Generate"}17 </button>18 <div>{completion}</div>19 {tps > 0 && <span>{tps.toFixed(1)} tok/s</span>}20 </div>21 );22}Embeddings
Pass embedding: true and call embed(). It returns a unit-L2-normalized vector, so cosine similarity is just a dot product. EmbeddingGemma is asymmetric — use taskType: "query" for searches and taskType: "document" for the corpus:
01"use client";02
03import { useState } from "react";04import { useEngine } from "@tryhamster/gerbil/hooks";05
06function SemanticSearch() {07 const { embed } = useEngine({ embedding: true }); // default embedding model08 const [score, setScore] = useState<number | null>(null);09
10 function dot(a: Float32Array, b: Float32Array) {11 let s = 0;12 for (let i = 0; i < a.length; i++) s += a[i] * b[i];13 return s; // vectors are L2-normalized, so dot == cosine14 }15
16 async function compare(query: string, doc: string) {17 const q = await embed(query, { taskType: "query" });18 const d = await embed(doc, { taskType: "document" });19 setScore(dot(q, d));20 }21
22 return (23 <div>24 <button onClick={() => compare("how do I run a model offline?", "Gerbil caches models in IndexedDB and works with no network.")}>25 Compare26 </button>27 {score !== null && <p>Similarity: {score.toFixed(3)}</p>}28 </div>29 );30}Vision
Pass enableVision: true and call describeImage with an image URL, File, or decoded RGB pixels — the hook decodes for you:
01"use client";02
03import { useState } from "react";04import { useEngine } from "@tryhamster/gerbil/hooks";05
06function VisionDemo() {07 const { describeImage, completion, isGenerating } = useEngine({ enableVision: true });08 const [image, setImage] = useState<string | null>(null);09
10 return (11 <div>12 <input13 type="file"14 accept="image/*"15 onChange={(e) => {16 const file = e.target.files?.[0];17 if (file) {18 const reader = new FileReader();19 reader.onload = () => setImage(reader.result as string);20 reader.readAsDataURL(file);21 }22 }}23 />24 {image && <img src={image} alt="Preview" className="max-w-xs" />}25 <button26 onClick={() => image && describeImage(image, "What's in this image?")}27 disabled={!image || isGenerating}28 >29 Analyze Image30 </button>31 {completion && <p>{completion}</p>}32 </div>33 );34}Speech-to-Text: useSTT
useSTT wraps native Moonshine ASR on the WebGPU engine. It captures the mic between startRecording() and stopRecording(), resamples to 16 kHz mono for you, and surfaces the transcript. Defaults to Moonshine — no model argument needed:
01"use client";02
03import { useSTT } from "@tryhamster/gerbil/hooks";04
05function VoiceInput() {06 const { startRecording, stopRecording, isRecording, isTranscribing, transcript } = useSTT();07
08 return (09 <div>10 <button11 onClick={() => (isRecording ? stopRecording() : startRecording())}12 disabled={isTranscribing}13 >14 {isRecording ? "Stop" : "Record"}15 </button>16 {isTranscribing && <span>Transcribing…</span>}17 {transcript && <p>You said: {transcript}</p>}18 </div>19 );20}Text-to-Speech: useTTS
useTTS wraps native Kani-TTS-2 on the WebGPU engine. Call speak(text) — it synthesizes and plays the audio, and keeps the clip around for replay(). Defaults to the Kani model:
01"use client";02
03import { useTTS } from "@tryhamster/gerbil/hooks";04
05function Speaker() {06 const { speak, replay, isSynthesizing, isPlaying, hasAudio } = useTTS();07
08 return (09 <div>10 <button11 onClick={() => speak("Hello from on-device text-to-speech.")}12 disabled={isSynthesizing || isPlaying}13 >14 {isSynthesizing ? "Synthesizing…" : "Speak"}15 </button>16 {hasAudio && <button onClick={replay} disabled={isPlaying}>Replay</button>}17 </div>18 );19}