Vercel AI SDK

Use Gerbil as a drop-in provider for the Vercel AI SDK. Full compatibility with generateText, streamText, generateObject, generateSpeech, and transcribe.

Recommended: The AI SDK integration is the best way to use Gerbil in production apps. It provides streaming, structured output, and tool calling out of the box.

Node-side integration. The @tryhamster/gerbil/ai provider runs on the server-side Gerbil class. For in-browser, GPU-accelerated inference — including audio — use the native WebGPU engine instead.

Installation

Terminal
npm install @tryhamster/gerbil ai

Quick Start

basic.ts
01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text } = await generateText({
05 model: gerbil("qwen3.5-0.8b"),
06 prompt: "Write a haiku about TypeScript",
07});
08
09console.log(text);

Streaming Responses

Stream text token by token for real-time UIs:

streaming.ts
01import { streamText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const stream = streamText({
05 model: gerbil("qwen3.5-0.8b"),
06 prompt: "Explain quantum computing",
07});
08
09for await (const chunk of stream.textStream) {
10 process.stdout.write(chunk);
11}
12
13// Get final result
14const result = await stream;
15console.log("\nTokens:", result.usage.totalTokens);

Thinking Mode

Enable chain-of-thought reasoning with Qwen3 models:

thinking.ts
01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text, experimental_providerMetadata } = await generateText({
05 model: gerbil("qwen3.5-0.8b", { thinking: true }),
06 prompt: "What is 127 × 43? Show your work.",
07});
08
09// Access the thinking process
10const thinking = experimental_providerMetadata?.gerbil?.thinking;
11console.log("Reasoning:", thinking);
12console.log("Answer:", text);

Structured Output

Generate type-safe structured data with Zod schemas:

structured.ts
01import { generateObject } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03import { z } from "zod";
04
05const { object } = await generateObject({
06 model: gerbil("qwen3.5-0.8b"),
07 schema: z.object({
08 name: z.string(),
09 age: z.number(),
10 city: z.string(),
11 interests: z.array(z.string()),
12 }),
13 prompt: "Extract: John is 32, lives in NYC, loves hiking and photography",
14});
15
16console.log(object);
17// { name: "John", age: 32, city: "NYC", interests: ["hiking", "photography"] }

Tool Calling

Let the model call functions to accomplish tasks:

tools.ts
01import { generateText, tool } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03import { z } from "zod";
04
05const { text, toolCalls } = await generateText({
06 model: gerbil("qwen3.5-0.8b"),
07 prompt: "What's the weather in San Francisco?",
08 tools: {
09 getWeather: tool({
10 description: "Get weather for a city",
11 parameters: z.object({
12 city: z.string(),
13 }),
14 execute: async ({ city }) => {
15 return `Weather in ${city}: 72°F, sunny`;
16 },
17 }),
18 },
19});
20
21console.log("Tool calls:", toolCalls);
22console.log("Response:", text);

System Prompts

system.ts
01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text } = await generateText({
05 model: gerbil("qwen3.5-0.8b"),
06 system: "You are a helpful coding assistant. Be concise.",
07 prompt: "How do I reverse a string in JavaScript?",
08});

Multi-turn Conversations

messages.ts
01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text } = await generateText({
05 model: gerbil("qwen3.5-0.8b"),
06 messages: [
07 { role: "user", content: "My name is Alice" },
08 { role: "assistant", content: "Hello Alice! Nice to meet you." },
09 { role: "user", content: "What's my name?" },
10 ],
11});
12
13console.log(text); // "Your name is Alice!"

Speech & Audio

Speech runs on the native WebGPU engine, not through the AI SDK provider. Text-to-speech uses Kani-TTS-2 via engine.speak(), and speech-to-text uses Moonshine via MoonshineSTT — both running on-device on WebGPU. See the Text-to-Speech and Speech-to-Text docs.

Custom Provider Configuration

Create a custom provider with specific settings:

custom-provider.ts
01import { createGerbil } from "@tryhamster/gerbil/ai";
02import { generateText } from "ai";
03
04// Create a custom provider
05const local = createGerbil({
06 device: "gpu", // "auto" | "gpu" | "cpu"
07 dtype: "q4", // "q4" | "q8" | "fp16" | "fp32"
08 cacheDir: "./models", // Custom cache directory
09});
10
11// Text generation
12const { text } = await generateText({
13 model: local("qwen3.5-0.8b", { thinking: true }),
14 prompt: "Write a poem",
15});

Model Preloading & Caching

Download models ahead of time so users don't wait on first use. The provider maintains a model cache — preloaded models are automatically reused across all subsequent calls.

preload.ts
01import { gerbil } from "@tryhamster/gerbil/ai";
02import { generateText } from "ai";
03
04// Preload once at app startup
05await gerbil.preload("qwen3.5-0.8b", {
06 keepLoaded: true, // Keep in memory
07 onProgress: (p) => console.log(p.status, p.progress),
08});
09
10// All subsequent calls reuse the cached model instance - instant!
11const { text } = await generateText({
12 model: gerbil("qwen3.5-0.8b"),
13 prompt: "Hello!",
14});
15
16// This also uses the same cached instance
17const { text: text2 } = await generateText({
18 model: gerbil("qwen3.5-0.8b"),
19 prompt: "Another prompt",
20});

keepLoaded Option

ValueBehaviorLoad Time
falseDownload → dispose → loads from disk later~1-2s
true ⭐Download → cache instance → reuse everywhere0ms
keep-loaded.ts
// Download only (default) - frees RAM, loads from disk later (~1-2s)
await gerbil.preload("qwen3.5-0.8b");
// Keep in memory - instant inference, model cached for reuse
await gerbil.preload("qwen3.5-0.8b", { keepLoaded: true });

Cache Checking

Check if a model is already cached before downloading:

cache-check.ts
// Check disk cache
const isCached = await gerbil.isCached("qwen3.5-0.8b");
if (!isCached) {
await gerbil.preload("qwen3.5-0.8b", { keepLoaded: true });
}

Preload methods on the provider:

MethodDescription
gerbil.preload(modelId, opts?)Preload model (with optional caching)
gerbil.isCached(modelId)Check if model is in disk cache

Model Options

OptionTypeDefaultDescription
thinkingbooleanfalseEnable chain-of-thought reasoning
maxTokensnumber256Maximum tokens to generate
temperaturenumber0.7Sampling temperature (0-2)
topPnumber0.9Nucleus sampling threshold
topKnumber50Top-k sampling

Full Next.js Example

Complete example with API route and React component:

app/api/chat/route.ts
01// app/api/chat/route.ts
02import { streamText } from "ai";
03import { gerbil } from "@tryhamster/gerbil/ai";
04
05export async function POST(req: Request) {
06 const { messages } = await req.json();
07
08 const result = streamText({
09 model: gerbil("qwen3.5-0.8b"),
10 messages,
11 });
12
13 return result.toDataStreamResponse();
14}
app/page.tsx
01// app/page.tsx
02"use client";
03import { useChat } from "ai/react";
04
05export default function Chat() {
06 const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat();
07
08 return (
09 <div className="flex flex-col h-screen p-4">
10 <div className="flex-1 overflow-auto space-y-4">
11 {messages.map((m) => (
12 <div key={m.id} className={m.role === "user" ? "text-right" : ""}>
13 <span className="font-bold">{m.role}:</span> {m.content}
14 </div>
15 ))}
16 </div>
17 <form onSubmit={handleSubmit} className="flex gap-2 mt-4">
18 <input
19 value={input}
20 onChange={handleInputChange}
21 placeholder="Say something..."
22 className="flex-1 p-2 border rounded"
23 />
24 <button type="submit" disabled={isLoading}>
25 {isLoading ? "..." : "Send"}
26 </button>
27 </form>
28 </div>
29 );
30}

Works with AI SDK Tools

Gerbil works seamlessly with ai-sdk-tools.dev for multi-agent orchestration, state management, and artifact streaming:

agent.ts
01import { createAgent } from "ai-sdk-tools";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const agent = createAgent({
05 model: gerbil("qwen3.5-0.8b"),
06 tools: {
07 // Your tools here
08 },
09});
10
11const result = await agent.run("Help me plan a trip to Japan");

Specification

Gerbil implements the following AI SDK v5 interfaces:

InterfacePurposeMethod
LanguageModelV2Text generationgerbil(modelId)
EmbeddingModelV2Embeddingsgerbil.embedding()