Vercel AI SDK

Name: Gerbil
Author: Gerbil

Use Gerbil as a drop-in provider for the Vercel AI SDK. Full compatibility with generateText, streamText, generateObject, generateSpeech, and transcribe.

Recommended: The AI SDK integration is the best way to use Gerbil in production apps. It provides streaming, structured output, and tool calling out of the box.

Node-side integration. The @tryhamster/gerbil/ai provider runs on the server-side Gerbil class. For in-browser, GPU-accelerated inference — including audio — use the native WebGPU engine instead.

Installation

Terminal

npm install @tryhamster/gerbil ai

Quick Start

basic.ts

01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text } = await generateText({
05  model: gerbil("qwen3.5-0.8b"),
06  prompt: "Write a haiku about TypeScript",
07});
08
09console.log(text);

Streaming Responses

Stream text token by token for real-time UIs:

streaming.ts

01import { streamText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const stream = streamText({
05  model: gerbil("qwen3.5-0.8b"),
06  prompt: "Explain quantum computing",
07});
08
09for await (const chunk of stream.textStream) {
10  process.stdout.write(chunk);
11}
12
13// Get final result
14const result = await stream;
15console.log("\nTokens:", result.usage.totalTokens);

Thinking Mode

Enable chain-of-thought reasoning with Qwen3 models:

thinking.ts

01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text, experimental_providerMetadata } = await generateText({
05  model: gerbil("qwen3.5-0.8b", { thinking: true }),
06  prompt: "What is 127 × 43? Show your work.",
07});
08
09// Access the thinking process
10const thinking = experimental_providerMetadata?.gerbil?.thinking;
11console.log("Reasoning:", thinking);
12console.log("Answer:", text);

Structured Output

Generate type-safe structured data with Zod schemas:

structured.ts

01import { generateObject } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03import { z } from "zod";
04
05const { object } = await generateObject({
06  model: gerbil("qwen3.5-0.8b"),
07  schema: z.object({
08    name: z.string(),
09    age: z.number(),
10    city: z.string(),
11    interests: z.array(z.string()),
12  }),
13  prompt: "Extract: John is 32, lives in NYC, loves hiking and photography",
14});
15
16console.log(object);
17// { name: "John", age: 32, city: "NYC", interests: ["hiking", "photography"] }

Tool Calling

Let the model call functions to accomplish tasks:

tools.ts

01import { generateText, tool } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03import { z } from "zod";
04
05const { text, toolCalls } = await generateText({
06  model: gerbil("qwen3.5-0.8b"),
07  prompt: "What's the weather in San Francisco?",
08  tools: {
09    getWeather: tool({
10      description: "Get weather for a city",
11      parameters: z.object({
12        city: z.string(),
13      }),
14      execute: async ({ city }) => {
15        return `Weather in ${city}: 72°F, sunny`;
16      },
17    }),
18  },
19});
20
21console.log("Tool calls:", toolCalls);
22console.log("Response:", text);

System Prompts

system.ts

01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text } = await generateText({
05  model: gerbil("qwen3.5-0.8b"),
06  system: "You are a helpful coding assistant. Be concise.",
07  prompt: "How do I reverse a string in JavaScript?",
08});

Multi-turn Conversations

messages.ts

01import { generateText } from "ai";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const { text } = await generateText({
05  model: gerbil("qwen3.5-0.8b"),
06  messages: [
07    { role: "user", content: "My name is Alice" },
08    { role: "assistant", content: "Hello Alice! Nice to meet you." },
09    { role: "user", content: "What's my name?" },
10  ],
11});
12
13console.log(text); // "Your name is Alice!"

Speech & Audio

Speech runs on the native WebGPU engine, not through the AI SDK provider. Text-to-speech uses Kani-TTS-2 via engine.speak(), and speech-to-text uses Moonshine via MoonshineSTT — both running on-device on WebGPU. See the Text-to-Speech and Speech-to-Text docs.

Custom Provider Configuration

Create a custom provider with specific settings:

custom-provider.ts

01import { createGerbil } from "@tryhamster/gerbil/ai";
02import { generateText } from "ai";
03
04// Create a custom provider
05const local = createGerbil({
06  device: "gpu",        // "auto" | "gpu" | "cpu"
07  dtype: "q4",          // "q4" | "q8" | "fp16" | "fp32"
08  cacheDir: "./models", // Custom cache directory
09});
10
11// Text generation
12const { text } = await generateText({
13  model: local("qwen3.5-0.8b", { thinking: true }),
14  prompt: "Write a poem",
15});

Model Preloading & Caching

Download models ahead of time so users don't wait on first use. The provider maintains a model cache — preloaded models are automatically reused across all subsequent calls.

preload.ts

01import { gerbil } from "@tryhamster/gerbil/ai";
02import { generateText } from "ai";
03
04// Preload once at app startup
05await gerbil.preload("qwen3.5-0.8b", { 
06  keepLoaded: true,  // Keep in memory
07  onProgress: (p) => console.log(p.status, p.progress),
08});
09
10// All subsequent calls reuse the cached model instance - instant!
11const { text } = await generateText({
12  model: gerbil("qwen3.5-0.8b"),
13  prompt: "Hello!",
14});
15
16// This also uses the same cached instance
17const { text: text2 } = await generateText({
18  model: gerbil("qwen3.5-0.8b"),
19  prompt: "Another prompt",
20});

keepLoaded Option

Value	Behavior	Load Time
false	Download → dispose → loads from disk later	~1-2s
true ⭐	Download → cache instance → reuse everywhere	0ms

keep-loaded.ts

// Download only (default) - frees RAM, loads from disk later (~1-2s)
await gerbil.preload("qwen3.5-0.8b");

// Keep in memory - instant inference, model cached for reuse
await gerbil.preload("qwen3.5-0.8b", { keepLoaded: true });

Cache Checking

Check if a model is already cached before downloading:

cache-check.ts

// Check disk cache
const isCached = await gerbil.isCached("qwen3.5-0.8b");

if (!isCached) {
  await gerbil.preload("qwen3.5-0.8b", { keepLoaded: true });
}

Preload methods on the provider:

Method	Description
gerbil.preload(modelId, opts?)	Preload model (with optional caching)
gerbil.isCached(modelId)	Check if model is in disk cache

Model Options

Option	Type	Default	Description
thinking	boolean	false	Enable chain-of-thought reasoning
maxTokens	number	256	Maximum tokens to generate
temperature	number	0.7	Sampling temperature (0-2)
topP	number	0.9	Nucleus sampling threshold
topK	number	50	Top-k sampling

Full Next.js Example

Complete example with API route and React component:

app/api/chat/route.ts

01// app/api/chat/route.ts
02import { streamText } from "ai";
03import { gerbil } from "@tryhamster/gerbil/ai";
04
05export async function POST(req: Request) {
06  const { messages } = await req.json();
07
08  const result = streamText({
09    model: gerbil("qwen3.5-0.8b"),
10    messages,
11  });
12
13  return result.toDataStreamResponse();
14}

app/page.tsx

01// app/page.tsx
02"use client";
03import { useChat } from "ai/react";
04
05export default function Chat() {
06  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat();
07
08  return (
09    <div className="flex flex-col h-screen p-4">
10      <div className="flex-1 overflow-auto space-y-4">
11        {messages.map((m) => (
12          <div key={m.id} className={m.role === "user" ? "text-right" : ""}>
13            <span className="font-bold">{m.role}:</span> {m.content}
14          </div>
15        ))}
16      </div>
17      <form onSubmit={handleSubmit} className="flex gap-2 mt-4">
18        <input
19          value={input}
20          onChange={handleInputChange}
21          placeholder="Say something..."
22          className="flex-1 p-2 border rounded"
23        />
24        <button type="submit" disabled={isLoading}>
25          {isLoading ? "..." : "Send"}
26        </button>
27      </form>
28    </div>
29  );
30}

Works with AI SDK Tools

Gerbil works seamlessly with ai-sdk-tools.dev for multi-agent orchestration, state management, and artifact streaming:

agent.ts

01import { createAgent } from "ai-sdk-tools";
02import { gerbil } from "@tryhamster/gerbil/ai";
03
04const agent = createAgent({
05  model: gerbil("qwen3.5-0.8b"),
06  tools: {
07    // Your tools here
08  },
09});
10
11const result = await agent.run("Help me plan a trip to Japan");

Specification

Gerbil implements the following AI SDK v5 interfaces:

Interface	Purpose	Method
LanguageModelV2	Text generation	gerbil(modelId)
EmbeddingModelV2	Embeddings	gerbil.embedding()