API

En slutpunkt. OpenAI-kompatibel.

Strömma completions från Llana 3.2 med den OpenAI-klient du redan har. Ta med din egen SDK; vi spelar med.

§ I — Quickstart
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.kapllan.ai/v1",
  apiKey: process.env.KAPLLAN_API_KEY,
});

const stream = await client.chat.completions.create({
  model: "llana-3.2",
  stream: true,
  messages: [{ role: "user", content: "Why does ice float?" }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
§ II — Endpoints
POST/v1/chat/completionsOpenAI-compatible chat. Streaming optional.
POST/v1/completionsLegacy text completions. Use chat unless you have a reason.
POST/v1/embeddingsMultimodal embeddings via Llana-VL.
GET/v1/modelsList available model IDs and context windows.
§ III — Models
llana-3.2128K

Flagship reasoning model. Default choice.

llana-3.2-fast32K

Distilled, lower latency, ~80% of full quality.

llana-coder128K

Code-specialized variant. SWE-bench tuned.

llana-vl32K

Vision-language. Charts, diagrams, scans.

§ IV — Pricing
ModelInput / 1MOutput / 1M
llana-3.2$3.00$15.00
llana-3.2-fast$0.50$2.00
llana-coder$3.00$15.00
llana-vl$2.00$8.00