Long chains without drift.
Llana sustains multi-step arguments over thousands of tokens, re-reading its own premises when it needs to.
Llana is Kapllan's flagship reasoning model — built for long-horizon problems where the shape of a good answer is not obvious. It reads carefully, shows its working, and prefers being right to being fast.
Llana sustains multi-step arguments over thousands of tokens, re-reading its own premises when it needs to.
128K context with structural awareness — call graphs, test intent, the difference between a bug and a choice.
Calibrated uncertainty — Llana will decline, hedge, or ask a clarifying question before it invents an answer.
A tool-use interface that treats every action as revocable — Llana narrates its intent before it takes one.
Charts, diagrams, scanned pages, handwritten notes — Llana reads images with the same care it brings to text.
Every refusal comes with a justification you can argue with — not a flat wall. Transparency is a design goal, not a patch.
| Benchmark | What it measures | Llana 3.2 | Prior SOTA |
|---|---|---|---|
| ▸MMLU-Pro | Multi-discipline reasoning | 84.1 | 81.3 |
| ▸GPQA-Diamond | Graduate science Q&A | 71.8 | 68.0 |
| ▸SWE-bench Verified | Real-world coding tasks | 62.4 | 58.9 |
| HumanEval | Code synthesis | 94.7 | 94.2 |
| ▸MATH-500 | Competition mathematics | 88.5 | 85.1 |
| AIME 2025 | Olympiad-level problems | 54.2 | 52.0 |
"We do not want a model that speaks confidently about everything. We want one that knows the shape of its own ignorance."— From the Llana 3 technical report
We release models when their behavior is understood, not when a demo looks clean. We would rather publish a late, calibrated model than an early, charismatic one.
Every capability claim is tied to a public evaluation, a dataset, or a paper. If we cannot describe how we measured it, we do not ship it.
Research that doesn't reduce to a screenshot is still research. A good question is a legitimate deliverable. We pay for depth.
Free during public beta. API access for researchers and developers. Enterprise pilots on request.