Cascadia×Otari
Otari × Cascadia · gateway + on-prem mesh

Real agentic workflows.
Metered by Otari. Every token on-prem.

Four regulated-industry agents running live through Otari — Mozilla.ai’s self-hostable, OpenAI-compatible gateway — onto a Cascadia mesh: open-weight 8B-class models spread across a room of Intel AI PCs, one pipeline-parallel across two machines. Otari authenticates a virtual key, enforces a budget, and meters every call; each step still shows the serving node, latency, and signed receipt.

Healthcare
Clinical referral triage
extract → triage criteria → urgency → ICD-10 coding assist → schedule → SBAR + letters → safety gate
Why on-prem: PHI never leaves the premises
Run the demo
Finance
KYC onboarding + AML screening
extract → watchlist screen → adjudicate hits → adverse media → risk rules → MLRO memo → policy gate
Why on-prem: BSA/AML · SAR-adjacent confidentiality
Run the demo
Finance
Financial model builder
extract assumptions → DCF + scenarios + Monte Carlo (deterministic) → IC valuation memo → QA gate
Why on-prem: Deal data stays on the premises
Run the demo
Government
FOIA redaction
intake → responsiveness → PII detection → exemption review → redact + verify (zero-leak) → response letter → gate
Why on-prem: IRS Pub 1075 — FTI on-prem is the compliant default
Run the demo
Otariusage & budget/v1/usage · /v1/budgets

Pulled live from the Otari gateway’s own ledger — every call below was authenticated against a virtual key, priced, and metered by the gateway before the tokens were served on-prem by Cascadia.

loading usage…
The fleet
qwen3-8bsingle nodeextraction · classification · adjudication · QA gates
llama-8b-2stagepipeline-parallel × 2 AI PCslong-form synthesis, streamed live off the chain
phi-3.5-minisingle nodeJSON repair rung · gate fallback