Cost forensics for AI coding assistants — open source, privacy-first, multi-provider. Built for the only honest opening in a crowded market: not "how much did we spend?", but "why did this cost so much?"
Teams run Claude Code, Codex, and Cursor all day. A single action can cost surprisingly much. Invoices show totals. Assistants show raw usage. Neither explains where the money went.
Cost per feature, workflow, customer, developer.
Cache writes, re-read files, retry loops, sub-agent fan-out.
Subscription value vs API-equivalent. Are we over-paying or maxing out?
On a real local dataset (44 Claude Code sessions, 34,872 events), the cost a naive tool reports is wrong by an order of magnitude.
In one real call: 6 input + 119 output tokens looks like ~$0.003. The same call carried 22,830 cache-read + 28,134 cache-write tokens — the actual cost was $0.296. ~99% lives in cache.
| Built for | Answers | Open / self-host | Sees content | Cache-accurate | |
|---|---|---|---|---|---|
| Langfuse / Helicone | production apps | what happened in the call | yes | via trace / proxy | partial |
| CloudZero / Finout / AI Vyuh | enterprise FinOps | what the org spent | no | invoices / keys | invoice-level |
| OpenAI / Anthropic dashboards | per-account billing | tokens & total | no | no | partial |
| Nomira | coding assistants | WHY this cost so much | yes | never | core |
We don't claim the market is empty. It isn't. Our wedge is the specific intersection — forensics + coding assistants + privacy + accuracy — that nobody else stands in.
Not "how much this month" — why this turn, conversation, or developer cost what it did. Plus waste signals: retry loops, repeated reads, cache rebuilt instead of reused, sub-agent fan-out.
Cache-aware, multi-provider (Anthropic, OpenAI/Codex, Gemini), versioned rates with live OpenRouter feed, reconcilable against the real provider invoice. Unknown models are flagged — never priced from a guess.
Token counts and business tags only — never prompt or response content. Schema-enforced. Self-hosted. No proxy in your request path. Closed FinOps SaaS can't honestly say this.
Read logs already on disk. Nothing leaves your machine.
# analyze your newest Claude Code session pip install nomira nomira nomira --all nomira --compare --by-project
One Docker host. Everyone else just --ship.
# admin (once) docker compose up -d # each developer nomira --ship --remote https://.../ingest \ --token $NOMIRA_TOKEN \ --developer alice
Cost-tag every LLM call from your product code.
import nomira nomira.track(model="claude-opus-4-7", usage=resp.usage, feature="doc-summary", user_tier="free")
The events table has no columns for prompt, response, content, text, or messages — by schema. The collector rejects any event that tries to carry them. Verified by test.
butt-dial — $5,016 API-equivalent across 16,543 calls. 49% of total AI value. Drill-down finds the money:
head/grep. ~$199.This is "why did this cost so much" — not "what was the total." It's the difference between a bill and a verdict.
Fixed monthly price + token limits + top-ups. The dollar shown is the API-equivalent value of your usage — not a bill.
"Am I getting my plan's worth?" / "How much allowance did I burn?"
The dollar shown is the bill. The true source is the provider's usage/cost API.
Reconciliation is the auditor's final answer.
Read existing logs on disk (Claude Code, Codex). Zero egress, zero install for teammates. The individual wedge.
Lightweight SDK / shipper sends counts + tags only to a self-hosted endpoint. Off the critical path. The team product.
For teams that prefer routing over instrumentation. Only ever self-hosted, never cloud middleman — content stays in your infra.
feature / workflow / tier / customer / environment dimensions, if widely adopted, are the actual "standard" — more than the code.Charts are copyable in a sprint. Being right about cost — for the coding assistants nobody else instruments — without ever touching content, isn't.
pip install nomiraDiscipline: no paid feature is built before ten teams have asked for it. The open core stays genuinely useful.
CLI · Team dashboard (self-hosted) · Optimizer drill-down per project · Developer leaderboard & drill-down · Reconciliation (CSV + Cost API) · Time-window controls · Dynamic pricing (OpenRouter + override + fallback) · SDK · Dockerized · Auth (bearer + Basic) · 18 docs rendered to HTML.
Three to five design partners. Real Claude Code / Codex teams who'll run it for a month, give honest feedback, and tell us whether "why did this cost so much" is the question they've been waiting to ask.
Open source · privacy-first · multi-provider · cache-aware. Built for the question your invoice can't answer.
Built honestly — no "empty market" claims. The opening we stand in is real, specific, and defensible.