Nomira is open-source cost forensics for AI coding assistants. It reads the logs you already have and explains where the money went — cache-aware, multi-provider, and without your prompts ever leaving your machine.
Most tools price only visible input/output. Real spend lives in cache reads, cache writes (5-min vs 1-hour), and reasoning tokens — priced differently by every provider. Getting this right is the whole point. We call it the auditor.
Not "how much this month" — why this turn, conversation, or developer cost what it did. Plus waste signals: retry loops, repeated reads, cache rebuilt instead of reused.
Cache-aware, multi-provider (Anthropic, OpenAI/Codex, Gemini), versioned rates, reconcilable against your invoice. Unknown models are flagged, never guessed.
Token counts and business tags only — never prompt or response content. Schema-enforced. Self-hosted. No proxy in your request path.
python nomira.py reads a local Claude Code or Codex session and shows top-cost turns, the cache gap, and waste.
--compare --by-project ranks efficiency: $/call, $/1k output, cache-reuse %. Why does A burn 5× B?
--ship then --serve for a self-hosted dashboard. Usage-only events; content stays home.
Python 3.9+, standard library only. No account, no install, no data egress.
| Langfuse / Helicone | CloudZero / Finout | Nomira | |
|---|---|---|---|
| Built for | production apps | enterprise FinOps | coding assistants |
| Answers | what happened in the call | what the org spent | why THIS cost so much |
| Open / self-host | yes | no | yes |
| Sees your content | via proxy/trace | via invoices | never |
| Cache-accurate | partial | invoice-level | core |
We don't claim the market is empty — it isn't. Our wedge is forensics + coding assistants + privacy + accuracy.