NOMIRA

Know what your AI actually costs.

Cost forensics for AI coding assistants — open source, privacy-first, multi-provider. Built for the only honest opening in a crowded market: not "how much did we spend?", but "why did this cost so much?"

See the problem Live demo Deploy GitHub

01 · the problem

Two developers on the same tools burn 5× differently — and nobody can explain why.

Teams run Claude Code, Codex, and Cursor all day. A single action can cost surprisingly much. Invoices show totals. Assistants show raw usage. Neither explains where the money went.

"We pay $200 a month per developer for Claude Code, plus our Anthropic API bill. We have no idea which features, agents, or developers drive that cost — or whether we're even getting our plan's worth."

Where did the money go?

Cost per feature, workflow, customer, developer.

Why did this turn cost so much?

Cache writes, re-read files, retry loops, sub-agent fan-out.

Am I getting my plan's worth?

Subscription value vs API-equivalent. Are we over-paying or maxing out?

02 · the truth

The cost isn't where you think.

On a real local dataset (44 Claude Code sessions, 34,872 events), the cost a naive tool reports is wrong by an order of magnitude.

True cost (cache-aware)

$10,318

across 44 local sessions

What a naive tool reports

~$970

visible input + output only

Of cost is hidden in cache

91%

cache-read + cache-write tokens

In one real call: 6 input + 119 output tokens looks like ~$0.003. The same call carried 22,830 cache-read + 28,134 cache-write tokens — the actual cost was $0.296. ~99% lives in cache.

03 · the market

"Cost aggregation" is taken. Forensics for coding assistants isn't.

	Built for	Answers	Open / self-host	Sees content	Cache-accurate
Langfuse / Helicone	production apps	what happened in the call	yes	via trace / proxy	partial
CloudZero / Finout / AI Vyuh	enterprise FinOps	what the org spent	no	invoices / keys	invoice-level
OpenAI / Anthropic dashboards	per-account billing	tokens & total	no	no	partial
Nomira	coding assistants	WHY this cost so much	yes	never	core

We don't claim the market is empty. It isn't. Our wedge is the specific intersection — forensics + coding assistants + privacy + accuracy — that nobody else stands in.

04 · what we are

Three pillars.

Forensic

Not "how much this month" — why this turn, conversation, or developer cost what it did. Plus waste signals: retry loops, repeated reads, cache rebuilt instead of reused, sub-agent fan-out.

Accurate

Cache-aware, multi-provider (Anthropic, OpenAI/Codex, Gemini), versioned rates with live OpenRouter feed, reconcilable against the real provider invoice. Unknown models are flagged — never priced from a guess.

Private

Token counts and business tags only — never prompt or response content. Schema-enforced. Self-hosted. No proxy in your request path. Closed FinOps SaaS can't honestly say this.

05 · the product

One CLI. One self-hosted dashboard. An SDK for your own app.

Individual — 30 seconds

Read logs already on disk. Nothing leaves your machine.

# analyze your newest Claude Code session
pip install nomira
nomira
nomira --all
nomira --compare --by-project

Team — 5 minutes

One Docker host. Everyone else just --ship.

# admin (once)
docker compose up -d

# each developer
nomira --ship --remote https://.../ingest \
       --token $NOMIRA_TOKEN \
       --developer alice

Your own app — 3 lines

Cost-tag every LLM call from your product code.

import nomira
nomira.track(model="claude-opus-4-7",
             usage=resp.usage,
             feature="doc-summary",
             user_tier="free")

Privacy invariant

The events table has no columns for prompt, response, content, text, or messages — by schema. The collector rejects any event that tries to carry them. Verified by test.

06 · proof, on real data

The optimizer drill-down, on a single project.

butt-dial — $5,016 API-equivalent across 16,543 calls. 49% of total AI value. Drill-down finds the money:

Top tools by attributed cost

Bash

$985 · 3,462

Edit

$452 · 1,722

Read

$378 · 1,571

Grep

$242 · 995

Sub-agents (Task)

$21 · 17

Recommendations (rough $)

router.ts read 172× — pin / .claudeignore. ~$36 saved.
register-router.ts read 131× — same. ~$24.
Bash dominates (3,475 calls). Pipe heavy outputs through head/grep. ~$199.
17 sub-agent runs — batch or cheaper model. ~$6.

This is "why did this cost so much" — not "what was the total." It's the difference between a bill and a verdict.

07 · the regimes

Two cost worlds — modeled explicitly.

Subscription (Claude Code, Codex)

Fixed monthly price + token limits + top-ups. The dollar shown is the API-equivalent value of your usage — not a bill.

Plan price$200/mo

API-equivalent (this dataset)$10,318

Value vs plan52×

"Am I getting my plan's worth?" / "How much allowance did I burn?"

API / token-billed (Anthropic, OpenAI, Gemini)

The dollar shown is the bill. The true source is the provider's usage/cost API.

Computed (API regime)— this build —

Billed (provider invoice)imported via CSV / Cost API

Delta±N% on the /reconcile page

Reconciliation is the auditor's final answer.

08 · architecture

Privacy by construction — three ingestion modes, one schema.

Local readers

Read existing logs on disk (Claude Code, Codex). Zero egress, zero install for teammates. The individual wedge.

Usage-only collector (recommended core)

Lightweight SDK / shipper sends counts + tags only to a self-hosted endpoint. Off the critical path. The team product.

Self-hosted gateway (optional)

For teams that prefer routing over instrumentation. Only ever self-hosted, never cloud middleman — content stays in your infra.

Wire formatprovider · model · token counts · business tags · timestamps

Forbiddenno prompt, no response, no content — by schema

True sourcethe provider's invoice (Anthropic Cost API + CSV import)

09 · why this is defensible

The moat isn't the charts.

Accuracy. Cache + regime + multi-provider + version matching + invoice reconciliation. Hard to get right, hard to keep right — the auditor's brand.
Privacy posture. Closed FinOps SaaS structurally cannot promise content stays local. We can — and our schema enforces it.
Canonical schema. The feature / workflow / tier / customer / environment dimensions, if widely adopted, are the actual "standard" — more than the code.
Trust + mindshare. Open source + GitHub-first + reconciled numbers means a CFO can trust the verdict. That's slow to copy.

Charts are copyable in a sprint. Being right about cost — for the coding assistants nobody else instruments — without ever touching content, isn't.

10 · GTM & business model

Open source first. Commercial on proven demand.

Phase A — Community (now)

MIT license · GitHub-first · stdlib-only · pip install nomira
Self-hosted dashboard via one Docker compose
Multi-provider, cache-aware cost engine + invoice reconciliation
Project & developer drill-down + recommendations
SDK for your own app

Phase B — Commercial (only after 10 teams ask)

Managed cloud (for those who don't want to self-host)
Unit economics layer (cost ↔ revenue / customer)
Reconciliation & compliance reporting (SSO, RBAC, PDF)
Forecasting & anomaly alerting
Industry benchmarking

Discipline: no paid feature is built before ten teams have asked for it. The open core stays genuinely useful.

11 · status

Already shipped (Stage 0).

Sources working

Claude Code · Codex · Cursor (probe)

Providers priced

Anthropic · OpenAI · Google

Tests passing

10/10

incl. privacy guard

Real cost catch

10×

vs a naive calculator on real data

Built & live

CLI · Team dashboard (self-hosted) · Optimizer drill-down per project · Developer leaderboard & drill-down · Reconciliation (CSV + Cost API) · Time-window controls · Dynamic pricing (OpenRouter + override + fallback) · SDK · Dockerized · Auth (bearer + Basic) · 18 docs rendered to HTML.

The ask

Three to five design partners. Real Claude Code / Codex teams who'll run it for a month, give honest feedback, and tell us whether "why did this cost so much" is the question they've been waiting to ask.

NOMIRA

The auditor for AI coding spend.

Open source · privacy-first · multi-provider · cache-aware. Built for the question your invoice can't answer.

nomiraai.com demo docs contact github (soon)

Built honestly — no "empty market" claims. The opening we stand in is real, specific, and defensible.