NOMIRA

Know what your AI actually costs.

Cost forensics for AI coding assistants — open source, privacy-first, multi-provider. Built for the only honest opening in a crowded market: not "how much did we spend?", but "why did this cost so much?"

See the problem Live demo Deploy GitHub
01 · the problem

Two developers on the same tools burn 5× differently — and nobody can explain why.

Teams run Claude Code, Codex, and Cursor all day. A single action can cost surprisingly much. Invoices show totals. Assistants show raw usage. Neither explains where the money went.

"We pay $200 a month per developer for Claude Code, plus our Anthropic API bill. We have no idea which features, agents, or developers drive that cost — or whether we're even getting our plan's worth."

Where did the money go?

Cost per feature, workflow, customer, developer.

Why did this turn cost so much?

Cache writes, re-read files, retry loops, sub-agent fan-out.

Am I getting my plan's worth?

Subscription value vs API-equivalent. Are we over-paying or maxing out?

02 · the truth

The cost isn't where you think.

On a real local dataset (44 Claude Code sessions, 34,872 events), the cost a naive tool reports is wrong by an order of magnitude.

True cost (cache-aware)
$10,318
across 44 local sessions
What a naive tool reports
~$970
visible input + output only
Of cost is hidden in cache
91%
cache-read + cache-write tokens

In one real call: 6 input + 119 output tokens looks like ~$0.003. The same call carried 22,830 cache-read + 28,134 cache-write tokens — the actual cost was $0.296. ~99% lives in cache.

03 · the market

"Cost aggregation" is taken. Forensics for coding assistants isn't.

Built forAnswersOpen / self-hostSees contentCache-accurate
Langfuse / Heliconeproduction appswhat happened in the callyesvia trace / proxypartial
CloudZero / Finout / AI Vyuhenterprise FinOpswhat the org spentnoinvoices / keysinvoice-level
OpenAI / Anthropic dashboardsper-account billingtokens & totalnonopartial
Nomiracoding assistantsWHY this cost so muchyesnevercore

We don't claim the market is empty. It isn't. Our wedge is the specific intersection — forensics + coding assistants + privacy + accuracy — that nobody else stands in.

04 · what we are

Three pillars.

Forensic

Not "how much this month" — why this turn, conversation, or developer cost what it did. Plus waste signals: retry loops, repeated reads, cache rebuilt instead of reused, sub-agent fan-out.

Accurate

Cache-aware, multi-provider (Anthropic, OpenAI/Codex, Gemini), versioned rates with live OpenRouter feed, reconcilable against the real provider invoice. Unknown models are flagged — never priced from a guess.

Private

Token counts and business tags only — never prompt or response content. Schema-enforced. Self-hosted. No proxy in your request path. Closed FinOps SaaS can't honestly say this.

05 · the product

One CLI. One self-hosted dashboard. An SDK for your own app.

Individual — 30 seconds

Read logs already on disk. Nothing leaves your machine.

# analyze your newest Claude Code session
pip install nomira
nomira
nomira --all
nomira --compare --by-project

Team — 5 minutes

One Docker host. Everyone else just --ship.

# admin (once)
docker compose up -d

# each developer
nomira --ship --remote https://.../ingest \
       --token $NOMIRA_TOKEN \
       --developer alice

Your own app — 3 lines

Cost-tag every LLM call from your product code.

import nomira
nomira.track(model="claude-opus-4-7",
             usage=resp.usage,
             feature="doc-summary",
             user_tier="free")

Privacy invariant

The events table has no columns for prompt, response, content, text, or messages — by schema. The collector rejects any event that tries to carry them. Verified by test.

06 · proof, on real data

The optimizer drill-down, on a single project.

butt-dial — $5,016 API-equivalent across 16,543 calls. 49% of total AI value. Drill-down finds the money:

Top tools by attributed cost

Bash
$985 · 3,462
Edit
$452 · 1,722
Read
$378 · 1,571
Grep
$242 · 995
Sub-agents (Task)
$21 · 17

Recommendations (rough $)

  • router.ts read 172× — pin / .claudeignore. ~$36 saved.
  • register-router.ts read 131× — same. ~$24.
  • Bash dominates (3,475 calls). Pipe heavy outputs through head/grep. ~$199.
  • 17 sub-agent runs — batch or cheaper model. ~$6.

This is "why did this cost so much" — not "what was the total." It's the difference between a bill and a verdict.

07 · the regimes

Two cost worlds — modeled explicitly.

Subscription (Claude Code, Codex)

Fixed monthly price + token limits + top-ups. The dollar shown is the API-equivalent value of your usage — not a bill.

Plan price$200/mo
API-equivalent (this dataset)$10,318
Value vs plan52×

"Am I getting my plan's worth?" / "How much allowance did I burn?"

API / token-billed (Anthropic, OpenAI, Gemini)

The dollar shown is the bill. The true source is the provider's usage/cost API.

Computed (API regime)— this build —
Billed (provider invoice)imported via CSV / Cost API
Delta±N% on the /reconcile page

Reconciliation is the auditor's final answer.

08 · architecture

Privacy by construction — three ingestion modes, one schema.

Local readers

Read existing logs on disk (Claude Code, Codex). Zero egress, zero install for teammates. The individual wedge.

Usage-only collector (recommended core)

Lightweight SDK / shipper sends counts + tags only to a self-hosted endpoint. Off the critical path. The team product.

Self-hosted gateway (optional)

For teams that prefer routing over instrumentation. Only ever self-hosted, never cloud middleman — content stays in your infra.

Wire formatprovider · model · token counts · business tags · timestamps
Forbiddenno prompt, no response, no content — by schema
True sourcethe provider's invoice (Anthropic Cost API + CSV import)
09 · why this is defensible

The moat isn't the charts.

Charts are copyable in a sprint. Being right about cost — for the coding assistants nobody else instruments — without ever touching content, isn't.

10 · GTM & business model

Open source first. Commercial on proven demand.

Phase A — Community (now)

  • MIT license · GitHub-first · stdlib-only · pip install nomira
  • Self-hosted dashboard via one Docker compose
  • Multi-provider, cache-aware cost engine + invoice reconciliation
  • Project & developer drill-down + recommendations
  • SDK for your own app

Phase B — Commercial (only after 10 teams ask)

  • Managed cloud (for those who don't want to self-host)
  • Unit economics layer (cost ↔ revenue / customer)
  • Reconciliation & compliance reporting (SSO, RBAC, PDF)
  • Forecasting & anomaly alerting
  • Industry benchmarking

Discipline: no paid feature is built before ten teams have asked for it. The open core stays genuinely useful.

11 · status

Already shipped (Stage 0).

Sources working
3
Claude Code · Codex · Cursor (probe)
Providers priced
3
Anthropic · OpenAI · Google
Tests passing
10/10
incl. privacy guard
Real cost catch
10×
vs a naive calculator on real data

Built & live

CLI · Team dashboard (self-hosted) · Optimizer drill-down per project · Developer leaderboard & drill-down · Reconciliation (CSV + Cost API) · Time-window controls · Dynamic pricing (OpenRouter + override + fallback) · SDK · Dockerized · Auth (bearer + Basic) · 18 docs rendered to HTML.

The ask

Three to five design partners. Real Claude Code / Codex teams who'll run it for a month, give honest feedback, and tell us whether "why did this cost so much" is the question they've been waiting to ask.

NOMIRA

The auditor for AI coding spend.

Open source · privacy-first · multi-provider · cache-aware. Built for the question your invoice can't answer.

nomiraai.com demo docs contact github (soon)

Built honestly — no "empty market" claims. The opening we stand in is real, specific, and defensible.

01 / 13
NOMIRA · v0.1 · ↑ ↓ ← → space