Overview

Nomira — cost forensics for AI coding assistants

What Nomira is, in one screen.

Install: pip install nomira · Self-host the team dashboard: docker compose up -d

· Full deployment guide: docs/DEPLOY.md

Stage 0 prototype. Answers a question no other tool does:

"Why did this conversation / turn cost so much?" — read straight from your

local Claude Code transcripts, with cache-aware (accurate) pricing.

Helicone, Langfuse and the FinOps tools all measure production app spend. Nobody explains coding-assistant spend. Nomira does — and shows you the part a naive token calculator misses: in real sessions ~90% of the cost is cache tokens, not the visible input/output.

Run it (no install, Python 3.9+)

python nomira.py                     # analyze your newest Claude Code session
python nomira.py --all               # aggregate across every local transcript
python nomira.py --compare           # efficiency comparison across sessions
python nomira.py --compare --by-project   # ...grouped by project
python nomira.py --list              # list transcripts it can see
python nomira.py path/to/session.jsonl --json
python nomira.py --regime api        # force "this is the bill" framing
python nomira.py --source codex      # analyze OpenAI Codex sessions (~/.codex/sessions)
python nomira.py --source codex --all

Reads Claude Code (~/.claude/projects) and OpenAI Codex (~/.codex/sessions) today. The cost engine is multi-provider (Anthropic verified; OpenAI/Codex + Gemini included), each with its own cache pricing. For Codex it also shows your subscription allowance used (e.g. "23% of your 5h window").

Team dashboard (self-hosted, usage-only)

python nomira.py --ship --developer alice   # write usage-only events to ~/.nomira/usage.db
python nomira.py --serve                     # http://127.0.0.1:8787

Each teammate ships; the dashboard shows cost by developer, project, model, and trend. Only token counts + tags are stored — never prompt/response content (enforced by the schema; the ingest endpoint rejects content). See PRIVACY.md.

Two cost worlds

Subscription (Claude Code, Codex, Cursor): fixed monthly + limits + top-ups. The dollar shown is the API-equivalent value of your usage — "am I getting my plan's worth", not a bill.
API / token-based (Claude API, Gemini, OpenAI): the dollar shown is the bill.

Nomira labels which world it's showing. The cost engine is multi-provider (Anthropic exact today; OpenAI/Google scaffolded) with each provider's own cache pricing.

Reads ~/.claude/projects/<project>/<session>.jsonl. Nothing leaves your machine. No SDK, no proxy, no server.

What it tells you

Total cost, split by component (input / output / cache-read / cache-write 5m & 1h / tools).
The most expensive turns, and why each one was expensive.
Waste signals: turns that kept rebuilding cache, repeated identical tool calls, repeated file reads.

Pricing (dynamic, multi-source)

Rates resolve through a source chain, best first:

Local override — ~/.nomira/prices.local.json (you edit; closest to "true").
OpenRouter — live per-token prices for 350+ models, cached locally

(python nomira.py --update-prices, auto-refreshed daily).

Built-in table — published-rate fallback in nomira/pricing.py.

Reports show which source priced the run. Matching is version-aware (claude-opus-4-7 → Opus 4.7, not an older or -fast variant). The true source — reconciliation against your real provider invoice — is the remaining goal.

Tests

python tests/test_cost_engine.py

Next →Product