Nomira — Changelog

Changelog

What shipped in each round.

Session 1 — 2026-05-27/28

Research & alignment

Extracted and digested the Nomira deck (11 slides) → docs/references/nomira-deck.md.
Researched the LLM cost / AI-FinOps landscape; verified every deck claim → docs/references/market-research.md.
Wrote deep problem analysis (ANALYSIS.md), strategy (STRATEGY.md), draft spec, and independent conclusions (CONCLUSIONS.md) — all before reading the developer's doc.
Read developer's Financial Aspects of Token Use.docx; wrote a structured comparison (COMPARISON.md). Independent convergence on the deck's flaws; adopted the developer's stronger forensics reframe.

Decision

Pivot: from cost aggregation to cost forensics for AI coding assistants.
Stage 0: a local CLI that reads Claude Code transcripts and explains where the money went. Focus order 1 → 3 → 2.

Built (Stage 0, focus 1 + 3)

nomira/ package (stdlib-only): transcript loader, cache-aware pricing engine (the auditor core), per-turn forensics, waste signals, terminal + JSON report, CLI.
Verified on real data: a 679-call session = $217 (unverified Opus rates), 87% of it cache a naive tool would miss. Across all 44 local transcripts: ~$10,231, 91% cache (naive view would show ~$970 — off ~10×).
Waste signals surface cache-rebuild-heavy turns.
Regression tests (tests/test_cost_engine.py) — 4/4 passing; caught an arithmetic slip in the expected values (the gate working).

Built (continued — multi-provider + regime + focus 2)

Developer raised the two cost regimes: subscription (Claude Code/Codex/Cursor — fixed price + limits + top-up; dollar = API-equivalent shadow value) vs API/token-based (dollar = the bill). Modeled both; report now labels which regime it shows.
Refactored the pricing engine into a generic provider registry with per-provider cache semantics (Anthropic exact; OpenAI cached-input 0.5×/no write cost; Google scaffold) + a NormalizedUsage seam for future source/provider adapters. OpenAI path sanity-checked.
Focus 2 shipped: --compare (by session) and --compare --by-project — efficiency table (cost, $/call, $/1k output, cache-reuse %). On local data: butt-dial priciest project (~$5,008 API-equiv), agents least efficient per output. True dev-vs-dev deferred (needs cross-machine identity).
Tests still 4/4.

Built (continued — pricing verify, Codex adapter, architecture)

(a) Pricing verified vs published rates: Anthropic confirmed (my originals were right), Gemini cache fixed to 0.10×, added GPT-4.1/o3/GPT-5-codex models with per-model cache overrides + per-model verified flags. Report now flags only unverified models.
(b) Codex adapter shipped: reads ~/.codex/sessions, diffs cumulative token totals into per-call usage, captures subscription allowance (used_percent). Added a NormalizedUsage seam so forensics prices any provider. On real local data: 2 Codex sessions, ~$5.73 API-equivalent, 64% cached input, allowance 23% of the 5h window. Regime now follows the source (Codex → subscription).
Architecture design (ARCHITECTURE.md): local readers vs usage-only plugin vs gateway. Recommendation: usage-only plugin as the team core (privacy invariant: counts + tags, never content); gateway only if self-hosted.
GTM/DD backlog added to TODO (branding, site, whitepaper, deck, product desc, user guide, DD-ready product) — what the 5-person gate actually needs.
Tests now 6/6 (added Codex + provider-cache-semantics tests).

Built (continued — "do all": team product, Cursor, readiness package)

Team product (Mode B): store.py (SQLite, no content columns), events.py (usage-only schema + content-rejection guard), ship.py (local → events → store), collector.py (http.server /ingest + /api/summary + server-rendered dashboard). CLI --ship/--serve/--developer/--db/--port. Shipped 34,949 real events ($10,300, 24 projects, both providers) and smoke-tested the server (content ingest rejected with 400).
Cursor adapter: investigated local data — Cursor stores no per-call token usage (0/98 conversations; server-side metering). Adapter reports availability honestly and points to the Cursor Admin API; --source cursor. We do not fabricate costs.
Readiness package: LICENSE (MIT), PRIVACY.md, SECURITY.md, branding, product one-pager, user guide, whitepaper, investor deck, and a self-contained landing page (site/index.html). HTML validated.
Tests now 9/9 (added tests/test_team.py — privacy + store round-trip + idempotency).

Built (continued — dynamic pricing + server running)

Dynamic multi-source pricing (prices.py): live OpenRouter feed (356 models, cached 24h, --update-prices) → local override → published fallback. Per-call source provenance shown in reports ("rate source: openrouter:DATE"). USE_DYNAMIC toggle keeps tests deterministic.
Version-aware matching: claude-opus-4-7 → Opus 4.7 ($5/$25), not Opus 4 ($15/$75) or -fast ($30/$150). Excludes fast/free/preview variants.
Idempotency fix: event keys exclude cost, so re-ship after a rate change won't duplicate. Re-shipped store: 34,872 events, $10,320 (market-sourced).
Dashboard server running at http://127.0.0.1:8787 (verified 200).
Tests 9/9 (pinned to published rates for determinism).

Built (continued — regime/plan correction)

Subscription reframed as a plan, not a bill. Added plans.py (Claude Code Max $200/mo default; configurable). Reports/dashboard now show subscription usage as API-equivalent VALUE vs your plan ("52× value"), explicitly not a bill; value-vs-plan shown only at period scope. Dashboard KPIs: API-equivalent value · Value vs $200/mo plan (52×) · Cache share (91%) · Developers. --plan-price flag.
Dashboard project labels shortened to the folder name with the full path in the tooltip.
Server restarted on :8787 with all changes; tests 9/9.

Built (continued — optimizer drill-down)

Project drill-down shipped. nomira/optimizer.py + /project/<slug> page. Click any project on the dashboard → KPIs (value, vs-plan, sub-agents, Bash), attributed cost by tool, most-repeated file reads with $ saving estimates, sub-agent usage, top sessions, and waste→savings recommendations.
Privacy preserved: optimizer data computed on demand from local transcripts; file paths / tool inputs never enter the team store (schema invariant intact).
Concrete payoff on real data: found a file read 171× in butt-dial — a clear pin / .claudeignore optimization target.
Tests 9/9; server live on :8787.

Built (continued — developer comparison + drill-down)

/compare/developers — efficiency leaderboard: sessions, projects, calls, cost, $/call, $/1k output, cache reuse %. Auto-headline "A spent N× more than B; least efficient X" when 2+ devs ship. Answers Pain-1.
/developer/<name> — drill-down: KPIs (value, share of team total, $/1k output, cache reuse), daily trend, cost by project (linked), cost by model.
Dashboard "Cost by developer" bars are clickable; card header has a compare → link.
Per-developer share-of-value KPI on project pages already; ditto here. Tests 9/9.

Built (continued — six items in one round)

Sharpened optimizer attribution (per-turn cache share across Reads in the same call; top files ranked by attributed cost, not raw frequency).
Per-developer optimizer detail on /developer/<name>, shown only when the dev matches this machine (privacy intact).
Invoice reconciliation — bills table, --reconcile-import (CSV/JSON), --reconcile-fetch (Anthropic Cost API with admin key). Dashboard card + /reconcile daily comparison. Subscription excluded from the comparison (not a bill).
Time-window controls (today/7d/30d/all) on all aggregate views via ?period=.
Remote collector with bearer token (NOMIRA_INGEST_TOKEN) + --host for LAN bind; ship.py --remote URL --token.
SDK nomira.track() for the user's own app — local SQLite or remote POST, privacy invariant verified by test.
Tests 10/10 (added SDK test).

Built (continued — packaging & docs for public deploy)

pyproject.toml + __main__.py: pip install nomira gives a nomira console script.
Dockerfile + docker-compose.yml + .dockerignore: one-command self-hosted team dashboard.
Optional dashboard auth (HTTP Basic via NOMIRA_DASHBOARD_PASSWORD); startup warns if binding non-localhost without it.
tools/build_docs.py renders every docs/*.md to a styled HTML page in site/docs/ (stdlib only) — for non-engineers / DD reviewers. Landing page links to it.
docs/DEPLOY.md: comprehensive deployment guide with onboarding scenarios, hosting recipes (Docker, Fly.io, Render, Cloud Run, bare VM), security posture, and a sizing note.

Built (continued — branded deck)

site/deck.html: 13-slide, self-contained, keyboard-navigable investor/sales deck. Brand-consistent dark+teal palette. Embeds real numbers from this build (cache gap, butt-dial drill-down, 52× plan leverage). Print-friendly. Linked from the marketing landing nav.
Note: the ChatGPT share link provided as additional context didn't expose its content to the fetcher (login shell); the deck was built from project material. Easy to revise once that review's content is pasted.

Built (continued — public web hosting)

.github/workflows/pages.yml: pushes to main re-render docs and publish the site/ folder (landing + deck + docs) to GitHub Pages automatically.
DEPLOY.md gained a "Hosting the public site" section: GH Pages (recommended) · Netlify Drop / Vercel / Cloudflare Pages (drag-and-drop) · own Caddy server. Notes on custom domain + the marketing-vs-dashboard subdomain split.
Verified site/ is fully self-contained — zero external CSS or JS references; drop on any host and it just works.

Session 2 — 2026-06-02

SDK framework adapters (`nomira/integrations/`)

LangChain — NomiraCallbackHandler (lazy import; safe to import without langchain installed). on_llm_end / on_chat_model_end extract token usage from the LLMResult / ChatResult and funnel into nomira.track() with constructor-supplied default tags.
OpenAI — wrap_client(client, **tags) patches chat.completions.create and responses.create in place. Idempotent (rewrap is a no-op). Preserves return values and exceptions. Tolerant of pydantic v1/v2/dict usage shapes and of cached-input details.
Vercel AI SDK — both flavors: a trackNomira TypeScript helper (embedded as a string for the wiki) that POSTs a usage-only event to /ingest from onFinish, plus a Python track_ai_sdk_response(payload, **tags) for the AI SDK Python flavor. Maps camelCase (promptTokens, promptTokensDetails.cachedTokens) to the engine's normalized usage.
Tests: tests/test_integrations.py (4/4) — module imports without optional deps; wrap_client records a priced content-free event; the LangChain handler records via a stubbed BaseCallbackHandler (no langchain required); AI SDK helper handles camelCase. All forbidden content fields rejected. Pricing pinned to published rates.
Privacy invariant unchanged: only token counts + business tags reach the store.

Cursor Admin API adapter (`nomira/cursor_admin.py`)

Server-side fetch into the same event store, via the documented endpoints (POST /teams/filtered-usage-events, POST /teams/daily-usage-data). Basic auth (key as username, empty password). Defensive error handling: a failed endpoint leaves verified=False and lists the error rather than silently shipping an empty result.
events_from_cursor() produces usage-only events: token-based rows get full token detail (input/output/cache-read/cache-write) plus the provider-charged cost as the source of truth; request-priced rows ship with the charged cost and zero tokens (clearly labeled). Cache-share cost split is heuristic, called out as UNVERIFIED — Cursor doesn't itemize per-component cost over the API.
Unknown models stay as cursor/<raw> placeholders — never fabricate a price. Recognized models (Claude/GPT) map into the existing registry so they aggregate cleanly with Claude Code / Codex events.
CLI: --cursor-fetch [--cursor-from YYYY-MM-DD] [--cursor-to YYYY-MM-DD]. Default window: last 30 days.
cursor.py::availability_report() now also reports whether NOMIRA_CURSOR_ADMIN_KEY is set and what command to run next.
Tests: tests/test_cursor_admin.py (4/4) — urllib.request.urlopen mocked. Verifies Basic-auth header, epoch-ms date conversion, clean event shape with no content fields, request-priced vs token-based paths, and that 404s surface as errors instead of being swallowed.

Reconciliation polish (`nomira/reconciliation.py`, `nomira/collector.py`)

Cost API parsing toughened: _walk_results now walks nested results/groups/data/items/rows containers depth-first, so newer Anthropic Cost API shapes (per-model breakdowns nested under groups) parse without code changes. _coerce_amount accepts plain numbers, {amount, currency}, AND {value, currency}; non-USD rows are skipped, never silently converted.
Structured error reporting: HTTP errors include the response body (Anthropic's JSON error message is very informative), network errors are caught separately from JSON-decode errors, and the raised RuntimeError text is now safe to surface in a UI.
Missing-day callout: reconciliation.missing_days(pairs) classifies days into billed_no_compute (provider charged us where we observed nothing — probably missing ingest) and computed_no_bill (observed API spend with no matching bill — probably an import gap). The /reconcile page renders these as a warning card above the daily-comparison table, with a clear next step.
Verified store.computed_by_day already returns a daily series and pairs cleanly even when one side is empty.
Tests: tests/test_reconciliation.py (6/6) — CSV parser (incl. quoted-comma costs, mixed date formats, garbage rows), JSON parser (list + envelope), Cost API shape walker, _coerce_amount, missing-day classifier, full store round-trip.

Total tests: 24/24 (6 cost engine + 4 team + 4 integrations + 4 cursor admin + 6 reconciliation).

Session 4 — 2026-06-02 (hybrid MVP)

Shipped: multi-tenant store + API keys + log-load mode.

Multi-tenant store. tenant_id on events and bills; defensive ALTER TABLE for existing DBs; every aggregate scoped via _scope_where / _scope_and. Backward-compatible — self-host installs sit under tenant default.
API keys. api_keys table, SHA-256 hash only (raw key shown once on create). --create-tenant NAME mints; --list-tenants shows registry.
Auth on /ingest. Three layers: legacy NOMIRA_INGEST_TOKEN, per-tenant bearer keys, or open (localhost). Authenticated tenant is forced onto every event server-side.
nomira --export events.json. Walks Claude Code + Codex sources, dumps usage-only events with the configured --tenant tag.
/import page. Drag-and-drop upload form. The log-load mode: zero live connection between developer machine and dashboard.
Event idempotency key now includes tenant + developer.
Tests: 26/26 (added tenant isolation + import-page check).

Built (continued — dashboard tenant context)

Every dashboard route accepts ?tenant=<id>; queries scoped via the existing _scope_where/_scope_and helpers.
Period picker, project bars, developer bars, "compare developers" links — all preserve ?tenant=X (single helper: _with_tenant).
Visible "viewing tenant X" chip when a non-default tenant is being viewed, with a "switch →" link back to default.
Smoke-tested: cross-tenant isolation holds at the rendered-HTML layer (acme's developer-a never appears under globex view, and vice versa).

Open / next

Per-tenant rate limits, signup/billing UI.
Managed cloud Phase B — only on proven demand.

← PreviousSecurity

Changelog

Session 1 — 2026-05-27/28

Research & alignment

Decision

Built (Stage 0, focus 1 + 3)

Built (continued — multi-provider + regime + focus 2)

Built (continued — pricing verify, Codex adapter, architecture)

Built (continued — "do all": team product, Cursor, readiness package)

Built (continued — dynamic pricing + server running)

Built (continued — regime/plan correction)

Built (continued — optimizer drill-down)

Built (continued — developer comparison + drill-down)

Built (continued — six items in one round)

Built (continued — packaging & docs for public deploy)

Built (continued — branded deck)

Built (continued — public web hosting)

Session 2 — 2026-06-02

SDK framework adapters (nomira/integrations/)

Cursor Admin API adapter (nomira/cursor_admin.py)

Reconciliation polish (nomira/reconciliation.py, nomira/collector.py)

Session 4 — 2026-06-02 (hybrid MVP)

Built (continued — dashboard tenant context)

Open / next

SDK framework adapters (`nomira/integrations/`)

Cursor Admin API adapter (`nomira/cursor_admin.py`)

Reconciliation polish (`nomira/reconciliation.py`, `nomira/collector.py`)