Budgets, Quotas, Usage, And Billing
Understand spend controls, quota controls, usage records, and billing calculations.
Budgets, Quotas, Usage, And Billing
Odock records usage and can enforce budget and quota policies before upstream calls are made.
Usage Data Model
The platform stores two kinds of usage data:
| Table | Purpose |
|---|---|
Usage | Rollups for dashboards and aggregate metrics |
UsageRecord | Lossless per-request LLM usage, billing, provider usage, routing metadata |
McpUsageRecord | MCP-specific usage, bytes, tool, server, transport, and cost details |
UsageRecord stores:
- request ID,
- organisation, team, user, API key,
- provider,
- model,
- model ID,
- region,
- token JSON,
- billing JSON,
- raw provider usage,
- routing metadata,
- timestamp,
- latency,
- HTTP status,
- provider request ID,
- billable tokens,
- total cost in nanos USD.
McpUsageRecord stores:
- MCP server,
- organisation, team, user, API key,
- transport,
- JSON-RPC method,
- tool name,
- endpoint,
- status,
- latency,
- input and output bytes,
- billing JSON,
- metadata,
- total cost in nanos USD.
Token Breakdown
The usage pipeline normalizes provider usage into:
- prompt tokens,
- completion tokens,
- input details by modality,
- output details by modality,
- cached tokens,
- rejected tokens,
- tool usage,
- embeddings tokens,
- total tokens.
Providers with normalizers:
- OpenAI
- Anthropic
- Gemini
- vLLM
- MCP
Endpoint-aware semantics handle special cases such as embeddings and image generation so those tokens are billed in the right bucket.
Billing
Billing uses model pricing to produce a deterministic BillingTokens object.
Costs are computed in integer nanos USD:
1 USD = 1,000,000,000 nanos USDThe calculator supports:
- input text/image/audio/video/reasoning rates,
- output text/image/audio/video/reasoning rates,
- cached input discount,
- tool definition/input/output rates,
- embeddings rate,
- total cost in nanos and human USD fields.
If pricing is missing, usage can still be recorded, but cost fields may be absent.
Budgets
Budgets cap spend.
Fields:
nameownerType:ORG,TEAM,USER,APIKEY- owner IDs according to owner type
period:DAILY,WEEKLY,MONTHLY,QUARTERLY- optional
startAtandendAt - optional
timezone currencyamountNanosUsdrolloverthresholdsactive
Only active budgets participate in enforcement.
Quotas
Quotas cap usage metrics.
Fields:
nameownerType:ORG,TEAM,USER,APIKEY- owner IDs according to owner type
metric:REQUESTS,TOKENS,TOKENS_IN,TOKENS_OUT,ERRORS,COST,LATENCY_MSlimitperiod- optional
startAtandendAt - optional
timezone rolloveractive
Enforcement Flow
For LLM requests:
- Gateway authenticates and decodes enough request context.
- Gateway estimates usage from requested model, max tokens, pricing, and rate-limit receipt.
budgetenforcer.Reservecreates aBudgetRequest.- Active budgets and quotas for organisation, team/user, and API key scopes are collected.
- Budget and quota windows are upserted.
- Reserved counters are increased only if
used + reserved + estimate <= limit. - Reservation rows are inserted.
- The upstream request proceeds.
- Usage is normalized and billed.
budgetenforcer.Settlereplaces reserved values with actual values.- If the request fails before completion,
budgetenforcer.Releasefrees reservations.
For MCP requests:
- estimated cost is based on MCP pricing and request bytes;
- actual MCP cost uses input/output bytes and call pricing;
- usage sidecar records are stored in
McpUsageRecord.
Idempotency
Budget operations are keyed by request ID:
- reserve twice is safe,
- settle twice is safe,
- release twice is safe.
This matters for retries and worker reconciliation.
Worker Reconciliation
The gateway starts a budget worker. It periodically:
- expires stale reserved requests,
- releases stale reservations,
- settles pending requests when a matching usage record exists.
This protects correctness when the gateway crashes after reserve but before settle.
Invoices
The current UI invoice surface builds previews and exports from usage data. It does not yet create persisted invoice records. Treat invoice pages as reporting over usage records, not as a finalized billing ledger.