fix(billing): audit P0 contract changes — pricing Option, BillingMode, DebitOutcome by FrozenArcher404 · Pull Request #463 · bitrouter/bitrouter

FrozenArcher404 · 2026-05-16T19:26:11Z

Six commits supporting a follow-up bitrouter-cloud audit-fix PR. All semver-affecting changes are in this PR so cloud's dep bump can land cleanly after the next release.

What's in here

fix(pricing): return Option when a required rate is missing (dedf41c)
bitrouter_core::pricing::calculate_cost and the bitrouter_api::mpp::calculate_usage_cost wrapper now return Option<f64>. The previous unwrap_or(0.0) short-circuit silently undercharged whenever a token bucket had nonzero usage but no rate (a provider with input pricing and output_tokens: {} placeholder, for instance, billed every output token at $0). The four /v1/chat/completions-shaped filter sites log pricing_unavailable and skip the deduct when None comes back — the recommender's skip-incomplete change on the cloud side is what should keep that from happening in practice.
feat(pricing): add ModelPricing::is_complete and reserved output buckets (95b64cf)
ModelPricing::is_complete() is the "safe to bill" predicate the cheapest-picker / recommender call to refuse placeholder entries. Adds reserved OutputTokenPricing.image / .audio fields (serde-round-tripped) so multimodal billing has a place to land without another breaking change.
feat(billing): explicit BillingMode on RoutingTarget (1b68281)
The filters previously inferred "is this a BYOK request?" from target.api_key_override.is_some(). Cloud-routed targets always carry a platform key, so that inference treated every cloud request as BYOK and the deduct gate was unconditionally skipped. New BillingMode { Cloud, Byok } field on RoutingTarget; defaults to Cloud so a forgotten field bills (safer than silently leaking). All four filters now read this field. All in-tree RoutingTarget constructors get an explicit billing_mode: BillingMode::default().
feat(billing): deduct returns DebitOutcome (Settled | Deferred) (65efb60)
Extends the PaymentGate::deduct contract. Deferred lets a gate signal "couldn't draw right now but recorded the debt for later collection" — callers MUST treat it as success, the response goes back unchanged. The credits-backed cloud gate uses this to insert a pending_debits row instead of failing the response. MppState's impl maps its existing success path to Settled (channels can't defer).
feat(mpp): export DebitOutcome from the public mpp module (01ee02c)
One-liner to make the new enum reachable from downstream cloud crates.
fix(stream): bill estimated output on mid-stream client disconnect (720bd77)
StreamObservation::outcome used to return Err whenever the client disconnected mid-stream, routing to on_request_failure with no billing — an attacker who drained a long generation and hung up before the Finish event paid nothing. Now: when Finish landed first, settle with its authoritative usage even if disconnect happened after; when disconnect beats Finish, return a synthesized usage with output tokens estimated from delta.len() / 4 so the request bills something. Input billing on disconnect is a known gap (prompt not plumbed through StreamObservation).

Compat notes

Three of these are technically API-breaking and want a minor bump:

RoutingTarget gains a billing_mode field — any downstream literal constructor needs to add it. Cushioned by BillingMode::default() returning Cloud.
calculate_cost / calculate_usage_cost now return Option<f64> — callers must handle None. Only known consumer is bitrouter-cloud.
PaymentGate::deduct returns Result<DebitOutcome, _> — trait impls update.

bitrouter-observe::ModelSpendObserver is updated in-tree to keep working with the new Option return (logs pricing_unavailable, writes cost=0 to the spend log rather than crashing).

Test plan

cargo nextest run — 968/968 pass
cargo clippy --all-targets -- -D warnings — clean
cargo fmt -- --check — clean
New unit tests in bitrouter-core::pricing::tests (Option / partial pricing / unused-bucket cases)
New unit tests in bitrouter-api::router::stream_observation_tests (disconnect after Finish, disconnect with deltas, disconnect with no output)
Integration / staging snapshot — paired with the bitrouter-cloud follow-up PR before either lands in production

🤖 Generated with Claude Code

calculate_cost previously unwrap_or(0.0)'d every absent rate, so a provider with only input pricing (no output_tokens.text) would bill every output token at $0 — silent undercharging. Callers had no way to distinguish "free model" from "we don't know how to bill this". calculate_cost (and the bitrouter-api mpp wrapper) now return Option<f64>. None means: at least one token bucket has nonzero usage and the matching rate is missing. The four chat/completions filter sites log pricing_unavailable and skip the deduct (response still goes out — the recommender skip-incomplete change is what should keep this from happening in practice). The spend-log observer records cost=0 and warns, since dashboard accuracy degrades but shouldn't block the request. Behaviour preserved when usage is genuinely empty (Some(0.0)) and when an unused bucket's rate happens to be missing (Some(rest)). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

is_complete returns true when no_cache + text are both set — the minimum a provider needs to be billable. Cleanest predicate for the recommender and the /v1/models cheapest-picker to refuse entries with placeholder pricing, complementing the per-bucket Option returned by calculate_cost. OutputTokenPricing gains reserved image/audio fields; they round-trip via serde and surface in /v1/models JSON if upstream catalogs ever publish them. No billing logic yet — that lands when matching usage buckets land in LanguageModelOutputTokens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The chat/completions filters previously derived "is this a BYOK request?" from `target.api_key_override.is_some()` — which is unconditionally true for cloud-routed targets that fill in the platform credential. As a result the cloud's deduct gate was silently skipped for every routed request, and users were billed nothing. Add a `BillingMode { Cloud, Byok }` enum and a `billing_mode` field on `RoutingTarget`. Default is `Cloud` so a forgotten field still bills (safer to over-charge than to silently leak). All four filter sites (openai/chat, openai/responses, anthropic/messages, google/generate_content) now read this field instead of inferring billing from the api_key field. All in-tree RoutingTarget constructors get an explicit `billing_mode: BillingMode::default()` (cloud-billed). The cloud's RegistryRoutingTable and ByokOverlay will set this explicitly in a follow-up commit on the cloud side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The post-flight deduct call used to return Result<(), VerificationError> and any Err was logged-and-swallowed by the chat/completions filters, so a user whose balance dropped to zero between auth and settlement still got the response for free. Extend the PaymentGate trait to return DebitOutcome: - Settled: the amount was deducted synchronously. - Deferred: the gate could not deduct right now (typically insufficient balance) but has recorded the debt for later collection. Callers MUST treat this as success — the response goes back to the client. MppState's impl always returns Settled on success (channels can't defer — they either have funds or don't, in which case Err propagates as before). The cloud's HybridPaymentGate will use Deferred in the cloud-side follow-up commit: insert a pending_debits row instead of Err on insufficient balance, returning Deferred so the response still ships while the gate's next verify_payment will refuse the caller until the debt clears. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

An attacker who drained a long generation and then disconnected before the upstream's Finish event landed used to pay nothing (StreamObservation::outcome returned Err on disconnect, which routed to on_request_failure and skipped settlement). StreamObservation now tracks bytes streamed across TextDelta / ReasoningDelta / ToolInputDelta and, on disconnect-without-Finish, returns a synthesized LanguageModelUsage with output_tokens estimated at ~4 bytes/token. This still under-bills (input tokens aren't tracked at this layer — the prompt isn't plumbed through StreamObservation; a follow-up should add that) but closes the "drain output then hang up" path. When the Finish event arrived before the disconnect, the authoritative usage from upstream now wins over the disconnect branch, so the response is billed in full rather than dropped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The new DebitOutcome enum is part of the PaymentGate contract; downstream cloud crates need to reference it when implementing their own gate (HybridPaymentGate in bitrouter-cloud). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CI's bitrouter-api matrix builds with --no-default-features and each payment feature in isolation. Without payments-tempo|payments-solana, the `byok_used = matches!(target.billing_mode, BillingMode::Byok)` sites compile out (they live inside cfg(any(payments-tempo, payments-solana))) but the unconditional BillingMode import remained visible — fires unused-imports, which CI treats as -D warnings. Split BillingMode into its own cfg-gated use across all four filter files. Verified each of the failing matrix combos (no-features, accounts, observe, guardrails) now `cargo check`s clean, plus payments-tempo, payments-solana, and --all-features. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Six fixes mapped from the v0 May-17 audit, with v0's links in the matching code comments: P0 — billing leaks - **Stream-disconnect bills $0** (v0 #463-E). `UsageAccumulator` only saw upstream `Usage` parts, so a client that drained a long generation and hung up before the trailing usage chunk paid nothing. Now also tracks `delta_chars` from `TextDelta` / `ReasoningDelta`; on `ClientDisconnected` with no upstream usage seen, `StreamProcessor::finish` synthesises a usage with `completion_tokens = ceil(chars / 4)` (the OpenAI ~4-chars/token heuristic). Prompt-token estimate is the documented v0 gap. Regression test: `disconnect_before_usage_bills_estimated_output`. - **MPP checkpoint lost-update race** (cloud #251 audit B7 / S5). `MppState::sign_checkpoint` previously read `last_checkpoint` and UPDATEd in separate statements; two concurrent debits with different `cumulative` values could compute deltas off the same `prev` and the second UPDATE would overwrite the first, leaking channel balance. Wrap read + compute + write in a transaction. P1 — billing-correctness / security - **Partial pricing silently undercharges** (v0 #463-A). `ModelPricing` rates are now `Option<f64>`; `calculate_charge_micro_usd` returns `Option<i64>` and reports `None` when any nonzero-usage bucket lacks a rate. `CreditCharge` / `MppCharge` emit `PricingUnavailable` and `Pass` — never silently bill the missing bucket at $0. Adds `ModelPricing::is_complete()` predicate + `partial(...)` constructor. - **Adversarial token clamp** (cloud #251 audit B9). New `MAX_TRUSTED_TOKENS = 10_000_000` cap inside `calculate_charge_micro_usd`; an adversarial upstream returning `u64::MAX` tokens can no longer drive the charge to overflow. - **SSRF defence for upstream `api_base`** (cloud #251 audit S3 / S4). New `bitrouter_sdk::url_validator::validate_upstream_url` rejects non-HTTP(S) schemes, metadata DNS aliases (`metadata.google.internal`, `instance-data.ec2.internal`), and literal IPs in link-local / private / carrier-grade-NAT / ULA / documentation ranges. Loopback (127/8, ::1, `localhost`) stays allowed for dev / self-hosted upstreams (Ollama, vLLM). Wired into `insert_byok_key` (rejects malicious BYOK rows) and into `config::parse_with` (rejects malicious `bitrouter.yaml` provider entries). P2 — hygiene - **Credential Debug redaction** (cloud #251 audit S9). Hand-written `Debug` impls on `RoutingTarget`, `ByokCredential`, and `ProviderConfig` redact the `api_key` / `api_key_override` fields, so a future `tracing::error!(?target, ...)` can't dump upstream or user keys into the log stream. 206 tests pass (was 193 + 13 new); clippy clean under `-D warnings`; fmt clean; doc warning-free; `cargo package --workspace` green. Out of scope for this pass (v1 doesn't need them or they need a contract change beyond the audit's intent): `DebitOutcome::Deferred` (no cloud-style deferred-debit recovery), `pending_debits` (same), SELECT FOR UPDATE on `credit_accounts` (v1's `deduct_credits` is already a single atomic UPDATE in a tx; safe under sqlx's serialised writes). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

takasaki404 and others added 6 commits May 17, 2026 02:57

github-actions Bot added the fix label May 16, 2026

FrozenArcher404 merged commit 98ff690 into main May 16, 2026
49 checks passed

FrozenArcher404 deleted the fix/audit-billing-and-pricing branch May 16, 2026 19:56

bitrouter-automation Bot mentioned this pull request May 16, 2026

chore(release): publish v0.33.0 #462

Merged

FrozenArcher404 restored the fix/audit-billing-and-pricing branch May 18, 2026 04:01

FrozenArcher404 deleted the fix/audit-billing-and-pricing branch May 18, 2026 04:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(billing): audit P0 contract changes — pricing Option, BillingMode, DebitOutcome#463

fix(billing): audit P0 contract changes — pricing Option, BillingMode, DebitOutcome#463
FrozenArcher404 merged 7 commits into
mainfrom
fix/audit-billing-and-pricing

FrozenArcher404 commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FrozenArcher404 commented May 16, 2026

What's in here

Compat notes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant