Skip to content

Add provider_cache_read/write_input_tokens to Usage type#7032

Merged
AntoineToussaint merged 7 commits intomainfrom
feat/cache-tokens-1-usage-type
Mar 23, 2026
Merged

Add provider_cache_read/write_input_tokens to Usage type#7032
AntoineToussaint merged 7 commits intomainfrom
feat/cache-tokens-1-usage-type

Conversation

@AntoineToussaint
Copy link
Copy Markdown
Member

@AntoineToussaint AntoineToussaint commented Mar 23, 2026

Summary

Adds provider_cache_read_input_tokens: Option<u32> and provider_cache_write_input_tokens: Option<u32> to the core Usage struct (crates/tensorzero-core/src/inference/types/usage.rs).

Production code changes

  • usage.rs: Two new fields on Usage, updated zero(), total_tokens(), and aggregation logic (lenient summation that preserves Some values)
  • mod.rs: aggregate_usage_across_model_inferences updated to propagate cache tokens
  • streams.rs: Streaming usage accumulation updated for cache tokens
  • Python client: Usage dataclass updated with new optional fields
  • TypeScript bindings: Usage.ts regenerated

Test / mechanical changes

All other files (providers, variants, endpoints, evaluations, etc.) are mechanical: adding provider_cache_read_input_tokens: None, provider_cache_write_input_tokens: None to existing Usage {} constructors so the code compiles.

What's NOT in this PR

PR Stack

  1. This PR — Usage type changes
  2. Add ClickHouse and Postgres migrations for cache token columns #7033 — ClickHouse + Postgres migrations
  3. Add cache token parsing for all providers #7034 — Provider type definitions + cache parsing
  4. Thread cache tokens through endpoints and OpenAI-compatible response #7035 — Endpoint threading + OpenAI-compatible response format
  5. Add e2e tests, docs, and test fixture updates for cache tokens #7036 — E2e tests, docs, test fixtures

Test plan

  • cargo check --all-targets --all-features
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test-unit-fast (12 pre-existing failures unrelated)

🤖 Generated with Claude Code

Add two new Optional<u32> fields to the Usage struct for tracking
provider-reported cache token counts. All existing constructors
initialized with None. Aggregation helpers in mod.rs and streams.rs
updated to propagate cache tokens through lenient summation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AntoineToussaint and others added 2 commits March 23, 2026 11:19
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
virajmehta
virajmehta previously approved these changes Mar 23, 2026
The #[serde(default, skip_serializing_if)] on the new cache fields
is for backward compatibility with existing serialized data that
predates these fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Usage is never deserialized from JSON — fields are always constructed
in Rust code. Keep cache fields plain like input_tokens/output_tokens.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
virajmehta
virajmehta previously approved these changes Mar 23, 2026
Per AGENTS.md convention: omit optional fields from API responses
when None. Most responses won't have cache data, so avoid cluttering
every response with null fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@AntoineToussaint AntoineToussaint added this pull request to the merge queue Mar 23, 2026
Merged via the queue into main with commit da77058 Mar 23, 2026
197 of 203 checks passed
@AntoineToussaint AntoineToussaint deleted the feat/cache-tokens-1-usage-type branch March 23, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants