Skip to content

fix: cast OpenRouter image generation float token counts to int#28070

Open
sharziki wants to merge 3 commits into
BerriAI:litellm_internal_stagingfrom
sharziki:fix/openrouter-float-token-counts
Open

fix: cast OpenRouter image generation float token counts to int#28070
sharziki wants to merge 3 commits into
BerriAI:litellm_internal_stagingfrom
sharziki:fix/openrouter-float-token-counts

Conversation

@sharziki
Copy link
Copy Markdown

Summary

Fixes #28001

OpenRouter's API returns cost-weighted float token counts (e.g. 14417.92, 14474.92) for image generation models. These are passed directly to ImageUsage, whose Pydantic v2 model declares output_tokens: int and total_tokens: int. Pydantic's strict validation rejects the floats with int_from_float errors, which bubble up as OpenrouterException and abort the response.

Fix

Wrap the three token-count reads in _set_usage_and_cost() with int():

prompt_tokens = int(usage_data.get("prompt_tokens", 0))
total_tokens = int(usage_data.get("total_tokens", 0))
image_tokens = int(completion_tokens_details.get("image_tokens", 0))

The fractional parts are OpenRouter's internal accounting artifacts — truncating to int is safe and matches the integer semantics callers expect.

Test

Added test_transform_image_generation_response_float_token_counts that reproduces the exact error from the issue (float usage values) and asserts the parsed fields are integers.

All 31 tests pass.

shin-berri and others added 3 commits May 13, 2026 22:37
…_chunk

When Gemini streams thinking/reasoning chunks, the parsed
reasoning_content lives in response_obj["original_chunk"] — not in the
fresh model_response created by model_response_creator(). The existing
reasoning_content check (line 817-820) only inspects the empty
model_response, so thinking-only chunks are silently dropped.

Add a fallback check that inspects
response_obj["original_chunk"].choices[0].delta.reasoning_content.
Once is_chunk_non_empty returns True, return_processed_chunk_logic
already rebuilds choices from original_chunk.model_dump(), which
preserves reasoning_content — no further changes needed.

Closes BerriAI#28000

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
OpenRouter returns cost-weighted float token counts (e.g. 14417.92)
for image generation models. Pydantic v2's strict int validation
rejects these, raising `int_from_float` errors that abort the response.

Fixes BerriAI#28001

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ shin-berri
❌ sharziki
You have signed the CLA already but the status is still pending? Let us recheck it.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 16, 2026

Greptile Summary

This PR fixes a crash caused by OpenRouter returning float-valued token counts (e.g. 14417.92) for image generation models, which Pydantic v2 rejects when populating the int-typed fields of ImageUsage. It also bundles an unrelated streaming handler fix that prevents Gemini thinking chunks from being silently dropped when reasoning_content is carried only in response_obj[\"original_chunk\"] rather than in the freshly built model_response.

  • transformation.py: wraps prompt_tokens, total_tokens, and image_tokens reads with int(), along with a reproduction test that asserts truncation to integer values.
  • streaming_handler.py: adds an or branch to is_chunk_non_empty to check original_chunk.choices[0].delta.reasoning_content, plus a corresponding test — this change is not described in the PR summary.

Confidence Score: 4/5

Both fixes are small and well-tested; the main risk is a None-guard gap in the int() casts and an undocumented bundled change.

The image generation fix is correct and well-covered by the reproduction test. The streaming handler change is a valid independent fix but undocumented in the PR description. The int() casts lack a None guard — if OpenRouter returns explicit null for a token field, int(None) raises TypeError.

litellm/llms/openrouter/image_generation/transformation.py (int() casts could crash on null token values); litellm/litellm_core_utils/streaming_handler.py (unrelated change with no PR description coverage)

Important Files Changed

Filename Overview
litellm/llms/openrouter/image_generation/transformation.py Adds int() casts around prompt_tokens, total_tokens, and image_tokens reads to fix Pydantic validation failures when OpenRouter returns floats
litellm/litellm_core_utils/streaming_handler.py Adds an OR branch to is_chunk_non_empty to detect reasoning_content in response_obj["original_chunk"] for Gemini thinking chunks; this change is unrelated to the PR description
tests/test_litellm/llms/openrouter/image_generation/test_openrouter_image_gen_transformation.py Adds test_transform_image_generation_response_float_token_counts reproducing the exact float-token-count issue and asserting int output
tests/test_litellm/litellm_core_utils/test_streaming_handler.py Adds test for Gemini thinking chunk detection via original_chunk; existing tests have cosmetic reformatting only (no logic changes)

Reviews (1): Last reviewed commit: "fix: cast OpenRouter float token counts ..." | Re-trigger Greptile

Comment on lines +821 to +830
or (
response_obj.get("original_chunk") is not None
and len(getattr(response_obj["original_chunk"], "choices", [])) > 0
and getattr(
response_obj["original_chunk"].choices[0].delta,
"reasoning_content",
None,
)
is not None
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Undocumented change bundled into unrelated PR

This new or branch — detecting reasoning_content on response_obj["original_chunk"] for Gemini thinking chunks — is a separate bug fix that isn't mentioned anywhere in the PR description or title. The PR is described solely as "fix: cast OpenRouter image generation float token counts to int." The streaming handler change and its accompanying test are independent work that should be documented (ideally in the PR description or in a separate PR) so reviewers can assess it with full context.

Comment on lines +206 to +210
prompt_tokens = int(usage_data.get("prompt_tokens", 0))
total_tokens = int(usage_data.get("total_tokens", 0))

completion_tokens_details = usage_data.get("completion_tokens_details", {})
image_tokens = completion_tokens_details.get("image_tokens", 0)
image_tokens = int(completion_tokens_details.get("image_tokens", 0))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 int() raises TypeError when passed None. If OpenRouter ever returns "prompt_tokens": null (key present, value None), usage_data.get("prompt_tokens", 0) returns None (the default is only used when the key is absent), and int(None) crashes. The same applies to total_tokens and image_tokens. Using or 0 guards against that case.

Suggested change
prompt_tokens = int(usage_data.get("prompt_tokens", 0))
total_tokens = int(usage_data.get("total_tokens", 0))
completion_tokens_details = usage_data.get("completion_tokens_details", {})
image_tokens = completion_tokens_details.get("image_tokens", 0)
image_tokens = int(completion_tokens_details.get("image_tokens", 0))
prompt_tokens = int(usage_data.get("prompt_tokens") or 0)
total_tokens = int(usage_data.get("total_tokens") or 0)
completion_tokens_details = usage_data.get("completion_tokens_details") or {}
image_tokens = int(completion_tokens_details.get("image_tokens") or 0)

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

@oss-pr-review-agent-shin
Copy link
Copy Markdown
Contributor

🤖 litellm-agent: This PR is currently BLOCKED from merge.

Score: 2/5

Why blocked:

Details: Score docked for: scope drift vs linked issue (Linked issue #28001 is about OpenRouter float token counts for image generation; the diff also modifies streaming_handler.py to fix Gemini thinking chunk detection, a change unrelated to the issue and not described in the PR body.); 1 unresolved reviewer concern (greptile).

Fix the issues above and push an update — the bot will re-review automatically.

Note: This bot is still in beta and might not always work as expected. Please share any feedback via Slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Error transforming OpenRouter image generation response: 2 validation errors for ImageUsage

3 participants