fix: cast OpenRouter image generation float token counts to int#28070
fix: cast OpenRouter image generation float token counts to int#28070sharziki wants to merge 3 commits into
Conversation
[Infra] Promote internal staging to main
…_chunk When Gemini streams thinking/reasoning chunks, the parsed reasoning_content lives in response_obj["original_chunk"] — not in the fresh model_response created by model_response_creator(). The existing reasoning_content check (line 817-820) only inspects the empty model_response, so thinking-only chunks are silently dropped. Add a fallback check that inspects response_obj["original_chunk"].choices[0].delta.reasoning_content. Once is_chunk_non_empty returns True, return_processed_chunk_logic already rebuilds choices from original_chunk.model_dump(), which preserves reasoning_content — no further changes needed. Closes BerriAI#28000 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
OpenRouter returns cost-weighted float token counts (e.g. 14417.92) for image generation models. Pydantic v2's strict int validation rejects these, raising `int_from_float` errors that abort the response. Fixes BerriAI#28001 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Greptile SummaryThis PR fixes a crash caused by OpenRouter returning float-valued token counts (e.g.
Confidence Score: 4/5Both fixes are small and well-tested; the main risk is a None-guard gap in the int() casts and an undocumented bundled change. The image generation fix is correct and well-covered by the reproduction test. The streaming handler change is a valid independent fix but undocumented in the PR description. The int() casts lack a None guard — if OpenRouter returns explicit null for a token field, int(None) raises TypeError. litellm/llms/openrouter/image_generation/transformation.py (int() casts could crash on null token values); litellm/litellm_core_utils/streaming_handler.py (unrelated change with no PR description coverage)
|
| Filename | Overview |
|---|---|
| litellm/llms/openrouter/image_generation/transformation.py | Adds int() casts around prompt_tokens, total_tokens, and image_tokens reads to fix Pydantic validation failures when OpenRouter returns floats |
| litellm/litellm_core_utils/streaming_handler.py | Adds an OR branch to is_chunk_non_empty to detect reasoning_content in response_obj["original_chunk"] for Gemini thinking chunks; this change is unrelated to the PR description |
| tests/test_litellm/llms/openrouter/image_generation/test_openrouter_image_gen_transformation.py | Adds test_transform_image_generation_response_float_token_counts reproducing the exact float-token-count issue and asserting int output |
| tests/test_litellm/litellm_core_utils/test_streaming_handler.py | Adds test for Gemini thinking chunk detection via original_chunk; existing tests have cosmetic reformatting only (no logic changes) |
Reviews (1): Last reviewed commit: "fix: cast OpenRouter float token counts ..." | Re-trigger Greptile
| or ( | ||
| response_obj.get("original_chunk") is not None | ||
| and len(getattr(response_obj["original_chunk"], "choices", [])) > 0 | ||
| and getattr( | ||
| response_obj["original_chunk"].choices[0].delta, | ||
| "reasoning_content", | ||
| None, | ||
| ) | ||
| is not None | ||
| ) |
There was a problem hiding this comment.
Undocumented change bundled into unrelated PR
This new or branch — detecting reasoning_content on response_obj["original_chunk"] for Gemini thinking chunks — is a separate bug fix that isn't mentioned anywhere in the PR description or title. The PR is described solely as "fix: cast OpenRouter image generation float token counts to int." The streaming handler change and its accompanying test are independent work that should be documented (ideally in the PR description or in a separate PR) so reviewers can assess it with full context.
| prompt_tokens = int(usage_data.get("prompt_tokens", 0)) | ||
| total_tokens = int(usage_data.get("total_tokens", 0)) | ||
|
|
||
| completion_tokens_details = usage_data.get("completion_tokens_details", {}) | ||
| image_tokens = completion_tokens_details.get("image_tokens", 0) | ||
| image_tokens = int(completion_tokens_details.get("image_tokens", 0)) |
There was a problem hiding this comment.
int() raises TypeError when passed None. If OpenRouter ever returns "prompt_tokens": null (key present, value None), usage_data.get("prompt_tokens", 0) returns None (the default is only used when the key is absent), and int(None) crashes. The same applies to total_tokens and image_tokens. Using or 0 guards against that case.
| prompt_tokens = int(usage_data.get("prompt_tokens", 0)) | |
| total_tokens = int(usage_data.get("total_tokens", 0)) | |
| completion_tokens_details = usage_data.get("completion_tokens_details", {}) | |
| image_tokens = completion_tokens_details.get("image_tokens", 0) | |
| image_tokens = int(completion_tokens_details.get("image_tokens", 0)) | |
| prompt_tokens = int(usage_data.get("prompt_tokens") or 0) | |
| total_tokens = int(usage_data.get("total_tokens") or 0) | |
| completion_tokens_details = usage_data.get("completion_tokens_details") or {} | |
| image_tokens = int(completion_tokens_details.get("image_tokens") or 0) |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
|
🤖 litellm-agent: This PR is currently BLOCKED from merge. Score: 2/5 ❌ Why blocked:
Details: Score docked for: scope drift vs linked issue (Linked issue #28001 is about OpenRouter float token counts for image generation; the diff also modifies streaming_handler.py to fix Gemini thinking chunk detection, a change unrelated to the issue and not described in the PR body.); 1 unresolved reviewer concern (greptile). Fix the issues above and push an update — the bot will re-review automatically.
|
Summary
Fixes #28001
OpenRouter's API returns cost-weighted float token counts (e.g.
14417.92,14474.92) for image generation models. These are passed directly toImageUsage, whose Pydantic v2 model declaresoutput_tokens: intandtotal_tokens: int. Pydantic's strict validation rejects the floats withint_from_floaterrors, which bubble up asOpenrouterExceptionand abort the response.Fix
Wrap the three token-count reads in
_set_usage_and_cost()withint():The fractional parts are OpenRouter's internal accounting artifacts — truncating to int is safe and matches the integer semantics callers expect.
Test
Added
test_transform_image_generation_response_float_token_countsthat reproduces the exact error from the issue (float usage values) and asserts the parsed fields are integers.All 31 tests pass.