[None][fix] modify qwen3-next sampling stop_tokens #9331

JadoTu · 2025-11-20T06:54:43Z

Summary by CodeRabbit

Bug Fixes
- Enhanced stop-word token handling for the Qwen3 Next model. The system now automatically detects and appends additional end-of-sequence tokens into the stop-word vocabulary during model initialization when available. This improvement ensures more accurate text generation termination and delivers consistent model behavior across different inference scenarios and deployment configurations.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Description

modify qwen3-next sampling stop_tokens

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>

coderabbitai · 2025-11-20T06:57:59Z

📝 Walkthrough

Walkthrough

Adds model-specific logic to extend stop-word vocabulary during sampling parameter setup for the qwen3_next model. When generation_config.eos_token_id is a list of integers, new EOS tokens not already present in the stop word list are computed and appended.

Changes

Cohort / File(s)	Summary
Model-specific stop-word vocabulary extension `tensorrt_llm/sampling_params.py`	Adds conditional logic to extend stop-word vocabulary with EOS tokens from generation_config when model_type is "qwen3_next". Computes token difference between generation_config.eos_token_id and existing stop word ids, then appends new tokens.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Verify the token comparison logic correctly identifies tokens not already in the stop word list
Confirm the conditional check for model_type == "qwen3_next" is the intended scope
Ensure the append operation maintains list integrity and doesn't cause unintended side effects

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is incomplete. It lacks detailed explanation of the issue, solution rationale, test coverage information, and does not follow the repository's description template structure.	Add sections for Description (what and why), Test Coverage, and complete the PR Checklist. Include details about the qwen3-next model-specific logic and why this change was necessary.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly specifies the model-specific change (qwen3-next sampling stop_tokens) and the fix type, directly reflecting the main change in the PR.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1bdd3ba and cd3dd45.

📒 Files selected for processing (1)

tensorrt_llm/sampling_params.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-18T08:42:02.640Z

Learnt from: samuellees
Repo: NVIDIA/TensorRT-LLM PR: 6974
File: tensorrt_llm/serve/scripts/benchmark_dataset.py:558-566
Timestamp: 2025-08-18T08:42:02.640Z
Learning: In TensorRT-LLM's RandomDataset (tensorrt_llm/serve/scripts/benchmark_dataset.py), when using --random-token-ids option, sequence length accuracy is prioritized over semantic correctness for benchmarking purposes. The encode/decode operations should use skip_special_tokens=True and add_special_tokens=False to ensure exact target token lengths.

Applied to files:

tensorrt_llm/sampling_params.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

tensorrt_llm/sampling_params.py

JadoTu · 2025-11-21T02:55:41Z

/bot run

tensorrt-cicd · 2025-11-21T03:01:07Z

PR_Github #25283 [ run ] triggered by Bot. Commit: cd3dd45

tensorrt-cicd · 2025-11-21T04:55:43Z

PR_Github #25283 [ run ] completed with state SUCCESS. Commit: cd3dd45
/LLM/main/L0_MergeRequest_PR pipeline #19128 completed with status: 'FAILURE'

JadoTu · 2025-11-21T06:02:09Z

/bot run

tensorrt-cicd · 2025-11-21T06:07:44Z

PR_Github #25311 [ run ] triggered by Bot. Commit: cd3dd45

tensorrt-cicd · 2025-11-21T11:05:52Z

PR_Github #25311 [ run ] completed with state SUCCESS. Commit: cd3dd45
/LLM/main/L0_MergeRequest_PR pipeline #19148 completed with status: 'FAILURE'

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>

JadoTu · 2025-11-22T10:56:39Z

/bot run

tensorrt-cicd · 2025-11-22T11:01:55Z

PR_Github #25417 [ run ] triggered by Bot. Commit: 18925a8

tensorrt-cicd · 2025-11-22T15:46:31Z

PR_Github #25417 [ run ] completed with state SUCCESS. Commit: 18925a8
/LLM/main/L0_MergeRequest_PR pipeline #19233 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>

modify qwen3-next sampling stop_tokens

cd3dd45

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

tensorrt_llm/sampling_params.py Outdated Show resolved Hide resolved

case of empty stop_word_ids

18925a8

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>

nv-guomingz approved these changes Nov 23, 2025

View reviewed changes

nv-guomingz merged commit 0582e54 into NVIDIA:main Nov 23, 2025
5 checks passed

codego7250 pushed a commit to codego7250/TensorRT-LLM that referenced this pull request Dec 11, 2025

[None][fix] modify qwen3-next sampling stop_tokens (NVIDIA#9331)

0c32e79

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>

[None][fix] modify qwen3-next sampling stop_tokens #9331

[None][fix] modify qwen3-next sampling stop_tokens #9331

Uh oh!

Conversation

JadoTu commented Nov 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Uh oh!

coderabbitai bot commented Nov 20, 2025

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JadoTu commented Nov 21, 2025

Uh oh!

tensorrt-cicd commented Nov 21, 2025

Uh oh!

tensorrt-cicd commented Nov 21, 2025

Uh oh!

JadoTu commented Nov 21, 2025

Uh oh!

tensorrt-cicd commented Nov 21, 2025

Uh oh!

tensorrt-cicd commented Nov 21, 2025

Uh oh!

JadoTu commented Nov 22, 2025

Uh oh!

tensorrt-cicd commented Nov 22, 2025

Uh oh!

tensorrt-cicd commented Nov 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JadoTu commented Nov 20, 2025 •

edited by coderabbitai bot

Loading