[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6323

Shixiaowei02 · 2025-07-24T05:59:32Z

Summary by CodeRabbit

Refactor
- Standardized cache transceiver backend identifiers to uppercase: DEFAULT, UCX, NIXL, MPI. Update configurations and inputs accordingly.
Tests
- Updated unit and integration tests to use uppercase backend values across disaggregated configs and accuracy suites.
Chores
- Aligned example configs and YAML generation to the new uppercase backend identifiers for consistency.

coderabbitai · 2025-07-24T05:59:39Z

📝 Walkthrough

Walkthrough

Updated CacheTransceiverConfig.backend annotation to use uppercase Literal values ("DEFAULT","UCX","NIXL","MPI") and normalized all tests, example configs, and YAML fixtures to use the matching uppercase backend strings. No runtime logic, validation, or control-flow changes.

Changes

Cohort / File(s)	Summary
Core API `tensorrt_llm/llmapi/llm_args.py`	Change type hint for `CacheTransceiverConfig.backend` from Optional[Literal["default","ucx","nixl","mpi"]] to Optional[Literal["DEFAULT","UCX","NIXL","MPI"]].
Disaggregated test YAML configs `tests/integration/defs/disaggregated/test_configs/*`	Replace backend literals to uppercase across many YAMLs (e.g., `"default"→"DEFAULT"`, `"ucx"→"UCX"`, `"nixl"→"NIXL"`, `"mpi"→"MPI"`).
Accuracy tests (inline configs) `tests/integration/defs/accuracy/test_disaggregated_serving.py`	Update inline `cache_transceiver_config.backend` string constants to uppercase (multiple test cases).
Disaggregated tests (Python fixtures) `tests/integration/defs/disaggregated/test_disaggregated_etcd.py`, `tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py`	Update hard-coded YAML content and test constructor arguments to use uppercase backend strings.
Unit tests - LLM args `tests/unittest/llmapi/test_llm_args.py`	Adjust test inputs and assertions to use uppercase backend values (e.g., `"ucx"→"UCX"`).
Examples - configs `examples/disaggregated/disagg_config.yaml`	Change `cache_transceiver_config.backend` occurrences to uppercase.
Examples - Slurm generator `examples/disaggregated/slurm/benchmark/gen_yaml.py`	Update generated backend strings from lowercase to uppercase.

Sequence Diagram(s)

(omitted — changes are declarative value/casing updates only, no control-flow or feature behavior to diagram)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Possibly related PRs

chore:[BREAKING CHANGE] use cacheTransceiverConfig as knobs for disagg service #5234: Introduced uppercase BackendType enum and related backend-casing normalization that this change aligns with.
[TRTLLM-7030][fix] Refactor the example doc of dist-serving #6766: Updated docs/examples to use uppercase backend identifiers; overlaps with example/config edits here.
fix single_disagg_test #6166: Touches test usage of cache_transceiver_config.backend and may conflict or complement casing changes.

Suggested reviewers

nv-guomingz
qiaoxj07
Tabrizian
pcastonguay
chzblych

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

tensorrt_llm/llmapi/llm_args.py (2)
863-866: Consider the trade-off between compile-time safety and runtime flexibility.

Changing from Optional[Literal["default", "ucx", "nixl", "mpi"]] to Optional[str] removes compile-time type checking. While the runtime validator compensates for this, invalid values will only be caught at runtime rather than during development or static analysis.

Consider if the flexibility gained (case-insensitive validation) is worth losing compile-time safety. If case-insensitive validation is the primary goal, you could alternatively keep the Literal type and add a field validator that converts input to lowercase before validation.

872-881: Good validator implementation with minor considerations.

The validator correctly implements case-insensitive validation with clear error messaging. A few observations:

The order of valid backends differs from the original Literal order (["default", "ucx", "nixl", "mpi"] vs ["default", "ucx", "mpi", "nixl"]) - consider maintaining consistency.

The truthy check if self.backend: will skip validation for empty strings, which may be intended but differs from None handling.

Consider sorting the valid_backends list alphabetically for consistency:
-        valid_backends = ["default", "ucx", "mpi", "nixl"]
+        valid_backends = ["default", "mpi", "nixl", "ucx"]

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 428e340 and 7ce3cce.

📒 Files selected for processing (2)

examples/disaggregated/README.md (2 hunks)
tensorrt_llm/llmapi/llm_args.py (2 hunks)

🧰 Additional context used

🧠 Learnings (2)

tensorrt_llm/llmapi/llm_args.py (1)

Learnt from: amitz-nv
PR: #5616
File: tensorrt_llm/executor/worker.py:375-384
Timestamp: 2025-07-17T09:01:27.402Z
Learning: In tensorrt_llm/executor/worker.py, the LoRA adapter cache optimization logic that checks is_adapter_in_cpu_cache() and conditionally passes None for weights/config has a known race condition issue that cannot be solved with simple error handling or verification checks. This is a known limitation that requires a more comprehensive solution.

examples/disaggregated/README.md (1)

Learnt from: yechank-nvidia
PR: #6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using from_shared_tensor() is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call strip_for_generation() to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.

🔇 Additional comments (3)

examples/disaggregated/README.md (3)

19-41: Excellent documentation improvement!

The restructuring significantly enhances clarity by:

Presenting configuration file contents in proper YAML format instead of inline shell commands

Using shorter, more intuitive file names (ctx_extra-llm-api-config.yml vs context_extra-llm-api-config.yml)

Moving the overlap scheduler explanation into the YAML as a contextual comment

Clearly separating configuration file creation from server startup commands

This makes the setup process much more straightforward for users.

44-51: Commands updated consistently with new config file names.

The server startup commands correctly reference the new config file names while preserving all functional parameters. The clear section headers improve readability.

113-114: Dynamic scaling commands updated consistently.

The commands correctly use the new config file names while maintaining all functional parameters for the dynamic scaling feature.

Shixiaowei02 · 2025-07-24T06:05:53Z

/bot --help

github-actions · 2025-07-24T06:06:00Z

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Shixiaowei02 · 2025-07-24T06:06:39Z

/bot run --disable-fail-fast

tensorrt-cicd · 2025-07-24T06:12:00Z

PR_Github #12812 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-24T12:34:31Z

PR_Github #12812 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #9546 completed with status: 'FAILURE'

tensorrt_llm/llmapi/llm_args.py

Shixiaowei02 · 2025-07-28T08:52:37Z

/bot run

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

examples/disaggregated/README.md (1)
126-131: Fix typo refersh_interval → refresh_interval

The key name in the metadata-server config is misspelled, which will break parsing at runtime.
-refersh_interval: 10.0
+refresh_interval: 10.0

🧹 Nitpick comments (2)

benchmarks/cpp/README.md (1)

338-342: Add language identifier to fenced block

The fenced block that contains the export and mpirun commands is missing a language hint, triggering MD040 warnings and losing syntax highlighting.

-```
+```bash
 export TRTLLM_USE_UCX_KVCACHE=1
 mpirun -n ${proc} benchmarks/disaggServerBenchmark --context_engine_dirs ${context_engine_0},${context_engine_1}...,${context_engine_{m-1}} \
 --generation_engine_dirs ${generation_engine_0},${generation_engine_1}...,${generation_engine_{n-1}} --dataset ${dataset_path}


</blockquote></details>
<details>
<summary>examples/disaggregated/README.md (1)</summary><blockquote>

`96-100`: **Specify language for the client invocation block**

This fenced block is missing a language specifier, again tripping MD040. Add `bash` (or `shell`) so linters are quiet and readers get highlighting.

```diff
-```
+```bash
 python3 ./clients/disagg_client.py -c disagg_config.yaml -p ./clients/prompts.json -e chat


</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used: .coderabbit.yaml**
**Review profile: CHILL**
**Plan: Pro**


<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 836311d189385e626704ffeda5a906180f45fdbe and 8a5d19719e65ed19d683926c26c8f54dde31d019.

</details>

<details>
<summary>📒 Files selected for processing (31)</summary>

* `benchmarks/cpp/README.md` (1 hunks)
* `docs/source/advanced/disaggregated-service.md` (0 hunks)
* `examples/cpp/executor/README.md` (1 hunks)
* `examples/disaggregated/README.md` (3 hunks)
* `examples/disaggregated/slurm/gen_yaml.py` (2 hunks)
* `tensorrt_llm/llmapi/llm_args.py` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml` (1 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml` (2 hunks)
* `tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml` (1 hunks)

</details>

<details>
<summary>💤 Files with no reviewable changes (1)</summary>

* docs/source/advanced/disaggregated-service.md

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (28)</summary>

* tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml
* examples/disaggregated/slurm/gen_yaml.py
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml
* tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml
* examples/cpp/executor/README.md
* tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml
* tensorrt_llm/llmapi/llm_args.py

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>benchmarks/cpp/README.md (1)</summary>

Learnt from: yechank-nvidia
PR: NVIDIA/TensorRT-LLM#6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using `from_shared_tensor()` is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call `strip_for_generation()` to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.

</details>
<details>
<summary>examples/disaggregated/README.md (2)</summary>

Learnt from: yechank-nvidia
PR: NVIDIA/TensorRT-LLM#6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using `from_shared_tensor()` is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call `strip_for_generation()` to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.

Learnt from: amitz-nv
PR: NVIDIA/TensorRT-LLM#5616
File: tensorrt_llm/executor/worker.py:375-384
Timestamp: 2025-07-17T09:01:27.402Z
Learning: In tensorrt_llm/executor/worker.py, the LoRA adapter cache optimization logic that checks `is_adapter_in_cpu_cache()` and conditionally passes None for weights/config has a known race condition issue that cannot be solved with simple error handling or verification checks. This is a known limitation that requires a more comprehensive solution.

</details>

</details><details>
<summary>🪛 markdownlint-cli2 (0.17.2)</summary>

<details>
<summary>benchmarks/cpp/README.md</summary>

346-346: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>
<details>
<summary>examples/disaggregated/README.md</summary>

28-28: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

45-45: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

113-113: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

tensorrt-cicd · 2025-07-28T08:58:04Z

PR_Github #13175 [ run ] triggered by Bot

tensorrt-cicd · 2025-07-28T10:38:49Z

PR_Github #13175 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #9867 completed with status: 'FAILURE'

tensorrt-cicd · 2025-08-13T01:04:54Z

PR_Github #14990 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11319 completed with status: 'FAILURE'

Shixiaowei02 · 2025-08-13T06:38:47Z

/bot run

tensorrt-cicd · 2025-08-13T06:46:44Z

PR_Github #15083 [ run ] triggered by Bot

tensorrt-cicd · 2025-08-13T09:41:57Z

PR_Github #15083 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11387 completed with status: 'FAILURE'

Shixiaowei02 · 2025-08-13T09:48:53Z

/bot run

tensorrt-cicd · 2025-08-13T09:54:06Z

PR_Github #15113 [ run ] triggered by Bot

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (2)

examples/disaggregated/slurm/benchmark/gen_yaml.py (2)
9-10: Fix Python 3.8-incompatible generic annotations (use typing.Tuple instead of built-in tuple[]).

Our guidelines require Python 3.8+, but PEP 585 built-in generics (tuple[int, ...]) are only valid in 3.9+. This will raise at import time on 3.8 unless future annotations are enabled. Replace with typing.Tuple.

Apply this diff within the selected lines:
-def process_node_and_task() -> tuple[int, List[str], List[str]]:
+def process_node_and_task() -> Tuple[int, List[str], List[str]]:
Also add Tuple to the typing imports near Line 4 (outside the selected range):
from typing import Dict, List, Tuple
85-86: Fix Python 3.8-incompatible return annotation in generate_urls.

Same issue as above; change tuple[List[str], int] to Tuple[List[str], int].

Apply this diff within the selected lines:
-                  task_nodes_offset: int = 0) -> tuple[List[str], int]:
+                  task_nodes_offset: int = 0) -> Tuple[List[str], int]:
Ensure you’ve added:
from typing import Tuple
to the existing typing imports, as noted earlier.

♻️ Duplicate comments (2)

tests/unittest/llmapi/test_llm_args.py (2)
669-672: Fix Ruff F405 (star-import) and follow namespace import guideline for CacheTransceiverConfig.

Use the module namespace to avoid F405 and adhere to the coding guideline.

Apply this diff within the selected lines:
-        config = CacheTransceiverConfig(backend="UCX",
+        config = llm_args.CacheTransceiverConfig(backend="UCX",
             max_tokens_in_buffer=1024)
         assert config.backend == "UCX"
Add this import near the other imports at the top of the file (outside selected range):
import tensorrt_llm.llmapi.llm_args as llm_args
677-677: Fix Ruff F405 here as well (use namespaced class).

Mirror the change above for the invalid-argument case.

Apply this diff within the selected lines:
-            CacheTransceiverConfig(backend="UCX", invalid_config="should_fail")
+            llm_args.CacheTransceiverConfig(backend="UCX", invalid_config="should_fail")

🧹 Nitpick comments (3)

examples/disaggregated/slurm/benchmark/gen_yaml.py (1)
1-1: Add NVIDIA copyright header to comply with repository standards.

Per coding guidelines, all Python sources should include the NVIDIA header.

Add this at the top of the file:
# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (2)
13-13: Nit: quote DEFAULT to keep YAML formatting consistent with other configs.

Other files quote the backend string; consistency reduces churn and ambiguity.

Apply this diff:
-    backend: DEFAULT
+    backend: "DEFAULT"
21-21: Nit: quote DEFAULT to keep YAML formatting consistent with other configs (generation_servers).

Same as the previous suggestion.

Apply this diff:
-    backend: DEFAULT
+    backend: "DEFAULT"

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9dde14 and 2ccc7e4.

📒 Files selected for processing (44)

examples/disaggregated/disagg_config.yaml (1 hunks)
examples/disaggregated/slurm/benchmark/gen_yaml.py (2 hunks)
tensorrt_llm/llmapi/llm_args.py (1 hunks)
tests/integration/defs/accuracy/test_disaggregated_serving.py (13 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_disaggregated_etcd.py (1 hunks)
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (3 hunks)
tests/unittest/llmapi/test_llm_args.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (39)

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml
tests/integration/defs/disaggregated/test_disaggregated_etcd.py
examples/disaggregated/disagg_config.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml
tests/integration/defs/accuracy/test_disaggregated_serving.py
tensorrt_llm/llmapi/llm_args.py
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else

Files:

examples/disaggregated/slurm/benchmark/gen_yaml.py
tests/unittest/llmapi/test_llm_args.py

**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

Prepend NVIDIA copyright header (current year) to all source files

Files:

examples/disaggregated/slurm/benchmark/gen_yaml.py
tests/unittest/llmapi/test_llm_args.py

🪛 Ruff (0.12.2)

tests/unittest/llmapi/test_llm_args.py

669-669: CacheTransceiverConfig may be undefined, or defined from star imports

(F405)

677-677: CacheTransceiverConfig may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (7)

examples/disaggregated/slurm/benchmark/gen_yaml.py (2)

198-201: LGTM: backend normalized to uppercase "DEFAULT" (context_servers).

Matches the updated Literal type and keeps docs/code aligned.

226-229: LGTM: backend normalized to uppercase "DEFAULT" (generation_servers).

Consistent with the rest of the PR’s uppercase normalization.

tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (3)

20-21: LGTM: backend set to "DEFAULT" (context_servers).

Conforms to the updated uppercase Literal contract.

36-37: LGTM: backend set to "DEFAULT" (generation_servers).

Consistent with the new typing and other configs.

1-1: No lingering lowercase backend values detected
A repository-wide search confirms that all backend entries use uppercase (DEFAULT, UCX, NIXL, MPI). There are no remaining lowercase occurrences of default, ucx, nixl, or mpi.

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (2)

13-13: LGTM: backend normalized to DEFAULT (context_servers).

Matches the Literal["DEFAULT","UCX","NIXL","MPI"] requirement.

25-25: LGTM: backend normalized to DEFAULT (generation_servers).

Aligned with the rest of the PR.

tensorrt-cicd · 2025-08-13T11:20:50Z

PR_Github #15113 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11410 completed with status: 'FAILURE'

Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

Shixiaowei02 · 2025-08-13T12:38:52Z

/bot run

tensorrt-cicd · 2025-08-13T12:44:34Z

PR_Github #15142 [ run ] triggered by Bot

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

tests/unittest/llmapi/test_llm_args.py (1)
669-672: Fix Ruff F405 and adhere to namespace-import guideline without wide refactor

Use a namespaced import for llm_args and reference CacheTransceiverConfig via that namespace to satisfy the “maintain module namespace” rule and remove F405 at these sites. Avoid alias “llm_args” to prevent shadowing local variables named llm_args in other tests.

Apply within the selected lines:
-        config = CacheTransceiverConfig(backend="UCX",
+        config = llm_args_mod.CacheTransceiverConfig(backend="UCX",
             max_tokens_in_buffer=1024)
         assert config.backend == "UCX"
...
-            CacheTransceiverConfig(backend="UCX", invalid_config="should_fail")
+            llm_args_mod.CacheTransceiverConfig(backend="UCX", invalid_config="should_fail")
Add this import near the other imports at the top of the file (outside the selected range):
import tensorrt_llm.llmapi.llm_args as llm_args_mod
Optionally, in a follow-up, consider replacing the star import:

from tensorrt_llm.llmapi.llm_args import *
with explicit names or with the llm_args_mod namespace throughout the file to fully eliminate F405 risks elsewhere.

Also applies to: 677-677

🧹 Nitpick comments (2)

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1)
12-12: Optional: unify YAML quoting style

Other configs use unquoted enum values; consider dropping quotes here for consistency. Both are valid YAML; this is purely stylistic.

Apply this minimal diff:
-    backend: "NIXL"
+    backend: NIXL
Also applies to: 20-20
tests/unittest/llmapi/test_llm_args.py (1)
669-669: Add NVIDIA copyright header (current year)

Per coding guidelines, prepend the standard NVIDIA header to Python sources.

Add at the very top of the file (outside the selected range):
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ccc7e4 and e4a6504.

📒 Files selected for processing (44)

examples/disaggregated/disagg_config.yaml (1 hunks)
examples/disaggregated/slurm/benchmark/gen_yaml.py (2 hunks)
tensorrt_llm/llmapi/llm_args.py (1 hunks)
tests/integration/defs/accuracy/test_disaggregated_serving.py (13 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml (1 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml (2 hunks)
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml (1 hunks)
tests/integration/defs/disaggregated/test_disaggregated_etcd.py (1 hunks)
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py (3 hunks)
tests/unittest/llmapi/test_llm_args.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (39)

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml
examples/disaggregated/slurm/benchmark/gen_yaml.py
tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml
tensorrt_llm/llmapi/llm_args.py
tests/integration/defs/disaggregated/test_disaggregated_etcd.py
tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml
tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml
tests/integration/defs/accuracy/test_disaggregated_serving.py
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml
examples/disaggregated/disagg_config.yaml

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else

Files:

tests/unittest/llmapi/test_llm_args.py

**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

Prepend NVIDIA copyright header (current year) to all source files

Files:

tests/unittest/llmapi/test_llm_args.py

🪛 Ruff (0.12.2)

tests/unittest/llmapi/test_llm_args.py

669-669: CacheTransceiverConfig may be undefined, or defined from star imports

(F405)

677-677: CacheTransceiverConfig may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (6)

tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (2)

21-21: Uppercased backend literal aligns with the new type annotation. LGTM.

Both cache_transceiver_config blocks now use DEFAULT, consistent with the updated Literal["DEFAULT","UCX","NIXL","MPI"].

Also applies to: 36-36

21-21: No leftover lowercase backend values found

The grep scan across YAML configs, Python code, and Markdown/docs returned only uppercase DEFAULT, UCX, MPI, and NIXL entries. There are no remaining lowercase default, ucx, nixl, or mpi occurrences.

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (1)

14-14: Consistent enum casing change looks correct

Using DEFAULT for both context and generation servers matches the updated enum and keeps configs consistent.

Also applies to: 24-24

tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (1)

18-18: LGTM: backend enum normalized to uppercase

DEFAULT here is consistent with the new Literal and with other updated fixtures.

Also applies to: 33-33

tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1)

12-12: Correct NIXL uppercase enum

Switching from "nixl" to "NIXL" aligns with the new Literal. Functionally correct.

Also applies to: 20-20

tests/unittest/llmapi/test_llm_args.py (1)

669-672: LGTM on enum casing in tests

Updating "ucx" -> "UCX" aligns the test with the new Literal contract and expected values.

Also applies to: 677-677

tensorrt-cicd · 2025-08-14T07:48:55Z

PR_Github #15142 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11435 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

NVIDIA#6323)

NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

NVIDIA#6323) Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>

NVIDIA#6323) Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

NVIDIA#6323)

NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

Shixiaowei02 requested a review from a team as a code owner July 24, 2025 05:59

Shixiaowei02 requested a review from nv-guomingz July 24, 2025 05:59

coderabbitai bot requested review from kaiyux and pcastonguay July 24, 2025 05:59

Shixiaowei02 force-pushed the user/xiaoweis/doc branch from 7ce3cce to c3bcefe Compare July 24, 2025 06:02

coderabbitai bot reviewed Jul 24, 2025

View reviewed changes

Shixiaowei02 force-pushed the user/xiaoweis/doc branch 2 times, most recently from fd64e14 to 836311d Compare July 24, 2025 06:05

Shixiaowei02 requested review from chuangz0 and joyang-nv July 24, 2025 06:06

chuangz0 approved these changes Jul 24, 2025

View reviewed changes

pcastonguay approved these changes Jul 25, 2025

View reviewed changes

Superjomn reviewed Jul 25, 2025

View reviewed changes

tensorrt_llm/llmapi/llm_args.py Outdated Show resolved Hide resolved

tensorrt_llm/llmapi/llm_args.py Outdated Show resolved Hide resolved

tensorrt_llm/llmapi/llm_args.py Outdated Show resolved Hide resolved

Shixiaowei02 force-pushed the user/xiaoweis/doc branch from 836311d to 8a5d197 Compare July 28, 2025 08:49

coderabbitai bot requested review from litaotju and pcastonguay July 28, 2025 08:50

Shixiaowei02 changed the title ~~fix: mismatch between docs and actual commands~~ fix: [BREAKING CHANGE] Mismatch between docs and actual commands Jul 28, 2025

coderabbitai bot added Community want to contribute PRs initiated from Community Doc <NV>TRTLLM's textual/illustrative materials: API refs, guides, tutorials. Improvement & clarity. labels Jul 28, 2025

coderabbitai bot reviewed Jul 28, 2025

View reviewed changes

Shixiaowei02 force-pushed the user/xiaoweis/doc branch from 8a5d197 to cb91758 Compare July 29, 2025 02:49

Shixiaowei02 force-pushed the user/xiaoweis/doc branch from 11ede22 to 6592d06 Compare August 13, 2025 06:37

Shixiaowei02 changed the title ~~[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands~~ [TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands Aug 13, 2025

Shixiaowei02 force-pushed the user/xiaoweis/doc branch from 6592d06 to 2ccc7e4 Compare August 13, 2025 09:47

coderabbitai bot reviewed Aug 13, 2025

View reviewed changes

fix: mismatch between docs and actual commands

e4a6504

Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

Shixiaowei02 force-pushed the user/xiaoweis/doc branch from 2ccc7e4 to e4a6504 Compare August 13, 2025 12:38

coderabbitai bot reviewed Aug 13, 2025

View reviewed changes

Shixiaowei02 merged commit 1095dfd into NVIDIA:main Aug 14, 2025
5 checks passed

Shixiaowei02 deleted the user/xiaoweis/doc branch August 14, 2025 09:50

This was referenced Aug 14, 2025

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6780

Closed

[None][fix] Update tests to use standardized uppercase backend identifiers #6921

Merged

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025

[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (

2121d72

NVIDIA#6323)

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025

[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (

7cbd1b6

NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025

[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (

ca75e48

NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 17, 2025

[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (

4ba74be

NVIDIA#6323) Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 18, 2025

[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (

be40988

NVIDIA#6323)

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Aug 18, 2025

[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (

fe70eed

NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

This was referenced Aug 18, 2025

[TRTLLM-7030][fix] uppercase def value in pd-config #6981

Merged

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #7191

Merged

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6323

[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6323

Uh oh!

Conversation

Shixiaowei02 commented Jul 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Shixiaowei02 commented Jul 24, 2025

Uh oh!

github-actions bot commented Jul 24, 2025

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Shixiaowei02 commented Jul 24, 2025

Uh oh!

tensorrt-cicd commented Jul 24, 2025

Uh oh!

tensorrt-cicd commented Jul 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Shixiaowei02 commented Jul 28, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Jul 28, 2025

Uh oh!

tensorrt-cicd commented Jul 28, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

Shixiaowei02 commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

Shixiaowei02 commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

Shixiaowei02 commented Aug 13, 2025

Uh oh!

tensorrt-cicd commented Aug 13, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Aug 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Shixiaowei02 commented Jul 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 24, 2025 •

edited

Loading