-
Notifications
You must be signed in to change notification settings - Fork 2k
[TRTLLM-7030][fix] BREAKING CHANGE: Mismatch between docs and actual commands #6323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughUpdated CacheTransceiverConfig.backend annotation to use uppercase Literal values ("DEFAULT","UCX","NIXL","MPI") and normalized all tests, example configs, and YAML fixtures to use the matching uppercase backend strings. No runtime logic, validation, or control-flow changes. Changes
Sequence Diagram(s)(omitted — changes are declarative value/casing updates only, no control-flow or feature behavior to diagram) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
7ce3cce to
c3bcefe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tensorrt_llm/llmapi/llm_args.py (2)
863-866: Consider the trade-off between compile-time safety and runtime flexibility.Changing from
Optional[Literal["default", "ucx", "nixl", "mpi"]]toOptional[str]removes compile-time type checking. While the runtime validator compensates for this, invalid values will only be caught at runtime rather than during development or static analysis.Consider if the flexibility gained (case-insensitive validation) is worth losing compile-time safety. If case-insensitive validation is the primary goal, you could alternatively keep the Literal type and add a field validator that converts input to lowercase before validation.
872-881: Good validator implementation with minor considerations.The validator correctly implements case-insensitive validation with clear error messaging. A few observations:
The order of valid backends differs from the original Literal order (
["default", "ucx", "nixl", "mpi"]vs["default", "ucx", "mpi", "nixl"]) - consider maintaining consistency.The truthy check
if self.backend:will skip validation for empty strings, which may be intended but differs from None handling.Consider sorting the valid_backends list alphabetically for consistency:
- valid_backends = ["default", "ucx", "mpi", "nixl"] + valid_backends = ["default", "mpi", "nixl", "ucx"]
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/disaggregated/README.md(2 hunks)tensorrt_llm/llmapi/llm_args.py(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
tensorrt_llm/llmapi/llm_args.py (1)
Learnt from: amitz-nv
PR: #5616
File: tensorrt_llm/executor/worker.py:375-384
Timestamp: 2025-07-17T09:01:27.402Z
Learning: In tensorrt_llm/executor/worker.py, the LoRA adapter cache optimization logic that checks is_adapter_in_cpu_cache() and conditionally passes None for weights/config has a known race condition issue that cannot be solved with simple error handling or verification checks. This is a known limitation that requires a more comprehensive solution.
examples/disaggregated/README.md (1)
Learnt from: yechank-nvidia
PR: #6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using from_shared_tensor() is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call strip_for_generation() to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.
🔇 Additional comments (3)
examples/disaggregated/README.md (3)
19-41: Excellent documentation improvement!The restructuring significantly enhances clarity by:
- Presenting configuration file contents in proper YAML format instead of inline shell commands
- Using shorter, more intuitive file names (
ctx_extra-llm-api-config.ymlvscontext_extra-llm-api-config.yml)- Moving the overlap scheduler explanation into the YAML as a contextual comment
- Clearly separating configuration file creation from server startup commands
This makes the setup process much more straightforward for users.
44-51: Commands updated consistently with new config file names.The server startup commands correctly reference the new config file names while preserving all functional parameters. The clear section headers improve readability.
113-114: Dynamic scaling commands updated consistently.The commands correctly use the new config file names while maintaining all functional parameters for the dynamic scaling feature.
fd64e14 to
836311d
Compare
|
/bot --help |
GitHub Bot Help
Provide a user friendly way for developers to interact with a Jenkins server. Run See details below for each supported subcommand. Details
Launch build/test pipelines. All previously running jobs will be killed.
kill
Kill all running builds associated with pull request. skip
Skip testing for latest commit on pull request. reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break. |
|
/bot run --disable-fail-fast |
|
PR_Github #12812 [ run ] triggered by Bot |
|
PR_Github #12812 [ run ] completed with state |
836311d to
8a5d197
Compare
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
examples/disaggregated/README.md (1)
126-131: Fix typorefersh_interval→refresh_intervalThe key name in the metadata-server config is misspelled, which will break parsing at runtime.
-refersh_interval: 10.0 +refresh_interval: 10.0
🧹 Nitpick comments (2)
benchmarks/cpp/README.md (1)
338-342: Add language identifier to fenced blockThe fenced block that contains the
exportandmpiruncommands is missing a language hint, triggering MD040 warnings and losing syntax highlighting.-``` +```bash export TRTLLM_USE_UCX_KVCACHE=1 mpirun -n ${proc} benchmarks/disaggServerBenchmark --context_engine_dirs ${context_engine_0},${context_engine_1}...,${context_engine_{m-1}} \ --generation_engine_dirs ${generation_engine_0},${generation_engine_1}...,${generation_engine_{n-1}} --dataset ${dataset_path}</blockquote></details> <details> <summary>examples/disaggregated/README.md (1)</summary><blockquote> `96-100`: **Specify language for the client invocation block** This fenced block is missing a language specifier, again tripping MD040. Add `bash` (or `shell`) so linters are quiet and readers get highlighting. ```diff -``` +```bash python3 ./clients/disagg_client.py -c disagg_config.yaml -p ./clients/prompts.json -e chat</blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: .coderabbit.yaml** **Review profile: CHILL** **Plan: Pro** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 836311d189385e626704ffeda5a906180f45fdbe and 8a5d19719e65ed19d683926c26c8f54dde31d019. </details> <details> <summary>📒 Files selected for processing (31)</summary> * `benchmarks/cpp/README.md` (1 hunks) * `docs/source/advanced/disaggregated-service.md` (0 hunks) * `examples/cpp/executor/README.md` (1 hunks) * `examples/disaggregated/README.md` (3 hunks) * `examples/disaggregated/slurm/gen_yaml.py` (2 hunks) * `tensorrt_llm/llmapi/llm_args.py` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml` (1 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml` (2 hunks) * `tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml` (1 hunks) </details> <details> <summary>💤 Files with no reviewable changes (1)</summary> * docs/source/advanced/disaggregated-service.md </details> <details> <summary>✅ Files skipped from review due to trivial changes (28)</summary> * tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml * examples/disaggregated/slurm/gen_yaml.py * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml * tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml * examples/cpp/executor/README.md * tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml * tensorrt_llm/llmapi/llm_args.py </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🧠 Learnings (2)</summary> <details> <summary>benchmarks/cpp/README.md (1)</summary> Learnt from: yechank-nvidia PR: NVIDIA/TensorRT-LLM#6254 File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204 Timestamp: 2025-07-22T09:22:14.726Z Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using `from_shared_tensor()` is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call `strip_for_generation()` to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation. </details> <details> <summary>examples/disaggregated/README.md (2)</summary> Learnt from: yechank-nvidia PR: NVIDIA/TensorRT-LLM#6254 File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204 Timestamp: 2025-07-22T09:22:14.726Z Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using `from_shared_tensor()` is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call `strip_for_generation()` to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation. Learnt from: amitz-nv PR: NVIDIA/TensorRT-LLM#5616 File: tensorrt_llm/executor/worker.py:375-384 Timestamp: 2025-07-17T09:01:27.402Z Learning: In tensorrt_llm/executor/worker.py, the LoRA adapter cache optimization logic that checks `is_adapter_in_cpu_cache()` and conditionally passes None for weights/config has a known race condition issue that cannot be solved with simple error handling or verification checks. This is a known limitation that requires a more comprehensive solution. </details> </details><details> <summary>🪛 markdownlint-cli2 (0.17.2)</summary> <details> <summary>benchmarks/cpp/README.md</summary> 346-346: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> <details> <summary>examples/disaggregated/README.md</summary> 28-28: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 45-45: Fenced code blocks should have a language specified (MD040, fenced-code-language) --- 113-113: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
PR_Github #13175 [ run ] triggered by Bot |
|
PR_Github #13175 [ run ] completed with state |
8a5d197 to
cb91758
Compare
|
PR_Github #14990 [ run ] completed with state |
11ede22 to
6592d06
Compare
|
/bot run |
|
PR_Github #15083 [ run ] triggered by Bot |
|
PR_Github #15083 [ run ] completed with state |
6592d06 to
2ccc7e4
Compare
|
/bot run |
|
PR_Github #15113 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
examples/disaggregated/slurm/benchmark/gen_yaml.py (2)
9-10: Fix Python 3.8-incompatible generic annotations (use typing.Tuple instead of built-in tuple[]).Our guidelines require Python 3.8+, but PEP 585 built-in generics (tuple[int, ...]) are only valid in 3.9+. This will raise at import time on 3.8 unless future annotations are enabled. Replace with typing.Tuple.
Apply this diff within the selected lines:
-def process_node_and_task() -> tuple[int, List[str], List[str]]: +def process_node_and_task() -> Tuple[int, List[str], List[str]]:Also add Tuple to the typing imports near Line 4 (outside the selected range):
from typing import Dict, List, Tuple
85-86: Fix Python 3.8-incompatible return annotation in generate_urls.Same issue as above; change tuple[List[str], int] to Tuple[List[str], int].
Apply this diff within the selected lines:
- task_nodes_offset: int = 0) -> tuple[List[str], int]: + task_nodes_offset: int = 0) -> Tuple[List[str], int]:Ensure you’ve added:
from typing import Tuple
to the existing typing imports, as noted earlier.
♻️ Duplicate comments (2)
tests/unittest/llmapi/test_llm_args.py (2)
669-672: Fix Ruff F405 (star-import) and follow namespace import guideline for CacheTransceiverConfig.Use the module namespace to avoid F405 and adhere to the coding guideline.
Apply this diff within the selected lines:
- config = CacheTransceiverConfig(backend="UCX", + config = llm_args.CacheTransceiverConfig(backend="UCX", max_tokens_in_buffer=1024) assert config.backend == "UCX"Add this import near the other imports at the top of the file (outside selected range):
import tensorrt_llm.llmapi.llm_args as llm_args
677-677: Fix Ruff F405 here as well (use namespaced class).Mirror the change above for the invalid-argument case.
Apply this diff within the selected lines:
- CacheTransceiverConfig(backend="UCX", invalid_config="should_fail") + llm_args.CacheTransceiverConfig(backend="UCX", invalid_config="should_fail")
🧹 Nitpick comments (3)
examples/disaggregated/slurm/benchmark/gen_yaml.py (1)
1-1: Add NVIDIA copyright header to comply with repository standards.Per coding guidelines, all Python sources should include the NVIDIA header.
Add this at the top of the file:
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml (2)
13-13: Nit: quote DEFAULT to keep YAML formatting consistent with other configs.Other files quote the backend string; consistency reduces churn and ambiguity.
Apply this diff:
- backend: DEFAULT + backend: "DEFAULT"
21-21: Nit: quote DEFAULT to keep YAML formatting consistent with other configs (generation_servers).Same as the previous suggestion.
Apply this diff:
- backend: DEFAULT + backend: "DEFAULT"
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (44)
examples/disaggregated/disagg_config.yaml(1 hunks)examples/disaggregated/slurm/benchmark/gen_yaml.py(2 hunks)tensorrt_llm/llmapi/llm_args.py(1 hunks)tests/integration/defs/accuracy/test_disaggregated_serving.py(13 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_disaggregated_etcd.py(1 hunks)tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py(3 hunks)tests/unittest/llmapi/test_llm_args.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (39)
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml
- tests/integration/defs/disaggregated/test_disaggregated_etcd.py
- examples/disaggregated/disagg_config.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml
- tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml
- tests/integration/defs/accuracy/test_disaggregated_serving.py
- tensorrt_llm/llmapi/llm_args.py
- tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
examples/disaggregated/slurm/benchmark/gen_yaml.pytests/unittest/llmapi/test_llm_args.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
examples/disaggregated/slurm/benchmark/gen_yaml.pytests/unittest/llmapi/test_llm_args.py
🪛 Ruff (0.12.2)
tests/unittest/llmapi/test_llm_args.py
669-669: CacheTransceiverConfig may be undefined, or defined from star imports
(F405)
677-677: CacheTransceiverConfig may be undefined, or defined from star imports
(F405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (7)
examples/disaggregated/slurm/benchmark/gen_yaml.py (2)
198-201: LGTM: backend normalized to uppercase "DEFAULT" (context_servers).Matches the updated Literal type and keeps docs/code aligned.
226-229: LGTM: backend normalized to uppercase "DEFAULT" (generation_servers).Consistent with the rest of the PR’s uppercase normalization.
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml (3)
20-21: LGTM: backend set to "DEFAULT" (context_servers).Conforms to the updated uppercase Literal contract.
36-37: LGTM: backend set to "DEFAULT" (generation_servers).Consistent with the new typing and other configs.
1-1: No lingering lowercase backend values detected
A repository-wide search confirms that allbackendentries use uppercase (DEFAULT,UCX,NIXL,MPI). There are no remaining lowercase occurrences ofdefault,ucx,nixl, ormpi.tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml (2)
13-13: LGTM: backend normalized to DEFAULT (context_servers).Matches the Literal["DEFAULT","UCX","NIXL","MPI"] requirement.
25-25: LGTM: backend normalized to DEFAULT (generation_servers).Aligned with the rest of the PR.
|
PR_Github #15113 [ run ] completed with state |
Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2ccc7e4 to
e4a6504
Compare
|
/bot run |
|
PR_Github #15142 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/unittest/llmapi/test_llm_args.py (1)
669-672: Fix Ruff F405 and adhere to namespace-import guideline without wide refactorUse a namespaced import for llm_args and reference CacheTransceiverConfig via that namespace to satisfy the “maintain module namespace” rule and remove F405 at these sites. Avoid alias “llm_args” to prevent shadowing local variables named llm_args in other tests.
Apply within the selected lines:
- config = CacheTransceiverConfig(backend="UCX", + config = llm_args_mod.CacheTransceiverConfig(backend="UCX", max_tokens_in_buffer=1024) assert config.backend == "UCX" ... - CacheTransceiverConfig(backend="UCX", invalid_config="should_fail") + llm_args_mod.CacheTransceiverConfig(backend="UCX", invalid_config="should_fail")Add this import near the other imports at the top of the file (outside the selected range):
import tensorrt_llm.llmapi.llm_args as llm_args_modOptionally, in a follow-up, consider replacing the star import:
- from tensorrt_llm.llmapi.llm_args import *
with explicit names or with the llm_args_mod namespace throughout the file to fully eliminate F405 risks elsewhere.Also applies to: 677-677
🧹 Nitpick comments (2)
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1)
12-12: Optional: unify YAML quoting styleOther configs use unquoted enum values; consider dropping quotes here for consistency. Both are valid YAML; this is purely stylistic.
Apply this minimal diff:
- backend: "NIXL" + backend: NIXLAlso applies to: 20-20
tests/unittest/llmapi/test_llm_args.py (1)
669-669: Add NVIDIA copyright header (current year)Per coding guidelines, prepend the standard NVIDIA header to Python sources.
Add at the very top of the file (outside the selected range):
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (44)
examples/disaggregated/disagg_config.yaml(1 hunks)examples/disaggregated/slurm/benchmark/gen_yaml.py(2 hunks)tensorrt_llm/llmapi/llm_args.py(1 hunks)tests/integration/defs/accuracy/test_disaggregated_serving.py(13 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml(1 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml(2 hunks)tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml(1 hunks)tests/integration/defs/disaggregated/test_disaggregated_etcd.py(1 hunks)tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py(3 hunks)tests/unittest/llmapi/test_llm_args.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (39)
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_overlap.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only_trt_backend.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_diff_max_tokens.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_conditional.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_trt_backend.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_load_balance.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_mpi.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_gentp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp2_genpp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_cache_aware_balance_deepseek_v3.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse_deepseek_v3.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_gen_only.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_genpp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_ucx.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml
- examples/disaggregated/slurm/benchmark/gen_yaml.py
- tests/integration/defs/disaggregated/test_configs/disagg_config_cuda_graph_padding.yaml
- tensorrt_llm/llmapi/llm_args.py
- tests/integration/defs/disaggregated/test_disaggregated_etcd.py
- tests/integration/defs/disaggregated/test_configs/disagg_config_ngram.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_torch_sampler.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_mixed.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp1_trt_backend.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp1_gentp1_deepseek_v3_lite_two_mtp.yaml
- tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml
- tests/integration/defs/accuracy/test_disaggregated_serving.py
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2pp2_gentp2pp2.yaml
- tests/integration/defs/disaggregated/test_configs/disagg_config_ctxpp4_genpp4.yaml
- examples/disaggregated/disagg_config.yaml
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
tests/unittest/llmapi/test_llm_args.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
tests/unittest/llmapi/test_llm_args.py
🪛 Ruff (0.12.2)
tests/unittest/llmapi/test_llm_args.py
669-669: CacheTransceiverConfig may be undefined, or defined from star imports
(F405)
677-677: CacheTransceiverConfig may be undefined, or defined from star imports
(F405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (6)
tests/integration/defs/disaggregated/test_configs/disagg_config_conditional_deepseek_v3.yaml (2)
21-21: Uppercased backend literal aligns with the new type annotation. LGTM.Both cache_transceiver_config blocks now use DEFAULT, consistent with the updated Literal["DEFAULT","UCX","NIXL","MPI"].
Also applies to: 36-36
21-21: No leftover lowercase backend values foundThe grep scan across YAML configs, Python code, and Markdown/docs returned only uppercase
DEFAULT,UCX,MPI, andNIXLentries. There are no remaining lowercasedefault,ucx,nixl, ormpioccurrences.tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml (1)
14-14: Consistent enum casing change looks correctUsing DEFAULT for both context and generation servers matches the updated enum and keeps configs consistent.
Also applies to: 24-24
tests/integration/defs/disaggregated/test_configs/disagg_config_cache_reuse.yaml (1)
18-18: LGTM: backend enum normalized to uppercaseDEFAULT here is consistent with the new Literal and with other updated fixtures.
Also applies to: 33-33
tests/integration/defs/disaggregated/test_configs/disagg_config_ctxtp2_gentp2_deepseek_v3_lite_nixl.yaml (1)
12-12: Correct NIXL uppercase enumSwitching from "nixl" to "NIXL" aligns with the new Literal. Functionally correct.
Also applies to: 20-20
tests/unittest/llmapi/test_llm_args.py (1)
669-672: LGTM on enum casing in testsUpdating "ucx" -> "UCX" aligns the test with the new Literal contract and expected values.
Also applies to: 677-677
|
PR_Github #15142 [ run ] completed with state |
NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
NVIDIA#6323) Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
NVIDIA#6323) Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
NVIDIA#6323) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Summary by CodeRabbit
Refactor
Tests
Chores