[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 #7985

chenfeiz0326 · 2025-09-25T07:48:10Z

Description

This PR adds trtllm-serve + client perf test for TRTLLM. This PR only supports R1 FP4 models perf tests on 4xB200 and 4xB300.

You can define the perf tests in yaml file: ${LLM_ROOT}/tests/script/perf-sanity/l0_dgx_b300.yaml and then use pytest to launch test. If the yaml file is l0_dgx_b300.yaml, server name is r1_fp4_dep4 and client name is con1_iter1_1024_1024:

echo "perf/test_perf.py::test_perf[perf_sanity-l0_dgx_b300-r1_fp4_dep4:con1_iter1_1024_1024]" > dsr1.perf_sanity.gb300.txt
pytest -v -s --test-prefix=${LLM_ROOT}/tests/integration/defs --test-list=dsr1.perf_sanity.gb300.txt -R=perf_sanity-l0_dgx_b300-r1_fp4_dep4:con1_iter1_1024_1024 --output-dir=./output --perf --perf-log-formats=csv -o junit_logging=out-err

Currently the perf test system doesn't support history perf data database.

You can also trigger the test by /bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Perf-Sanity-Post-Merge-1,DGX_B300-4_GPUs-PyTorch-Perf-Sanity-Post-Merge-1" .

coderabbitai · 2025-09-25T07:55:57Z

📝 Walkthrough

Walkthrough

Adds a Perf-Sanity test pathway to the Jenkins L0 pipeline with a new filter flag and parallel job orchestration. Introduces and updates perf-sanity scripts: Docker-run wrapper, benchmark runner with retries and reproduction scripts, result parser with filename-based extraction and dynamic metrics, and replaces config YAMLs (remove old, add l0_dgx_b200).

Changes

Cohort / File(s)	Summary
Jenkins pipeline integration `jenkins/L0_Test.groovy`	Adds ONLY_PERF_SANITY_TEST flag and filter; integrates perfSanity configs/jobs into fullSet and parallel orchestration; branches execution to perf-sanity bench script when stage name indicates Perf-Sanity; sets env/library paths and credentials similarly to pytest path.
Perf-sanity Docker entrypoint `tests/scripts/perf-sanity/benchmark-serve.sh`	Reworks CLI to accept config file, optional TRT-LLM dir, and extra options; validates dirs; constructs dynamic Docker mounts; optionally installs TRT-LLM from wheel; updates benchmark and CSV parse invocations to use provided config path.
Perf-sanity benchmark runner `tests/scripts/perf-sanity/run_benchmark_serve.py`	Adds retryable execution with per-test metadata tracking; initializes test case infos; generates reproduction scripts; enhances wait/cleanup, logging, and error handling; adjusts server/benchmark lifecycle and timeouts; extends log filenames; updates enable_chunked_prefill derivation.
Perf-sanity result parser `tests/scripts/perf-sanity/parse_benchmark_results.py`	Adds filename-based config extraction; introduces metric extraction/update pipeline; threads metrics through test case generation and CSV row creation; minor messaging adjustments.
Perf-sanity configs (add/remove) `tests/scripts/perf-sanity/benchmark_config.yaml`	Removes previous multi-case config file.
Perf-sanity configs (add) `tests/scripts/perf-sanity/l0_dgx_b200.yaml`	Adds single-case config for 70B-FP8 on 1 GPU with specified batching, memory, and concurrency iterations.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Dev as Jenkins L0 Pipeline
  participant PS as Perf-Sanity Stage
  participant SH as benchmark-serve.sh
  participant PY as run_benchmark_serve.py
  participant Srv as TRT-LLM Server
  participant Bmk as Benchmark Client
  participant Pars as parse_benchmark_results.py

  Dev->>PS: Launch stage (ONLY_PERF_SANITY_TEST or stageName contains "Perf-Sanity")
  PS->>SH: docker run ... benchmark-serve.sh --config_file l0_dgx_b200.yaml ...
  Note over SH: Validate dirs, build mounts, optional TRT-LLM install
  SH->>PY: python run_benchmark_serve.py --config_file ...
  PY->>PY: initialize_test_case_infos()
  loop per test_case with retries
    PY->>Srv: start server
    PY->>PY: wait_for_server() and cleanup caches
    alt server ready
      loop per concurrency
        PY->>Bmk: run benchmark
        Bmk-->>PY: status/logs
        PY->>PY: handle success/error, timeouts
      end
      PY->>Srv: terminate server
    else server not ready
      PY->>PY: mark failed and retry/abort
    end
  end
  PY->>PY: generate_reproduction_scripts()
  SH->>Pars: parse_benchmark_results.py --config_file ...
  Pars-->>SH: CSV output
  SH-->>PS: Exit code
  PS-->>Dev: Stage result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The title states "Add Server-Client Perf Test in pytest for B200 and B300," but this contains inaccuracies that would mislead reviewers. The actual changes introduce a Perf-Sanity test pathway that runs benchmarks via a dedicated bench script (run_benchmark_serve.py) rather than pytest, as explicitly noted in the summaries. Additionally, the changes show only B200 test configuration (l0_dgx_b200.yaml) being added to the changeset; B300 is not present in the provided file modifications. While the pull request does address performance testing infrastructure, the title's specific claims about pytest usage and B300 support are inaccurate relative to what is actually implemented.	Consider revising the title to accurately reflect the implementation details, such as "[TRTLLM-8260][infra] Add Perf-Sanity benchmark tests for B200" or similar phrasing that correctly identifies the benchmark script approach and only references hardware configurations present in the changeset. This will ensure the title accurately represents the primary changes and won't mislead future readers of the commit history.
Description Check	⚠️ Warning	The PR description includes a well-detailed "Description" section that explains the changes (adding perf sanity tests for B200/B300), provides usage examples with pytest commands, and explains how to trigger tests via bot commands. However, the description is missing two required sections from the template: "Test Coverage" and "PR Checklist." The Test Coverage section should list the relevant tests that safeguard these changes, and the PR Checklist section should verify compliance with coding guidelines, test coverage, dependencies, CODEOWNERS, documentation, and reviewer appropriateness. While the Description section itself is comprehensive, the absence of these structured sections means the PR description does not fully conform to the template structure.	To bring this PR into compliance with the template, please add the following missing sections to the PR description: (1) a "Test Coverage" section that clearly lists the relevant tests (such as the perf sanity tests defined in l0_dgx_b200.yaml and l0_dgx_b300.yaml, and any integration tests in perf/test_perf.py) that validate these changes, and (2) a "PR Checklist" section with checkboxes for verifying coding guidelines compliance, test case provision, dependency scanning, CODEOWNERS updates if applicable, documentation updates, and appropriate reviewer assignment.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

tests/scripts/perf-sanity/benchmark-serve.sh (1)

77-82: Fix variable expansion in file existence check

Single quotes prevent ${bench_dir} expansion; the check never triggers.

Apply:
-            if [[ -f '${bench_dir}/parse_benchmark_results.py' ]]; then
+            if [[ -f "${bench_dir}/parse_benchmark_results.py" ]]; then

tests/scripts/perf-sanity/parse_benchmark_results.py (1)

1-1: Add NVIDIA Apache-2.0 header (2025) per repo guidelines

All .py files must include the NVIDIA Apache-2.0 header.

Apply (below the shebang):

 #!/usr/bin/env python3
+#
+# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#     http://www.apache.org/licenses/LICENSE-2.0
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

tests/scripts/perf-sanity/run_benchmark_serve.py (2)

1-1: Add NVIDIA Apache-2.0 header (2025) per repo guidelines

All .py files must include the NVIDIA Apache-2.0 header.

Apply (below the shebang):

 #!/usr/bin/env python3
+#
+# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#     http://www.apache.org/licenses/LICENSE-2.0
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

657-666: Always pass --model as separate args to trtllm-serve

Avoid relying on positional parsing; split flags/values as distinct argv elements.

Apply:

-        serve_cmd = [
-            "trtllm-serve", MODEL, "--backend", "pytorch", "--tp_size",
+        serve_cmd = [
+            "trtllm-serve", "--model", model_identifier, "--backend", "pytorch", "--tp_size",
             str(test_case['tp']), "--ep_size",
             str(test_case['ep']), "--max_batch_size",
             str(test_case['max_batch_size']), "--max_num_tokens",
             str(test_case['max_num_tokens']),
             "--kv_cache_free_gpu_memory_fraction",
             str(test_case['free_gpu_mem_fraction']), "--extra_llm_api_options",
             config_path
         ]

🧹 Nitpick comments (5)

tests/scripts/perf-sanity/benchmark-serve.sh (1)
68-69: Quote PYTHONPATH assignment when injecting trtllm_dir

Prevents issues with paths containing spaces/colons.

Apply:
-                export PYTHONPATH=$trtllm_dir
+                export PYTHONPATH="$trtllm_dir"
tests/scripts/perf-sanity/parse_benchmark_results.py (2)
194-197: Remove unnecessary f-strings

No placeholders present; flagged by linters.

Apply:
-        print(f"Successfully extracted configuration from filename")
+        print("Successfully extracted configuration from filename")
@@
-        print(f"Could not extract configuration from filename either")
+        print("Could not extract configuration from filename either")
186-189: Avoid bare except; catch specific exceptions

Catching Exception hides parsing errors and makes debugging harder.

Use (e.g.) FileNotFoundError, OSError, UnicodeDecodeError as appropriate around file reads. Same for extract_and_update_metrics.
tests/scripts/perf-sanity/run_benchmark_serve.py (2)
900-910: Reproduction script: use resolved id in benchmark --model

Align benchmark invocation with fixed identifier.

Apply:
-    --model {model_path} \\
+    --model {model_identifier} \\
384-390: Consider avoiding shell=True where not necessary

The cleanup commands can be replaced with subprocess.run([...]) for safer invocation.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 259cc66 and 3fbf4ff.

📒 Files selected for processing (6)

jenkins/L0_Test.groovy (7 hunks)
tests/scripts/perf-sanity/benchmark-serve.sh (2 hunks)
tests/scripts/perf-sanity/benchmark_config.yaml (0 hunks)
tests/scripts/perf-sanity/l0_dgx_b200.yaml (1 hunks)
tests/scripts/perf-sanity/parse_benchmark_results.py (8 hunks)
tests/scripts/perf-sanity/run_benchmark_serve.py (15 hunks)

💤 Files with no reviewable changes (1)

tests/scripts/perf-sanity/benchmark_config.yaml

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

tests/scripts/perf-sanity/parse_benchmark_results.py
tests/scripts/perf-sanity/run_benchmark_serve.py

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.

Files:

tests/scripts/perf-sanity/parse_benchmark_results.py
tests/scripts/perf-sanity/run_benchmark_serve.py

**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

tests/scripts/perf-sanity/parse_benchmark_results.py
tests/scripts/perf-sanity/run_benchmark_serve.py

🧠 Learnings (1)

📚 Learning: 2025-08-20T15:04:42.885Z

Learnt from: dbari
PR: NVIDIA/TensorRT-LLM#7095
File: docker/Dockerfile.multi:168-168
Timestamp: 2025-08-20T15:04:42.885Z
Learning: In docker/Dockerfile.multi, wildcard COPY for benchmarks (${CPP_BUILD_DIR}/benchmarks/*Benchmark) is intentionally used instead of directory copy because the benchmarks directory contains various other build artifacts during C++ builds, and only specific benchmark executables should be copied to the final image.

Applied to files:

tests/scripts/perf-sanity/benchmark-serve.sh

🪛 Shellcheck (0.11.0)

tests/scripts/perf-sanity/benchmark-serve.sh

[warning] 55-55: Quote this to prevent word splitting.

(SC2046)

🪛 Ruff (0.13.1)

tests/scripts/perf-sanity/parse_benchmark_results.py

62-79: Consider moving this statement to an else block

(TRY300)

186-186: Do not catch blind exception: Exception

(BLE001)

194-194: f-string without any placeholders

Remove extraneous f prefix

(F541)

197-197: f-string without any placeholders

Remove extraneous f prefix

(F541)

213-213: Do not catch blind exception: Exception

(BLE001)

tests/scripts/perf-sanity/run_benchmark_serve.py

384-384: subprocess call with shell=True identified, security issue

(S602)

384-384: f-string without any placeholders

Remove extraneous f prefix

(F541)

387-387: subprocess call with shell=True identified, security issue

(S602)

387-387: f-string without any placeholders

Remove extraneous f prefix

(F541)

609-609: f-string without any placeholders

Remove extraneous f prefix

(F541)

615-615: Do not catch blind exception: Exception

(BLE001)

618-618: f-string without any placeholders

Remove extraneous f prefix

(F541)

716-716: Consider moving this statement to an else block

(TRY300)

718-718: Do not catch blind exception: Exception

(BLE001)

728-728: subprocess call with shell=True identified, security issue

(S602)

731-731: subprocess call with shell=True identified, security issue

(S602)

734-734: subprocess call with shell=True identified, security issue

(S602)

734-734: f-string without any placeholders

Remove extraneous f prefix

(F541)

737-737: subprocess call with shell=True identified, security issue

(S602)

737-737: f-string without any placeholders

Remove extraneous f prefix

(F541)

740-740: Do not catch blind exception: Exception

(BLE001)

773-773: Local variable gpu_info is assigned to but never used

Remove assignment to unused variable gpu_info

(F841)

921-925: f-string without any placeholders

Remove extraneous f prefix

(F541)

938-938: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

951-951: os.chmod setting a permissive mask 0o755 on file or directory

(S103)

956-956: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (4)

tests/scripts/perf-sanity/l0_dgx_b200.yaml (1)

1-17: Config looks consistent with runner expectations; please confirm intended GPU count

YAML structure matches run_benchmark_serve.py (fields and concurrency_iterations). Confirm that gpus: 1 is intentional for the B200 sanity path (execution uses TP/EP; gpus is informational).
tests/scripts/perf-sanity/benchmark-serve.sh (2)
13-19: Validate config_file default/semantics

config_file defaults to $(pwd) but later is used as a filename appended to bench_dir. Ensure callers pass a filename (e.g., l0_dgx_b200.yaml) or change the default to a sensible filename. Otherwise, --config_file will resolve to bench_dir/$(pwd) (nonsense).

52-55: Quote mount aggregation in docker run to avoid word splitting (SC2046)

Unquoted $mount can split on spaces and break -v args.

Apply:
-        $mount \
+        "$mount" \
Likely an incorrect or invalid review comment.
jenkins/L0_Test.groovy (1)

2235-2239: Confirm GPU type for Perf-Sanity k8s pod

createKubernetesPodConfig(..., "b100-ts2", ...) for DGX_B200 Perf-Sanity looks intentional but differs from values[0] ("b200-x8"). Verify node label is correct.

jenkins/L0_Test.groovy

tests/scripts/perf-sanity/run_benchmark_serve.py

chenfeiz0326 · 2025-09-25T09:41:58Z

/bot run --only-perf-sanity-test

tensorrt-cicd · 2025-09-25T09:47:14Z

PR_Github #19931 Bot args parsing error: usage: /bot [-h]
{run,kill,skip,submit,reviewers,reuse-pipeline,reuse-review} ...
/bot: error: unrecognized arguments: --only-perf-sanity-test

chenfeiz0326 · 2025-09-25T10:56:37Z

/bot run --only-perf-sanity-test

tensorrt-cicd · 2025-09-25T11:02:17Z

PR_Github #19944 Bot args parsing error: usage: /bot [-h]
{run,kill,skip,submit,reviewers,reuse-pipeline,reuse-review} ...
/bot: error: unrecognized arguments: --only-perf-sanity-test

.pre-commit-config.yaml

jenkins/L0_Test.groovy

.github/pull_request_template.md

jenkins/L0_Test.groovy

chenfeiz0326 · 2025-10-13T07:34:58Z

/bot run --stage-list "Perf-Sanity"

tensorrt-cicd · 2025-10-13T07:40:53Z

PR_Github #21172 [ run ] triggered by Bot

chenfeiz0326 · 2025-10-13T08:07:43Z

/bot run --stage-list "Perf-Sanity"

tensorrt-cicd · 2025-10-13T08:13:02Z

PR_Github #21179 [ run ] triggered by Bot

tensorrt-cicd · 2025-10-13T08:13:04Z

PR_Github #21172 [ run ] completed with state ABORTED
LLM/main/L0_MergeRequest_PR #15985 (Blue Ocean) completed with status: ABORTED

tensorrt-cicd · 2025-10-13T08:48:17Z

PR_Github #21179 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #15990 (Partly Tested) completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-13T09:23:56Z

/bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Perf-Sanity-Post-Merge-1"

tensorrt-cicd · 2025-10-13T09:28:54Z

PR_Github #21190 [ run ] triggered by Bot

tensorrt-cicd · 2025-10-13T10:26:10Z

PR_Github #21190 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #15996 (Partly Tested) completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-13T14:13:14Z

/bot run --stage-list "DGX_B200-4_GPUs-PyTorch-Perf-Sanity-Post-Merge-1"

tensorrt-cicd · 2025-10-13T14:19:16Z

PR_Github #21225 [ run ] triggered by Bot

tensorrt-cicd · 2025-10-13T16:02:06Z

PR_Github #21225 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #16018 (Partly Tested) completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-14T02:46:27Z

/bot run --stage-list "DGX_B200-4_GPUs-PyTorch-1,DGX_B200-4_GPUs-PyTorch-Perf-Sanity-Post-Merge-1"

tensorrt-cicd · 2025-10-14T02:52:32Z

PR_Github #21284 [ run ] triggered by Bot

tensorrt-cicd · 2025-10-19T08:05:19Z

PR_Github #21786 [ run ] completed with state SUCCESS. Commit: 9fc867a
/LLM/main/L0_MergeRequest_PR pipeline #16424 completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-20T03:15:50Z

/bot run

tensorrt-cicd · 2025-10-20T03:20:46Z

PR_Github #21839 [ run ] triggered by Bot. Commit: 7e0c59e

tensorrt-cicd · 2025-10-20T05:31:11Z

PR_Github #21839 [ run ] completed with state SUCCESS. Commit: 7e0c59e
/LLM/main/L0_MergeRequest_PR pipeline #16462 completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-20T06:30:21Z

/bot run

tensorrt-cicd · 2025-10-20T06:35:46Z

PR_Github #21869 [ run ] triggered by Bot. Commit: 7e0c59e

tensorrt-cicd · 2025-10-20T07:52:26Z

PR_Github #21869 [ run ] completed with state SUCCESS. Commit: 7e0c59e
/LLM/main/L0_MergeRequest_PR pipeline #16486 completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-20T09:13:19Z

/bot run

tensorrt-cicd · 2025-10-20T09:18:48Z

PR_Github #21879 [ run ] triggered by Bot. Commit: 7083622

tensorrt-cicd · 2025-10-20T14:27:09Z

PR_Github #21879 [ run ] completed with state SUCCESS. Commit: 7083622
/LLM/main/L0_MergeRequest_PR pipeline #16493 completed with status: 'FAILURE'

chzblych

Approved for the jenkins/L0_Test.groovy changes.

jenkins/L0_Test.groovy

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 · 2025-10-21T02:29:57Z

/bot run

tensorrt-cicd · 2025-10-21T02:36:52Z

PR_Github #21983 [ run ] triggered by Bot. Commit: cf15a7a

tensorrt-cicd · 2025-10-21T05:36:05Z

PR_Github #21983 [ run ] completed with state SUCCESS. Commit: cf15a7a
/LLM/main/L0_MergeRequest_PR pipeline #16576 completed with status: 'FAILURE'

chenfeiz0326 · 2025-10-21T05:50:11Z

/bot run

tensorrt-cicd · 2025-10-21T05:56:03Z

PR_Github #22009 [ run ] triggered by Bot. Commit: cf15a7a

tensorrt-cicd · 2025-10-22T00:57:18Z

PR_Github #22009 [ run ] completed with state SUCCESS. Commit: cf15a7a
/LLM/main/L0_MergeRequest_PR pipeline #16594 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

…d B300 (NVIDIA#7985) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>

…d B300 (NVIDIA#7985) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 requested review from a team as code owners September 25, 2025 07:48

chenfeiz0326 requested review from kxdc and tburt-nv September 25, 2025 07:48

coderabbitai bot reviewed Sep 25, 2025

View reviewed changes

chenfeiz0326 changed the title ~~[TRTLLM-8260][infra] Add Perf Sanity Test for B200~~ [TRTLLM-8260][feat] Add Perf Sanity Test for B200 Sep 25, 2025

chenfeiz0326 requested a review from a team as a code owner September 25, 2025 10:13

tburt-nv reviewed Sep 25, 2025

View reviewed changes

.pre-commit-config.yaml Show resolved Hide resolved

jenkins/L0_Test.groovy Outdated Show resolved Hide resolved

tburt-nv requested a review from a team September 25, 2025 21:02

chzblych requested changes Sep 26, 2025

View reviewed changes

chenfeiz0326 force-pushed the chenfeiz/add_perf_sanity_tests branch from 099eb45 to 999a4c9 Compare October 13, 2025 07:22

chenfeiz0326 requested a review from chzblych October 19, 2025 07:16

chenfeiz0326 force-pushed the chenfeiz/add_perf_sanity_tests branch from 9fc867a to 7e0c59e Compare October 20, 2025 03:15

chenfeiz0326 force-pushed the chenfeiz/add_perf_sanity_tests branch from 7e0c59e to 7083622 Compare October 20, 2025 09:12

chzblych approved these changes Oct 20, 2025

View reviewed changes

jenkins/L0_Test.groovy Outdated Show resolved Hide resolved

chenfeiz0326 added 2 commits October 20, 2025 19:23

Add perf sanity test

d25aa31

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

Add perf-sanity tests

cf15a7a

Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

chenfeiz0326 force-pushed the chenfeiz/add_perf_sanity_tests branch from 7083622 to cf15a7a Compare October 21, 2025 02:29

litaotju merged commit 6cf1c3f into NVIDIA:main Oct 22, 2025
5 checks passed

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Nov 1, 2025

[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 an…

4740967

…d B300 (NVIDIA#7985) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Nov 3, 2025

[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 an…

66b646b

…d B300 (NVIDIA#7985) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Nov 3, 2025

[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 an…

e6a4733

…d B300 (NVIDIA#7985) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Nov 3, 2025

[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 an…

291604a

…d B300 (NVIDIA#7985) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>

anish-shanbhag mentioned this pull request Dec 12, 2025

[TRTC-71][feat] Add regression testing for config database #9832

Merged

1 task

[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 #7985

[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 #7985

Uh oh!

Conversation

chenfeiz0326 commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

coderabbitai bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chenfeiz0326 commented Sep 25, 2025

Uh oh!

tensorrt-cicd commented Sep 25, 2025

Uh oh!

chenfeiz0326 commented Sep 25, 2025

Uh oh!

tensorrt-cicd commented Sep 25, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chenfeiz0326 commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

chenfeiz0326 commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

chenfeiz0326 commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

chenfeiz0326 commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

tensorrt-cicd commented Oct 13, 2025

Uh oh!

chenfeiz0326 commented Oct 14, 2025

Uh oh!

tensorrt-cicd commented Oct 14, 2025

Uh oh!

tensorrt-cicd commented Oct 19, 2025

Uh oh!

chenfeiz0326 commented Oct 20, 2025

Uh oh!

tensorrt-cicd commented Oct 20, 2025

Uh oh!

tensorrt-cicd commented Oct 20, 2025

Uh oh!

chenfeiz0326 commented Oct 20, 2025

Uh oh!

tensorrt-cicd commented Oct 20, 2025

Uh oh!

chenfeiz0326 commented Sep 25, 2025 •

edited

Loading

coderabbitai bot commented Sep 25, 2025 •

edited

Loading