-
Notifications
You must be signed in to change notification settings - Fork 2k
[None] [feat] Support accurate device iter time #6906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds device-side per-iteration CUDA event timing to the executor’s profiling, alternating two event pairs to compute the previous iteration’s device time. Logging now includes host_step_time and prev_device_step_time. Also updates NVTX range labeling to 1-based iteration indexing. Existing profiling/trace mechanisms remain unchanged. Changes
Sequence Diagram(s)sequenceDiagram
participant Executor as PyExecutor
participant Prof as _profiler ctx
participant CUDA as CUDA Events
participant Log as Logger
loop Each iteration (it)
Executor->>Prof: Enter profiling iteration
alt it % 2 == 0
Executor->>CUDA: Record start_event_1 (create if needed)
else
Executor->>CUDA: Record start_event_2 (create if needed)
end
Executor->>Executor: _forward_step (NVTX range with it+1)
alt it % 2 == 0
Executor->>CUDA: Record end_event_1
Executor->>CUDA: Sync start_event_2/end_event_2 → compute prev_device_step_time
else
Executor->>CUDA: Record end_event_2
Executor->>CUDA: Sync start_event_1/end_event_1 → compute prev_device_step_time
end
Executor->>Executor: Compute host_step_time
Executor->>Log: Log host_step_time and prev_device_step_time (N/A on first)
Prof-->>Executor: Continue/exit profiling
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15 minutes Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tensorrt_llm/_torch/pyexecutor/py_executor.py (3)
444-444: Use perf_counter and unify host/device units with consistent formatting
- perf_counter provides higher-precision monotonic timing than time.time for durations.
- Keep units consistent with device timings (ms) and avoid mutating prev_device_step_time’s type between float and str.
@@ - end_time = time.time() + end_time = time.perf_counter() @@ - if prev_device_step_time is None: - prev_device_step_time = "N/A" # Handle first iteration - else: - prev_device_step_time = f"{prev_device_step_time}ms" + if prev_device_step_time is None: + prev_device_step_time_str = "N/A" # Handle first iteration + else: + prev_device_step_time_str = f"{prev_device_step_time:.3f}ms" @@ - f"host_step_time = {end_time - start_time}s, " - f"prev_device_step_time = {prev_device_step_time}, " + f"host_step_time = {((end_time - start_time) * 1e3):.3f}ms, " + f"prev_device_step_time = {prev_device_step_time_str}, " @@ - start_time = time.time() + start_time = time.perf_counter()Also applies to: 456-460, 468-470, 483-483
1428-1430: Fix long NVTX label line (Ruff E501) by splitting the f-stringThis preserves the new 1-based labeling while satisfying the linter.
- @nvtx_range( - f"[Executor] _forward_step {self.model_engine.iter_counter + 1}: {len(scheduled_requests.context_requests)} ctx reqs, {len(scheduled_requests.generation_requests)} gen reqs" - ) + @nvtx_range( + f"[Executor] _forward_step {self.model_engine.iter_counter + 1}: " + f"{len(scheduled_requests.context_requests)} ctx reqs, " + f"{len(scheduled_requests.generation_requests)} gen reqs" + )
1-1: Add NVIDIA copyright header (current year) per guidelinesThe file is missing the required NVIDIA header.
+# Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.If the repository standard also includes an SPDX identifier (e.g., Apache-2.0), please append it accordingly.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
tensorrt_llm/_torch/pyexecutor/py_executor.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
tensorrt_llm/_torch/pyexecutor/py_executor.py
🪛 Ruff (0.12.2)
tensorrt_llm/_torch/pyexecutor/py_executor.py
432-432: Line too long (123 > 120)
(E501)
1428-1428: Line too long (185 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py (1)
444-455: Alternating two event pairs to report the previous iteration’s device time is soundThe parity-based event pair scheme correctly avoids a device-wide sync while giving you the prior iter duration. Once recorded on the right stream, this will be both accurate and low overhead.
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
tensorrt_llm/_torch/pyexecutor/py_executor.py (3)
406-414: Record CUDA events on the actual compute stream to ensure accurate device timings (and fix comment grammar).Recording without specifying a stream risks skewed device timings when compute runs on non-default streams (sampler/forward can switch streams). Track the stream at iteration start and record both start/end events on it. Also tweak the comment grammar.
Apply this diff:
- # These events are used to record the time of the previous batch. - # We need two set of the start-end events to record the time through - # a ping-pong way so that it works with overlap scheduler. + # These events are used to record the time of the previous batch. + # We need two sets of start/end events to record the time in a ping-pong + # fashion so that it works with the overlap scheduler. start_event_1 = None end_event_1 = torch.cuda.Event(enable_timing=True) start_event_2 = None end_event_2 = torch.cuda.Event(enable_timing=True) prev_device_step_time = None + # Track the streams used to record timing events to avoid default-stream skew. + stream_1 = None + stream_2 = None
450-462: Record end events on the same stream as their corresponding start events.Without passing the stream handle, Event.record() uses the current/default stream, which can under-report device time if compute ran on a different stream.
Apply this diff:
- if it % 2 == 0: - end_event_1.record() + if it % 2 == 0: + end_event_1.record(stream_1) if start_event_2 is not None: end_event_2.synchronize() prev_device_step_time = start_event_2.elapsed_time( end_event_2) else: - end_event_2.record() + end_event_2.record(stream_2) if start_event_1 is not None: end_event_1.synchronize() prev_device_step_time = start_event_1.elapsed_time( end_event_1)
492-499: Record start events on the compute stream captured at iteration start.Capture the current compute stream and record the start event on it; this pairs with the end-event change above.
Apply this diff:
if it % 2 == 0: if start_event_1 is None: start_event_1 = torch.cuda.Event(enable_timing=True) - start_event_1.record() + stream_1 = torch.cuda.current_stream(self.device_id) + start_event_1.record(stream_1) else: if start_event_2 is None: start_event_2 = torch.cuda.Event(enable_timing=True) - start_event_2.record() + stream_2 = torch.cuda.current_stream(self.device_id) + start_event_2.record(stream_2)
🧹 Nitpick comments (4)
tensorrt_llm/_torch/pyexecutor/py_executor.py (4)
437-437: Split long nonlocal (Ruff E501) and add stream handles to scope.This line exceeds 120 chars and is flagged by Ruff. Split it and add the newly introduced stream variables.
Apply this diff:
- nonlocal it, enabled, start_time, start_event_1, end_event_1, start_event_2, end_event_2, prev_device_step_time + nonlocal it, enabled, start_time + nonlocal start_event_1, end_event_1, start_event_2, end_event_2 + nonlocal prev_device_step_time, stream_1, stream_2
463-467: Format timing values consistently and avoid mixing types.Right now prev_device_step_time gets turned into a string (or “N/A”), while host_step_time remains a float. Round consistently and keep explicit “_ms” variable names for clarity.
Apply this diff:
- if prev_device_step_time is None: - prev_device_step_time = "N/A" # Handle first iteration - else: - prev_device_step_time = f"{prev_device_step_time}ms" - host_step_time = (end_time - start_time) * 1000 # milliseconds + if prev_device_step_time is None: + prev_device_step_time_str = "N/A" # Handle first iteration + else: + prev_device_step_time_str = f"{prev_device_step_time:.3f}ms" + host_step_time_ms = (end_time - start_time) * 1000.0 # milliseconds
476-477: Align log field names with units and use the formatted variables.Follow-up to the previous change: use host_step_time_ms and prev_device_step_time_str, and format host time to 3 decimals.
Apply this diff:
- f"host_step_time = {host_step_time}ms, " - f"prev_device_step_time = {prev_device_step_time}, " + f"host_step_time = {host_step_time_ms:.3f}ms, " + f"prev_device_step_time = {prev_device_step_time_str}, "
1436-1436: Fix long NVTX label (Ruff E501) and keep logging consistent with 1-based indexing.Split the string to satisfy E501. Also, since the NVTX label moved to 1-based indexing, consider aligning the logged iteration field to avoid confusion.
Apply this diff:
- @nvtx_range( - f"[Executor] _forward_step {self.model_engine.iter_counter + 1}: {len(scheduled_requests.context_requests)} ctx reqs, {len(scheduled_requests.generation_requests)} gen reqs" - ) + @nvtx_range( + f"[Executor] _forward_step {self.model_engine.iter_counter + 1}: " + f"{len(scheduled_requests.context_requests)} ctx reqs, " + f"{len(scheduled_requests.generation_requests)} gen reqs" + )Additionally (outside this hunk), update the logged iteration to match 1-based indexing:
# Suggested change near the logger.info(...) call: # f"iter = {self.model_engine.iter_counter}, " f"iter = {self.model_engine.iter_counter + 1}, "
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
tensorrt_llm/_torch/pyexecutor/py_executor.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
tensorrt_llm/_torch/pyexecutor/py_executor.py
🪛 Ruff (0.12.2)
tensorrt_llm/_torch/pyexecutor/py_executor.py
437-437: Line too long (123 > 120)
(E501)
1436-1436: Line too long (185 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
|
PR_Github #15317 [ run ] triggered by Bot |
|
PR_Github #15317 [ run ] completed with state |
QiJune
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
b7f1acb to
67c7540
Compare
|
/bot run --disable-fail-fast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tensorrt_llm/_torch/pyexecutor/py_executor.py (2)
451-463: Ensure CUDA events are recorded on the correct streamThe current implementation records CUDA events without specifying a stream, which defaults to the current stream. However, if forward/sampler operations use different streams, the timing measurements may be inaccurate. Based on code inspection,
sampler.pyobtains the current stream and other components may create/use non-default streams.Capture the actual compute stream at iteration start and record events on that stream:
+ # Track the streams used to record timing events + stream_1 = None + stream_2 = None def profile_step(): - nonlocal it, enabled, start_time, start_event_1, end_event_1, start_event_2, end_event_2, prev_device_step_time + nonlocal it, enabled, start_time + nonlocal start_event_1, end_event_1, start_event_2, end_event_2 + nonlocal prev_device_step_time, stream_1, stream_2 # ... existing code ... if start_time is not None and self.print_log and self.dist.rank == 0: end_time = time.time() if it % 2 == 0: - end_event_1.record() + # Record the end event on the same stream as its start + end_event_1.record(stream_1) if start_event_2 is not None: end_event_2.synchronize() prev_device_step_time = start_event_2.elapsed_time( end_event_2) else: - end_event_2.record() + end_event_2.record(stream_2) if start_event_1 is not None: end_event_1.synchronize() prev_device_step_time = start_event_1.elapsed_time( end_event_1)
493-500: Record start events on the correct streamSimilar to the end event recording, start events should be recorded on the actual compute stream to ensure accurate timing measurements.
if it % 2 == 0: if start_event_1 is None: start_event_1 = torch.cuda.Event(enable_timing=True) + stream_1 = torch.cuda.current_stream(self.device_id) - start_event_1.record() + start_event_1.record(stream_1) else: if start_event_2 is None: start_event_2 = torch.cuda.Event(enable_timing=True) + stream_2 = torch.cuda.current_stream(self.device_id) - start_event_2.record() + start_event_2.record(stream_2)
🧹 Nitpick comments (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py (1)
438-438: Consider splitting the long line for better readabilityLine 438 exceeds the 120-character limit. Consider splitting the
nonlocaldeclaration across multiple lines.- nonlocal it, enabled, start_time, start_event_1, end_event_1, start_event_2, end_event_2, prev_device_step_time + nonlocal it, enabled, start_time + nonlocal start_event_1, end_event_1, start_event_2, end_event_2 + nonlocal prev_device_step_time
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tensorrt_llm/_torch/pyexecutor/py_executor.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
tensorrt_llm/_torch/pyexecutor/py_executor.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
tensorrt_llm/_torch/pyexecutor/py_executor.py
🪛 Ruff (0.12.2)
tensorrt_llm/_torch/pyexecutor/py_executor.py
438-438: Line too long (123 > 120)
(E501)
1437-1437: Line too long (185 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (4)
tensorrt_llm/_torch/pyexecutor/py_executor.py (4)
406-415: Add device-side timing measurement using CUDA eventsThe implementation correctly introduces CUDA event-based timing to measure device-side execution time, complementing the existing host-side timing. The ping-pong approach with two event pairs allows proper measurement even with overlapping operations.
464-468: Good handling of the first iteration caseThe code correctly handles the first iteration where no previous timing is available by setting
prev_device_step_timeto "N/A", preventing any display issues in the logs.
469-481: Clear and informative profiling output formatThe updated log format clearly distinguishes between host and device timing, making it easier to identify performance bottlenecks. The inclusion of both
host_step_timeandprev_device_step_timeprovides valuable insights for performance analysis.
1437-1437: Update NVTX range label to use 1-based iteration indexingThe change from
self.model_engine.iter_countertoself.model_engine.iter_counter + 1provides clearer 1-based iteration numbering in profiling output, which is more intuitive for users. The line exceeds 120 characters but is acceptable as it maintains readability of the complex string formatting.
|
PR_Github #15385 [ run ] triggered by Bot |
|
PR_Github #15385 [ run ] completed with state |
|
/bot run |
|
PR_Github #15524 [ run ] triggered by Bot |
|
PR_Github #15524 [ run ] completed with state |
|
/bot run |
|
PR_Github #15559 [ run ] triggered by Bot |
|
PR_Github #15559 [ run ] completed with state |
Summary by CodeRabbit
New Features
Refactor
Description
Test Coverage
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
Details
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.