[pytorch][torchelastic] Duplicate stdout and stderr and apply custom filter in torchrun by cnphil · Pull Request #160712 · pytorch/pytorch

cnphil · 2025-08-15T04:50:59Z

Summary:
Part of an effort to extract some important error logs (e.g. #157996) that was tee'ed to stdout and stderr.

The general idea is to:

Duplicate the tees on stdout and stderr to a separate file, filtered_stdout.log and filtered_stderr.log, respectively.
In these files, as its name suggests, only log lines matching a customizable filter.
Later on in another PR, append the contents of these files to the reply file.

Outline of changes in this PR:

Enhance TailLog to be able to 1) stream to a file, and 2) only write when the line matches the passed filter.
Add filtered_stdout and filtered_stderr to LogsDest and have LogsSpecs reify them.
In start_processes() and PContext, add params duplicate_stdout_filters and duplicate_stderr_filters to filter and write the duplicated stream to the files above. When no filters are passed in, no duplicated streams are created.

Test Plan:

$ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:api_test

Buck UI: https://www.internalfb.com/buck2/f5c6b7da-217d-4a0b-872a-c7cd3d05587f
Test UI: https://www.internalfb.com/intern/testinfra/testrun/4222124951617688
Network: Up: 398B  Down: 44MiB  (reSessionID-a489a961-b602-45be-b851-3490ebb7a26a)
Analyzing targets. Remaining     0/200
Executing actions. Remaining     0/12856                                                                                                                                        0.1s exec time total
Command: test.     Finished 1 local
Time elapsed: 17:37.9s
Tests finished: Pass 52. Fail 0. Fatal 0. Skip 0. Build failure 0

$ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:tail_log_test

Buck UI: https://www.internalfb.com/buck2/d6d5c1c1-db98-4d9c-b608-7ba6fbb5e3ee
Test UI: https://www.internalfb.com/intern/testinfra/testrun/13510798985149262
Network: Up: 94KiB  Down: 417MiB  (reSessionID-27b46fba-d31c-4c04-8ede-a506454e6922)
Analyzing targets. Remaining     0/3                                                                                                                                            536 actions, 555 artifacts declared
Executing actions. Remaining     0/186                                                                                                                                          1:05.5s exec time total
Command: test.     Finished 7 local, 1 remote, 115 cache (93% hit)                                                                                                              37.0s exec time cached (56%)
Time elapsed: 1:11.5s
Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0

Rollback Plan:

Differential Revision: D80188995

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci

pytorch-bot · 2025-08-15T04:51:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160712

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 280aefc with merge base 05b2e02 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-15T04:51:07Z

This pull request was exported from Phabricator. Differential Revision: D80188995

github-actions · 2025-10-14T17:34:52Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

meta-codesync · 2025-10-21T05:57:29Z

@cnphil has exported this pull request. If you are a Meta employee, you can view the originating Diff in D80188995.

cnphil · 2025-10-21T05:58:22Z

@fduwjj Exported new changes from Pharbicator, PTAL :)

…filter in torchrun (pytorch#160712) Summary: Part of an effort to extract some important error logs (e.g. [pytorch#157996](pytorch#157996)) that was `tee`'ed to `stdout` and `stderr`. The general idea is to: - Duplicate the `tee`s on `stdout` and `stderr` to a separate file, `filtered_stdout.log` and `filtered_stderr.log`, respectively. - In these files, as its name suggests, only log lines matching a customizable filter. - Later on in another PR, append the contents of these files to the reply file. Outline of changes in this PR: - Enhance `TailLog` to be able to 1) stream to a file, and 2) only write when the line matches the passed filter. - Add `filtered_stdout` and `filtered_stderr` to `LogsDest` and have `LogsSpecs` `reify` them. - In `start_processes()` and `PContext`, add params `duplicate_stdout_filters` and `duplicate_stderr_filters` to filter and write the duplicated stream to the files above. When no filters are passed in, no duplicated streams are created. Test Plan: ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/f5c6b7da-217d-4a0b-872a-c7cd3d05587f Test UI: https://www.internalfb.com/intern/testinfra/testrun/4222124951617688 Network: Up: 398B Down: 44MiB (reSessionID-a489a961-b602-45be-b851-3490ebb7a26a) Analyzing targets. Remaining 0/200 Executing actions. Remaining 0/12856 0.1s exec time total Command: test. Finished 1 local Time elapsed: 17:37.9s Tests finished: Pass 52. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:tail_log_test ``` ``` Buck UI: https://www.internalfb.com/buck2/d6d5c1c1-db98-4d9c-b608-7ba6fbb5e3ee Test UI: https://www.internalfb.com/intern/testinfra/testrun/13510798985149262 Network: Up: 94KiB Down: 417MiB (reSessionID-27b46fba-d31c-4c04-8ede-a506454e6922) Analyzing targets. Remaining 0/3 536 actions, 555 artifacts declared Executing actions. Remaining 0/186 1:05.5s exec time total Command: test. Finished 7 local, 1 remote, 115 cache (93% hit) 37.0s exec time cached (56%) Time elapsed: 1:11.5s Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/34f426fd-25a0-4cf5-8da3-2f3d84767d1e Test UI: https://www.internalfb.com/intern/testinfra/testrun/14918173871977118 Network: Up: 1.0MiB Down: 2.9GiB (reSessionID-048daa50-9ad4-4826-886f-08cec54c7d72) Analyzing targets. Remaining 0/5 533 actions, 552 artifacts declared Executing actions. Remaining 0/176 1:22.7s exec time total Command: test. Finished 51 local, 13 remote, 50 cache (44% hit) 19.8s exec time cached (23%) Time elapsed: 1:45.2s Tests finished: Pass 31. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:local_agent_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test/fb:local_agent_fb_internal_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:launch_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:test_run ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:local_launch_mast_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:fb_run_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:launch_test ``` Reviewed By: mradmila Differential Revision: D80188995

…filter in torchrun (pytorch#160712) Summary: Pull Request resolved: pytorch#160712 Part of an effort to extract some important error logs (e.g. [pytorch#157996](pytorch#157996)) that was `tee`'ed to `stdout` and `stderr`. The general idea is to: - Duplicate the `tee`s on `stdout` and `stderr` to a separate file, `filtered_stdout.log` and `filtered_stderr.log`, respectively. - In these files, as its name suggests, only log lines matching a customizable filter. - Later on in another PR, append the contents of these files to the reply file. Outline of changes in this PR: - Enhance `TailLog` to be able to 1) stream to a file, and 2) only write when the line matches the passed filter. - Add `filtered_stdout` and `filtered_stderr` to `LogsDest` and have `LogsSpecs` `reify` them. - In `start_processes()` and `PContext`, add params `duplicate_stdout_filters` and `duplicate_stderr_filters` to filter and write the duplicated stream to the files above. When no filters are passed in, no duplicated streams are created. Test Plan: ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/f5c6b7da-217d-4a0b-872a-c7cd3d05587f Test UI: https://www.internalfb.com/intern/testinfra/testrun/4222124951617688 Network: Up: 398B Down: 44MiB (reSessionID-a489a961-b602-45be-b851-3490ebb7a26a) Analyzing targets. Remaining 0/200 Executing actions. Remaining 0/12856 0.1s exec time total Command: test. Finished 1 local Time elapsed: 17:37.9s Tests finished: Pass 52. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:tail_log_test ``` ``` Buck UI: https://www.internalfb.com/buck2/d6d5c1c1-db98-4d9c-b608-7ba6fbb5e3ee Test UI: https://www.internalfb.com/intern/testinfra/testrun/13510798985149262 Network: Up: 94KiB Down: 417MiB (reSessionID-27b46fba-d31c-4c04-8ede-a506454e6922) Analyzing targets. Remaining 0/3 536 actions, 555 artifacts declared Executing actions. Remaining 0/186 1:05.5s exec time total Command: test. Finished 7 local, 1 remote, 115 cache (93% hit) 37.0s exec time cached (56%) Time elapsed: 1:11.5s Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/34f426fd-25a0-4cf5-8da3-2f3d84767d1e Test UI: https://www.internalfb.com/intern/testinfra/testrun/14918173871977118 Network: Up: 1.0MiB Down: 2.9GiB (reSessionID-048daa50-9ad4-4826-886f-08cec54c7d72) Analyzing targets. Remaining 0/5 533 actions, 552 artifacts declared Executing actions. Remaining 0/176 1:22.7s exec time total Command: test. Finished 51 local, 13 remote, 50 cache (44% hit) 19.8s exec time cached (23%) Time elapsed: 1:45.2s Tests finished: Pass 31. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:local_agent_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test/fb:local_agent_fb_internal_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:launch_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:test_run ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:local_launch_mast_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:fb_run_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:launch_test ``` Reviewed By: mradmila Differential Revision: D80188995

…filter in torchrun (pytorch#160712) Summary: Part of an effort to extract some important error logs (e.g. [pytorch#157996](pytorch#157996)) that was `tee`'ed to `stdout` and `stderr`. The general idea is to: - Duplicate the `tee`s on `stdout` and `stderr` to a separate file, `filtered_stdout.log` and `filtered_stderr.log`, respectively. - In these files, as its name suggests, only log lines matching a customizable filter. - Later on in another PR, append the contents of these files to the reply file. Outline of changes in this PR: - Enhance `TailLog` to be able to 1) stream to a file, and 2) only write when the line matches the passed filter. - Add `filtered_stdout` and `filtered_stderr` to `LogsDest` and have `LogsSpecs` `reify` them. - In `start_processes()` and `PContext`, add params `duplicate_stdout_filters` and `duplicate_stderr_filters` to filter and write the duplicated stream to the files above. When no filters are passed in, no duplicated streams are created. Test Plan: ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/f5c6b7da-217d-4a0b-872a-c7cd3d05587f Test UI: https://www.internalfb.com/intern/testinfra/testrun/4222124951617688 Network: Up: 398B Down: 44MiB (reSessionID-a489a961-b602-45be-b851-3490ebb7a26a) Analyzing targets. Remaining 0/200 Executing actions. Remaining 0/12856 0.1s exec time total Command: test. Finished 1 local Time elapsed: 17:37.9s Tests finished: Pass 52. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:tail_log_test ``` ``` Buck UI: https://www.internalfb.com/buck2/d6d5c1c1-db98-4d9c-b608-7ba6fbb5e3ee Test UI: https://www.internalfb.com/intern/testinfra/testrun/13510798985149262 Network: Up: 94KiB Down: 417MiB (reSessionID-27b46fba-d31c-4c04-8ede-a506454e6922) Analyzing targets. Remaining 0/3 536 actions, 555 artifacts declared Executing actions. Remaining 0/186 1:05.5s exec time total Command: test. Finished 7 local, 1 remote, 115 cache (93% hit) 37.0s exec time cached (56%) Time elapsed: 1:11.5s Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/34f426fd-25a0-4cf5-8da3-2f3d84767d1e Test UI: https://www.internalfb.com/intern/testinfra/testrun/14918173871977118 Network: Up: 1.0MiB Down: 2.9GiB (reSessionID-048daa50-9ad4-4826-886f-08cec54c7d72) Analyzing targets. Remaining 0/5 533 actions, 552 artifacts declared Executing actions. Remaining 0/176 1:22.7s exec time total Command: test. Finished 51 local, 13 remote, 50 cache (44% hit) 19.8s exec time cached (23%) Time elapsed: 1:45.2s Tests finished: Pass 31. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:local_agent_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test/fb:local_agent_fb_internal_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:launch_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:test_run ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:local_launch_mast_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:fb_run_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:launch_test ``` Reviewed By: mradmila Differential Revision: D80188995

fduwjj

LGTM

…filter in torchrun (pytorch#160712) Summary: Part of an effort to extract some important error logs (e.g. [pytorch#157996](pytorch#157996)) that was `tee`'ed to `stdout` and `stderr`. The general idea is to: - Duplicate the `tee`s on `stdout` and `stderr` to a separate file, `filtered_stdout.log` and `filtered_stderr.log`, respectively. - In these files, as its name suggests, only log lines matching a customizable filter. - Later on in another PR, append the contents of these files to the reply file. Outline of changes in this PR: - Enhance `TailLog` to be able to 1) stream to a file, and 2) only write when the line matches the passed filter. - Add `filtered_stdout` and `filtered_stderr` to `LogsDest` and have `LogsSpecs` `reify` them. - In `start_processes()` and `PContext`, add params `duplicate_stdout_filters` and `duplicate_stderr_filters` to filter and write the duplicated stream to the files above. When no filters are passed in, no duplicated streams are created. Test Plan: ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/f5c6b7da-217d-4a0b-872a-c7cd3d05587f Test UI: https://www.internalfb.com/intern/testinfra/testrun/4222124951617688 Network: Up: 398B Down: 44MiB (reSessionID-a489a961-b602-45be-b851-3490ebb7a26a) Analyzing targets. Remaining 0/200 Executing actions. Remaining 0/12856 0.1s exec time total Command: test. Finished 1 local Time elapsed: 17:37.9s Tests finished: Pass 52. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/multiprocessing:tail_log_test ``` ``` Buck UI: https://www.internalfb.com/buck2/d6d5c1c1-db98-4d9c-b608-7ba6fbb5e3ee Test UI: https://www.internalfb.com/intern/testinfra/testrun/13510798985149262 Network: Up: 94KiB Down: 417MiB (reSessionID-27b46fba-d31c-4c04-8ede-a506454e6922) Analyzing targets. Remaining 0/3 536 actions, 555 artifacts declared Executing actions. Remaining 0/186 1:05.5s exec time total Command: test. Finished 7 local, 1 remote, 115 cache (93% hit) 37.0s exec time cached (56%) Time elapsed: 1:11.5s Tests finished: Pass 7. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:api_test ``` ``` Buck UI: https://www.internalfb.com/buck2/34f426fd-25a0-4cf5-8da3-2f3d84767d1e Test UI: https://www.internalfb.com/intern/testinfra/testrun/14918173871977118 Network: Up: 1.0MiB Down: 2.9GiB (reSessionID-048daa50-9ad4-4826-886f-08cec54c7d72) Analyzing targets. Remaining 0/5 533 actions, 552 artifacts declared Executing actions. Remaining 0/176 1:22.7s exec time total Command: test. Finished 51 local, 13 remote, 50 cache (44% hit) 19.8s exec time cached (23%) Time elapsed: 1:45.2s Tests finished: Pass 31. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test:local_agent_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/elastic/agent/server/test/fb:local_agent_fb_internal_test [DISABLED] ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:launch_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher:test_run ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:api_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:local_launch_mast_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:fb_run_test ``` ``` $ buck test 'fbcode//mode/opt' caffe2/test/distributed/launcher/fb:launch_test ``` Reviewed By: fduwjj, mradmila Differential Revision: D80188995

cnphil · 2025-10-23T14:14:11Z

@pytorchbot merge

facebook-github-bot · 2025-10-23T14:14:40Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2025-10-23T14:16:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (torchelastic) labels Aug 15, 2025

facebook-github-bot added the fb-exported label Aug 15, 2025

cnphil requested review from d4l3k, fduwjj and kiukchung August 15, 2025 04:52

github-actions bot added the Stale label Oct 14, 2025

cnphil force-pushed the export-D80188995 branch from ff7ebc4 to 591b883 Compare October 21, 2025 05:57

meta-codesync bot added the meta-exported label Oct 21, 2025

cnphil removed the Stale label Oct 21, 2025

cnphil force-pushed the export-D80188995 branch from 591b883 to 1cada2c Compare October 21, 2025 06:30

cnphil force-pushed the export-D80188995 branch 2 times, most recently from 4440f65 to 9d89360 Compare October 21, 2025 16:56

cnphil force-pushed the export-D80188995 branch 2 times, most recently from 99841b8 to f814794 Compare October 21, 2025 17:52

cnphil force-pushed the export-D80188995 branch 2 times, most recently from 9fa3b52 to b997417 Compare October 21, 2025 18:53

cnphil force-pushed the export-D80188995 branch from b997417 to 5cc9908 Compare October 21, 2025 19:20

cnphil force-pushed the export-D80188995 branch from 5cc9908 to 1ff2dc8 Compare October 21, 2025 20:10

cnphil force-pushed the export-D80188995 branch from 1ff2dc8 to 07e9f9f Compare October 21, 2025 20:14

cnphil force-pushed the export-D80188995 branch from 07e9f9f to 57781d9 Compare October 21, 2025 20:14

cnphil force-pushed the export-D80188995 branch from 57781d9 to 5a6f7c7 Compare October 21, 2025 20:19

fduwjj added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 21, 2025

cnphil force-pushed the export-D80188995 branch from 5a6f7c7 to ee434e6 Compare October 21, 2025 21:42

cnphil force-pushed the export-D80188995 branch from ee434e6 to 29e037e Compare October 21, 2025 22:26

fduwjj approved these changes Oct 21, 2025

View reviewed changes

cnphil force-pushed the export-D80188995 branch from 29e037e to 90f21f9 Compare October 22, 2025 15:50

cnphil force-pushed the export-D80188995 branch from 90f21f9 to 2c7c858 Compare October 22, 2025 15:50

cnphil force-pushed the export-D80188995 branch from 2c7c858 to 280aefc Compare October 22, 2025 18:12

pytorchmergebot added the merging label Oct 23, 2025

pytorchmergebot closed this in cbcb4f7 Oct 23, 2025

pytorchmergebot added Merged and removed merging labels Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pytorch][torchelastic] Duplicate stdout and stderr and apply custom filter in torchrun#160712

[pytorch][torchelastic] Duplicate stdout and stderr and apply custom filter in torchrun#160712
cnphil wants to merge 1 commit intopytorch:mainfrom
cnphil:export-D80188995

cnphil commented Aug 15, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Aug 15, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

Uh oh!

meta-codesync bot commented Oct 21, 2025

Uh oh!

cnphil commented Oct 21, 2025 •

edited

Loading

Uh oh!

fduwjj left a comment

Uh oh!

cnphil commented Oct 23, 2025

Uh oh!

facebook-github-bot commented Oct 23, 2025

Uh oh!

pytorchmergebot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

cnphil commented Aug 15, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160712

✅ No Failures

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

Uh oh!

meta-codesync bot commented Oct 21, 2025

Uh oh!

cnphil commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fduwjj left a comment

Choose a reason for hiding this comment

Uh oh!

cnphil commented Oct 23, 2025

Uh oh!

facebook-github-bot commented Oct 23, 2025

Uh oh!

pytorchmergebot commented Oct 23, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cnphil commented Aug 15, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 15, 2025 •

edited

Loading

cnphil commented Oct 21, 2025 •

edited

Loading