Skip to content

Conversation

@wayi1
Copy link
Contributor

@wayi1 wayi1 commented Mar 31, 2022

Previously the highest-level process group in period_process_group_dict could be None, indicating the global group. Now period_process_group_dict cannot contain None as a process group, so the function _find_process_group can just return a process group instead of a tuple -- when not found, just return None, because now the returned process group cannot be None.

Proposal: #71325

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 31, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 03447c3 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-bionic-py3.7-clang9 / test (default, 2, 2, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-04-01T05:51:02.2186690Z .jenkins/pytorch/t...ped) python backend.py --export-module-to=model.pt
2022-04-01T05:51:01.9230438Z ----------------------------------------------------------------------
2022-04-01T05:51:01.9230988Z Ran 2 tests in 0.066s
2022-04-01T05:51:01.9231197Z 
2022-04-01T05:51:01.9231290Z OK
2022-04-01T05:51:01.9231386Z 
2022-04-01T05:51:01.9231483Z Generating XML reports...
2022-04-01T05:51:01.9260577Z Generated XML report: test-reports/python-unittest/test_custom_backend/TEST-TestCustomBackend-20220401055101.xml
2022-04-01T05:51:02.0233058Z + python backend.py --export-module-to=model.pt
2022-04-01T05:51:02.1943252Z OMP: Error #15: Initializing libiomp5.so, but found unknown library already initialized.
2022-04-01T05:51:02.1944371Z OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
2022-04-01T05:51:02.2186690Z .jenkins/pytorch/test.sh: line 366:  6441 Aborted                 (core dumped) python backend.py --export-module-to=model.pt
2022-04-01T05:51:02.2187059Z + cleanup
2022-04-01T05:51:02.2187210Z + retcode=134
2022-04-01T05:51:02.2187557Z + set +x
2022-04-01T05:51:02.2228359Z ##[error]Process completed with exit code 134.
2022-04-01T05:51:02.2253616Z Prepare all required actions
2022-04-01T05:51:02.2278085Z ##[group]Run ./.github/actions/chown-workspace
2022-04-01T05:51:02.2278556Z env:
2022-04-01T05:51:02.2278714Z   IN_CI: 1
2022-04-01T05:51:02.2278863Z   IS_GHA: 1
2022-04-01T05:51:02.2279042Z   GIT_DEFAULT_BRANCH: master

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Mar 31, 2022
@dagitses dagitses added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 1, 2022
Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, thanks!

@facebook-github-bot
Copy link
Contributor

@rohan-varma has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@rohan-varma
Copy link
Contributor

CI error is unrelated - OMP: Error #15: Initializing libiomp5.so, but found unknown library already initialized.

facebook-github-bot pushed a commit that referenced this pull request Apr 4, 2022
#75007)

Summary:
Previously the highest-level process group in `period_process_group_dict` could be `None`, indicating the global group. Now `period_process_group_dict` cannot contain `None` as a process group, so the function `_find_process_group` can just return a process group instead of a tuple -- when not found, just return `None`, because now the returned process group cannot be `None`.

Proposal: #71325

Pull Request resolved: #75007

Reviewed By: awgu

Differential Revision: D35357816

Pulled By: rohan-varma

fbshipit-source-id: 4522dba49797df7140227bfd822d668b7e118a66
@github-actions
Copy link
Contributor

github-actions bot commented Apr 4, 2022

Hey @wayi1.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed oncall: distributed Add this issue/PR to distributed oncall triage queue open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants