Commit 9af59ca
committed
[v1.7 patch] Disallow creation of ProcessGroupNCCL without GPUs. (#45642)
Summary:
Note: This PR has been merged into master at b5a2f04 after the 1.7 branch cut
(see original PR: #45642). This PR is to merge it into the 1.7 branch.
---- Original Commit Description Follows ---
Pull Request resolved: #45642
Prior to #45181, initializing a
NCCL process group would work even if no GPUs were present. Although, now since
init_process_group calls `barrier()` this would fail.
In general the problem was that we could initialize ProcessGroupNCCL without
GPUs and then if we called a method like `barrier()` the process would crash
since we do % numGPUs resulting in division by zero.
ghstack-source-id: 113490343
Test Plan: waitforbuildbot
Reviewed By: osalpekar
Differential Revision: D24038839
fbshipit-source-id: a1f1db52cabcfb83e06c1a11ae9744afbf03f8dc1 parent 653d766 commit 9af59ca
File tree
3 files changed
+52
-7
lines changed- test/distributed
- torch
- lib/c10d
- testing/_internal
3 files changed
+52
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
1594 | 1595 | | |
1595 | 1596 | | |
1596 | 1597 | | |
| 1598 | + | |
| 1599 | + | |
| 1600 | + | |
| 1601 | + | |
| 1602 | + | |
| 1603 | + | |
| 1604 | + | |
| 1605 | + | |
| 1606 | + | |
| 1607 | + | |
| 1608 | + | |
| 1609 | + | |
| 1610 | + | |
| 1611 | + | |
| 1612 | + | |
| 1613 | + | |
| 1614 | + | |
| 1615 | + | |
| 1616 | + | |
| 1617 | + | |
| 1618 | + | |
| 1619 | + | |
| 1620 | + | |
1597 | 1621 | | |
1598 | | - | |
1599 | | - | |
1600 | | - | |
1601 | | - | |
1602 | | - | |
1603 | | - | |
1604 | 1622 | | |
1605 | 1623 | | |
1606 | 1624 | | |
| |||
1615 | 1633 | | |
1616 | 1634 | | |
1617 | 1635 | | |
| 1636 | + | |
| 1637 | + | |
1618 | 1638 | | |
1619 | 1639 | | |
1620 | 1640 | | |
| |||
1639 | 1659 | | |
1640 | 1660 | | |
1641 | 1661 | | |
| 1662 | + | |
| 1663 | + | |
1642 | 1664 | | |
1643 | 1665 | | |
1644 | 1666 | | |
| |||
1661 | 1683 | | |
1662 | 1684 | | |
1663 | 1685 | | |
| 1686 | + | |
| 1687 | + | |
1664 | 1688 | | |
1665 | 1689 | | |
1666 | 1690 | | |
| |||
1722 | 1746 | | |
1723 | 1747 | | |
1724 | 1748 | | |
| 1749 | + | |
| 1750 | + | |
1725 | 1751 | | |
1726 | 1752 | | |
1727 | 1753 | | |
| |||
1752 | 1778 | | |
1753 | 1779 | | |
1754 | 1780 | | |
| 1781 | + | |
| 1782 | + | |
1755 | 1783 | | |
1756 | 1784 | | |
1757 | 1785 | | |
| |||
1777 | 1805 | | |
1778 | 1806 | | |
1779 | 1807 | | |
| 1808 | + | |
| 1809 | + | |
1780 | 1810 | | |
1781 | 1811 | | |
1782 | 1812 | | |
| |||
1854 | 1884 | | |
1855 | 1885 | | |
1856 | 1886 | | |
| 1887 | + | |
| 1888 | + | |
1857 | 1889 | | |
1858 | 1890 | | |
1859 | 1891 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
450 | 450 | | |
451 | 451 | | |
452 | 452 | | |
| 453 | + | |
| 454 | + | |
453 | 455 | | |
454 | 456 | | |
455 | 457 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
130 | 130 | | |
131 | 131 | | |
132 | 132 | | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
133 | 144 | | |
134 | 145 | | |
135 | 146 | | |
| |||
0 commit comments