Skip to content

Conversation

@XilunWu
Copy link
Contributor

@XilunWu XilunWu commented Jan 6, 2023

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 6, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91801

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 387458b:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@XilunWu
Copy link
Contributor Author

XilunWu commented Jan 6, 2023

Two meshes created over world in test_creat_1d_device_mesh. Need furthur thinking on this PR.

Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not always do the check and do it only when we initialize world_pg

)

# TODO: we will support mesh on a subset of WORLD in future
if self.mesh.numel() < world_size:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check should not happen all the time, it should only happen when there's no default pg exist and we want to help user create a world_pg, this check should be only inside get_or_create_group I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense because IIRC we can define for example mesh A on rank 0, 1 and mesh B on rank 2, 3. The example we discussed last time is actually about mesh is defined on rank 0, 1 and no mesh is defined on rank 2, 3. Right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's possibly to create sub meshes, that's what 2-D did currently, so we should still allow such behavior.

Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@XilunWu
Copy link
Contributor Author

XilunWu commented Jan 12, 2023

@pytorchmergebot merge -g

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 12, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@XilunWu XilunWu deleted the gh/XilunWu/9/head branch April 11, 2023 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants