[PP] Add optional argument to not save outputs#165822
[PP] Add optional argument to not save outputs#165822H-Huang wants to merge 4 commits intogh/H-Huang/227/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165822
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit c03169d with merge base fe80f03 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| if self.rank == self.world_size - 1: | ||
| output = schedule.step(target=target, losses=losses, return_outputs=False) | ||
| else: | ||
| schedule.step(x) |
There was a problem hiding this comment.
technically output is None for this line too, right?
There was a problem hiding this comment.
yep that is also None (every rank without the last stage returns None)
wconstab
left a comment
There was a problem hiding this comment.
looks good to me.
testing that we actually do not save stuff seems (instead of just that we do not return stuff) seems safer in a paranoid sense, if you can think of a way to do that.
Yeah agree there, maybe there can be a way to verify the memory usage doesn't increase significantly when increasing # of microbatches. I will think about it |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fix pytorch#159251 Add an optional argument `return_outputs` to the schedule `step` Pull Request resolved: pytorch#165822 Approved by: https://github.com/wconstab
Fix pytorch#159251 Add an optional argument `return_outputs` to the schedule `step` Pull Request resolved: pytorch#165822 Approved by: https://github.com/wconstab
Uses the API added in pytorch/pytorch#165822, since we do not return any output from PP step(). This allows us to release the memory earlier,
Uses the API added in pytorch/pytorch#165822, since we do not return any output from PP step(). This allows us to release the memory earlier,
Stack from ghstack (oldest at bottom):
Fix #159251
Add an optional argument
return_outputsto the schedulestepcc @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci