Skip to content

Conversation

@antoniojkim
Copy link
Collaborator

@antoniojkim antoniojkim commented Oct 3, 2022

As described in the issue, this PR adds hooks to be run when register_parameter, register_buffer and register_module are called.

Fixes #85837

cc @albanD @mruberry @jbschlosser @walterddr @kshitij12345 @saketh-are

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 3, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86148

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 41607f6:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@albanD
Copy link
Collaborator

albanD commented Oct 3, 2022

Hi @antoniojkim thanks for sending a PR for this!

  • we review changes to nn.Module very carefully and in particular changing the direct setting of _modules to a register_module is something we should make sure is safe.
  • could you split the formatting change in a separate PR? That would make it a lot easier to review this one.

@antoniojkim
Copy link
Collaborator Author

* could you split the formatting change in a separate PR? That would make it a lot easier to review this one.

Would this PR pass the linting check in CI without the proper formatting?

@albanD
Copy link
Collaborator

albanD commented Oct 3, 2022

Would this PR pass the linting check in CI without the proper formatting?

No. But we can merge the formatting PR quickly and so it will disappear from this PR.

@antoniojkim antoniojkim force-pushed the antoniojkim/new_registration_hooks branch from 2dd2179 to 546e93d Compare October 3, 2022 19:50
@antoniojkim
Copy link
Collaborator Author

I've undone all the formatting changes

changing the direct setting of _modules to a register_module is something we should make sure is safe.

Turns out that the behaviour is slightly different. So, I've reverted it back to how it was before. There is now multiple copies of the code that calls the buffer registration and module registration hooks

@antoniojkim
Copy link
Collaborator Author

@pytorchbot rebase

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 3, 2022

You don't have permissions to rebase this PR, only people with write permissions may rebase PRs.

@antoniojkim
Copy link
Collaborator Author

@pytorchbot -h

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 3, 2022

PyTorchBot Help

usage: @pytorchbot [-h] {merge,revert,rebase,label} ...

In order to invoke the bot on your PR, include a line that starts with
@pytorchbot anywhere in a comment. That line will form the command; no
multi-line commands are allowed. 

Example:
    Some extra context, blah blah, wow this PR looks awesome

    @pytorchbot merge

optional arguments:
  -h, --help            Show this help message and exit.

command:
  {merge,revert,rebase,label}
    merge               Merge a PR
    revert              Revert a PR
    rebase              Rebase a PR
    label               Add label to a PR

Merge

usage: @pytorchbot merge [-g | -f MESSAGE | -l] [-r [{viable/strict,master}]]

Merge an accepted PR, subject to the rules in .github/merge_rules.json.
By default, this will wait for all required checks (lint, pull) to succeed before merging.

optional arguments:
  -g, --green           Merge when all status checks running on the PR pass. To add status checks, use labels like `ciflow/trunk`.
  -f MESSAGE, --force MESSAGE
                        Merge without checking anything. This requires a reason for auditting purpose, for example:
                        @pytorchbot merge -f 'Minor update to fix lint. Expecting all PR tests to pass'
  -l, --land-checks     Merge with land time checks. This will create a new branch with your changes rebased on viable/strict and run a majority of trunk tests _before_ landing to increase trunk reliability and decrease risk of revert. The tests added are: pull, Lint and trunk. Note that periodic is excluded. (EXPERIMENTAL)
  -r [{viable/strict,master}], --rebase [{viable/strict,master}]
                        Rebase the PR to re run checks before merging.  Accepts viable/strict or master as branch options and will default to viable/strict if not specified.

Revert

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst}

Revert a merged PR. This requires that you are a Meta employee.

Example:
  @pytorchbot revert -m="This is breaking tests on trunk. hud.pytorch.org/" -c=nosignal

optional arguments:
  -m MESSAGE, --message MESSAGE
                        The reason you are reverting, will be put in the commit message. Must be longer than 3 words.
  -c {nosignal,ignoredsignal,landrace,weird,ghfirst}, --classification {nosignal,ignoredsignal,landrace,weird,ghfirst}
                        A machine-friendly classification of the revert reason.

Rebase

usage: @pytorchbot rebase [-s | -b BRANCH]

Rebase a PR. Rebasing defaults to the stable viable/strict branch of pytorch.
You, along with any member of the pytorch organization, can rebase your PR.

optional arguments:
  -s, --stable          [DEPRECATED] Rebase onto viable/strict
  -b BRANCH, --branch BRANCH
                        Branch you would like to rebase to

Label

usage: @pytorchbot label labels [labels ...]

Adds label to a PR

positional arguments:
  labels  Labels to add to given Pull Request

@antoniojkim antoniojkim force-pushed the antoniojkim/new_registration_hooks branch from 546e93d to 6ad0084 Compare October 3, 2022 22:51
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 3, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: antoniojkim / name: Jae Hoon (Antonio) Kim (6ad0084d17dda3f25ec7247da6d168d5bd9d6c92)

@antoniojkim antoniojkim force-pushed the antoniojkim/new_registration_hooks branch from 6ad0084 to 127f0c9 Compare October 4, 2022 13:21
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good!
We would need a couple tests and update to the signature and that will be good to go!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you name these with a "module" in the name to match the other hooks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the buffer be passed in?
Also for consistency, could you pass the module first: hook(module, name, buffer) -> None or new buffer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the others below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, good catch. That's my bad for just copying the docstring over and not replacing all instances 😅

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the misunderstanding. I think we can do a small update for the signatures.
The calls will need to be updated as well to pass in the right arguments to the hooks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still be hook(module, name, buffer) -> None or new buffer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one should be hook(module, name, submodule) -> None or new submodule

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be hook(module, name, param) -> None or new param

@antoniojkim antoniojkim force-pushed the antoniojkim/new_registration_hooks branch from 5893b2d to 73d6ced Compare October 5, 2022 16:46
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signatures look good!
I think we only need two things now:

  • Update the call sites to pass in the right arguments
  • Add some basic tests for each of these hooks

@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 6, 2022
@antoniojkim antoniojkim force-pushed the antoniojkim/new_registration_hooks branch 3 times, most recently from 945dff2 to 86fa769 Compare October 11, 2022 22:26
@antoniojkim antoniojkim force-pushed the antoniojkim/new_registration_hooks branch from 86fa769 to 41607f6 Compare October 12, 2022 14:49
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM!
Thanks for taking the time to add full testing!

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 12, 2022
@albanD
Copy link
Collaborator

albanD commented Oct 12, 2022

@pytorchbot merge -g

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions
Copy link
Contributor

Hey @antoniojkim.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

pytorchmergebot pushed a commit that referenced this pull request Oct 21, 2022
There is a bug in the implementation of the registration hooks introduced in #86148 whereby if the hook returns a tensor, then the short circuiting logic:
```
value = hook(self, name, value) or value
```
Raises an exception
```
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
```
Fixing the logic so that it only checks to see if the value is `None` before overriding

Fixes #85837

CC: @albanD @jbschlosser
Pull Request resolved: #87369
Approved by: https://github.com/albanD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Parameter Registration Hook

6 participants