Skip to content

Conversation

@bdhirsh
Copy link
Contributor

@bdhirsh bdhirsh commented Mar 30, 2022

This PR is a re-land of #69633 (this is the second re-land attempt, the first one is at #72827). The original PR had a memory corruption bug that only surfaced on mobile builds.

Background: Existing Mobile Optimization

Pytorch mobile builds have an existing optimization (here and here), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).
In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.
The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here.
The mobile-optimization currently does not extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator).

The Bug

This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line).
That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the Dispatcher object, backendFallbackKernels_.

Stack from ghstack (oldest at bottom):

NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on Phabricator!

Original commit changeset: b962de5d5eff

Original Phabricator Diff: D35192346


Back out "Back out "DispatchKeySet perf improvements""

Original commit changeset: e38081810a56

Original Phabricator Diff: D35192317

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 30, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 65b26fa (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build pull / win-vs2019-cuda11.3-py3 / build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

2022-03-31T14:10:42.7649219Z CMake Error: Error...ariable not set, cmake may not be built correctly.
2022-03-31T14:10:42.1949433Z CMAKE_CUDA_COMPILE_WHOLE_COMPILATION
2022-03-31T14:10:42.2068298Z CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
2022-03-31T14:10:42.2068846Z Missing variable is:
2022-03-31T14:10:42.2069239Z CMAKE_CUDA_COMPILE_WHOLE_COMPILATION
2022-03-31T14:10:42.2177561Z CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
2022-03-31T14:10:42.2178089Z Missing variable is:
2022-03-31T14:10:42.2178609Z CMAKE_CUDA_COMPILE_WHOLE_COMPILATION
2022-03-31T14:10:42.2231880Z CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
2022-03-31T14:10:42.2232421Z Missing variable is:
2022-03-31T14:10:42.2232798Z CMAKE_CUDA_COMPILE_WHOLE_COMPILATION
2022-03-31T14:10:42.7649219Z CMake Error: Error required internal CMake variable not set, cmake may not be built correctly.
2022-03-31T14:10:42.7649839Z Missing variable is:
2022-03-31T14:10:42.7650242Z CMAKE_CUDA_COMPILE_WHOLE_COMPILATION
2022-03-31T14:10:42.9656409Z -- Generating done
2022-03-31T14:10:43.0929506Z CMake Warning:
2022-03-31T14:10:43.0930141Z   Manually-specified variables were not used by the project:
2022-03-31T14:10:43.0930512Z 
2022-03-31T14:10:43.0930798Z     BUILD_ENVIRONMENT
2022-03-31T14:10:43.0931123Z     BUILD_TYPE
2022-03-31T14:10:43.0931412Z     BUILD_WHEEL
2022-03-31T14:10:43.0931592Z 

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Original commit changeset: b962de5d5eff

Original Phabricator Diff: D35192346


Back out "Back out "DispatchKeySet perf improvements""

Original commit changeset: e38081810a56

Original Phabricator Diff: D35192317

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!

[ghstack-poisoned]
Original commit changeset: b962de5d5eff

Original Phabricator Diff: D35192346


Back out "Back out "DispatchKeySet perf improvements""

Original commit changeset: e38081810a56

Original Phabricator Diff: D35192317

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!

[ghstack-poisoned]
Original commit changeset: b962de5d5eff

Original Phabricator Diff: D35192346


Back out "Back out "DispatchKeySet perf improvements""

Original commit changeset: e38081810a56

Original Phabricator Diff: D35192317

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Mar 30, 2022
Pull Request resolved: #74963

Original commit changeset: b962de5d5eff

Original Phabricator Diff: D35192346


Back out "Back out "DispatchKeySet perf improvements""

Original commit changeset: e38081810a56

Original Phabricator Diff: D35192317
ghstack-source-id: 152614490

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!
@bdhirsh bdhirsh requested a review from albanD March 30, 2022 20:55
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

This PR is a re-land of  #69633 (this is the second re-land attempt, the first one is at #72827). The original PR had a memory corruption bug that only surfaced on mobile builds.


*Background: Existing Mobile Optimization*

Pytorch mobile builds have an existing optimization ([here](https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/c10/core/DispatchKey.h#L382) and [here](https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/OperatorEntry.h#L214)), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).
In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.
The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined [here](https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/Dispatcher.h#L294).
The mobile-optimization currently does not extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator).


*The Bug*

This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on [this](https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294) line).
That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the Dispatcher object, backendFallbackKernels_.


**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Mar 31, 2022
Pull Request resolved: #74963

This is a re-land of D35192346 and D35192317, which together are a diff that changes the internal representation of `DispatchKeySet` in pytorch core to free up the number of dispatch keys that we have available. See a more detailed description of the design in the original PR: #69633.



The original PR broke Milan workflows, which use a pytorch mobile build, and manifested as a memory corruption bug inside of `liboacrmerged.so`.

**Background: Existing Mobile Optimization**
Pytorch mobile builds have an existing optimization (here https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/c10/core/DispatchKey.h#L382 and here https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/OperatorEntry.h#L214), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).

In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.

The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here: https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/Dispatcher.h#L294.

The mobile-optimization currently does **not** extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator). So the per-operator tables on mobile are size 8, while the fallback table is size 64.

**The Bug**
This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line: https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294).

That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the `Dispatcher` object, `backendFallbackKernels_`.


**Why didn't this problem show up in OSS CI? Why didn't it break other internal mobile workflows aside from Milan?**

Ideally, this failure would show up as part of the OSS signal on GitHub, since we already have mobile OSS builds. Given that it was another memory corruption issue that only affected Milan (subset of mobile), I'm not sure what's specific about Milan's builds that caused it only to manifest there. @dreiss I wonder if there's another flavor of mobile builds we could run in OSS CI that could potentially help catch this?



**The debugging experience was pretty difficult**

Debugging the Milan-specific failure was made difficult by the following:

(1) lack of CI
- the original Milan failure didn't surface on my original diff, because the Milan job(s) that failed weren't triggered to run on pytorch changes. There's probably a balance to strike here, since those jobs will only be useful if they aren't flaky, and if they can produce reliable failure logs for debugging.

(2) It's difficult to get a repro.
- my work laptop doesn't have the right specs to run the Milan development workflow (not enough disk space)
- There is an existing OnDemand workflow for Milan, but it appears to be relatively new, and after a bunch of help from @mflporto, we ran into issues forwarding the log output from Milan tests on the emulator back to the terminal (see the original discussion here: https://fb.workplace.com/groups/OnDemandFRL/permalink/1424937774645433/)

(3) Lack of stack-traces.
- Most Milan failures didn't include actionable stack traces. @phding generously helped me debug by running my suggested patches locally, and reporting back if there were any failures. The failing test didn't include a stack trace though (just the line where the crash appeared), so I ended up making some educated guesses about what the issue was based on the area of the crash.

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!
ghstack-source-id: 152688542
facebook-github-bot pushed a commit to pytorch/nestedtensor that referenced this pull request Mar 31, 2022
Summary:
X-link: pytorch/pytorch#74963

This is a re-land of D35192346 and D35192317, which together are a diff that changes the internal representation of `DispatchKeySet` in pytorch core to free up the number of dispatch keys that we have available. See a more detailed description of the design in the original PR: pytorch/pytorch#69633.

The original PR broke Milan workflows, which use a pytorch mobile build, and manifested as a memory corruption bug inside of `liboacrmerged.so`.

**Background: Existing Mobile Optimization**
Pytorch mobile builds have an existing optimization (here https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/c10/core/DispatchKey.h#L382 and here https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/OperatorEntry.h#L214), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).

In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.

The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here: https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/Dispatcher.h#L294.

The mobile-optimization currently does **not** extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator). So the per-operator tables on mobile are size 8, while the fallback table is size 64.

**The Bug**
This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line: https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294).

That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the `Dispatcher` object, `backendFallbackKernels_`.

**Why didn't this problem show up in OSS CI? Why didn't it break other internal mobile workflows aside from Milan?**

Ideally, this failure would show up as part of the OSS signal on GitHub, since we already have mobile OSS builds. Given that it was another memory corruption issue that only affected Milan (subset of mobile), I'm not sure what's specific about Milan's builds that caused it only to manifest there. dreiss I wonder if there's another flavor of mobile builds we could run in OSS CI that could potentially help catch this?

**The debugging experience was pretty difficult**

Debugging the Milan-specific failure was made difficult by the following:

(1) lack of CI
- the original Milan failure didn't surface on my original diff, because the Milan job(s) that failed weren't triggered to run on pytorch changes. There's probably a balance to strike here, since those jobs will only be useful if they aren't flaky, and if they can produce reliable failure logs for debugging.

(2) It's difficult to get a repro.
- my work laptop doesn't have the right specs to run the Milan development workflow (not enough disk space)
- There is an existing OnDemand workflow for Milan, but it appears to be relatively new, and after a bunch of help from MarcioPorto, we ran into issues forwarding the log output from Milan tests on the emulator back to the terminal (see the original discussion here: https://fb.workplace.com/groups/OnDemandFRL/permalink/1424937774645433/)

(3) Lack of stack-traces.
- Most Milan failures didn't include actionable stack traces. phding generously helped me debug by running my suggested patches locally, and reporting back if there were any failures. The failing test didn't include a stack trace though (just the line where the crash appeared), so I ended up making some educated guesses about what the issue was based on the area of the crash.
ghstack-source-id: 152688542

Reviewed By: phding, albanD

Differential Revision: D35222806

fbshipit-source-id: 0ad115a0f768bc8ea5d4c203b2990254c7092d30
facebook-github-bot pushed a commit that referenced this pull request Mar 31, 2022
Summary:
Pull Request resolved: #74963

This is a re-land of D35192346 (9872a06) and D35192317 (a9216cd), which together are a diff that changes the internal representation of `DispatchKeySet` in pytorch core to free up the number of dispatch keys that we have available. See a more detailed description of the design in the original PR: #69633.

The original PR broke Milan workflows, which use a pytorch mobile build, and manifested as a memory corruption bug inside of `liboacrmerged.so`.

**Background: Existing Mobile Optimization**
Pytorch mobile builds have an existing optimization (here https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/c10/core/DispatchKey.h#L382 and here https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/OperatorEntry.h#L214), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).

In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.

The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here: https://github.com/pytorch/pytorch/blob/cc23725e89713138aa1c81ce5fb4a8dbcd440ccf/aten/src/ATen/core/dispatch/Dispatcher.h#L294.

The mobile-optimization currently does **not** extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator). So the per-operator tables on mobile are size 8, while the fallback table is size 64.

**The Bug**
This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line: https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294).

That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the `Dispatcher` object, `backendFallbackKernels_`.

**Why didn't this problem show up in OSS CI? Why didn't it break other internal mobile workflows aside from Milan?**

Ideally, this failure would show up as part of the OSS signal on GitHub, since we already have mobile OSS builds. Given that it was another memory corruption issue that only affected Milan (subset of mobile), I'm not sure what's specific about Milan's builds that caused it only to manifest there. dreiss I wonder if there's another flavor of mobile builds we could run in OSS CI that could potentially help catch this?

**The debugging experience was pretty difficult**

Debugging the Milan-specific failure was made difficult by the following:

(1) lack of CI
- the original Milan failure didn't surface on my original diff, because the Milan job(s) that failed weren't triggered to run on pytorch changes. There's probably a balance to strike here, since those jobs will only be useful if they aren't flaky, and if they can produce reliable failure logs for debugging.

(2) It's difficult to get a repro.
- my work laptop doesn't have the right specs to run the Milan development workflow (not enough disk space)
- There is an existing OnDemand workflow for Milan, but it appears to be relatively new, and after a bunch of help from MarcioPorto, we ran into issues forwarding the log output from Milan tests on the emulator back to the terminal (see the original discussion here: https://fb.workplace.com/groups/OnDemandFRL/permalink/1424937774645433/)

(3) Lack of stack-traces.
- Most Milan failures didn't include actionable stack traces. phding generously helped me debug by running my suggested patches locally, and reporting back if there were any failures. The failing test didn't include a stack trace though (just the line where the crash appeared), so I ended up making some educated guesses about what the issue was based on the area of the crash.
ghstack-source-id: 152688542

Test Plan: Confirmed with phding that the broken Milan workflow from the previous version of this diff is now passing.

Reviewed By: phding, albanD

Differential Revision: D35222806

fbshipit-source-id: 0ad115a0f768bc8ea5d4c203b2990254c7092d30
@github-actions
Copy link
Contributor

Hey @bdhirsh.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@facebook-github-bot facebook-github-bot deleted the gh/bdhirsh/191/head branch April 4, 2022 14:17
NesrineMHB pushed a commit to NesrineMHB/pytorch that referenced this pull request Apr 8, 2022
Original commit changeset: b962de5d5eff

Original Phabricator Diff: D35192346


Back out "Back out "DispatchKeySet perf improvements""

Original commit changeset: e38081810a56

Original Phabricator Diff: D35192317

Differential Revision: [D35222806](https://our.internmc.facebook.com/intern/diff/D35222806/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35222806/)!

ghstack-source-id: eb033a6
Pull Request resolved: pytorch/pytorch#74963
dzdang added a commit that referenced this pull request Apr 9, 2022
…ons for quantized & non-quantized tensors in item"

Summary: This PR is part of a series of PRs addressing #54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for item
using a dispatcher.
Simultaneous support of CompositeImplicitAutograd and Quantized dispatch keys
was made possible with #74963

Test plan:
There are numerous tests in the suite that make use of torch.Tensor.item.
```
python test/run_test.py
```
can be used for comprehensive evaluation, or alternatively, b/c this PR should not affect torch.Tensor.item calls on non-quantized tensors,
we can specifically test on quantized tensors:
```
python test/test_quantization.py
```

Differential Revision: [D34004298](https://our.internmc.facebook.com/intern/diff/D34004298)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Apr 9, 2022
…d & non-quantized tensors in item"

Summary: This PR is part of a series of PRs addressing #54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for item
using a dispatcher.
Simultaneous support of CompositeImplicitAutograd and Quantized dispatch keys
was made possible with #74963

Test plan:
There are numerous tests in the suite that make use of torch.Tensor.item.
```
python test/run_test.py
```
can be used for comprehensive evaluation, or alternatively, b/c this PR should not affect torch.Tensor.item calls on non-quantized tensors,
we can specifically test on quantized tensors:
```
python test/test_quantization.py
```

Differential Revision: [D34004298](https://our.internmc.facebook.com/intern/diff/D34004298)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Apr 9, 2022
…mentations for quantized & non-quantized tensors in item"

Summary: This PR is part of a series of PRs addressing #54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for item
using a dispatcher.
Simultaneous support of CompositeImplicitAutograd and Quantized dispatch keys
was made possible with #74963

Test plan:
There are numerous tests in the suite that make use of torch.Tensor.item.
```
python test/run_test.py
```
can be used for comprehensive evaluation, or alternatively, b/c this PR should not affect torch.Tensor.item calls on non-quantized tensors,
we can specifically test on quantized tensors:
```
python test/test_quantization.py
```

Differential Revision: [D35517808](https://our.internmc.facebook.com/intern/diff/D35517808)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Apr 9, 2022
…uantized & non-quantized tensors in item"

Summary: This PR is part of a series of PRs addressing #54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for item
using a dispatcher.
Simultaneous support of CompositeImplicitAutograd and Quantized dispatch keys
was made possible with #74963

Test plan:
There are numerous tests in the suite that make use of torch.Tensor.item.
```
python test/run_test.py
```
can be used for comprehensive evaluation, or alternatively, b/c this PR should not affect torch.Tensor.item calls on non-quantized tensors,
we can specifically test on quantized tensors:
```
python test/test_quantization.py
```

Differential Revision: [D35517808](https://our.internmc.facebook.com/intern/diff/D35517808)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Apr 9, 2022
…non-quantized tensors in item

Summary: This PR is part of a series of PRs addressing #54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR separates the calls to quantized & non-quantized backends for item
using a dispatcher.
Simultaneous support of CompositeImplicitAutograd and Quantized dispatch keys
was made possible with #74963

Test plan:
There are numerous tests in the suite that make use of torch.Tensor.item.
```
python test/run_test.py
```
can be used for comprehensive evaluation, or alternatively, b/c this PR should not affect torch.Tensor.item calls on non-quantized tensors,
we can specifically test on quantized tensors:
```
python test/test_quantization.py
```

ghstack-source-id: 3264c38
Pull Request resolved: #72333
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants