-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Check if input is ChannelsLast or ChannelsLast3d for quantized AdaptivePool3d. #42780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check if input is ChannelsLast or ChannelsLast3d for quantized AdaptivePool3d. #42780
Conversation
💊 CI failures summary and remediationsAs of commit 2c970e3 (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
z-a-f
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great catch! Thank you. cc'ing @jerryzh168 for an extra pair of eyes to take a look
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@z-a-f has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Hey @z-a-f @jerryzh168, what's the state of this PR? |
|
It's imported to phabricator, @z-a-f will need to land this |
|
Rebasing it now -- once it completes, will land it! |
|
@z-a-f Awesome! |
cc @z-a-f, @vkuzo. This serves as a very simple first step to the issue mentioned in #42779.
Description
Since
ChannelsLastandChannelsLast3dare not equivalent (MemoryFormat.h), the "fast" path forNDHWCis ignored.This PR would produce the expected behaviour for 4 (5 if including batch) dimensional tensors.
Benchmarks
Notes
< 8, it is actually slower than before.qint32, it is actually2xslower than before.qint8andquint8when8 <channels< 64, the execution time decreases up to9-10times in the benchmarks.contiguousvariant when channels> 64.C++
Python
Reproduce
See #42779.