Skip to content

Conversation

@vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Aug 13, 2020

Stack from ghstack:

Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time. We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except HistogramObserver.

Test Plan:

CI for correctness

performance:

cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D23093995

Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Aug 13, 2020
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: e0dedf6
Pull Request resolved: #42957
@dr-ci
Copy link

dr-ci bot commented Aug 13, 2020

💊 CI failures summary and remediations

As of commit 6a60cbc (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 2/2 non-CircleCI failure(s)

Extra GitHub checks: 1 failed


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 31 times.

Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995)

[ghstack-poisoned]
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995)

[ghstack-poisoned]
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 2, 2020
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 721eb17
Pull Request resolved: #42957
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 2, 2020
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 673b854
Pull Request resolved: #42957
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except `HistogramObserver`.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
/*
    * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/
    * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/
    * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/
    * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/
*/
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Sep 5, 2020
Summary:

Switches observers to use the new min_max function to calculate
min and max at the same time.  We see around 45-50% speedup on
representative input shapes on the microbenchmarks.

Test Plan:

CI for correctness

performance:
```
cd benchmarks/operator_benchmark
// repeat (before diff, after diff) x (cpu, cuda)
python -m pt.qobserver_test --tag_filter all --device cpu
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0562225
Pull Request resolved: #42957
@codecov
Copy link

codecov bot commented Sep 5, 2020

Codecov Report

Merging #42957 into gh/vkuzo/123/base will increase coverage by 0.03%.
The diff coverage is 67.54%.

Impacted file tree graph

@@                  Coverage Diff                  @@
##           gh/vkuzo/123/base   #42957      +/-   ##
=====================================================
+ Coverage              69.32%   69.35%   +0.03%     
=====================================================
  Files                    381      381              
  Lines                  47190    47323     +133     
=====================================================
+ Hits                   32713    32822     +109     
- Misses                 14477    14501      +24     
Impacted Files Coverage Δ
torch/_classes.py 87.50% <0.00%> (ø)
torch/jit/_fuser.py 32.60% <0.00%> (ø)
torch/jit/_serialization.py 85.71% <0.00%> (ø)
torch/jit/supported_ops.py 0.00% <0.00%> (ø)
torch/jit/unsupported_tensor_ops.py 0.00% <0.00%> (ø)
torch/nn/modules/_functions.py 63.30% <0.00%> (ø)
torch/nn/parallel/distributed.py 42.53% <ø> (ø)
torch/nn/qat/__init__.py 100.00% <ø> (ø)
torch/nn/qat/modules/conv.py 100.00% <ø> (ø)
torch/nn/qat/modules/linear.py 100.00% <ø> (ø)
... and 52 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 307f9e0...6a60cbc. Read the comment docs.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in fd8e206.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/123/head branch September 12, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants