Vectorize bitwise_not #45103

xuhdev · 2020-09-22T01:57:37Z

Benchmark (Debian 10, Release build, gcc 8.3, no turbo, Intel(R) Xeon(R)
E-2136 CPU @ 3.30GHz):

import timeit
for dtype in ('torch.int64', 'torch.int32', 'torch.int16', 'torch.int8', 'torch.uint8'):
    for n, t in [(10_000, 100000),
                (100_000, 10000)]:
        print(f'torch.bitwise_not(a), numel() == {n} for {t} times, dtype={dtype}')
        print(timeit.timeit('torch.bitwise_not(a)', setup=f'import torch; a = torch.arange(-{n//2}, {n//2}, dtype={dtype})', number=t))

Before:

torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int64
0.5479081739904359
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int64
0.3350257440470159
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int32
0.39590477803722024
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int32
0.25563537096604705
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int16
0.31152817397378385
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int16
0.20817365101538599
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int8
0.8573925020173192
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int8
0.4150037349900231
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.uint8
0.8551108679967001
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.uint8
0.37137620500288904

After:

torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int64
0.5232444299617782
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int64
0.33852163201663643
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int32
0.3931163849774748
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int32
0.24392802000511438
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int16
0.3122224889229983
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int16
0.1977886479580775
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int8
0.26711542706470937
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int8
0.18208567495457828
torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.uint8
0.2615354140289128
torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.uint8
0.17972210398875177

Benchmark (Debian 10, Release build, gcc 8.3, no turbo, Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz): ```python import timeit for dtype in ('torch.int64', 'torch.int32', 'torch.int16', 'torch.int8', 'torch.uint8'): for n, t in [(10_000, 100000), (100_000, 10000)]: print(f'torch.bitwise_not(a), numel() == {n} for {t} times, dtype={dtype}') print(timeit.timeit('torch.bitwise_not(a)', setup=f'import torch; a = torch.arange(-{n//2}, {n//2}, dtype={dtype})', number=t)) ``` Before: ``` torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int64 0.5479081739904359 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int64 0.3350257440470159 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int32 0.39590477803722024 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int32 0.25563537096604705 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int16 0.31152817397378385 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int16 0.20817365101538599 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int8 0.8573925020173192 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int8 0.4150037349900231 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.uint8 0.8551108679967001 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.uint8 0.37137620500288904 ``` After: ``` torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int64 0.5232444299617782 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int64 0.33852163201663643 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int32 0.3931163849774748 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int32 0.24392802000511438 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int16 0.3122224889229983 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int16 0.1977886479580775 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.int8 0.26711542706470937 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.int8 0.18208567495457828 torch.bitwise_not(a), numel() == 10000 for 100000 times, dtype=torch.uint8 0.2615354140289128 torch.bitwise_not(a), numel() == 100000 for 10000 times, dtype=torch.uint8 0.17972210398875177 ```

codecov · 2020-09-22T05:47:00Z

Codecov Report

Merging #45103 into master will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #45103   +/-   ##
=======================================
  Coverage   67.85%   67.85%           
=======================================
  Files         384      384           
  Lines       50026    50026           
=======================================
+ Hits        33944    33945    +1     
+ Misses      16082    16081    -1

Impacted Files	Coverage Δ
torch/testing/_internal/expecttest.py	`78.57% <0.00%> (+1.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20f52cd...15522eb. Read the comment docs.

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-09-25T18:14:58Z

@ezyang merged this pull request in 536580e.

xuhdev added module: cpu CPU specific problem (e.g., perf, algorithm) module: operators labels Sep 22, 2020

xuhdev requested review from VitalyFedyunin and ezyang September 22, 2020 01:57

pytorchbot added the open source label Sep 22, 2020

nitish-awasthi approved these changes Sep 22, 2020

View reviewed changes

ezyang approved these changes Sep 22, 2020

View reviewed changes

facebook-github-bot reviewed Sep 22, 2020

View reviewed changes

facebook-github-bot closed this in 536580e Sep 25, 2020

xuhdev deleted the bitwise-not branch September 25, 2020 17:52

facebook-github-bot added the merged label Sep 25, 2020

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vectorize bitwise_not #45103

Vectorize bitwise_not #45103

Uh oh!

xuhdev commented Sep 22, 2020

Uh oh!

codecov bot commented Sep 22, 2020 •

edited

Loading

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Sep 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Vectorize bitwise_not #45103

Vectorize bitwise_not #45103

Uh oh!

Conversation

xuhdev commented Sep 22, 2020

Uh oh!

codecov bot commented Sep 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov bot commented Sep 22, 2020 •

edited

Loading