-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Optimize SiLU (Swish) op in PyTorch #42976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
💊 CI failures summary and remediationsAs of commit ac3c803 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 40 times. |
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
538cb93 to
c1b6530
Compare
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
c1b6530 to
afa0838
Compare
|
linke #43017 |
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
afa0838 to
96cd785
Compare
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
96cd785 to
2e226a4
Compare
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
2e226a4 to
50abac1
Compare
ngimel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thank you!
50abac1 to
c6130c3
Compare
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
c6130c3 to
9b4cd3b
Compare
Summary: Pull Request resolved: pytorch#42976 Optimize SiLU (Swish) op in PyTorch. Some benchmark result input = torch.rand(1024, 32768, dtype=torch.float, device="cpu") forward: 221ms -> 133ms backward: 600ms -> 170ms input = torch.rand(1024, 32768, dtype=torch.double, device="cpu") forward: 479ms -> 297ms backward: 1438ms -> 387ms input = torch.rand(8192, 32768, dtype=torch.float, device="cuda") forward: 24.34ms -> 9.83ms backward: 97.05ms -> 29.03ms input = torch.rand(4096, 32768, dtype=torch.double, device="cuda") forward: 44.24ms -> 30.15ms backward: 126.21ms -> 49.68ms Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "SiLU" Reviewed By: houseroad Differential Revision: D23093593 fbshipit-source-id: ca6129d9fce76b569ac7ea0690dd38eac23c221d
|
This pull request was exported from Phabricator. Differential Revision: D23093593 |
9b4cd3b to
ac3c803
Compare
|
This pull request has been merged in 4ae832e. |
Summary:
Optimize SiLU (Swish) op in PyTorch.
Some benchmark result
input = torch.rand(1024, 32768, dtype=torch.float, device="cpu")
forward: 221ms -> 133ms
backward: 600ms -> 170ms
input = torch.rand(1024, 32768, dtype=torch.double, device="cpu")
forward: 479ms -> 297ms
backward: 1438ms -> 387ms
input = torch.rand(8192, 32768, dtype=torch.float, device="cuda")
forward: 24.34ms -> 9.83ms
backward: 97.05ms -> 29.03ms
input = torch.rand(4096, 32768, dtype=torch.double, device="cuda")
forward: 44.24ms -> 30.15ms
backward: 126.21ms -> 49.68ms
Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "SiLU"
Differential Revision: D23093593