Skip to content

Conversation

@xiaomengy
Copy link
Contributor

@xiaomengy xiaomengy commented Aug 13, 2020

Summary:
Optimize SiLU (Swish) op in PyTorch.

Some benchmark result

input = torch.rand(1024, 32768, dtype=torch.float, device="cpu")
forward: 221ms -> 133ms
backward: 600ms -> 170ms

input = torch.rand(1024, 32768, dtype=torch.double, device="cpu")
forward: 479ms -> 297ms
backward: 1438ms -> 387ms

input = torch.rand(8192, 32768, dtype=torch.float, device="cuda")
forward: 24.34ms -> 9.83ms
backward: 97.05ms -> 29.03ms

input = torch.rand(4096, 32768, dtype=torch.double, device="cuda")
forward: 44.24ms -> 30.15ms
backward: 126.21ms -> 49.68ms

Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "SiLU"

Differential Revision: D23093593

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@xiaomengy xiaomengy requested a review from ngimel August 13, 2020 05:00
@dr-ci
Copy link

dr-ci bot commented Aug 13, 2020

💊 CI failures summary and remediations

As of commit ac3c803 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 40 times.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@xiaomengy
Copy link
Contributor Author

linke #43017

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

Copy link
Collaborator

@ngimel ngimel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you!

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

Summary:
Pull Request resolved: pytorch#42976

Optimize SiLU (Swish) op in PyTorch.

Some benchmark result

input = torch.rand(1024, 32768, dtype=torch.float, device="cpu")
forward: 221ms -> 133ms
backward: 600ms -> 170ms

input = torch.rand(1024, 32768, dtype=torch.double, device="cpu")
forward: 479ms -> 297ms
backward: 1438ms -> 387ms

input = torch.rand(8192, 32768, dtype=torch.float, device="cuda")
forward: 24.34ms -> 9.83ms
backward: 97.05ms -> 29.03ms

input = torch.rand(4096, 32768, dtype=torch.double, device="cuda")
forward: 44.24ms -> 30.15ms
backward: 126.21ms -> 49.68ms

Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "SiLU"

Reviewed By: houseroad

Differential Revision: D23093593

fbshipit-source-id: ca6129d9fce76b569ac7ea0690dd38eac23c221d
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D23093593

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 4ae832e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants