-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Added SiLU activation function #41034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
glaringlee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@albanD
This is an initial draft, should this be put in nn?
|
Will there be a |
|
I think that if you want to have the full |
|
Please modify this pull request to rename the swish to the SiLU. The swish was originally coined the "SiLU" in https://arxiv.org/pdf/1606.08415.pdf and https://arxiv.org/abs/1702.03118 long before the swish paper. Renaming other peoples' exact same ideas is unacceptable, as we all know. Here's a possible docstring: |
💊 CI failures summary and remediationsAs of commit 20ea162 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 32 times. |
|
That sounds like a good point @TFUsers @heitorschueroff could you make the following changes then:
This is just to make sure that people looking for swish in our documentation page hit something that points them to what they're looking for. Also the doc is missing from your PR:
|
|
CC @mruberry: we have an alias mechanism, right? That we can use to alias swish to silu. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should swish have python_module: nn ?
Not yet. File a follow-up issue and assign it to me and I'll add it when we finalize aliasing in a few weeks. |
|
this is related to |
|
@gchanan pytorch/torch/nn/functional.py Line 1119 in 5e03a1e
Should we add python_module:nn and add silu into nn/functional.py? otherwise, silu can only be accessed through torch._C._nn.silu or Should we remove python_module:nn and just provide torch.silu? |
Originally torch.gelu (the activation function used in BERT) was provided, but then it became F.gelu. |
|
To be consistent, let's do nn.functional.silu (F.silu) |
glaringlee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far looks good to me.
There are several things that need to add before ship this:
-
For documents, please update the following two rst file to make it indexed.
a. docs/source/nn.functional.rst
In 'Non-linear activation functions' section, probably afterhardsigmoid
b. docs/source/nn.rst
In `Non-linear Activations (weighted sum, nonlinearity) section, add SiLU in
here is what it should looks like (you can click nn.SiLU and nn.functional.silu)
https://glaringlee.github.io/search.html?q=silu&check_keywords=yes&area=defaultAfter the rst file get updated, you can try build the doc locally. Here is how.
https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md#writing-documentation -
Update torch/_override.py
Add silu into the list of get_testing_overrides(), this is the root cause of the CI test failure.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
glaringlee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now.
|
@heitorschueroff merged this pull request in 75a4862. |
|
Did this not make it into 1.6 in the end? |
|
The original branch cut happened on the 23rd of July IIRC. And then only important bugfix are cherry picked into the release. |
Implemented the SiLU activation function as discussed in #3169.