-
Notifications
You must be signed in to change notification settings - Fork 31.5k
Make activation functions available from modeling_utils (PyTorch) #1371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* This commit replaces references to activation functions/modules by a dict of functions that lives in `modeling_utils`. This ensures that all activation functions are available to all modules. * In addition, when available the native PyTorch gelu function will be used.
Codecov Report
@@ Coverage Diff @@
## master #1371 +/- ##
==========================================
+ Coverage 83.76% 84.74% +0.97%
==========================================
Files 84 84
Lines 12596 12559 -37
==========================================
+ Hits 10551 10643 +92
+ Misses 2045 1916 -129
Continue to review full report at Codecov.
|
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
Unstale. |
|
Would feel syntactically cleaner if we could do |
Sounds good but note that this is not something I introduced. The ACT2FN dict already existed, but wasn't used consistently it seemed. |
|
Ah yeah, I see. Would you want to do this change, if you have the time/bandwidth? (+ rebasing on current master so we can merge easily?) |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
AFAICT, this has been done by @sshleifer on master. Re-open if necessary! |
modeling_utils. This ensures that all activation functions are available to all modules, praticularly custom functions such as swish and new_gelu.NOTE that this replaces all
nn.Module's by bare functions except for one which was required for testing to be of the typenn.Module. If requested, this can be reverted so that only function calls are replaced by ACT2FN functions, and that existingnn.Modules are untouched.NOTE that one would thus also expect that all usages of activation functions are taken from
ACT2FNfor consistency's sake.NOTE since the Module counter-part of PyTorch's GeLU isn't available (yet), it might be worth waiting to implement this pull, and then use Modules and functions in the right places where one would expect, i.e.
Modulewhen part of architecture, function when processing other kinds of data.