don't copy weight gradients in rnn #12600

ngimel · 2018-10-12T05:53:19Z

This PR gets rid of unnecessary copy of weight gradients in cudnn rnn. Also removes unnecessary check for input size when deciding whether to use persistent rnn, and adds doc string explaining when persistent rnn can be used. cc @ezyang

soumith · 2018-10-12T14:22:07Z

test failures are unrelated

facebook-github-bot

soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

houseroad · 2018-10-12T19:29:48Z

@pytorchbot retest this please

Summary: This PR gets rid of unnecessary copy of weight gradients in cudnn rnn. Also removes unnecessary check for input size when deciding whether to use persistent rnn, and adds doc string explaining when persistent rnn can be used. cc ezyang Pull Request resolved: pytorch/pytorch#12600 Differential Revision: D10359981 Pulled By: soumith fbshipit-source-id: 0fce11b527d543fabf21e6e9213fb2879853d7fb

aten/src/ATen/native/cudnn/RNN.cpp

+        if (copy) {
+            param_to.copy_(param_from.view_as(param_to));
+        } else {
+            param_from.resize_as_(param_to);


docs/source/cudnn_persistent_rnn.rst

+    3) input data has dtype ``torch.float16`` 
+    4) V100 GPU is used,
+    5) input data is not in ``PackedSequence`` format
+    persistent algorithm can be selected to improve performance.  


Summary: Should close #36428. The cudnn RNN API expects weights to occupy a flat buffer in memory with a particular layout. This PR implements a "speed of light" fix: [`_cudnn_rnn_cast_reflatten`](https://github.com/pytorch/pytorch/pull/42385/files#diff-9ef93b6a4fb5a06a37c562b83737ac6aR327) (the autocast wrapper assigned to `_cudnn_rnn`) copies weights to the right slices of a flat FP16 buffer with a single read/write per weight (as opposed to casting them to FP16 individually then reflattening the individual FP16 weights, which would require 2 read/writes per weight). It isn't pretty but IMO it doesn't make rnn bindings much more tortuous than they already are. The [test](https://github.com/pytorch/pytorch/pull/42385/files#diff-e68a7bc6ba14f212e5e7eb3727394b40R2683) tries a forward under autocast and a backward for the full cross product of RNN options and input/weight/hidden dtypes. As for all FP16list autocast tests, forward output and backward grads are checked against a control where inputs (including RNN module weights in this case) are precasted to FP16 on the python side. Not sure who to ask for review, tagging ezyang and ngimel because Ed wrote this file (almost 2 years ago) and Natalia did the most recent major [surgery](#12600). Side quests discovered: - Should we update [persistent RNN heuristics](https://github.com/pytorch/pytorch/blob/dbdd28207c5cf6c4a35ceb1de0811c4812e8882c/aten/src/ATen/native/cudnn/RNN.cpp#L584) to include compute capability 8.0? Could be another PR but seems easy enough to include. - Many (maybe all?!) the raw cudnn API calls in [RNN.cpp](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cudnn/RNN.cpp) are deprecated in cudnn 8. I don't mind taking the AI to update them since my mental cache is full of rnn stuff, but that would be a substantial separate PR. Pull Request resolved: #42385 Reviewed By: zhangguanheng66 Differential Revision: D23077782 Pulled By: ezyang fbshipit-source-id: a2afb1bdab33ba0442879a703df13dc87f03ec2e

don't copy weight gradients in rnn

5e57eee

ngimel requested review from apaszke, colesbury, ezyang, gchanan, soumith, ssnl, zdevito and zou3519 as code owners October 12, 2018 05:53

soumith approved these changes Oct 12, 2018

View reviewed changes

facebook-github-bot reviewed Oct 12, 2018

View reviewed changes

facebook-github-bot closed this in 134b5d6 Oct 12, 2018

apaszke reviewed Oct 14, 2018

View reviewed changes

ngimel deleted the rnn_copies branch January 16, 2019 19:51

ezyang added open source merged labels Jun 24, 2019

mcarilli mentioned this pull request Aug 3, 2020

Autocast support for cudnn RNNs #42385

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

don't copy weight gradients in rnn #12600

don't copy weight gradients in rnn #12600

Uh oh!

ngimel commented Oct 12, 2018 •

edited

Loading

Uh oh!

soumith commented Oct 12, 2018

Uh oh!

facebook-github-bot left a comment

Uh oh!

houseroad commented Oct 12, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

don't copy weight gradients in rnn #12600

don't copy weight gradients in rnn #12600

Uh oh!

Conversation

ngimel commented Oct 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soumith commented Oct 12, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

houseroad commented Oct 12, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ngimel commented Oct 12, 2018 •

edited

Loading