-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Hi everyone,
I'm using the new torch.split function given a list of chunks as well as a LSTM/GRU network (both lead to the bug).
On CPU, the code works perfectly.
On GPU, If I do something else that a RNN forward during the iteration over torch.split, it's fine. otherwise it crashes.
StackTrace
...
File "/home/diego/Github/DocAgg/pygcn_modified/models.py", line 80, in forward
output, hidden = self.document_rnn(sentence_embeddings_per_doc, self.document_rnn_hidden)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/rnn.py", line 181, in forward
output, hidden = func(input, self.all_weights, hx, batch_sizes)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py", line 315, in forward
return func(input, *fargs, **fkwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py", line 284, in forward
Variable(dropout_desc.state) if dropout_desc.state is not None else None)
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED
Code
sentence_hidden_embeddings is a FloatTensor [657, 700]
nb_sentences_per_doc is a python list :[26, 13, 12, 20, 25, 26, 535]
`
all_sentence_embeddings_per_doc = torch.split(sentence_hidden_embeddings.unsqueeze(0), nb_sentences_per_doc, dim=1)[:-1]
document_embeddings = []
for sentence_embeddings_per_doc in all_sentence_embeddings_per_doc:
self.document_rnn_hidden = self.init_hidden()
output, hidden = self.document_rnn(sentence_embeddings_per_doc, self.document_rnn_hidden)
# output[-1][-1] == hidden[-1][-1] (GRU) and output[-1][-1] == hidden[0][-1][-1] (LSTM)
doc_emb = hidden[-1] if self.mode == 'GRU' else (hidden[0][-1] if self.mode == 'LSTM' else None)
document_embeddings.append(doc_emb)
# TODO Remove. Doing only this perfectly works on GPU
#doc_emb = torch.mean(sentence_embeddings_per_doc, dim=1)
#document_embeddings.append(doc_emb)
cluster_embedding = torch.mean(torch.cat(document_embeddings), dim=0)`
RNN
`
if self.mode == 'GRU':
self.document_rnn = nn.GRU(embedding_size, embedding_size, num_layers=self.nb_layers, bias=True, dropout=self.dropout, bidirectional=False, batch_first=True)
elif self.mode == 'LSTM':
self.document_rnn = nn.LSTM(embedding_size, embedding_size, num_layers=self.nb_layers, bias=True, dropout=self.dropout, bidirectional=False, batch_first=True)
self.document_rnn_hidden = self.init_hidden()
`
Hidden_init
`
def init_hidden(self):
document_rnn_init_h = nn.Parameter(nn.init.xavier_uniform(torch.Tensor(self.nb_layers, self.batch_size, self.embedding_size).type(torch.FloatTensor)), requires_grad=True)
if self.mode == 'GRU':
return document_rnn_init_h
elif self.mode == 'LSTM':
document_rnn_init_c = nn.Parameter(nn.init.xavier_uniform(torch.Tensor(self.nb_layers, self.batch_size, self.embedding_size).type(torch.FloatTensor)), requires_grad=True)
return (document_rnn_init_h, document_rnn_init_c)
`
- OS: Linux Mint 18.2 Sonya
- PyTorch version: From source (2b2d56d)
- How you installed PyTorch (conda, pip, source): pip
- Python version: 3.5
- CUDA/cuDNN version: 9.1/7.0.5 (latest versions)
- GPU models and configuration: Titan Xp 12Go (Driver 390.12)
- GCC version (if compiling from source): (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609