-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Issue description
I can run this code normally through CPU. but got the Error RuntimeError: cuda runtime error (9) : invalid configuration argument at /data/users/mabing/pytorch/aten/src/ATen/native/cuda/EmbeddingBag.cu:257 when run loss.backward(). through cuda().
Code example
import torch.optim as optim
import torch
import torch.nn as nn
import numpy as np
from scipy.special import expit
import os
import time
class SkipGramModel(nn.Module):
def __init__(self, component_size, word_size, dim):
super(SkipGramModel, self).__init__()
self.emb_size = dim
self.component_size = component_size
self.word_size = word_size
self.atten_layers = nn.Embedding(word_size,1)
self.u_embeddings = nn.EmbeddingBag(component_size,dim)
self.word_embeddings = nn.Embedding(word_size,dim,sparse=True)
self.v_embeddings = nn.Embedding(word_size,dim,sparse=True)
self.m = nn.Sigmoid()
self.init_emb()
def init_emb(self):
initrange = 0.5 / self.emb_size
self.word_embeddings.weight.data.uniform_(-initrange,initrange)
self.u_embeddings.weight.data.uniform_(-initrange, initrange)
self.v_embeddings.weight.data.uniform_(-0, 0)
atten = torch.zeros([self.word_size, 5])
atten[:, 0] += torch.log(torch.FloatTensor([4]))
self.atten_layers.weight.data = atten
def forward(self, word_in,component_in, word_out, offset):
char_in = torch.cuda.LongTensor(component_in[0])
redical_in = torch.cuda.LongTensor(component_in[1])
com1_in = torch.cuda.LongTensor(component_in[2])
com2_in = torch.cuda.LongTensor(component_in[3])
offset1 = torch.cuda.LongTensor(offset[0])
offset2 = torch.cuda.LongTensor(offset[1])
offset3 = torch.cuda.LongTensor(offset[2])
offset4 = torch.cuda.LongTensor(offset[3])
attention = torch.softmax(self.atten_layers(word_in),dim=-1).unsqueeze(1)
emb_uword = self.word_embeddings(word_in)
emb_char = self.u_embeddings(char_in,offset1)
emb_redical = self.u_embeddings(redical_in,offset2)
emb_com1 = self.u_embeddings(com1_in,offset3)
emb_com2 = self.u_embeddings(com2_in,offset4)
emb_all = torch.stack((emb_uword,emb_char,emb_redical,emb_com1,emb_com2),1)
emb_vword = self.v_embeddings(word_out)
emb_mixin = torch.bmm(attention,emb_all).squeeze(1)
score = torch.mul(emb_mixin, emb_vword)
score = torch.sum(score, dim=-1)
score = self.m(score)
return score
if __name__ == '__main__':
model = SkipGramModel(364, 180, 100).cuda()
optimizer = optim.SGD(model.parameters(), lr=0.025)
Lossfunc = nn.BCELoss(reduction='sum')
for _ in range(100):
word_in = torch.cuda.LongTensor([2]*128)
word_out = torch.cuda.LongTensor([2]*128)
label = torch.cuda.FloatTensor([1]*128)
component_in = [[3,5],[2,4,5],[2,3,4],[]]
offset = [[0]*127+[1],[0]*127+[1],[0]*128,[0]*128]
outs = model.forward(word_in, component_in, word_out, offset)
loss = Lossfunc(outs, label)
optimizer.zero_grad()
loss.backward()
optimizer.step()
System Info
`Collecting environment information...
PyTorch version: 1.0.0a0+7f0dd24
Is debug build: No
CUDA used to build PyTorch: 8.0.61
OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 4.9.3-13ubuntu2) 4.9.3
CMake version: version 3.12.2
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla M40 24GB
GPU 1: Tesla M40 24GB
GPU 2: Tesla M40 24GB
GPU 3: Tesla M40 24GB
Nvidia driver version: 384.130
cuDNN version: Probably one of the following:
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.6
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.6.0.21
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.7
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.7.0.5
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn_static.a
/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so
/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so.7
/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so.7.0.4
/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn.so.7.0.5
/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudnn_static.a
Versions of relevant libraries:
[pip] Could not collect
[conda] cuda80 1.0 0 soumith
[conda] cuda91 1.0 h4c16780_0 pytorch
[conda] pytorch 0.4.1 py36_cuda8.0.61_cudnn7.1.2_1 [cuda80] soumith
[conda] torch 1.0.0a0+7f0dd24
[conda] torch 0.4.1
[conda] torchvision 0.1.9 py36h7584368_1 soumith
[conda] torchvision 0.2.1 `
- PyTorch or Caffe2: PyTorch
- How you installed PyTorch (conda, pip, source): source
- Build command you used (if compiling from source):
python setup.py install - OS: Ubantu 16.04.5 LTS (Xenial Xerus)
- PyTorch version:0.4.1
- Python version:3.6.6
- CUDA/cuDNN version: 8.0.61
- GPU models and configuration:
- GCC version (if compiling from source): 4.9.3
- CMake version:3.12.2
- Versions of any other relevant libraries: