Runtime error when mixing FP32 loss functions and FP16 cnn layers

I am trying to speed up the neural style transfer (https://github.com/leongatys/PytorchNeuralStyleTransfer) on Nvidia Tesla V100 by using FP16.

I modified the code to move the vgg to cuda().half(). In addition, all three images, style image, content image, and opt_img, are in FP16. I tried to keep the loss functions in FP32 because it easily can generate NaN and infinity in FP16.
The code is at https://gist.github.com/michaelhuang74/009e149a2002b84696731fb599408c90

When I ran the code, I encountered the following error.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Traceback (most recent call last):
File "neural-style-Gatys-half.py", line 167, in 
style_targets = [GramMatrix()(A).detach().cuda() for A in vgg(style_image, style_layers)]
File "/home/mqhuang2/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 319, in call
result = self.forward(input, **kwargs)
File "neural-style-Gatys-half.py", line 86, in forward
G.div_(hw)
RuntimeError: value cannot be converted to type Half without overflow: 960000
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

It seems that although I tried to keep the GramMatrix (see line 82) and loss functions (see line 155) in FP32, somehow, pytorch tried to convert FP32 to FP16 in the GramMatrix forward() method.

Any idea how to resolve this error?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Runtime error when mixing FP32 loss functions and FP16 cnn layers #3651

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Runtime error when mixing FP32 loss functions and FP16 cnn layers #3651

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions