Pytorch GPU memory keeps increasing with every batch

Question

I'm training a CNN model on images. Initially, I was training on image patches of size (256, 256) and everything was fine. Then I changed my dataloader to load full HD images (1080, 1920) and I was cropping the images after some processing. In this case, the GPU memory keeps increasing with every batch. Why is this happening?

PS: While tracking losses, I'm doing loss.detach().item() so that loss is not retained in the graph.

Nagabhushan S N · Accepted Answer · 2022-01-06 05:16:49Z

5

As suggested here, deleting the input, output and loss data helped.

Additionally, I had the data as a dictionary. Just deleting the dictionary isn't sufficient. I had to iterate over the dict elements and delete all of them.

answered Jan 6, 2022 at 5:16

Nagabhushan S N

7,31712 gold badges55 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Alex Li · Accepted Answer · 2024-11-01 14:18:58Z

1

I had a similar issue but it accumulated much much more slowly, after millions of iterations there was a lot of memory being used (hard to debug as you would imagine). I think it's because I had run export CUDA_LAUNCH_BLOCKING=1 export TORCH_USE_CUDA_DSA=1 to turn on the debugging flags before starting my run.

Another thing worth trying for those with this issue is to clear memory each epoch.

import gc
import torch
gc.collect()
torch.cuda.empty_cache()

answered Nov 1, 2024 at 14:18

Alex Li

2824 silver badges14 bronze badges

Collectives™ on Stack Overflow

Pytorch GPU memory keeps increasing with every batch

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related