-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Labels
Description
Environment
- OS: ubuntu 14.04, 16.04
- PyTorch version: 0.3
- How you installed PyTorch (conda, pip, source): pip
- Python version: 2.7
- CUDA/cuDNN version: 8.0/6.0, 9.0/7.0
- GPU models and configuration: Geforce 1080Ti, TITAN XP
Symptom
https://discuss.pytorch.org/t/dataparallel-makes-garbages/12771
https://nbviewer.jupyter.org/gist/khanrc/a21dbe0dc316c31387a56b683b94aa2d
- Memory leak arises when using DataParallel module
- It mainly came from the replicate module, as a results of my tests
- So more gpu devices (more replicas) make more leak
- In the above jupyter notebook link, you can see the collected garbages, which indicates memory leak
- This memory leak come from DataParallel module; There is no leak without DataParallel module
Reproduce
Reproducing is very easy. You can reproduce this problem using above jupyter notebook code. Further, I've always had this problem every time I use the DataParallel module.