Skip to content

DataParallel module leaks memory #4865

@khanrc

Description

@khanrc

Environment

  • OS: ubuntu 14.04, 16.04
  • PyTorch version: 0.3
  • How you installed PyTorch (conda, pip, source): pip
  • Python version: 2.7
  • CUDA/cuDNN version: 8.0/6.0, 9.0/7.0
  • GPU models and configuration: Geforce 1080Ti, TITAN XP

Symptom

https://discuss.pytorch.org/t/dataparallel-makes-garbages/12771
https://nbviewer.jupyter.org/gist/khanrc/a21dbe0dc316c31387a56b683b94aa2d

  • Memory leak arises when using DataParallel module
  • It mainly came from the replicate module, as a results of my tests
    • So more gpu devices (more replicas) make more leak
  • In the above jupyter notebook link, you can see the collected garbages, which indicates memory leak
  • This memory leak come from DataParallel module; There is no leak without DataParallel module

Reproduce

Reproducing is very easy. You can reproduce this problem using above jupyter notebook code. Further, I've always had this problem every time I use the DataParallel module.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions