GPU 0 context created on GPU 1 worker when using pin_memory=True

## 🐛 Bug


While trying to enable pin_memory=True on fairseq (https://github.com/pytorch/fairseq/pull/3560) I noticed that a GPU 0 context was being created on the GPU 1 worker. I eventually root caused this to the following code in dataloader.py:

https://github.com/pytorch/pytorch/blob/8cf85a1152952af3c25f3ea6eee904d4af02fb1f/torch/utils/data/dataloader.py#L930

The problem is that fairseq is creating a worker thread which as a side-effect of iterating creates the pin_memory thread.

https://github.com/pytorch/fairseq/blob/d6855baec88f99ac776962027b91d404fe917eea/fairseq/data/iterators.py#L548

The worker thread gets the default device of GPU0. I worked around by calling `torch.cuda.set_device()`. This was a hard bug to track down. Not sure how to avoid this. Perhaps torch could add a threading wrapper similar to `torch.multiprocessing` that would ensure default device is consistent across threads.

Note: The GPU 1 worker immediately calls `torch.set_device()` after process creation. The problem is that sub-threads do not inherit the context of the main thread.

## To Reproduce

Steps to reproduce the behavior:

1.
1.
1.



## Expected behavior



## Environment

Please copy and paste the output from our
[environment collection script](https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py)
(or fill out the checklist below manually).

You can get the script and run it with:
```
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
```

 - PyTorch Version (e.g., 1.0):
 - OS (e.g., Linux):
 - How you installed PyTorch (`conda`, `pip`, source):
 - Build command you used (if compiling from source):
 - Python version:
 - CUDA/cuDNN version:
 - GPU models and configuration:
 - Any other relevant information:

## Additional context




cc @ngimel @SsnL @VitalyFedyunin @ejguan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU 0 context created on GPU 1 worker when using pin_memory=True #58626

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPU 0 context created on GPU 1 worker when using pin_memory=True #58626

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions