[CUDAGraph] Silent failure when graphs capture attempted on wrong device 

### 🐛 Describe the bug

The following code snippet silently fails to capture the graph due to memory being allocated on one device while the graphs capture being attempted on other.

```python
import torch

device = torch.device("cuda:0")
x = torch.randn(10, dtype=torch.float32, device=device)
y = torch.randn(10, dtype=torch.float32, device=device)
z = torch.zeros(10, dtype=torch.float32, device=device)

with torch.cuda.device('cuda:1'): # Wrong device
    g = torch.cuda.CUDAGraph()
    with torch.cuda.graph(g):
        z = x + y

    for i in range(3):
        x.normal_()
        y.normal_()
        g.replay()
        print(z) # One would expect it to print different values each iteration, 
                 # but it does not because the current_device is 0 
                 # while all the tensors are on device 1

    print(f'Test passed')
```

I believe it should raise an error. However, I couldn't think of any easy way of doing it. 
cc @ngimel @ptrblck 

### Versions

Latest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUDAGraph] Silent failure when graphs capture attempted on wrong device #87894

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[CUDAGraph] Silent failure when graphs capture attempted on wrong device #87894

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions