Skip to content

[BugFix] chunk_size should always be int64_t#165971

Closed
lingebeng wants to merge 4 commits intopytorch:mainfrom
lingebeng:linhaifeng/bug_fix/Integer-overflow
Closed

[BugFix] chunk_size should always be int64_t#165971
lingebeng wants to merge 4 commits intopytorch:mainfrom
lingebeng:linhaifeng/bug_fix/Integer-overflow

Conversation

@lingebeng
Copy link
Contributor

@lingebeng lingebeng commented Oct 21, 2025

aspired by #156872

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165971

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4d17f59 with merge base 03f3f78 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: cuda release notes category label Oct 21, 2025
@cyyever
Copy link
Collaborator

cyyever commented Oct 21, 2025

While it is too expensive to create large dense tensors, is it possible to create large sparse tensors to test the ops?

@lingebeng lingebeng changed the title [BugFix] chunk_size should always be int64_t for Foreach functors [BugFix] chunk_size should always be int64_t Oct 21, 2025
@lingebeng
Copy link
Contributor Author

Of course!

@lingebeng
Copy link
Contributor Author

import torch
from torch.optim import Adagrad


def test_torch_adagrad():
    num_params = 27000008
    param_size = 192
    param = torch.randn(num_params, param_size, device="cuda", dtype=torch.float32, requires_grad=True)
    grad = torch.randn_like(param) * 0.01
    param.grad = grad
    optimizer = Adagrad([param], lr=0.01)
    optimizer.step()
    torch.cuda.synchronize()


if __name__ == "__main__":
    test_torch_adagrad()

I am so sorry. I cannot run the code,could you run it? @cyyever

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 19.31 GiB. GPU 0 has a total capacity of 31.74 GiB of which 12.12 GiB is free. Including non-PyTorch memory, this process has 19.61 GiB memory in use. Of the allocated memory 19.31 GiB is allocated by PyTorch, and 0 bytes is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

@cyyever
Copy link
Collaborator

cyyever commented Oct 21, 2025

@lingebeng

import torch
from torch.optim import Adagrad


def test_torch_adagrad():
    num_params = 27000008
    param_size = 150
    param = torch.randn(num_params, param_size, device="cuda:1", dtype=torch.bfloat16, requires_grad=True)
    grad = torch.randn_like(param) * 0.01
    param.grad = grad
    optimizer = Adagrad([param], lr=0.01)
    optimizer.step()
    torch.cuda.synchronize()


if __name__ == "__main__":
    test_torch_adagrad()

This one allocates less than 40GB, but no error raises.

@lingebeng
Copy link
Contributor Author

Thanks,I see. Maybe it's not a bug!

@cyyever
Copy link
Collaborator

cyyever commented Oct 21, 2025

@lingebeng We can have further chats via e-mail.

@lingebeng
Copy link
Contributor Author

OK,I have contacted you!

@cyyever cyyever requested a review from albanD October 21, 2025 16:05
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyyever my layman understanding is that int is int64_t on linux/mac but int32_t on windows. So I would only expect to see this fail on windows.

Generally, we do want to make these type explicit to avoid windows-only issues.

@Skylion007
Copy link
Collaborator

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 21, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@cyyever
Copy link
Collaborator

cyyever commented Oct 22, 2025

@albanD Yes, MSVC still recognises int as int32_t, likely for Win32 compatibility.

zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: cuda release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants