-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Allow np.memmap objects (numpy arrays based on files) to be processed… #39847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow np.memmap objects (numpy arrays based on files) to be processed… #39847
Conversation
💊 CI failures summary and remediationsAs of commit 3584a7b (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 11 times. |
|
This sounds like a reasonable ask, but I don't context over this part of the codebase. @ssnl does this sound reasonable to you? @gonglinyuan this would also require a test in order to be shipped. Something like pytorch/test/test_dataloader.py Line 1654 in d21ee2d
|
|
LGTM. The CI failures do not look relevant but can we rebase to make sure? |
Do you know how can I rebase so that these CIs can work? Thanks! |
|
@gonglinyuan you didn't allow maintainers to push to the fork so I can't do that for you. But you should be able to do so just by |
… by default_collate.
11049fe to
3584a7b
Compare
Thank you! I rebased and the CI runs successfully. |
|
@zou3519 could you merge this? thanks :) |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Allow np.memmap objects to be processed by default_collate
np.memmap objects has the same behavior as numpy arrays, and the only difference is that they are stored in a binary file on the disk. However, the default_collate function used by PyTorch DataLoader only accepts np.array, and rejects np.memmap by type checking. This commit allows np.memmap objects to be processed by default_collate. In this way, users can use in-disk large arrays with PyTorch DataLoader.