Fix indexing of large tensors #20920

ngimel · 2019-05-24T18:16:23Z

Uses int64_t OffsetCalculator when necessary, fix for #20888
cc @colesbury.
@colesbury, should it be uint32_t here

pytorch/aten/src/ATen/native/TensorIterator.cpp

Line 628 in 4704322

int64_t max_value = std::numeric_limits<int32_t>::max();

? Default indexing type is uint32_t, not int32_t.

colesbury · 2019-05-24T19:13:41Z

should it be uint32_t here?

No, because the "32-bit" code path uses "fast integer division" (THCIntegerDivider.cuh) and N-bit division can require an N+1 bit magic number.

I slightly prefer my patch (#20919). The 32-bit + split is slightly faster than 64-bit indexing (23.5 ms vs. 33.3 ms for the repro you wrote for #20888) and has fewer specializations. None of this really matters because we're not really optimizing for the giant Tensor case and the extra template specializations are unlikely to have a significant impact on overall compilation time.

ngimel · 2019-05-24T19:23:26Z

Closed in favor of #20919

Natalia Gimelshein added 2 commits May 24, 2019 17:46

fix indexing for large tensors

ead949a

use can_use_32bit_indexing

f5c3ebd

pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: operators labels May 24, 2019

ngimel closed this May 24, 2019

ezyang added the open source label Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix indexing of large tensors #20920

Fix indexing of large tensors #20920

Uh oh!

ngimel commented May 24, 2019

Uh oh!

colesbury commented May 24, 2019

Uh oh!

ngimel commented May 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix indexing of large tensors #20920

Fix indexing of large tensors #20920

Uh oh!

Conversation

ngimel commented May 24, 2019

Uh oh!

colesbury commented May 24, 2019

Uh oh!

ngimel commented May 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants