Skip to content

Commit 0702b5f

Browse files
xuhdevfacebook-github-bot
authored andcommitted
Partially parallelize randperm on CPU. (#21529)
Summary: This commit parallelizes the variable initialization (from 1 to n) step on CPU. Pull Request resolved: #21529 Differential Revision: D15855402 Pulled By: VitalyFedyunin fbshipit-source-id: f1ba54587451f9cb0eb5e542c3c5b458b48e1a3d
1 parent e388f70 commit 0702b5f

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

aten/src/ATen/native/TensorFactories.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -444,9 +444,11 @@ void randperm_cpu(Tensor& result, int64_t n, CPUGenerator* generator) {
444444
result.resize_({n});
445445
int64_t r__stride_0 = result.stride(0);
446446

447-
for(int64_t i = 0; i < n; i++) {
448-
r__data[i*r__stride_0] = static_cast<scalar_t>(i);
449-
}
447+
at::parallel_for(0, n, internal::GRAIN_SIZE,
448+
[&r__data, &r__stride_0](int64_t p_begin, int64_t p_end) {
449+
for(int64_t i = p_begin; i < p_end; i++)
450+
r__data[i*r__stride_0] = static_cast<scalar_t>(i);
451+
});
450452

451453
for(int64_t i = 0; i < n - 1; i++)
452454
{

0 commit comments

Comments
 (0)