[tensorexpr] Add support for aten::stack #73801

huiguoo · 2022-03-04T18:53:05Z

Stack from ghstack (oldest at bottom):

This PR adds the lowering function for aten::stack in NNC.
Differential Revision: D34647822

[ghstack-poisoned]

pytorch-bot · 2022-03-04T18:53:09Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/59e95b891f3864083e9e39b3319657d2c286e0f3/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
linux-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
linux-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
linux-binary-manywheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`	✅ triggered
linux-bionic-rocm4.5-py3.7	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
macos-arm64-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-arm64-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
macos-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
windows-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
windows-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
windows-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`, `ciflow/xla`	🚫 skipped

facebook-github-bot · 2022-03-04T18:53:10Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/73801
↩️ [fb-only] Re-run with SSH instructions
Need help or want to give feedback on the CI? Visit our office hours

💊 CI failures summary and remediations

As of commit 7819277 (more details on the Dr. CI page):

1/2 failures introduced in this PR
1/2 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pull / linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-29T17:55:32.3059660Z The PR is introduc...m to confirm whether this change is wanted or not.

2022-03-29T17:55:32.3046997Z processing existing schema:  text(__torch__.torch.classes.profiling.SourceRef _0) -> (str _0)
2022-03-29T17:55:32.3048396Z processing existing schema:  count(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-03-29T17:55:32.3049665Z processing existing schema:  duration_ns(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-03-29T17:55:32.3050955Z processing existing schema:  source(__torch__.torch.classes.profiling.SourceStats _0) -> (__torch__.torch.classes.profiling.SourceRef _0)
2022-03-29T17:55:32.3052709Z processing existing schema:  line_map(__torch__.torch.classes.profiling.SourceStats _0) -> (Dict(int, __torch__.torch.classes.profiling.InstructionStats) _0)
2022-03-29T17:55:32.3053655Z processing existing schema:  __init__(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-29T17:55:32.3055097Z processing existing schema:  enable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-29T17:55:32.3056328Z processing existing schema:  disable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-29T17:55:32.3057910Z processing existing schema:  _dump_stats(__torch__.torch.classes.profiling._ScriptProfile _0) -> (__torch__.torch.classes.profiling.SourceStats[] _0)
2022-03-29T17:55:32.3059307Z processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (NoneType _0)
2022-03-29T17:55:32.3059660Z The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
2022-03-29T17:55:32.3059752Z 
2022-03-29T17:55:32.3059834Z Broken ops: [
2022-03-29T17:55:32.3060157Z 	aten::histogramdd(Tensor self, int[] bins, float[]? range=None, Tensor? weight=None, bool density=False) -> (Tensor hist, Tensor[] bin_edges)
2022-03-29T17:55:32.3060476Z 	aten::histogramdd.int_bins(Tensor self, int bins, float[]? range=None, Tensor? weight=None, bool density=False) -> (Tensor hist, Tensor[] bin_edges)
2022-03-29T17:55:32.3060816Z 	aten::histogramdd.TensorList_bins(Tensor self, Tensor[] bins, float[]? range=None, Tensor? weight=None, bool density=False) -> (Tensor hist, Tensor[] bin_edges)
2022-03-29T17:55:32.3060974Z 	aten::_to_dense(Tensor self, int? dtype=None) -> (Tensor)
2022-03-29T17:55:32.3061037Z ]
2022-03-29T17:55:32.4047002Z + cleanup
2022-03-29T17:55:32.4047122Z + retcode=1
2022-03-29T17:55:32.4047240Z + set +x

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pull / linux-xenial-py3.7-clang7-asan / test (default, 2, 3, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun) ❄️

2022-03-29T19:32:04.0763708Z unknown file: Failure

2022-03-29T19:32:03.3354853Z �[0;32m[       OK ] �[mKernel.CatInputTypesPromotion (71 ms)
2022-03-29T19:32:03.3355300Z �[0;32m[ RUN      ] �[mKernel.CatAndInlineWithAConstantDim
2022-03-29T19:32:03.7075755Z �[0;32m[       OK ] �[mKernel.CatAndInlineWithAConstantDim (372 ms)
2022-03-29T19:32:03.7076251Z �[0;32m[ RUN      ] �[mKernel.CatWithEmptyInputs
2022-03-29T19:32:03.9370204Z �[0;32m[       OK ] �[mKernel.CatWithEmptyInputs (229 ms)
2022-03-29T19:32:03.9370548Z �[0;32m[ RUN      ] �[mKernel.CatWoConditionals
2022-03-29T19:32:04.0070285Z �[0;32m[       OK ] �[mKernel.CatWoConditionals (68 ms)
2022-03-29T19:32:04.0070636Z �[0;32m[ RUN      ] �[mKernel.OptimizeConditionals
2022-03-29T19:32:04.0642583Z �[0;32m[       OK ] �[mKernel.OptimizeConditionals (58 ms)
2022-03-29T19:32:04.0642884Z �[0;32m[ RUN      ] �[mKernel.Stack
2022-03-29T19:32:04.0763708Z unknown file: Failure
2022-03-29T19:32:04.0763989Z C++ exception with description "Expected to not find "\n" but found it
2022-03-29T19:32:04.0765641Z       for (int64_t k = 0ll; k < 2ll; k++) {
2022-03-29T19:32:04.0766071Z         for (int64_t l = 0ll; l < 3ll; l++) {
2022-03-29T19:32:04.0766499Z           for (int64_t m = 0ll; m < 6ll; m++) {
2022-03-29T19:32:04.0766878Z             aten_stack[(((108ll * i + 18ll * k) + m) + 36ll * j) + 6ll * l] = k==1ll ? (ty_1[((18ll * j + m) + 54ll * i) + 6ll * l]) : (tx_1[((18ll * j + m) + 54ll * i) + 6ll * l]);
2022-03-29T19:32:04.0767383Z           }
2022-03-29T19:32:04.0767663Z         }
2022-03-29T19:32:04.0768051Z From CHECK-NEXT: aten_stack
2022-03-29T19:32:04.0769408Z " thrown in the test body.
2022-03-29T19:32:04.0769939Z �[0;31m[  FAILED  ] �[mKernel.Stack (12 ms)

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ghstack-source-id: ede6abb Pull Request resolved: #73801

huiguoo · 2022-03-04T18:53:49Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

eellison

Also we need shape analysis support for this for dynamic shapes..

eellison · 2022-03-04T22:16:22Z

torch/csrc/jit/passes/tensorexpr_fuser.h

    size_t min_group_size = 2,
    bool add_composed_op = false,
-    bool fuse_to_dynamic_shapes = false);
+    bool fuse_to_dynamic_shapes = true);


We shouldn't be changing the default as part of this PR i don't think

Oh sorry it was a draft PR. Just updated it, and now it's ready for review!

In this PR, aten::stack is not enabled for fusion yet. Will enable the fusion (w/wo dynamic shapes) in separate PRs.

Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

ghstack-source-id: 740715a Pull Request resolved: #73801

huiguoo · 2022-03-09T00:11:14Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ZolotukhinM · 2022-03-09T20:32:43Z

torch/csrc/jit/tensorexpr/operators/misc.cpp


+static bool checkStackInputShape(const std::vector<BufHandle>& bufList) {
+  if (bufList.size() == 0) {
+    throw std::runtime_error("Empty input list is passed to aten::stack");


Nit: let's be consistent in handling error cases. Two lines below we use TORCH_INTERNAL_ASSERT and here we're throwing an error.

ZolotukhinM · 2022-03-09T20:34:24Z

torch/csrc/jit/tensorexpr/operators/misc.cpp

+          load = ifThenElse(
+              CompareSelect::make(


Why do we generate both ifThenElse and CompareSelect?

ZolotukhinM · 2022-03-09T20:37:49Z

torch/csrc/jit/tensorexpr/operators/misc.cpp

+        std::vector<ExprHandle> newAxes(axes.begin(), axes.begin() + dim);
+        newAxes.insert(newAxes.end(), axes.begin() + dim + 1, axes.end());


This doesn't look right. First line makes

newAxes[:] = axes[:]

Second line then appends it with axes[dim+1:].

That means that if we have 10-d tensors and dim=0 the stacked tensor will have rank 19. I think it should be 11-d instead, am I missing something?

Also, please update the comment to describe what new axis are created by this op and where they are inserted.

The first line makes

newAxes = axes[:dim]

Overall, there will be N-1 elements in newAxes where N is the size of axes.

Ah, got it, it makes sense. Maybe it'd be easier to grasp if we call newAxes as outputAxes and axes as inputAxes - currently "new" corresponds to inputs which is somewhat counterintuitive (even though I understand why it's done that way - we're deducing input indexes from output indexes).

ZolotukhinM · 2022-03-09T20:40:26Z

test/cpp/tensorexpr/test_ops.cpp

+    auto at = at::rand(dims, at::kFloat);
+    auto bt = at::rand(dims, at::kFloat);


The test only checks stacking two tensors with equal dimensions. We should also check other cases, e.g.:

stacking just one tensor

stacking N>2 tensors

stacking tensors with different dimensions

It would be good to also have such tests in python, so that whenever opinfo covers stack we can use that too.

It might be also a good idea to have a test verifying IR that we generate.

The test only checks stacking two tensors with equal dimensions. We should also check other cases...

That's a good point! Will add more tests.

It would be good to also have such tests in python, so that whenever opinfo covers stack we can use that too.

Not sure if I understand this. Are you suggesting to test above cases from Python instead of C++? We don't have many unit tests for ops right now. I only know aten::sum, aten::cat, and aten::conv. For these 3 ops, we put the unit tests in 3 files under test/cpp/tensorexpr. Maybe we should put them together in one file like test_ops.cpp.

It might be also a good idea to have a test verifying IR that we generate.

Agree. Will add it.

Are you suggesting to test above cases from Python instead of C++?

Having C++ tests is fine, but currently the most comprehensive coverage that we have is in python:

pytorch/test/test_jit_fuser_te.py

Line 2476 in 122f864

class TestNNCOpInfo(JitCommonTestCase):

It leverages OpInfo, which is basically a collection of representative inputs for an op. What I suggested was to add a test for stack to those tests as well (if possible). If stack is not supported by OpInfo, then I think it still would make sense to write a python test manually, as most of our ops are currently tested from python rather than from C++.

Thanks for the pointer!! Added such a test for aten::stack in PR #74077.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

ghstack-source-id: 92268ec Pull Request resolved: #73801

huiguoo · 2022-03-10T21:09:40Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

navahgar · 2022-03-10T21:16:12Z

torch/csrc/jit/tensorexpr/operators/misc.cpp

+  return hasEmptyDims;
+}
+
+Tensor computeStack(


Can we add an implementation that lowers into multiple loops like in the case of aten::cat, as done in computeCatWoConditionals?

The conditionals based lowering does not give good perf on CPUs, mainly because we can't vectorize it. I think the conditionals based lowering still makes sense for CUDA.

The conditionals based lowering does not give good perf on CPUs, mainly because we can't vectorize it.

aten::stack only uses compareSelect conditionals which can be vectorized. I'll see if the other implementation has better performance. Will add it if it is.

Please use some representative shapes from the models you have been working on, for perf analysis.

I still believe the multiple loops approach would be faster than conditionals because we will have both spacial and temporal cache locality for all inputs. It would be good to confirm with a perf analysis.

If you want to do this as a followup, thats fine too.

Sounds great! I'll first enable aten::stack in the fusion, and then do a perf analysis on real models for both approaches.

navahgar · 2022-03-10T21:32:06Z

torch/csrc/jit/tensorexpr/operators/misc.cpp

+        }
+
+        int64_t dim_ = c10::get<int64_t>(argDim);
+        auto dim = normalizeAndCheckIndex(dim_, axes.size());


It looks like function errors out when dim_ == axes.size(). Isn't that a correct input for stack?

The axes are for the output of aten::stack. For example,

a = [1, 2] b = [3, 4] output = aten::stack([a, b], dim=dim_)

axes.size() equals to 3, and dim_ should be a number between 0 to 2, or -1 to -3, right? It cannot be 3. We have such dim tests for stack.

Aah, I missed the point that axes is for the output of stack. Makes sense.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo · 2022-03-11T01:11:43Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

navahgar

LGTM

Please ensure you address all comments from Mikhail.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo · 2022-03-14T22:26:24Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ZolotukhinM

Looks good! I think it's still a good idea to add a pyhton test though.

ZolotukhinM · 2022-03-16T18:07:20Z

test/cpp/tensorexpr/test_kernel.cpp

+# CHECK: for
+# CHECK-NEXT: for
+# CHECK-NEXT: for
+# CHECK-NEXT: for
+# CHECK-NEXT: aten_stack)IR";


This check essentially just verifies that the output is 4-d. Do we want to check the actual IR?

Attempted to check the stmt IR in the following way,

CHECK-NEXT: aten_stack[Ramp(5ll * (j + 2ll * i), 1ll, 4)] = (Broadcast(j, 4))==(Broadcast(1ll, 4)) ? (ty_1[Ramp(5ll * i, 1ll, 4)]) : (tx_1[Ramp(5ll * i, 1ll, 4)]))

it failed one CI test (linux-xenial-py3.7-clang7-asan). With the configuration in this CI test, the vectorization failed thus the generated stmt is different. There are two ways to fix this,

implement a CHECK that can check patterns in a string such as ".?.:" for stack

save the original stmt before prepare_to_codegen in TEK for checking issues

Option 2 does not seem good as it introduces additional member vars in TEK only for testing issues; Option 1 can be implemented in a separate PR. Currently landing this PR with the original checks.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo · 2022-03-22T00:18:23Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo · 2022-03-25T18:26:12Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo · 2022-03-25T23:25:15Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo · 2022-03-29T17:42:33Z

@huiguoo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: Pull Request resolved: #73801 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D34647822 Pulled By: huiguoo fbshipit-source-id: 3b863c71886c7c6616b16f5d3313079714c8b82a

Summary: Pull Request resolved: #73801 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D34647822 Pulled By: huiguoo fbshipit-source-id: 3b863c71886c7c6616b16f5d3313079714c8b82a (cherry picked from commit c71778c)

github-actions · 2022-03-30T21:25:47Z

Hey @huiguoo.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

zengk95 · 2022-03-31T02:56:21Z

@pytorchbot revert this

pytorchmergebot · 2022-03-31T02:57:56Z

Reverting PR 73801 failed due to Can't revert PR that was landed via phabricator as D34647822
Raised by https://github.com/pytorch/pytorch/actions/runs/2068620191

malfet · 2022-03-31T04:02:06Z

Reverting, as it broke ASAN at Kernel.Stack test, see https://github.com/pytorch/pytorch/runs/5761874340?check_suite_focus=true

facebook-github-bot · 2022-03-31T04:04:45Z

This pull request has been reverted by 3cc49984f4116be253e08ede8408021cfde30509. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).

facebook-github-bot · 2022-03-31T04:25:58Z

This pull request has been reverted by 43313cb. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).

facebook-github-bot · 2022-04-05T00:55:03Z

This pull request has been reverted by 43313cb. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).

[tensorexpr] Add support for aten::stack

59e95b8

[ghstack-poisoned]

pytorch-bot bot added the ciflow/default label Mar 4, 2022

facebook-github-bot added the cla signed label Mar 4, 2022

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Mar 4, 2022

huiguoo added a commit that referenced this pull request Mar 4, 2022

[tensorexpr] Add support for aten::stack

78883bf

ghstack-source-id: ede6abb Pull Request resolved: #73801

eellison reviewed Mar 4, 2022

View reviewed changes

Update on "[tensorexpr] Add support for aten::stack"

b8cb69e

Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo added a commit that referenced this pull request Mar 8, 2022

[tensorexpr] Add support for aten::stack

e925564

ghstack-source-id: 740715a Pull Request resolved: #73801

huiguoo requested review from ZolotukhinM, eellison and navahgar March 8, 2022 23:35

ZolotukhinM reviewed Mar 9, 2022

View reviewed changes

Update on "[tensorexpr] Add support for aten::stack"

8200244

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo added a commit that referenced this pull request Mar 10, 2022

[tensorexpr] Add support for aten::stack

c1863bd

ghstack-source-id: 92268ec Pull Request resolved: #73801

huiguoo requested a review from ZolotukhinM March 10, 2022 21:10

navahgar reviewed Mar 10, 2022

View reviewed changes

Update on "[tensorexpr] Add support for aten::stack"

4284702

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

huiguoo mentioned this pull request Mar 11, 2022

[tensorexpr] Enabled aten::stack in the fuser pass with static shapes #74077

Closed

navahgar reviewed Mar 12, 2022

View reviewed changes

Update on "[tensorexpr] Add support for aten::stack"

2c6ef98

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

Update on "[tensorexpr] Add support for aten::stack"

58dd444

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

ZolotukhinM approved these changes Mar 16, 2022

View reviewed changes

Update on "[tensorexpr] Add support for aten::stack"

79d93b5

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

suo removed the ciflow/default label Mar 22, 2022

Update on "[tensorexpr] Add support for aten::stack"

92fce99

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

Update on "[tensorexpr] Add support for aten::stack"

c57705f

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

Update on "[tensorexpr] Add support for aten::stack"

7819277

This PR adds the lowering function for aten::stack in NNC. Differential Revision: [D34647822](https://our.internmc.facebook.com/intern/diff/D34647822) [ghstack-poisoned]

pytorchmergebot closed this Mar 30, 2022

facebook-github-bot added the Reverted label Mar 31, 2022

zengk95 mentioned this pull request Apr 5, 2022

[Meta] CI Revert Tracker #66178

Closed

facebook-github-bot deleted the gh/huiguoo/57/head branch April 30, 2022 14:17

WBobby mentioned this pull request Aug 17, 2022

Add ROCm5.2.3/AMDGPU support for PyTorch WBobby/pytorch#2

Closed

		std::vector<ExprHandle> newAxes(axes.begin(), axes.begin() + dim);
		newAxes.insert(newAxes.end(), axes.begin() + dim + 1, axes.end());

		auto at = at::rand(dims, at::kFloat);
		auto bt = at::rand(dims, at::kFloat);

[tensorexpr] Add support for aten::stack #73801

[tensorexpr] Add support for aten::stack #73801

Uh oh!

Conversation

huiguoo commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 4, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pull / linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/1)

❄️ 1 failure tentatively classified as flaky

pull / linux-xenial-py3.7-clang7-asan / test (default, 2, 3, linux.2xlarge) (1/1)

Uh oh!

huiguoo commented Mar 4, 2022

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huiguoo commented Mar 9, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZolotukhinM Mar 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huiguoo commented Mar 10, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huiguoo commented Mar 11, 2022

Uh oh!

navahgar left a comment

Choose a reason for hiding this comment

Uh oh!

huiguoo commented Mar 14, 2022

Uh oh!

ZolotukhinM left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huiguoo commented Mar 4, 2022 •

edited

Loading

facebook-github-bot commented Mar 4, 2022 •

edited

Loading

ZolotukhinM Mar 10, 2022 •

edited

Loading