To vectorize long datatype as mask index #91076

EikanWang · 2022-12-18T14:36:35Z

Stack from ghstack (oldest at bottom):

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported masked in this PR by simplifying it as mask_load to support max_pool2d.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @Guobing-Chen @chunyuan-w @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

[ghstack-poisoned]

pytorch-bot · 2022-12-18T14:36:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91076

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7ed96f1:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 4527867 Pull Request resolved: #91076

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

ghstack-source-id: d020ab9 Pull Request resolved: #91076

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

ghstack-source-id: a78c8f4 Pull Request resolved: #91076

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

jgong5 · 2023-01-13T10:18:30Z

torch/_inductor/codegen/cpp.py

+
+    def is_indirect_indexing(self, index: sympy.Expr):
+        for _load_res in self.load_results:
+            # The index expression cotains a value that loads from memory


Suggested change

# The index expression cotains a value that loads from memory

# The index expression contains a value that loads from memory

torch/_inductor/codegen/cpp.py

jgong5 · 2023-01-13T12:54:18Z

torch/_inductor/codegen/cpp.py

+        ]
+        self.store_supported_dtypes: list[torch.dtype] = [torch.float, torch.float32]
+        # Cache the dtypes of the store operation. If the store is mixing dtypes, the
+        # vectorization would not support it as it is hard to determin the vec dtype


Suggested change

# vectorization would not support it as it is hard to determin the vec dtype

# vectorization would not support it as it is hard to determine the vec dtype

torch/_inductor/codegen/cpp.py

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

jansel · 2023-01-24T03:18:15Z

torch/_inductor/codegen/cpp.py

+                f"auto {var} = at::vec::Vectorized<float>(std::numeric_limits<float>::infinity());"
+            )
+        elif isinstance(other, float):
+            code.writeline(f"auto {var} = at::vec::Vectorized<float>({other});")


Won't the next case do the same thing for a float?

Fixed by removing this logic.

jansel · 2023-01-24T03:19:37Z

torch/_inductor/codegen/cpp_prefix.h

+  return at::vec::Vectorized<float>(_mm512_cvtepi32_ps(src));
+#endif
+}
+#endif


newline end of file

desertfire · 2023-01-24T15:41:13Z

torch/_inductor/codegen/cpp.py

+    def is_supported_cmp(self, node: torch.fx.Node):
+        def get_node_dtype(node):
+            if type(node) == torch.fx.Node:
+                return None if "dtype" not in node.meta else node.meta["dtype"]


Nit: you can use node.meta.get("dtype", None)

desertfire · 2023-01-24T15:42:15Z

torch/_inductor/codegen/cpp.py

+
+        left_dtype, right_dtype = get_cmp_dtypes(node)
+        if left_dtype is None or right_dtype is None:
+            # TODO(Eikan): Should be conservative?


Shouldn't we start with being conservative and leave the aggressive option as TODO?

Actually, it is a missing piece to record, deduce and propagate the data type of every expression. Hence, we could not get the real data type of the left expression or the right expression. I refined the comment. By the way, I'm working on the data type analysis.

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

EikanWang · 2023-01-29T05:02:32Z

@jansel @desertfire , may I know if I have addressed your comments?

desertfire

Looks good to me, but please address Horace's and my comments to why test/functorch/test_aotdispatch.py is relevant for this PR before your merge.

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

ghstack-source-id: 046e850 Pull Request resolved: #91076

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

EikanWang · 2023-02-05T00:30:36Z

@pytorchbot merge

pytorchmergebot · 2023-02-05T00:34:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

To vectorize long datatype as mask index

a655b8c

[ghstack-poisoned]

EikanWang mentioned this pull request Dec 18, 2022

Vectorize exmp1 and log1p #91074

Closed

github-actions bot added ciflow/inductor module: inductor labels Dec 18, 2022

EikanWang added a commit that referenced this pull request Dec 18, 2022

To vectorize long datatype as mask index

90d0710

ghstack-source-id: 4527867 Pull Request resolved: #91076

EikanWang marked this pull request as draft December 18, 2022 14:37

EikanWang added the topic: not user facing topic category label Dec 18, 2022

pytorchbot added the open source label Dec 18, 2022

Update on "To vectorize long datatype as mask index"

9a660a5

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Update on "To vectorize long datatype as mask index"

fbd02b2

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Update on "To vectorize long datatype as mask index"

8d18982

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

EikanWang added a commit that referenced this pull request Dec 30, 2022

To vectorize long datatype as mask index

a7bf8ac

ghstack-source-id: d020ab9 Pull Request resolved: #91076

Update on "To vectorize long datatype as mask index"

3f40c4e

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Update on "To vectorize long datatype as mask index"

82dacee

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Update on "To vectorize long datatype as mask index"

2b908b6

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

Update on "To vectorize long datatype as mask index"

956a3fd

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

EikanWang added a commit that referenced this pull request Jan 3, 2023

To vectorize long datatype as mask index

2d7ba5c

ghstack-source-id: a78c8f4 Pull Request resolved: #91076

EikanWang marked this pull request as ready for review January 4, 2023 05:25

EikanWang requested review from Chillee and ezyang as code owners January 4, 2023 05:25

EikanWang requested review from desertfire, jansel and jgong5 January 4, 2023 05:25

Update on "To vectorize long datatype as mask index"

bd856a1

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

jgong5 reviewed Jan 13, 2023

View reviewed changes

EikanWang added 4 commits January 15, 2023 06:06

EikanWang requested a review from jgong5 January 16, 2023 14:06

jgong5 approved these changes Jan 17, 2023

View reviewed changes

jansel reviewed Jan 24, 2023

View reviewed changes

desertfire reviewed Jan 24, 2023

View reviewed changes

EikanWang added 3 commits January 24, 2023 16:34

EikanWang requested review from desertfire and jansel January 24, 2023 16:47

desertfire approved these changes Jan 31, 2023

View reviewed changes

jansel approved these changes Feb 2, 2023

View reviewed changes

EikanWang added a commit that referenced this pull request Feb 2, 2023

To vectorize long datatype as mask index

ca35c62

ghstack-source-id: 046e850 Pull Request resolved: #91076

EikanWang added 2 commits February 2, 2023 08:33

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 5, 2023

pytorchmergebot added the Merged label Feb 5, 2023

pytorchmergebot closed this in 9895c19 Feb 5, 2023

facebook-github-bot deleted the gh/EikanWang/19/head branch June 8, 2023 14:30

	# The index expression cotains a value that loads from memory
	# The index expression contains a value that loads from memory

	# vectorization would not support it as it is hard to determin the vec dtype
	# vectorization would not support it as it is hard to determine the vec dtype

To vectorize long datatype as mask index #91076

To vectorize long datatype as mask index #91076

Uh oh!

Conversation

EikanWang commented Dec 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91076

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EikanWang commented Jan 29, 2023

Uh oh!

desertfire left a comment

Choose a reason for hiding this comment

Uh oh!

EikanWang commented Feb 5, 2023

Uh oh!

pytorchmergebot commented Feb 5, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

EikanWang commented Dec 18, 2022 •

edited

Loading

pytorch-bot bot commented Dec 18, 2022 •

edited

Loading