Skip to content

Conversation

@EikanWang
Copy link
Collaborator

@EikanWang EikanWang commented Dec 18, 2022

Stack from ghstack (oldest at bottom):

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported masked in this PR by simplifying it as mask_load to support max_pool2d.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @Guobing-Chen @chunyuan-w @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 18, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91076

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7ed96f1:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

EikanWang added a commit that referenced this pull request Dec 18, 2022
ghstack-source-id: 4527867
Pull Request resolved: #91076
@EikanWang EikanWang marked this pull request as draft December 18, 2022 14:37
@EikanWang EikanWang added the topic: not user facing topic category label Dec 18, 2022
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
EikanWang added a commit that referenced this pull request Dec 30, 2022
ghstack-source-id: d020ab9
Pull Request resolved: #91076
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
EikanWang added a commit that referenced this pull request Jan 3, 2023
ghstack-source-id: a78c8f4
Pull Request resolved: #91076
@EikanWang EikanWang marked this pull request as ready for review January 4, 2023 05:25
cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]

def is_indirect_indexing(self, index: sympy.Expr):
for _load_res in self.load_results:
# The index expression cotains a value that loads from memory
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The index expression cotains a value that loads from memory
# The index expression contains a value that loads from memory

]
self.store_supported_dtypes: list[torch.dtype] = [torch.float, torch.float32]
# Cache the dtypes of the store operation. If the store is mixing dtypes, the
# vectorization would not support it as it is hard to determin the vec dtype
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# vectorization would not support it as it is hard to determin the vec dtype
# vectorization would not support it as it is hard to determine the vec dtype

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@EikanWang EikanWang requested a review from jgong5 January 16, 2023 14:06
f"auto {var} = at::vec::Vectorized<float>(std::numeric_limits<float>::infinity());"
)
elif isinstance(other, float):
code.writeline(f"auto {var} = at::vec::Vectorized<float>({other});")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't the next case do the same thing for a float?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by removing this logic.

return at::vec::Vectorized<float>(_mm512_cvtepi32_ps(src));
#endif
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline end of file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

def is_supported_cmp(self, node: torch.fx.Node):
def get_node_dtype(node):
if type(node) == torch.fx.Node:
return None if "dtype" not in node.meta else node.meta["dtype"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you can use node.meta.get("dtype", None)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


left_dtype, right_dtype = get_cmp_dtypes(node)
if left_dtype is None or right_dtype is None:
# TODO(Eikan): Should be conservative?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we start with being conservative and leave the aggressive option as TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is a missing piece to record, deduce and propagate the data type of every expression. Hence, we could not get the real data type of the left expression or the right expression. I refined the comment. By the way, I'm working on the data type analysis.

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@EikanWang
Copy link
Collaborator Author

@jansel @desertfire , may I know if I have addressed your comments?

Copy link
Contributor

@desertfire desertfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but please address Horace's and my comments to why test/functorch/test_aotdispatch.py is relevant for this PR before your merge.

In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
EikanWang added a commit that referenced this pull request Feb 2, 2023
ghstack-source-id: 046e850
Pull Request resolved: #91076
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mlazos soumith voznesenskym yanboliang penguinwu anijain2305 Guobing-Chen chunyuan-w zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@EikanWang
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 5, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/EikanWang/19/head branch June 8, 2023 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: inductor open source topic: not user facing topic category

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants