[MPS] Compute `offset2bag/bag_size/max_indices` in `_embedding_bag` by kurtamohler · Pull Request #163281 · pytorch/pytorch

kurtamohler · 2025-09-18T19:36:58Z

Stack from ghstack (oldest at bottom):

-> [MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag #163281

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

[ghstack-poisoned]

pytorch-bot · 2025-09-18T19:37:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163281

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 703a149 with merge base f8f230a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Part of #162270 ghstack-source-id: 4196466 Pull-Request: #163281

[ghstack-poisoned]

Part of #162270 ghstack-source-id: 79a39ce Pull-Request: #163281

malfet · 2025-09-18T19:51:42Z

aten/src/ATen/native/mps/kernels/EmbeddingBag.metal

+template <typename T>
+struct ReductionOp<EmbeddingBagMode::MAX, T> {
+  inline opmath_t<T> operator()(opmath_t<T> weight_val, opmath_t<T> out_val) {
+    return max(weight_val, out_val);


Q: Which max are you using there? one from metal:: or one from c10::metal::?
I don't know if embedding bug supposed to carry about NaN, but if it is, make sure to use c10::metal:: wrapper, as regular one will not be able to handle NaN correctly

Good catch. Looks like the only way to make it match the CPU impl is to use metal::max and also initialize out_val with nan. I'll make those changes and add some nans to the test

Ok I've updated it to have the same nan behavior as the CPU impl. It doesn't use max at all, and instead uses comparison. Let me know what you think

By the way, I don't know which one is faster: ternary or max

aten/src/ATen/native/mps/kernels/EmbeddingBag.metal

malfet · 2025-09-18T19:53:42Z

aten/src/ATen/native/mps/kernels/EmbeddingBag.metal

+      thread I& max_idx,
+      I weight_idx,
+      bool pad) {
+    max_idx = (pad || new_out_val == out_val) ? max_idx : weight_idx;


I'm not sure if compiler is smart enough, but wouldn't it be better to do something like

Suggested change

max_idx = (pad || new_out_val == out_val) ? max_idx : weight_idx;

if (!pad && new_out_val != out_val) {

max_idx = weight_idx;

}

I'm not sure if the compiler is smart enough to avoid the thread divergence either, but the Metal documentation does recommend avoiding if statements that could potentially cause divergence: link

Apparently, XCode has a thread divergence counter in the profiler, so it would be possible to check this. But I don't have access to a graphical interface on the machine I'm using

Let me experiment locally and I'll let you know

malfet · 2025-09-18T19:54:30Z

test/test_mps.py

        with self.assertRaisesRegex(RuntimeError, "Index to scalar can have only 1 value"):
            helper(22, 0, [])

+    # TODO: This test can be removed once the backward pass of embedding_bag is


Hmm, what stops us from running existing forward tests from op_info and just add emebdding_bag to GRAD_FAILURES?

We're already running the torch.nn.functional.embedding_bag opinfo tests, but that function does not return offset2bag, bag_size, and max_indices. There currently is no forward mode opinfo test that checks those

[ghstack-poisoned]

Part of #162270 ghstack-source-id: 2e5ca7f Pull-Request: #163281

[ghstack-poisoned]

Part of #162270 ghstack-source-id: 7d71a4a Pull-Request: #163281

malfet · 2025-09-19T00:21:35Z

aten/src/ATen/native/mps/kernels/EmbeddingBag.metal

+      opmath_t<T> weight_val,
+      opmath_t<T> out_val,
+      bool is_first) {
+    return (is_first || weight_val > out_val) ? weight_val : out_val;


This would also be a non-nan preserving, but I guess it's the same behavior as CPU

[ghstack-poisoned]

Part of #162270 ghstack-source-id: 729bb5e Pull-Request: #163281

kurtamohler · 2025-09-23T19:38:46Z

@pytorchbot merge

pytorchmergebot · 2025-09-23T19:40:42Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ytorch#163281) Part of pytorch#162270 Pull Request resolved: pytorch#163281 Approved by: https://github.com/malfet

…163281) Part of #162270 Pull Request resolved: #163281 Approved by: https://github.com/malfet

Update

6395663

[ghstack-poisoned]

kurtamohler requested review from kulinseth and malfet as code owners September 18, 2025 19:36

kurtamohler added a commit that referenced this pull request Sep 18, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag

7d93745

Part of #162270 ghstack-source-id: 4196466 Pull-Request: #163281

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Sep 18, 2025

Update

36de91d

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Sep 18, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag

0c0453a

Part of #162270 ghstack-source-id: 79a39ce Pull-Request: #163281

pytorchbot added the open source label Sep 18, 2025

malfet reviewed Sep 18, 2025

View reviewed changes

Update

1ad3a85

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Sep 19, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag

0bfcac1

Part of #162270 ghstack-source-id: 2e5ca7f Pull-Request: #163281

pytorch-bot bot added ciflow/inductor module: inductor labels Sep 19, 2025

Update

2cffc13

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Sep 19, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag

e5281b0

Part of #162270 ghstack-source-id: 7d71a4a Pull-Request: #163281

malfet approved these changes Sep 19, 2025

View reviewed changes

malfet reviewed Sep 19, 2025

View reviewed changes

Update

703a149

[ghstack-poisoned]

kurtamohler added a commit that referenced this pull request Sep 19, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag

aae43c9

Part of #162270 ghstack-source-id: 729bb5e Pull-Request: #163281

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 23, 2025

pytorchmergebot added the merging label Sep 23, 2025

pytorchmergebot added the Merged label Sep 23, 2025

pytorchmergebot closed this in 2014908 Sep 23, 2025

pytorchmergebot removed the merging label Sep 23, 2025

dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag (p…

e4dcb5d

…ytorch#163281) Part of pytorch#162270 Pull Request resolved: pytorch#163281 Approved by: https://github.com/malfet

jainapurva pushed a commit that referenced this pull request Sep 29, 2025

[MPS] Compute offset2bag/bag_size/max_indices in _embedding_bag (#…

3537934

…163281) Part of #162270 Pull Request resolved: #163281 Approved by: https://github.com/malfet

github-actions bot deleted the gh/kurtamohler/52/head branch October 24, 2025 02:08

-    max_idx = (pad || new_out_val == out_val) ? max_idx : weight_idx;
+   if (!pad && new_out_val != out_val) {
+         max_idx = weight_idx;
+   }

Conversation

kurtamohler commented Sep 18, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163281

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malfet Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kurtamohler Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kurtamohler commented Sep 23, 2025

Uh oh!

pytorchmergebot commented Sep 23, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kurtamohler commented Sep 18, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Sep 18, 2025 •

edited

Loading

malfet Sep 19, 2025 •

edited

Loading

kurtamohler Sep 18, 2025 •

edited

Loading