support batch size=0 for flash attention by liangel-02 · Pull Request #166318 · pytorch/pytorch

liangel-02 · 2025-10-27T17:39:40Z

Summary

Today, if we attempt to run flash attention with batch_size 0, we get error Runtime Error: batch size must be positive. This PR fixes this by returning early with empty tensors in the fwd and bwd.

Test plan
python test/test_transformers.py -k test_scaled_dot_product_attention - added case for batch_size=0

pytorch-bot · 2025-10-27T17:39:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166318

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e3e06d9 with merge base 7ce723d ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp

drisspg · 2025-10-27T17:45:55Z

aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp

    const int seqlen_k = k.size(1);
    const int num_heads_k = k.size(2);
+
+    if (batch_size == 0) {


@soulitzer what is the semantic for outputs/grads for empty are the 0s or empty?

hmm probably not a huge difference when tensors are zero-numel

ill change to empty_like since filling with zeros after isnt necessary

aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp

meta-codesync · 2025-10-27T19:27:52Z

@liangel-02 has imported this pull request. If you are a Meta employee, you can view this in D85592445.

liangel-02 · 2025-10-28T19:18:34Z

@pytorchbot merge

pytorchmergebot · 2025-10-28T19:20:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

liangel-02 added topic: improvements topic category module: sdpa All things related to torch.nn.functional.scaled_dot_product_attentiion labels Oct 27, 2025

liangel-02 added the release notes: nn release notes category label Oct 27, 2025

liangel-02 requested a review from drisspg October 27, 2025 17:44

liangel-02 marked this pull request as ready for review October 27, 2025 17:44

drisspg reviewed Oct 27, 2025

View reviewed changes

aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp Outdated Show resolved Hide resolved

drisspg reviewed Oct 27, 2025

View reviewed changes

liangel-02 force-pushed the sdpa-bs-zero branch from a94a66f to 5ec1bd0 Compare October 27, 2025 17:51

Skylion007 reviewed Oct 27, 2025

View reviewed changes

aten/src/ATen/native/transformers/cuda/flash_attn/flash_api.cpp Outdated Show resolved Hide resolved

support batch size=0 for sdpa

e3e06d9

liangel-02 force-pushed the sdpa-bs-zero branch from 5ec1bd0 to e3e06d9 Compare October 28, 2025 14:50

drisspg approved these changes Oct 28, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 28, 2025

pytorchmergebot added the merging label Oct 28, 2025

pytorchmergebot added the Merged label Oct 28, 2025

pytorchmergebot closed this in 08ae550 Oct 28, 2025

pytorchmergebot removed the merging label Oct 28, 2025

github-actions bot deleted the sdpa-bs-zero branch November 29, 2025 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support batch size=0 for flash attention#166318

support batch size=0 for flash attention#166318
liangel-02 wants to merge 1 commit intomainfrom
sdpa-bs-zero

liangel-02 commented Oct 27, 2025

Uh oh!

pytorch-bot bot commented Oct 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

drisspg Oct 27, 2025

Uh oh!

soulitzer Oct 28, 2025

Uh oh!

liangel-02 Oct 28, 2025

Uh oh!

Uh oh!

meta-codesync bot commented Oct 27, 2025

Uh oh!

liangel-02 commented Oct 28, 2025

Uh oh!

pytorchmergebot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

liangel-02 commented Oct 27, 2025

Uh oh!

pytorch-bot bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166318

✅ No Failures

Uh oh!

Uh oh!

drisspg Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

soulitzer Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

liangel-02 Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

meta-codesync bot commented Oct 27, 2025

Uh oh!

liangel-02 commented Oct 28, 2025

Uh oh!

pytorchmergebot commented Oct 28, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Oct 27, 2025 •

edited

Loading