-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: Dao-AILab/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ROCM] Add support with Infinity Cache (LLC) awareness for performance improvement - [PR#2147 rebased on PR#2178]
#2217
opened Jan 29, 2026 by
tianwyan
Loading…
[Cute, Flex, Fwd, Sm100] Allow vectorized score_mod definitions
#2215
opened Jan 28, 2026 by
reubenconducts
Loading…
Add shift scheduler for deterministic full‑mask FA3 bwd on Hopper (sm90)
#2207
opened Jan 23, 2026 by
tie-pilot-qxw
Loading…
[Cute, SM100, BWD] Refactor get_n_block_max_for_m_block into a method of BlockInfo
#2203
opened Jan 23, 2026 by
henrylhtsang
Loading…
Fix compute_block_sparsity import in benchmark_mask_mod
#2190
opened Jan 17, 2026 by
blueberrycongee
Loading…
[Cute,Fwd,Sm100] support irregular qhead / kvhead ratios
#2186
opened Jan 16, 2026 by
timmy-feng
•
Draft
Update mha_fwd.cpp, Normalize the commented-out parameters
#2160
opened Jan 9, 2026 by
breakfei
Loading…
Add FLASH_ATTENTION_FORCE_NON_STABLE_API option to allow building on NVidia Pytorch 25.09 image
#2140
opened Jan 5, 2026 by
jp-gr
Loading…
[ROCM] Fix AMD Triton backend crash when dropout != 0 and return_attn_probs = False
#2111
opened Dec 30, 2025 by
Logiquo
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.