What's Changed
- fix sage_fp16_triton repeat by @ITerydh in #146
- bump to 0.6.3.post1 by @feifeibear in #147
- fix: remove redundant code to avoid error in specific env by @ZDJeffrey in #150
- fix: flashattention3 call by @yuyu5333 in #151
- [feature] adapt for Moore Threads graphics processing unit by @houchen-li in #152
- [bugfix] correct the operator alias in attention.py by @houchen-li in #153
- Fix backward gradient count mismatch by @MartinPernus in #156
- Add AttnType.AITER by @kTorp in #159
- Adapted for Huawei Ascend npu by @endymion-ni in #162
- Adapted Ascend npu for LongContextAttention\UlyssesAttention\RingAttention calulation based on torch_npu.npu_fusion_attention_v2 & npu_fusion_attention_grad_v2 for trainning and inference case by @L4-1024 in #167
- Refine AttnType.TORCH by @genghisun in #165
- bump to 0.6.4 by @feifeibear in #168
New Contributors
- @ITerydh made their first contribution in #146
- @yuyu5333 made their first contribution in #151
- @houchen-li made their first contribution in #152
- @MartinPernus made their first contribution in #156
- @kTorp made their first contribution in #159
- @endymion-ni made their first contribution in #162
- @L4-1024 made their first contribution in #167
- @genghisun made their first contribution in #165
Full Changelog: 0.6.3...0.6.4