bf16 support for fake_quantize_learnable_per_channel_affine by liangel-02 · Pull Request #165098 · pytorch/pytorch

liangel-02 · 2025-10-09T20:15:48Z

Adding bf16 support for torch._fake_quantize_learnable_per_channel_affine() op by relaxing the type check on scale

TODO: need to add bf16 support to per_tensor_affine_ as torch._fake_quantize_learnable_per_tensor_affine_backward gets called in the backward pass

Test
Modified unit test in test_workflow_ops.py

pytorch-bot · 2025-10-09T20:15:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165098

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9d89081 with merge base 34ac9b6 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-10-09T20:30:22Z

@liangel-02 has imported this pull request. If you are a Meta employee, you can view this in D84286904.

aten/src/ATen/native/quantized/FakeQuantPerChannelAffine.cpp

test/quantization/core/test_workflow_ops.py

andrewor14

Thanks, please add a TODO somewhere (PR description is fine) for fixing the per_tensor version

aten/src/ATen/native/quantized/FakeQuantPerChannelAffine.cpp

liangel-02 · 2025-10-10T14:13:06Z

@pytorchbot merge

pytorchmergebot · 2025-10-10T14:15:10Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning. For testing, we upcast to fp32 before calling the reference function. [ghstack-poisoned]

Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning. For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it. [ghstack-poisoned]

Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning. For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it. Pull Request resolved: #165325 Approved by: https://github.com/andrewor14

Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning. For testing, we upcast to fp32 before calling the reference function. [ghstack-poisoned]

…165098) Adding bf16 support for `torch._fake_quantize_learnable_per_channel_affine()` op by relaxing the type check on scale TODO: need to add bf16 support to `per_tensor_affine_` as `torch._fake_quantize_learnable_per_tensor_affine_backward` gets called in the backward pass **Test** Modified unit test in `test_workflow_ops.py` Pull Request resolved: pytorch#165098 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14

Follow up to pytorch#165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning. For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it. Pull Request resolved: pytorch#165325 Approved by: https://github.com/andrewor14

pytorch-bot bot added the release notes: quantization release notes category label Oct 9, 2025

liangel-02 marked this pull request as ready for review October 9, 2025 20:16

liangel-02 requested review from digantdesai, jerryzh168, jianyuh, kimishpatel and salilsdesai as code owners October 9, 2025 20:16

liangel-02 requested a review from andrewor14 October 9, 2025 20:16

jerryzh168 reviewed Oct 9, 2025

View reviewed changes

aten/src/ATen/native/quantized/FakeQuantPerChannelAffine.cpp Show resolved Hide resolved

jerryzh168 reviewed Oct 9, 2025

View reviewed changes

test/quantization/core/test_workflow_ops.py Show resolved Hide resolved

jerryzh168 reviewed Oct 9, 2025

View reviewed changes

test/quantization/core/test_workflow_ops.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Oct 9, 2025

View reviewed changes

test/quantization/core/test_workflow_ops.py Outdated Show resolved Hide resolved

liangel-02 force-pushed the bf16_support_per_channel branch from 98bdb26 to 366d198 Compare October 9, 2025 21:06

jerryzh168 approved these changes Oct 9, 2025

View reviewed changes

andrewor14 approved these changes Oct 9, 2025

View reviewed changes

aten/src/ATen/native/quantized/FakeQuantPerChannelAffine.cpp Outdated Show resolved Hide resolved

bf16 support for fake_quantize_learnable_per_channel_affine

9d89081

liangel-02 force-pushed the bf16_support_per_channel branch from 366d198 to 9d89081 Compare October 9, 2025 22:59

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 10, 2025

pytorchmergebot added the merging label Oct 10, 2025

pytorchmergebot added the Merged label Oct 10, 2025

pytorchmergebot closed this in 253fd76 Oct 10, 2025

pytorchmergebot removed the merging label Oct 10, 2025

liangel-02 mentioned this pull request Oct 13, 2025

bf16 support for per_channel bwd #165325

Closed

github-actions bot deleted the bf16_support_per_channel branch November 10, 2025 02:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bf16 support for fake_quantize_learnable_per_channel_affine#165098

bf16 support for fake_quantize_learnable_per_channel_affine#165098
liangel-02 wants to merge 1 commit intomainfrom
bf16_support_per_channel

liangel-02 commented Oct 9, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 9, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrewor14 left a comment

Uh oh!

Uh oh!

liangel-02 commented Oct 10, 2025

Uh oh!

pytorchmergebot commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

liangel-02 commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165098

✅ No Failures

Uh oh!

meta-codesync bot commented Oct 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrewor14 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liangel-02 commented Oct 10, 2025

Uh oh!

pytorchmergebot commented Oct 10, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liangel-02 commented Oct 9, 2025 •

edited

Loading

pytorch-bot bot commented Oct 9, 2025 •

edited

Loading