Commit d90dfae
committed
Update on "redo of add quantized layer norm implementation"
Summary:
This is a redo of #35329 with a
better test.
Adds a quantized implementation of LayerNorm for server.
A future PR will add the Python wrapper.
Test Plan:
numerics match the floating point implementation
benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b
Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268)
[ghstack-poisoned]1 parent ab805d3 commit d90dfae
1 file changed
+2
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
346 | 346 | | |
347 | 347 | | |
348 | 348 | | |
349 | | - | |
| 349 | + | |
| 350 | + | |
350 | 351 | | |
351 | 352 | | |
352 | 353 | | |
| |||
0 commit comments