Commit 18eb6cd
committed
Update on "redo of add quantized layer norm implementation"
Summary:
This is a redo of #35329 with a
better test.
Adds a quantized implementation of LayerNorm for server.
A future PR will add the Python wrapper.
Test Plan:
numerics match the floating point implementation
benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b
Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268)
[ghstack-poisoned]1 parent d90dfae commit 18eb6cd
1 file changed
+2
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
356 | 356 | | |
357 | 357 | | |
358 | 358 | | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | 359 | | |
365 | 360 | | |
366 | 361 | | |
367 | | - | |
| 362 | + | |
| 363 | + | |
368 | 364 | | |
369 | 365 | | |
370 | 366 | | |
| |||
0 commit comments