Commit bce59e9
committed
Update on "add quantized layer norm implementation"
Summary:
Adds a quantized implementation of LayerNorm for server.
Relevant PRs:
* #20345 (floating point LN)
* #33080 (quantized BN)
A future PR will add the Python wrapper.
Test Plan:
numerics match the floating point implementation
TODO: benchmarks
Reviewers:
Subscribers:
Tasks:
Tags:
[ghstack-poisoned]1 parent 15d19f7 commit bce59e9
2 files changed
+2
-2
lines changed- CMakeLists.txt-2
- bench/BenchUtils.cc-21
- bench/BenchUtils.h-4
- bench/EmbeddingSpMDM8BitBenchmark.cc+34-37
- bench/EmbeddingSpMDMBenchmark.cc+3-2
- bench/EmbeddingSpMDMNBitBenchmark.cc-1
- bench/EmbeddingSpMDMNBitRowWiseSparseBenchmark.cc-1
- bench/SpConvFP32Benchmark.cc-51
- bench/SpConvI8Benchmark.cc-63
- bench/SpMMFP32Benchmark.cc-88
- bench/SpMMI8Benchmark.cc-107
- include/fbgemm/FbgemmSpConv.h-34
- include/fbgemm/FbgemmSpMM.h-120
- include/fbgemm/Utils.h+5-2
- src/EmbeddingSpMDM.cc+15-10
- src/EmbeddingSpMDMNBit.cc+16-13
- src/FbgemmFP16UKernelsAvx2.cc+6-6
- src/FbgemmFP16UKernelsAvx512.cc+14-14
- src/FbgemmFP16UKernelsAvx512_256.cc+8-8
- src/FbgemmI64.cc+6-6
- src/FbgemmSpConv.cc-369
- src/FbgemmSpMM-inl.h-471
- src/FbgemmSpMM.cc-1.1k
- src/GenerateKernelU8S8S32ACC16.cc+3-3
- src/GenerateKernelU8S8S32ACC16Avx512.cc+5-5
- src/GenerateKernelU8S8S32ACC32.cc+4-4
- src/GenerateKernelU8S8S32ACC32Avx512.cc+6-6
- src/GenerateKernelU8S8S32ACC32Avx512VNNI.cc+5-5
- src/RefImplementations.h+2-13
- src/RowWiseSparseAdagradFused.cc+5-9
- src/codegen_fp16fp32.cc+1-1
- test/SpConvFP32Test.cc-85
- test/SpConvI8Test.cc-96
- test/SpMMFP32Test.cc-149
- test/SpMMI8Test.cc-127
Submodule protobuf updated 1548 files
0 commit comments