Skip to content

Commit d90dfae

Browse files
committed
Update on "redo of add quantized layer norm implementation"
Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268) [ghstack-poisoned]
1 parent ab805d3 commit d90dfae

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

test/quantization/test_quantized.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -346,7 +346,8 @@ def test_qlayer_norm(self, shapes, torch_type, X_rand_scale, Y_scale, Y_zero_poi
346346
X_zero_point = 127
347347
num_bins = 2 ** 8
348348
X_scale = float(max_val - min_val) / num_bins
349-
349+
if X_scale == 0:
350+
X_scale = 1e-10
350351

351352
X = torch.from_numpy(X)
352353
qX = torch.quantize_per_tensor(X, scale=X_scale,

0 commit comments

Comments
 (0)