Update on "redo of add quantized layer norm implementation"

vkuzo · vkuzo · commit d90dfae3486c · 2020-04-14T23:43:43.000-07:00
Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268) [ghstack-poisoned]
diff --git a/test/quantization/test_quantized.py b/test/quantization/test_quantized.py
@@ -346,7 +346,8 @@ def test_qlayer_norm(self, shapes, torch_type, X_rand_scale, Y_scale, Y_zero_poi
                 X_zero_point = 127
                 num_bins = 2 ** 8
                 X_scale = float(max_val - min_val) / num_bins
-
+            if X_scale == 0:
+                X_scale = 1e-10
 
             X = torch.from_numpy(X)
             qX = torch.quantize_per_tensor(X, scale=X_scale,