Update on "Pass Scalar by reference"

wxie · wxie · commit d8bd7abd3ccb · 2021-03-15T17:04:05.000-07:00
Summary: `Scalar` takes 32 bytes due to `c10::complex<double>` requires aligning to 16 bytes. Passing Scalar by reference shows about 1% improvements on instruction count. All the changes in this commit are codemoded except for the following 4 files (which code-gen signatures): ``` tools/codegen/api/cpp.py tools/codegen/api/native.py tools/codegen/api/structured.py caffe2/contrib/aten/gen_op.py ``` # Codemode ## Main Step For the codemod part, here is the main command used: ``` fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}' fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}' fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}' fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}' ``` As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter). In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see #53479 as an reference). ## Pre-Step Prior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list: ``` fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}' fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}' fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}' fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}' ``` ## Fixup There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list: ``` fastmod --extensions cpp 'const const Scalar' 'const Scalar' fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>' fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>' fastmod 'at::const Scalar&' 'const at::Scalar&' ``` ## Supplementary `cu` and `mm` files also need to be codemoded, for example: ``` fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&' fastmod --extensions mm '([a-zA-Z_+]$[^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}' ``` Function pointers are not codemoded. Here is an incomplete list: ``` # Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source); fastmod --extensions h '(void\s*\(\s*\*\s*$$[^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}' # Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar); fastmod --extensions h '(void\s*\(\s*\*\s*$$[^)]*,?\s*)Scalar([, $])' '${1}const Scalar&${2}' fastmod --extensions cpp '(void\s*$\s*\*\s*$$[^)]*,?\s*)Scalar([, $])' '${1}const Scalar&${2}' fastmod --extensions h '(void\s*$\s*\*\s*$$[^)]*,?\s*)optional<Scalar>([, $])' '${1}const optional<Scalar>&${2}' ``` Some corner cases needs to be manually fixed. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D26904445](https://our.internmc.facebook.com/intern/diff/D26904445) [ghstack-poisoned]
diff --git a/aten/src/ATen/native/mkldnn/Relu.cpp b/aten/src/ATen/native/mkldnn/Relu.cpp
@@ -15,7 +15,7 @@ Tensor& mkldnn_relu_(Tensor& input) {
   TORCH_CHECK(false, "mkldnn_relu_: ATen not compiled with MKLDNN support");
 }
 
-Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, Scalar threshold) {
+Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, const Scalar& threshold) {
   TORCH_CHECK(false, "mkldnn_relu_backward: ATen not compiled with MKLDNN support");
 }
 
@@ -54,7 +54,7 @@ Tensor& mkldnn_relu_(Tensor& input) {
   return input;
 }
 
-Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, Scalar threshold) {
+Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, const Scalar& threshold) {
   ideep::tensor& x = itensor_from_mkldnn(input);
   ideep::tensor grady = itensor_from_mkldnn(grad_output);
   ideep::tensor gradx;

Original file line number	Diff line number	Diff line change
`@@ -15,7 +15,7 @@ Tensor& mkldnn_relu_(Tensor& input) {`
`15`	`15`	`TORCH_CHECK(false, "mkldnn_relu_: ATen not compiled with MKLDNN support");`
`16`	`16`	`}`
`17`	`17`
`18`		`-Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, Scalar threshold) {`
	`18`	`+Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, const Scalar& threshold) {`
`19`	`19`	`TORCH_CHECK(false, "mkldnn_relu_backward: ATen not compiled with MKLDNN support");`
`20`	`20`	`}`
`21`	`21`
`@@ -54,7 +54,7 @@ Tensor& mkldnn_relu_(Tensor& input) {`
`54`	`54`	`return input;`
`55`	`55`	`}`
`56`	`56`
`57`		`-Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, Scalar threshold) {`
	`57`	`+Tensor mkldnn_relu_backward(const Tensor& grad_output, const Tensor& input, const Scalar& threshold) {`
`58`	`58`	`ideep::tensor& x = itensor_from_mkldnn(input);`
`59`	`59`	`ideep::tensor grady = itensor_from_mkldnn(grad_output);`
`60`	`60`	`ideep::tensor gradx;`