Skip to content

Conversation

@supriyar
Copy link
Contributor

@supriyar supriyar commented Jul 7, 2020

Stack from ghstack:

Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode.
For the weight tensor we handle saturation by clipping the values to fp16 range
This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode. This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@supriyar supriyar requested a review from apaszke as a code owner July 7, 2020 00:02
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jul 7, 2020
supriyar added a commit that referenced this pull request Jul 7, 2020
Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode. This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 1a7c6c6
Pull Request resolved: #41049
…cast op"

Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode. This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jul 7, 2020
Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode. This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c973a37
Pull Request resolved: #41049
@dr-ci
Copy link

dr-ci bot commented Jul 7, 2020

💊 CI failures summary and remediations

As of commit 31c0a60 (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



❄️ 7 failures tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_build (1/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a not found 

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (2/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:fff7795428560442086f7b2bb6004b65245dc11a not found 

See CircleCI build pytorch_linux_xenial_py3_clang5_mobile_build (3/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a not found 

See CircleCI build pytorch_linux_bionic_py3_6_clang9_build (4/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-py3.6-clang9:fff7795428560442086f7b2bb6004b65245dc11a not found 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_build (5/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:fff7795428560442086f7b2bb6004b65245dc11a not found 

See CircleCI build pytorch_libtorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_build (6/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7:fff7795428560442086f7b2bb6004b65245dc11a not found 

See CircleCI build pytorch_linux_xenial_py3_clang5_mobile_custom_build_static (7/7)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a not found
DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a 
Error response from daemon: manifest for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3-clang5-asan:fff7795428560442086f7b2bb6004b65245dc11a not found 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 8 times.

…cast op"

Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode. This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@supriyar supriyar changed the title [quant][graphmode] Fp16 quant support - remove activation cast op [quant][graphmode] Fp16 quant support - match numerics with eager mode Jul 7, 2020
…h eager mode"


Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode.
For the weight tensor we handle saturation by clipping the values to fp16 range
This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
…h eager mode"


Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode.
For the weight tensor we handle saturation by clipping the values to fp16 range
This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jul 7, 2020
Summary:
In eager mode there is no cast operator for the activation tensor in the fbgemm fp16 operator
Remove it from graph mode.
For the weight tensor we handle saturation by clipping the values to fp16 range
This makes numerics match between debug model and final quantized model.

Test Plan:
python test/test_quantization.py test_linear_dynamic_fp16
Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 510d046
Pull Request resolved: #41049
Comment on lines +237 to +238
// We don't need to insert cast operators for activation tensors for fp16
// quant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we filter this in a different place? e.g. we don't insert observer for activation tensors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for user to not specify activation observer in the qconfig? Does the prepare_jit pass ensure observers aren't inserted in that case for activation tensors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may have issues since for FP16 quant we don't specify dtype anywhere in the qconfig. We set quant type to dynamic so there is no way to distinguish int8 dynamic quant from fp16 dynamic quant. Hence was wondering if not specifying any activation observer (since we don't want it observed) would work here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like right now we are checking noop observer to do fp16 quantization, this sounds like a hack, can we expose fp16 as an argument to the API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel it makes more sense to expose this in the top level API, why don't we do that?

@supriyar supriyar marked this pull request as draft July 7, 2020 23:46
@facebook-github-bot facebook-github-bot deleted the gh/supriyar/148/head branch August 13, 2020 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants