Skip to content

Conversation

@supriyar
Copy link
Contributor

@supriyar supriyar commented Sep 22, 2020

Stack from ghstack:

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
and tries to minimize the quant error by doing torch.norm(x-fake_quant(x,s,z))
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D23848060

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@dr-ci
Copy link

dr-ci bot commented Sep 22, 2020

💊 CI failures summary and remediations

As of commit b91bd0d (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 34 times.

…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23848060](https://our.internmc.facebook.com/intern/diff/D23848060)

[ghstack-poisoned]
'std::tuple<Tensor,Tensor,Tensor,Tensor,int64_t>',
'std::tuple<Tensor,Tensor,double,Tensor,int64_t>',
'std::tuple<double,int64_t>',
'std::tuple<double,double>',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to your change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this adds a new combination of return types that is not currently supported.

// and packs the float weight tensor. In the next step it will be replaced by a
// quantize and pack function once we support FP scale and FP zero_point
Tensor qembeddingbag_byte_prepack(const Tensor& weight) {
Tensor qembeddingbag_byte_prepack(const Tensor& weight, bool optimized_qparams) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not doing optimized_qparam calculation, can you please add that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In caffe2 we currently don't have optimized_qparam calculation for byte_prepack. Do we need it in PT?

m.def("embedding_bag_prepack(Tensor weight) -> __torch__.torch.classes.quantized.EmbeddingPackedParamsBase W_prepack");
m.def("embedding_bag_unpack(__torch__.torch.classes.quantized.EmbeddingPackedParamsBase W_prepack) -> Tensor W_origin");
m.def("embedding_bag_byte_prepack(Tensor weight) -> Tensor");
m.def("embedding_bag_byte_prepack(Tensor weight, bool optimized_qparams=False) -> Tensor");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we enable this at the python level, how does the user control which observer is used for embeddings? i.e should we change the default for this op to have optimized_qparams=True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have a separate observer that calls the aten op to calculate optimized qparams.
I'm not sure if setting this to true will have any implications on PyPer perf.

…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23848060](https://our.internmc.facebook.com/intern/diff/D23848060)

[ghstack-poisoned]
@supriyar supriyar requested a review from apaszke as a code owner September 22, 2020 23:40
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23848060](https://our.internmc.facebook.com/intern/diff/D23848060)

[ghstack-poisoned]
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23848060](https://our.internmc.facebook.com/intern/diff/D23848060)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 23, 2020
Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c8b3262
Pull Request resolved: #45149
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23848060](https://our.internmc.facebook.com/intern/diff/D23848060)

[ghstack-poisoned]
…mbedding_bag"

Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23848060](https://our.internmc.facebook.com/intern/diff/D23848060)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 23, 2020
Summary:
The choose_qparams_optimized calculates the the optimized qparams.
It uses a greedy approach to nudge the min and max and calculate the l2 norm
  and tries to minimize the quant error by doing `torch.norm(x-fake_quant(x,s,z))`
Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 21f5b9d
Pull Request resolved: #45149
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 60665ac.

@facebook-github-bot facebook-github-bot deleted the gh/supriyar/186/head branch September 27, 2020 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants