Skip to content

Conversation

@supriyar
Copy link
Contributor

@supriyar supriyar commented Jul 11, 2020

Stack from ghstack:

Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D22506700

Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jul 11, 2020
Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: cfbca83
Pull Request resolved: #41293
@dr-ci
Copy link

dr-ci bot commented Jul 11, 2020

💊 CI failures summary and remediations

As of commit d7b399d (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 13 times.

…ntize ops"

Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jul 13, 2020
Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 782b7da
Pull Request resolved: #41293
constexpr int NUM_ELEM_PER_BYTE = 8 / BIT_RATE;
TORCH_CHECK(
weight_contig.size(weight.dim() - 1) % NUM_ELEM_PER_BYTE == 0,
"FloatToFused4BitRowwiseQuantizedOp only works for the number of "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a nit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean this check isn't required?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says "FloatToFused4BitRowwiseQuantizedOp". :) It should be "qembeddingbag_4bit_prepack only works for the number of columns a multiple of 2".

…ntize ops"

Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22506700](https://our.internmc.facebook.com/intern/diff/D22506700)

[ghstack-poisoned]
Copy link
Contributor

@vkuzo vkuzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg, feel free to ignore the comments if this implementation will be replaced in the near future

auto* output_data = output.data_ptr<uint8_t>();
const auto output_columns = output.size(output.dim() - 1);

for (int row = 0; row < embedding_rows; ++row) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does performance matter, or is this a reference implementation? Could probably parallelize if needed (same for the other op)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This op is a temporary solution until we de-couple quantize and packing. We can re-visit optimizations then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

const float* input_row = weight_data + row * embedding_cols;
std::uint8_t* output_row = output_data + row * output_columns;

at::Half* output_row_scale_zp = reinterpret_cast<at::Half*>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional readability nit: if this is packed at the end of a row, maybe we can move the code down to be below the weight packing, so the code structure follows the data format?

…ntize ops"

Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22506700](https://our.internmc.facebook.com/intern/diff/D22506700)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jul 13, 2020
Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 80fec1a
Pull Request resolved: #41293
…ntize ops"

Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22506700](https://our.internmc.facebook.com/intern/diff/D22506700)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Jul 15, 2020
Summary:

Add new operators that does quantize and packing for 8 bit and 4 bit embedding bag operators.
This is an initial change to help unblock testing. This will be follwed by adding graph mode passes to enable quantization of embedding_bag module

Note to reviewers: Future PRs will replace this op with a separate quantize and pack operator and add support for floating point scale and zero point.

Test Plan:
python test/test_quantization.py TestQuantizedEmbeddingBag

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: fe24a52
Pull Request resolved: #41293
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 008ab27.

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 008ab27.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants