Skip to content

Conversation

@kimishpatel
Copy link
Contributor

@kimishpatel kimishpatel commented Jul 28, 2020

Stack from ghstack:

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D22796479

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Jul 28, 2020
Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 65f145c
Pull Request resolved: #42178
@dr-ci
Copy link

dr-ci bot commented Jul 28, 2020

💊 CI failures summary and remediations

As of commit ecb3848 (more details on the Dr. CI page):


  • 3/3 failures possibly* introduced in this PR
    • 3/3 non-CircleCI failure(s)

Extra GitHub checks: 1 failed


ci.pytorch.org: 2 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 213 times.

@kimishpatel kimishpatel requested a review from jerryzh168 July 28, 2020 18:53
@jerryzh168
Copy link
Contributor

looks like there are some failing tests

…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 5, 2020
Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 8b8d12c
Pull Request resolved: #42178
Copy link
Contributor

@dreiss dreiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the existing test are sufficient to test this. At the very least, I think we need tests for quantizing channels-first, channels-last, and fully non-contiguous input tensors. Unless those are already there and I'm just missing them.

qtensor.scalar_type(), "quantize_tensor_per_tensor_affine_cpu", [&]() {
TORCH_CHECK(
rtensor.is_contiguous(), "Float tensor should be contiguous");
const float* const rdata = rtensor.data_ptr<float>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. It's usually not safe to call data_ptr on a non-contiguous tensor. Maybe the check should be that rtensor is contiguous in the memory format of qtensor? And what if qtensor is not contiguous?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes. Thanks for the catch. Dont know what was I thinking removing this, maybe it will come back.

@kimishpatel
Copy link
Contributor Author

I'm not sure the existing test are sufficient to test this. At the very least, I think we need tests for quantizing channels-first, channels-last, and fully non-contiguous input tensors. Unless those are already there and I'm just missing them.

That sounds good. Let's check with Jerry. @jerryzh168, do you know if quantization unit tests covers all these cases. If not I will add.

@jerryzh168
Copy link
Contributor

jerryzh168 commented Aug 5, 2020

I'm not sure the existing test are sufficient to test this. At the very least, I think we need tests for quantizing channels-first, channels-last, and fully non-contiguous input tensors. Unless those are already there and I'm just missing them.

That sounds good. Let's check with Jerry. @jerryzh168, do you know if quantization unit tests covers all these cases. If not I will add.

right we don't have tests for different memory formats, we always make Tensor contiguous before. Please add them in test/quantization/test_quantized_tensor.py Thanks!

…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 6, 2020
Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 295234b
Pull Request resolved: #42178
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 7, 2020
Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.
pytest test/quantization/test_quantized_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c3a7da3
Pull Request resolved: #42178
@kimishpatel kimishpatel requested a review from jerryzh168 August 7, 2020 19:17
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Aug 10, 2020
Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.
pytest test/quantization/test_quantized_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 9364161
Pull Request resolved: #42178
Comment on lines 2627 to 2633
TORCH_CHECK(
qtensor.is_contiguous(qtensor.suggest_memory_format()),
"Quantized tensor should be contiguous");
TORCH_CHECK(
rtensor.is_contiguous(qtensor.suggest_memory_format()),
"Float tensor should be contiguous "
"in same memory format as quantizd tensor");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we create a helper function for these two checks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for checks? I know there are a few places we are doing checks, but it seems strange to have a separate two-line function to just do asserts.

zero_points = torch.tensor([5, 10], dtype=torch.long)
axis = 1

def quantize_c_4d(data, scales, zero_points):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this function general to all dimensions? we can probably reshape the tensor to 1d, do computation in 1d tensor and then reshape it back i think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do 1D tensor because then we have to deconstruct channel dimension to apply per channel quant. But I think we can combine, h, w and d dimensions and then use just one function.

Comment on lines 260 to 265
for memory_format in [torch.contiguous_format, torch.channels_last]:
r = r.contiguous(memory_format=memory_format)
qr = torch.quantize_per_channel(r, scales, zero_points, axis, torch.quint8)
rqr = qr.dequantize()
self.assertTrue(np.allclose(qr.int_repr(), quantize_c_4d(r, scales, zero_points)))
self.assertTrue(np.allclose(r.numpy(), rqr.numpy(), atol=2 / np.min(scales.numpy())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and merge the test code as well

out = torch.nn.functional.max_pool2d(out, 2, 2)
out = self.cat.cat([out, out])
out = out.view(-1, 3 * 2 * 2)
out = out.reshape(-1, 3 * 2 * 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this change needed?

Copy link
Contributor Author

@kimishpatel kimishpatel Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because view is broken for channels_last format and there is no easy way to fix that. Although not clear why this PR exposes that particularly.

qtensor.scalar_type(), "quantize_tensor_per_tensor_affine_cpu", [&]() {
TORCH_CHECK(
rtensor.is_contiguous(), "Float tensor should be contiguous");
rtensor.is_contiguous(rtensor.suggest_memory_format()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we guarantee that these functions are always passed contiguous tensors as inputs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already broken if we are not passing contiguous tensors, because kernels extract raw pointer and operate on them. This check is to ensure that the kernels get what they expect.

Comment on lines 281 to 286
ref_res = _quantize_per_channel_ref_nd(r, scales, zero_points)
r = r.contiguous(memory_format=memory_format)
qr = torch.quantize_per_channel(r, scales, zero_points, axis, torch.quint8)
rqr = qr.dequantize()
self.assertTrue(np.allclose(qr.int_repr(), ref_res))
self.assertTrue(np.allclose(r.numpy(), rqr.numpy(), atol=2 / np.min(scales.numpy())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
@kimishpatel kimishpatel requested a review from jerryzh168 August 20, 2020 14:13
Comment on lines 2535 to 2541
TORCH_CHECK(
qtensor.is_contiguous(qtensor.suggest_memory_format()),
"Quantized tensor should be contiguous");
TORCH_CHECK(
rtensor.is_contiguous(qtensor.suggest_memory_format()),
"Float tensor should be contiguous "
"in same memory format as quantizd tensor");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like these are repeated a lot of times, I think it would be better to put them into a function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it seems weird to just move this to a function whose sole purpose is to assert. If you insist I can do this, but I dont think it is very meaniful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put these in anonymous namespace to avoid affecting other files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Although my feeling is that, it is little too much of deduplication just for the sake of it.

Comment on lines 53 to 56
print("--- reference output---")
print(Y_exp)
print("--- actual output---")
print(Y_act)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah. Thanks for the catch.

Comment on lines +276 to +294
# Check 4D tensor with 2 different memory formats.
r = torch.rand(3, 2, 4, 5, dtype=torch.float) * 4 - 2
scales = torch.tensor([0.2, 0.03], dtype=torch.double)
zero_points = torch.tensor([5, 10], dtype=torch.long)
self._test_quantize_per_channel(r, scales, zero_points, 1 , False)

scales = torch.tensor([0.2, 0.03, 0.5], dtype=torch.double)
zero_points = torch.tensor([5, 10, 7], dtype=torch.long)
self._test_quantize_per_channel(r, scales, zero_points, 0, False)

# Check 5D tensor.
r = torch.rand(3, 2, 4, 5, 7, dtype=torch.float) * 4 - 2
scales = torch.tensor([0.2, 0.03], dtype=torch.double)
zero_points = torch.tensor([5, 10], dtype=torch.long)
self._test_quantize_per_channel(r, scales, zero_points, 1, False)

scales = torch.tensor([0.2, 0.03, 0.5], dtype=torch.double)
zero_points = torch.tensor([5, 10, 7], dtype=torch.long)
self._test_quantize_per_channel(r, scales, zero_points, 0, False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these can be in a loop as well, with r, scale, zero_points, axis being configurable

self.assertTrue(np.allclose(qr.int_repr(), ref))
self.assertTrue(np.allclose(r.numpy(), dequant_tensor.numpy(), atol=1))

# Check 4D tensor with 2 different memory formats.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, I think maybe you can also merge this test with previous test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduces unrelated changes. We should merge with previous one in a separate PR if we want to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, sounds good

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, had a few inline nit comments

…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
…at."

Summary:
This otherwise introduces unnecessary calls to contiguous in the rest of
the network, where certain ops want channels last format.

Test Plan:
Quantization tests.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D22796479](https://our.internmc.facebook.com/intern/diff/D22796479)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in b52e6d0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants