serialization: validate sparse tensors after loading #34059

Baranowski · 2020-03-02T10:45:39Z

Fixes #33439

This introduces torch._sparse_coo_tensor_unsafe(...) and
torch._validate_sparse_coo_tensor_args(...)

Baranowski

I won't be surprised if I did something wrong working with the Python-C++ interface.

@ezyang, could you please suggest a reviewer

tools/autograd/templates/python_torch_functions.cpp

aten/src/ATen/native/native_functions.yaml

dr-ci · 2020-03-02T11:22:32Z

💊 CI failures summary and remediations

As of commit 5b84ca0 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 86 times.

ezyang · 2020-03-02T17:21:21Z

@Baranowski to start, can you get one of your teammates working on sparse to do a first review of this PR? I can take a look after that.

ezyang · 2020-03-02T18:40:57Z

btw it looks like there are some test failures

Baranowski · 2020-03-02T18:49:07Z

Yes, thank you. I'm looking.

…

On Mon, Mar 2, 2020 at 7:40 PM Edward Z. Yang ***@***.***> wrote: btw it looks like there are some test failures — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#34059?email_source=notifications&email_token=AAEBT2WB6T6WAJKVT6MEAI3RFP4TVA5CNFSM4K7RQTZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENQOQYY#issuecomment-593553507>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEBT2RO32HJELW6YHSCBKLRFP4TVANCNFSM4K7RQTZA> .

-- *Wojciech Baranowski* *Software Engineer* [image: www.quansight.com] <http://www.quansight.com>

Baranowski · 2020-03-03T17:39:50Z

For the record, (at least some) CI failures are due to #34126, I have put up a PR for that.

Baranowski · 2020-03-14T11:13:21Z

@hameerabbasi, this is ready for a review

test/test_serialization.py

hameerabbasi · 2020-03-18T09:44:32Z

aten/src/ATen/native/sparse/SparseTensor.cpp

I think you need a TORCH_API here, since this is the symbol that is "not found".

There is TORCH_API in the header file. Besides, I believe TORCH_API doesn't do anything when building with gcc, as is the case on our dev machine.

Anyway, this was taking way more time than I considered worth it, so I abandoned this attempt and just left _validate_sparse_coo_tensor_args as a method on torch. I can try some more if you or another reviewer feel strongly about it.

Don't feel too strongly, seems like the methods these are related to naturally go in torch. I was just hoping to give some pointers on how to make it work.

Baranowski · 2020-03-18T20:19:21Z

@ezyang this is ready for a review. No rush.

ezyang · 2020-03-19T15:52:43Z

aten/src/ATen/native/sparse/SparseTensor.cpp

It's a bit misleading that this returns a bool, but actually will never return false (and instead will just raise an error message). I think raising error messages is right, but then maybe the return type should just be void in that case.

Neither void, nor None work as a result type in native_function.yaml. Or at least I couldn't find a way to make them work.

lol, it's because it's spelled (). So blah_blah(...) -> ()

ezyang · 2020-03-19T15:54:19Z

aten/src/ATen/native/sparse/SparseTensor.cpp

I don't think this is necessary here OR you should remove the expand_values_if_needed from the validate/unsafe calls below.

ezyang · 2020-03-19T15:56:24Z

torch/tensor.py

nit: The logic here seems inverted. IF it is sparse, then I should validate sparse coo tensor args (not the other way around!)

ezyang · 2020-03-19T16:01:14Z

torch/serialization.py

What if the sparse tensor is recursively stored inside of another struct, e.g., a dict?

ezyang · 2020-03-19T16:08:06Z

You need to actually make sure the hook is called for every object you unpickle, and no, I don't want you to write some sort of function to recursively traverse arbitrary structs looking for sparse tensors, that way leads to madness. Maybe we can maintain some global variable (ugh!) of "deserialized sparse tensors that need validating" and then check them at the end with load.

Baranowski · 2020-03-21T18:58:31Z

Done. The CI failures don't look related.

ezyang · 2020-03-23T19:03:02Z

Getting a second opinion on this

Baranowski · 2020-03-31T15:05:14Z

@ezyang, any updates? There is no rush. I'm just making sure this doesn't fall off the radar.

ezyang · 2020-03-31T21:27:48Z

cc @gchanan

Baranowski · 2020-04-09T19:03:20Z

@hameerabbasi Do you have time for another review. I've changed it substantially in response to Ed's review, and it seems that FB folks have been too busy recently.

ezyang · 2020-04-10T13:35:54Z

gchanan is aware but tied up with 1.5 release right now.

hameerabbasi

A small question.

hameerabbasi · 2020-04-10T13:40:41Z

torch/_utils.py

Am I correct in thinking that this would skip all Tensors if one raised? If so, is that the intention?

That's correct. That was my intention. Do you think that I should leave the ones that haven't been verified yet?

Actually, that should be fine because the exception will go straight through _legacy_load() or _load() without being caught, which won't return any result so all the unvalidated sparse tensors will be discarded anyway.

hameerabbasi · 2020-04-10T13:50:34Z

torch/_utils.py

Is it possible to get rid of the global state here? Could cause issues when threading or leave things in an invalid state.

Not that I can think of. I don't know enough about Python threading and GIL to understand the dangers. Do you think I should guard this with some mutex in _legacy_load() in serialization.py? That would get really ugly.

Can you somehow put it inside _legacy_load and pass it around?

I cannot think of a way to do that.

Baranowski · 2020-04-17T10:19:12Z

@gchanan No rush, just a friendly periodic ping to make sure you are still aware. I will send another ping in a week or so, unless instructed otherwise.

Baranowski · 2020-04-24T15:08:04Z

@gchanan, this is another friendly low-pri ping for a review

Baranowski · 2020-05-15T10:50:26Z

@gchanan, another friendly ping for a review but I'm in no rush. Next ping coming in about a month or so.

Baranowski · 2020-06-18T15:12:49Z

@gchanan, another friendly ping for a review. I won't send any more of those.

ezyang

I'm going to stick my neck out and approve this. Can we get a merge to master?

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Fixes pytorch#33439 This introduces torch._sparse_coo_tensor_unsafe(...) and torch._validate_sparse_coo_tensor_args(...)

Baranowski · 2020-06-25T06:12:32Z

@ezyang Done. Sorry for the delay. I was chasing a spurious CI failure that looked like it was my fault.

facebook-github-bot

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-07-01T06:15:54Z

@ezyang merged this pull request in fcadca1.

pytorchbot added the open source label Mar 2, 2020

Baranowski commented Mar 2, 2020

View reviewed changes

tools/autograd/templates/python_torch_functions.cpp Outdated Show resolved Hide resolved

aten/src/ATen/native/native_functions.yaml Outdated Show resolved Hide resolved

ezyang self-requested a review March 2, 2020 17:21

Baranowski requested a review from hameerabbasi March 2, 2020 17:36

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 2, 2020

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from b780c06 to 85f3ad4 Compare March 12, 2020 22:11

hameerabbasi reviewed Mar 14, 2020

View reviewed changes

test/test_serialization.py Outdated Show resolved Hide resolved

Baranowski changed the title ~~serialization: validate sparse tensors after loading~~ [WiP] serialization: validate sparse tensors after loading Mar 16, 2020

hameerabbasi reviewed Mar 18, 2020

View reviewed changes

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from 3b5d71d to 421b985 Compare March 18, 2020 15:08

Baranowski changed the title ~~[WiP] serialization: validate sparse tensors after loading~~ serialization: validate sparse tensors after loading Mar 18, 2020

ezyang reviewed Mar 19, 2020

View reviewed changes

torch/serialization.py Outdated

Copy link

Contributor

ezyang Mar 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the sparse tensor is recursively stored inside of another struct, e.g., a dict?

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from 95ef830 to eb912e5 Compare March 21, 2020 10:23

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from eb912e5 to 27eab3f Compare March 21, 2020 19:04

ezyang requested a review from gchanan March 23, 2020 19:02

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from 27eab3f to c2af0b8 Compare March 31, 2020 15:04

hameerabbasi self-requested a review April 9, 2020 20:22

hameerabbasi reviewed Apr 10, 2020

View reviewed changes

ezyang approved these changes Jun 22, 2020

View reviewed changes

facebook-github-bot reviewed Jun 22, 2020

View reviewed changes

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from c2af0b8 to 0f8f570 Compare June 23, 2020 05:57

serialization: validate sparse tensors after loading

5b84ca0

Fixes pytorch#33439 This introduces torch._sparse_coo_tensor_unsafe(...) and torch._validate_sparse_coo_tensor_args(...)

Baranowski force-pushed the wbaranowski-sparse_load-33439 branch from 0f8f570 to 5b84ca0 Compare June 24, 2020 14:18

facebook-github-bot reviewed Jun 25, 2020

View reviewed changes

facebook-github-bot closed this in fcadca1 Jul 1, 2020

facebook-github-bot added the merged label Jul 1, 2020

mruberry added the Merged label Oct 28, 2020

serialization: validate sparse tensors after loading #34059

serialization: validate sparse tensors after loading #34059

Uh oh!

Conversation

Baranowski commented Mar 2, 2020

Uh oh!

Baranowski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dr-ci bot commented Mar 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

ezyang commented Mar 2, 2020

Uh oh!

ezyang commented Mar 2, 2020

Uh oh!

Baranowski commented Mar 2, 2020 via email

Uh oh!

Baranowski commented Mar 3, 2020

Uh oh!

Baranowski commented Mar 14, 2020

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Baranowski commented Mar 18, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Mar 19, 2020

Uh oh!

Baranowski commented Mar 21, 2020

Uh oh!

ezyang commented Mar 23, 2020

Uh oh!

Baranowski commented Mar 31, 2020

Uh oh!

ezyang commented Mar 31, 2020

Uh oh!

Baranowski commented Apr 9, 2020

Uh oh!

ezyang commented Apr 10, 2020

Uh oh!

hameerabbasi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Baranowski Apr 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Baranowski commented Apr 17, 2020

dr-ci bot commented Mar 2, 2020 •

edited

Loading

Baranowski Apr 10, 2020 •

edited

Loading