-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Export tensor descriptor #8313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export tensor descriptor #8313
Conversation
|
Hi, I'm all in for exporting more things that makes life easier when writing extensions! |
|
@fmassa what were you thinking specifically? API documentation or something like more cpp_extension examples and tests. Those specific APIs don't seem to have changed much recently, but the build itself has been refactored a lot. That said, there is a lot of code in ATen that could be refactored, especially around creating descriptors. For example, I'm pretty sure you could eliminate 3/4 duplicate descriptors in this file: My specific use case is just implementing cudnnCTCLoss but it should be similar to any cudnn extension. These are what I'm using, and everything builds and tests correctly, but please let me know if there are some others that I should be using and if this list seems reasonable:
These are the includes I ended up with: Would it make sense to add an extension test? It would just be running cpp_extension to see if some specified list of headers and functions link correctly in the extension. |
|
@pytorchbot retest this please. |
|
cc: @ssnl who was looking into implementing cudnnCTCLoss bindings as well, looks like Ben has done some work :) |
|
I think that everything which is currently exposed in the public ATen is considered stable, but I think it would be good to have some clear instructions to avoid problems like the one in #8267 |
|
@ssnl just saw dotaPredict. Awesome! If you're ever doing any some dota-based projects please let me know. |
|
@soumith @peterjc123 Any thoughts on an approach to this current error (unrelated to this PR)? It looks like onnx sets /WX (treat warnings as errors) and includes protobuf, and protobuf has a small warning in it (forcing int to bool). Making that an atomic bool instead of an atomic int is marked as a todo in protobuf (at least the version that is the submodule). Maybe remove the /WX from onnx, fix or supress the warning in protobuf, or try updating the submodules to see if there are any changes. Not sure if there would be a good way to inject a pragma disable from torch that would propagate through onnx to protobuf. Relevant build log below: |
|
This looks like the culprit: onnx/onnx#1104 If we just set onnx to the version before that, it would be a temporary fix. |
|
And this looks like the fix (1 hr ago): onnx/onnx#1105 Thanks @bddppq ! |
|
Hey @bstriner, as author of these descriptor classes, I can say a little more about what @fmassa is talking about. Basically, if I'm writing some code which I know is only going to be used in one context, I may take shortcuts / make assumptions that I wouldn't make if I was designing a general purpose API. Glancing over the descriptor classes, I can see a few assumptions I made:
And there are some maybe questionable design choices, such as having So, the danger is, at some point someone will want to actually add support for something that is not in the code, and the most straightforward thing to do will be to just refactor the (public APIs!) of these classes so that they have one more parameter or something, and we'll accept that patch because none of the tests test for a particular API, and then your code will break. That being said, I'm not opposed to making this part of the public API and committing to supporting them for the forseeable future; they're pretty self-contained classes, and unlikely to be misused, so keeping them BC shouldn't be a big deal. But we haven't really setup the process for making sure we don't accidentally break pieces like this; which is why @fmassa is nervous :) |
|
@ezyang I was actually thinking about some updates along those lines. I like the idea of some central code for creating descriptors but it could use some extra options/tests/whatever. Personally, I'm only using TensorDescriptor, I just figured it would make sense to export the rest of that header while I was in there. It's not the end of the world if I go back to calling cudnnSetTensorNdDescriptor myself in my own helper, but TensorDescriptor is just so much cleaner if you already have a bunch of at::Tensors. Same with things like CUDNN_CHECK. Easy to write but makes much more sense to just import from pytorch. If the concern is that something will break, I was thinking about just adding a test. Write a CPP file that uses all of the relevant functions, call cpp_extension and assert that everything links correctly. Doesn't need to actually do anything in particular but you could also add some tests. That would check that the headers are in the right place and that the functions are exported correctly. |
|
You might have a point about the mutex. I'll have to check if I can see any noticeable performance difference between inlining everything and using the exports, but I don't expect much. |
|
Yes, I think some tests would be just the ticket! |
torch/utils/cpp_extension.py
Outdated
| bindings. If a dictionary is given, it should map function names to | ||
| docstrings (which are otherwise just the function names). | ||
| with_cuda: If ``cuda_sources`` is provided, CUDA will always be | ||
| enabled. Use this flag for pure CPP extensions that require CUDA. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the executable bit on the files and this is OK to merge.
goldsborough
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need some time to look at this, this change is new to me
goldsborough
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take another look. I think there are some necessary changes
| #include <torch/torch.h> | ||
| #include <iostream> | ||
|
|
||
| using namespace at; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| const char *cudnn_relu_name = "cudnn_relu"; | ||
|
|
||
| // Check arguments to cudnn_relu | ||
| int cudnn_relu_check(const Tensor &inputs, const Tensor &outputs) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| return 0; | ||
| } | ||
|
|
||
| int cudnn_relu(const Tensor& inputs, const Tensor& outputs) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| checkScalarType(cudnn_relu_name, arg_inputs, ScalarType::Float); \ | ||
| checkBackend(cudnn_relu_name, arg_inputs.tensor, Backend::CUDA); | ||
| checkContiguous(cudnn_relu_name, arg_outputs); | ||
| checkScalarType(cudnn_relu_name, arg_outputs, ScalarType::Float); \ |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| } | ||
|
|
||
| int cudnn_relu(const Tensor& inputs, const Tensor& outputs) { | ||
| //Check inputs |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| //Destroy | ||
| CUDNN_CHECK(cudnnDestroyActivationDescriptor(activationDesc)); | ||
| delete input_tensor_desc; | ||
| //Return |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| #include <ATen/cudnn/Handles.h> | ||
| #include <torch/csrc/cuda/cuda_check.h> | ||
| #include <torch/torch.h> | ||
| #include <iostream> |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| @@ -0,0 +1,73 @@ | |||
| #include <torch/torch.h> | |||
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| extra_include_paths=None, | ||
| build_directory=None, | ||
| verbose=False): | ||
| verbose=False, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/test_cpp_extensions.py
Outdated
|
|
||
| from torch.utils.cpp_extension import CUDA_HOME | ||
| TEST_CUDA = torch.cuda.is_available() and CUDA_HOME is not None | ||
| TEST_CUDNN = TEST_CUDA and torch._C.has_cudnn |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@goldsborough going through changes now. I think some sort of extra flag is required if you want cuda libraries but don't happen to be making a new kernel. Example use case is if you compiled kernels separately or something. What is your take on the actual semantics/name/documentation of that option? An extra option like |
|
@bstriner I just replied to inline to say that we should keep I did see the extra cudnn library includes in the test, and it looks a bit fragile. I'd be happy to add |
|
If it is same location as cuda, then it isn't a problem. Some people have multiple cudnn installs in separate directories, which is a complete pain. One possibility is to pull the paths found by cmake and use them as extra include and library paths if they exist. That would mean cudnn would work if it is in cuda_home or if it is in the same place as when built. whatever libraries you built with would be used for extensions. Should cover most situations. Not sure if there would be a situation where you would want to compile with one cudnn in one directory and build extensions with a different one in a different directory. https://github.com/pytorch/pytorch/blob/master/cmake/Modules/FindCuDNN.cmake |
|
Not all users using C++ extensions will be compiling PyTorch from source however. So there may not be any cmake stuff to pull from. I think it's ok to add |
|
The cmake would just be a hint, so if there isn't anything, it isn't going to do any worse than it did before. Current test just assumes it is in one of the known paths (including CUDA_HOME), which should work for most situations. Maybe the simplest workaround would be just respecting some sort of CUDNN_HOME environment variable and getting includes and libraries from there (if provided). Will have to think about it some more and maybe start another issue later. |
|
@goldsborough Think I got all of the comments addressed now. Thanks for all the help! Biggest change is just that parameter logic and documentation. Existing logic: My first version: New logic based on this discussion: Most people would use |
|
@pytorchbot retest this please. |
|
Currently some third_party issues are blocking the tests (onnx sqrt issue). Needs retest after those are resolved. |
|
Looks like this commit changed sqrt in onnx: onnx/onnx@068f1a4 |
goldsborough
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks a lot for implementing this
|
I just pushed to onnx something that should fix the issue: |
|
@pytorchbot retest this please. |
|
@goldsborough @ezyang @fmassa Thanks for all the comments! Looks like that onnx PR got CI working again. I think this is now good to go but please let me know if there is anything else missing. Possible follow-up tasks:
|
Hi Guys!
This exports descriptors (TensorDescriptor, ConvolutionDescriptor, etc.). Also re-implements previous PR to include cudnn/*.h in the install, which was knocked out by recent refactoring: #7749
Descriptors.h, Handles.h and Exceptions.h eliminate a lot of boilerplate in cudnn extensions, so this PR makes them available.
Cheers