-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add docs about prod ecosystem features #23010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @@ -0,0 +1,131 @@ | |||
| Integration in production ecosystem | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ecosystem -> environment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration in production environment
-> Features for large-scale deployments of PyTorch
| It doesn't cover topics of deploying models to production. Check | ||
| :mod:`torch.jit` or one of the corresponding tutorials. | ||
|
|
||
| Assumption below is that you either build PyTorch from source in your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Assumption below is" -> "The note asssumes"
| Assumption below is that you either build PyTorch from source in your | ||
| organization or have an ability to statically link additional code to be loaded | ||
| when PyTorch is used. Therefore, many of the hooks are exposed as C++ APIs that | ||
| can be triggered once in some centralized place, e.g. in static initialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in some" -> "in a"
|
|
||
| PyTorch comes with :mod:`torch.autograd.profiler` capable of measuring time | ||
| taken by individual operators on demand. One can use the same mechanism to do | ||
| always on measurements for any process running PyTorch. It might be useful for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always on -> "Always ON"
| PyTorch comes with :mod:`torch.autograd.profiler` capable of measuring time | ||
| taken by individual operators on demand. One can use the same mechanism to do | ||
| always on measurements for any process running PyTorch. It might be useful for | ||
| gathering information about PyTorch workload running in a given process or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
workload -> workloads
| global sampling rate specified by | ||
| `torch::autograd::profiler::setSamplingProbability`. | ||
|
|
||
| Potential example might look like: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential example might look like
-> Here's an example
| often convenient to bundle additional information together with the model, for | ||
| example, description of model producer or auxiliary artifacts. | ||
|
|
||
| It can be achieved by passing ``_extra_files`` argument to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by passing -> by passing the
| @@ -0,0 +1,131 @@ | |||
| Integration in production ecosystem | |||
| =================================== | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a table of contents here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a table of contents rendered on the right, but sure
| }); | ||
|
|
||
|
|
||
| API usage logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this section above "Attaching metadata to saved TorchScript models" as it flows much better after the operator profiling section
ilia-cher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the part on operator callbacks - LGTM, please also add that pushCallback and setSamplingProbability are not thread safe and should be called during initialization only and should never be called while any operator is running.
|
|
||
| void onFunctionEnter(const RecordFunction& fn) { | ||
| std::cerr << "Before function " << fn.name() | ||
| << " with " << fn.inputs().size() << "inputs" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space before inputs
|
|
||
| When running in a broader ecosystem, for example in managed job scheduler, it's | ||
| often useful to track which binaries invoke particular PyTorch APIs. There | ||
| exists simple intrumentation injected at several important API points that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: instrumentation
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dzhulgakov is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@dzhulgakov merged this pull request in d6dcec3. |
| often convenient to bundle additional information together with the model, for | ||
| example, description of model producer or auxiliary artifacts. | ||
|
|
||
| It can be achieved by passing the ``_extra_files`` argument to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well we should never document arguments which have an underscore prefix... It's there to signify it's private. This has been put as a temporary measure after a discussion I had with @zdevito, but we never intended this functionality to be publicized and to have a lot of people depend on it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this comment. I think passing metadata through PT files might be useful for higher level workflows built on top of PT. How do you feel about dropping underscore all together and making it official functionality? I don't see too many drawbacks as the interface name->blob is pretty generic and our container is a zip file already.
Of course, one can argue that additional files can be appended with just ZipFile python API. While it's ok for reading, there might be unintended effects for writing as ZipFile doesn't guarantee data block alignment (something our writer does). For reading, suggesting to use ZipFile should be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not be super happy about it, but it would at least resolve this issue 😕 I guess it's ok...
Covering fleet-wide profiling, api logging, etc.
It's my first time writing rst, so suggestions are definitely welcomed.