Skip to content

Conversation

@dzhulgakov
Copy link
Collaborator

Covering fleet-wide profiling, api logging, etc.

It's my first time writing rst, so suggestions are definitely welcomed.

@pytorchbot pytorchbot added the module: docs Related to our documentation, both in docs/ and docblocks label Jul 18, 2019
@@ -0,0 +1,131 @@
Integration in production ecosystem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ecosystem -> environment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration in production environment
-> Features for large-scale deployments of PyTorch

It doesn't cover topics of deploying models to production. Check
:mod:`torch.jit` or one of the corresponding tutorials.

Assumption below is that you either build PyTorch from source in your
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Assumption below is" -> "The note asssumes"

Assumption below is that you either build PyTorch from source in your
organization or have an ability to statically link additional code to be loaded
when PyTorch is used. Therefore, many of the hooks are exposed as C++ APIs that
can be triggered once in some centralized place, e.g. in static initialization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in some" -> "in a"


PyTorch comes with :mod:`torch.autograd.profiler` capable of measuring time
taken by individual operators on demand. One can use the same mechanism to do
always on measurements for any process running PyTorch. It might be useful for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always on -> "Always ON"

PyTorch comes with :mod:`torch.autograd.profiler` capable of measuring time
taken by individual operators on demand. One can use the same mechanism to do
always on measurements for any process running PyTorch. It might be useful for
gathering information about PyTorch workload running in a given process or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workload -> workloads

global sampling rate specified by
`torch::autograd::profiler::setSamplingProbability`.

Potential example might look like:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential example might look like
-> Here's an example

often convenient to bundle additional information together with the model, for
example, description of model producer or auxiliary artifacts.

It can be achieved by passing ``_extra_files`` argument to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by passing -> by passing the

@@ -0,0 +1,131 @@
Integration in production ecosystem
===================================

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a table of contents here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a table of contents rendered on the right, but sure

});


API usage logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this section above "Attaching metadata to saved TorchScript models" as it flows much better after the operator profiling section

Copy link
Contributor

@ilia-cher ilia-cher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the part on operator callbacks - LGTM, please also add that pushCallback and setSamplingProbability are not thread safe and should be called during initialization only and should never be called while any operator is running.


void onFunctionEnter(const RecordFunction& fn) {
std::cerr << "Before function " << fn.name()
<< " with " << fn.inputs().size() << "inputs" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space before inputs


When running in a broader ecosystem, for example in managed job scheduler, it's
often useful to track which binaries invoke particular PyTorch APIs. There
exists simple intrumentation injected at several important API points that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: instrumentation

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dzhulgakov is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@dzhulgakov merged this pull request in d6dcec3.

often convenient to bundle additional information together with the model, for
example, description of model producer or auxiliary artifacts.

It can be achieved by passing the ``_extra_files`` argument to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we should never document arguments which have an underscore prefix... It's there to signify it's private. This has been put as a temporary measure after a discussion I had with @zdevito, but we never intended this functionality to be publicized and to have a lot of people depend on it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed this comment. I think passing metadata through PT files might be useful for higher level workflows built on top of PT. How do you feel about dropping underscore all together and making it official functionality? I don't see too many drawbacks as the interface name->blob is pretty generic and our container is a zip file already.

Of course, one can argue that additional files can be appended with just ZipFile python API. While it's ok for reading, there might be unintended effects for writing as ZipFile doesn't guarantee data block alignment (something our writer does). For reading, suggesting to use ZipFile should be sufficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not be super happy about it, but it would at least resolve this issue 😕 I guess it's ok...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: docs Related to our documentation, both in docs/ and docblocks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants