-
Notifications
You must be signed in to change notification settings - Fork 26.3k
quantization: autogenerate quantization backend configs for documentation #75126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…umentation
Summary:
Quantization has a high volume of configurations of how to quantize an
op for a reference model representation which is useful for a lowering
step for a backend. An example of this is
```
{'dtype_configs': [{'input_dtype': torch.quint8,
'output_dtype': torch.quint8}],
'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>,
'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>},
```
These configs are checked into master, and they are created with Python functions.
Therefore, there is no easy way for the user to see what the configs actually
are without running some Python code.
This PR is one approach to document these configs. Here is what this is doing:
1. during documentation build, write a text file of the configs
2. render that text file on a quantization page, with some additional context
In the future, this could be extended to autogenerate better looking tables
such as: op support per backend and dtype, op support per valid quantization settings per backend,
etc.
Currently this is an RFC to ensure there aren't any security concerns with
rendering a text file. If there aren't, this could move to review by
quantization team.
Note: the content being rendered is a placeholder, for now. We need a better
looking print-out, with more context, before we can ship something like
this to users.
Test plan:
```
cd docs
make html
cd html
python -m http.server 8000
// render http://[::]:8000/quantization-backend-configuration.html
// it renders correctly
```
[ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 169bdb5 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
…umentation
Summary:
Quantization has a high volume of configurations of how to quantize an
op for a reference model representation which is useful for a lowering
step for a backend. An example of this is
```
{'dtype_configs': [{'input_dtype': torch.quint8,
'output_dtype': torch.quint8}],
'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>,
'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>},
```
These configs are checked into master, and they are created with Python functions.
Therefore, there is no easy way for the user to see what the configs actually
are without running some Python code.
This PR is one approach to document these configs. Here is what this is doing:
1. during documentation build, write a text file of the configs
2. render that text file on a quantization page, with some additional context
In the future, this could be extended to autogenerate better looking tables
such as: op support per backend and dtype, op support per valid quantization settings per backend,
etc.
Currently this is an RFC to ensure there aren't any security concerns with
rendering a text file. If there aren't, this could move to review by
quantization team.
Note: the content being rendered is a placeholder, for now. We need a better
looking print-out, with more context, before we can ship something like
this to users.
Test plan:
```
cd docs
make html
cd html
python -m http.server 8000
// render http://[::]:8000/quantization-backend-configuration.html
// it renders correctly
```
ghstack-source-id: 9227148
Pull Request resolved: #75126
andrewor14
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way I just landed a bunch of new configs. Could you rebase and see what it looks like now?
|
|
||
| from torch.ao.quantization.fx.backend_config import get_native_backend_config_dict | ||
| from pprint import pprint | ||
| pprint(get_native_backend_config_dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would look better:
import json
print(json.dumps(get_native_backend_config_dict(), indent=4))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this would work:
from torch.utils._pytree import tree_map
config = get_native_backend_config_dict()
# convert Python types to strings, so json module can understand them
config_str = tree_map(str, config)
f.write(json.dumps(config_str, indent=2))
However, the results would look like:
{
"pattern": "<function leaky_relu at 0x7fd4083abb80>",
"observation_type": "ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT",
"dtype_configs": [
{
"input_dtype": "torch.quint8",
"output_dtype": "torch.quint8"
}
]
},
It's easier to read, but also everything is printed as strings, which may be confusing to the user since these objects are not actually strings. I think a custom printing function may have to be written to both make it nice on the eyes and also not print incorrect types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Maybe we can do that in the future then
…r documentation"
Summary:
Quantization has a high volume of configurations of how to quantize an
op for a reference model representation which is useful for a lowering
step for a backend. An example of this is
```
{'dtype_configs': [{'input_dtype': torch.quint8,
'output_dtype': torch.quint8}],
'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>,
'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>},
```
These configs are checked into master, and they are created with Python functions.
Therefore, there is no easy way for the user to see what the configs actually
are without running some Python code.
This PR is one approach to document these configs. Here is what this is doing:
1. during documentation build, write a text file of the configs
2. render that text file on a quantization page, with some additional context
In the future, this could be extended to autogenerate better looking tables
such as: op support per backend and dtype, op support per valid quantization settings per backend,
etc.
Test plan:
```
cd docs
make html
cd html
python -m http.server 8000
// render http://[::]:8000/quantization-backend-configuration.html
// it renders correctly
```
[ghstack-poisoned]
…umentation
Summary:
Quantization has a high volume of configurations of how to quantize an
op for a reference model representation which is useful for a lowering
step for a backend. An example of this is
```
{'dtype_configs': [{'input_dtype': torch.quint8,
'output_dtype': torch.quint8}],
'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>,
'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>},
```
These configs are checked into master, and they are created with Python functions.
Therefore, there is no easy way for the user to see what the configs actually
are without running some Python code.
This PR is one approach to document these configs. Here is what this is doing:
1. during documentation build, write a text file of the configs
2. render that text file on a quantization page, with some additional context
In the future, this could be extended to autogenerate better looking tables
such as: op support per backend and dtype, op support per valid quantization settings per backend,
etc.
Currently this is an RFC to ensure there aren't any security concerns with
rendering a text file. If there aren't, this could move to review by
quantization team.
Note: the content being rendered is a placeholder, for now. We need a better
looking print-out, with more context, before we can ship something like
this to users.
Test plan:
```
cd docs
make html
cd html
python -m http.server 8000
// render http://[::]:8000/quantization-backend-configuration.html
// it renders correctly
```
ghstack-source-id: f69b752
Pull Request resolved: #75126
|
@vkuzo has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…tion (#75126) Summary: Pull Request resolved: #75126 Quantization has a high volume of configurations of how to quantize an op for a reference model representation which is useful for a lowering step for a backend. An example of this is ``` {'dtype_configs': [{'input_dtype': torch.quint8, 'output_dtype': torch.quint8}], 'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>, 'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>}, ``` These configs are checked into master, and they are created with Python functions. Therefore, there is no easy way for the user to see what the configs actually are without running some Python code. This PR is one approach to document these configs. Here is what this is doing: 1. during documentation build, write a text file of the configs 2. render that text file on a quantization page, with some additional context In the future, this could be extended to autogenerate better looking tables such as: op support per backend and dtype, op support per valid quantization settings per backend, etc. Test Plan: ``` cd docs make html cd html python -m http.server 8000 // render http://[::]:8000/quantization-backend-configuration.html // it renders correctly ``` Reviewed By: ejguan Differential Revision: D35365461 Pulled By: vkuzo fbshipit-source-id: d60f776ccb57da9db3d09550e4b27bd5e725635a
|
Hey @vkuzo. |
Stack from ghstack:
Summary:
Quantization has a high volume of configurations of how to quantize an
op for a reference model representation which is useful for a lowering
step for a backend. An example of this is
These configs are checked into master, and they are created with Python functions.
Therefore, there is no easy way for the user to see what the configs actually
are without running some Python code.
This PR is one approach to document these configs. Here is what this is doing:
In the future, this could be extended to autogenerate better looking tables
such as: op support per backend and dtype, op support per valid quantization settings per backend,
etc.
Test plan:
Differential Revision: D35365461