-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add binary to benchmark model load speed #74700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 1e13a90 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
| #include <vector> | ||
|
|
||
| #include <ATen/ATen.h> | ||
| #include "caffe2/core/timer.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we re-organize the order of includes?
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
Differential Revision: [D35124881](https://our.internmc.facebook.com/intern/diff/D35124881) [ghstack-poisoned]
|
@SS-JIA has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: Pull Request resolved: #74700 Test Plan: Imported from OSS Some results running this benchmark for a quantized CPU xirp14b model on a Pixel 5: ``` PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "46749"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19261"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19235"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19396"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19486"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19562"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19566"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19559"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19632"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19938"} ``` Some results running this benchmark for the Vulkan xirp20a model on Pixel 5, after pre-loading the Context: ``` PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "38664"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19921"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20316"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20255"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20219"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20329"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20463"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "21072"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20668"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20889"} ``` Without pre-loading Context: ``` PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "70850"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19867"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20211"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20039"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20082"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20268"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20363"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "21103"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20511"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20528"} ``` Reviewed By: mrshenli Differential Revision: D35124881 Pulled By: SS-JIA fbshipit-source-id: 0f093e4aa45d69c538a4fe2003e0d5617d72b97a
Summary: Pull Request resolved: #74700 Test Plan: Imported from OSS Some results running this benchmark for a quantized CPU xirp14b model on a Pixel 5: ``` PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "46749"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19261"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19235"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19396"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19486"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19562"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19566"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19559"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19632"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19938"} ``` Some results running this benchmark for the Vulkan xirp20a model on Pixel 5, after pre-loading the Context: ``` PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "38664"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19921"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20316"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20255"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20219"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20329"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20463"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "21072"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20668"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20889"} ``` Without pre-loading Context: ``` PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "70850"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "19867"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20211"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20039"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20082"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20268"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20363"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "21103"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20511"} PyTorchObserver {"type": "NET", "unit": "us", "metric": "latency", "value": "20528"} ``` Reviewed By: mrshenli Differential Revision: D35124881 Pulled By: SS-JIA fbshipit-source-id: 0f093e4aa45d69c538a4fe2003e0d5617d72b97a (cherry picked from commit 96f9914)
|
Hey @SS-JIA. |
ghstack-source-id: c34962a Pull Request resolved: pytorch/pytorch#74700
Stack from ghstack:
Differential Revision: D35124881