Skip to content

Conversation

@titaiwangms
Copy link
Collaborator

@titaiwangms titaiwangms commented Aug 10, 2022

Stack from ghstack (oldest at bottom):

Fix #82589
Why:

  1. full_check works in onnx::checker::check_model function as it turns on strict_mode in onnx::shape_inference::InferShapes() which I think that was the intention of this part of code.
  2. strict_mode catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.
  3. This PR doesn't change the original behavior of check_onnx_proto, but add a warning message for those models which can't pass strict shape type inference, saying the models would fail on onnxruntime.

cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel @ezyang @gchanan

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 10, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit a5ceffa (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Aug 10, 2022
@titaiwangms titaiwangms marked this pull request as draft August 10, 2022 16:57
@titaiwangms titaiwangms changed the title add full checker mode [ONNX] Add full checker mode in torch.nn.export Aug 10, 2022
Why:
1. Currently, the full_check mode in `_C._check_onnx_proto` does nothing, as `onnx::shape_inference::InferShapes(model)` is included in `onnx::checker::check_model(model)`
2. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
3. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.

[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Aug 10, 2022
ghstack-source-id: 076750e
Pull Request resolved: #83186
Why:
1. Currently, the full_check mode in `_C._check_onnx_proto` does nothing, as `onnx::shape_inference::InferShapes(model)` is included in `onnx::checker::check_model(model)`
2. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
3. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.

CI should pass after #83201 

[ghstack-poisoned]
Why:
1. Currently, the full_check mode in `_C._check_onnx_proto` does nothing, as `onnx::shape_inference::InferShapes(model)` is included in `onnx::checker::check_model(model)`
2. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
3. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.

CI should pass after #83201 

[ghstack-poisoned]
@titaiwangms titaiwangms added module: onnx Related to torch.onnx release notes: onnx torch.onnx related changes that should show up in the release notes topic: improvements topic category labels Aug 11, 2022
Why:
1. Currently, the full_check mode in `_C._check_onnx_proto` does nothing, as `onnx::shape_inference::InferShapes(model)` is included in `onnx::checker::check_model(model)`
2. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
3. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.


[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Aug 11, 2022
Why:

ONNX has mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check,this improvement should be added. Also, this version bump enables #83186 

[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Aug 11, 2022
Why:

ONNX has mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check,this improvement should be added. Also, this version bump enables #83186 

[ghstack-poisoned]
Why:
1. Currently, the full_check mode in `_C._check_onnx_proto` does nothing, as `onnx::shape_inference::InferShapes(model)` is included in `onnx::checker::check_model(model)`
2. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
3. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.


[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Aug 11, 2022
Why:

ONNX has mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check,this improvement should be added. Also, this version bump enables #83186 

[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Aug 11, 2022
Why:

ONNX has mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check,this improvement should be added. Also, this version bump enables #83186 

[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Dec 6, 2022
Why:

ONNX had mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check with older version of ONNX,this improvement should be added. Also, this version bump enables #83186 

Updated 12/5/2022:
This PR includes ONNX 1.13.0 pre-release (https://github.com/onnx/onnx/tree/rel-1.13.0)

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang chunyuan-w zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Dec 6, 2022
Why:

ONNX had mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check with older version of ONNX,this improvement should be added. Also, this version bump enables #83186 

Updated 12/5/2022:
This PR includes ONNX 1.13.0 pre-release (https://github.com/onnx/onnx/tree/rel-1.13.0)

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang chunyuan-w zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
Fix #82589
Depends: #83201
Why:
1. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
2. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.
3. This PR doesn't change the original behavior of `check_onnx_proto`, but add a warning message for those models which can't pass strict shape type inference, saying the models would fail on onnxruntime.

[ghstack-poisoned]
@titaiwangms titaiwangms mentioned this pull request Dec 6, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2023

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Feb 5, 2023
Fix #82589
Why:
1. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
2. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.
3. This PR doesn't change the original behavior of `check_onnx_proto`, but add a warning message for those models which can't pass strict shape type inference, saying the models would fail on onnxruntime.

cc EikanWang jgong5 wenzhe-nrv sanchitintel

[ghstack-poisoned]
@titaiwangms titaiwangms added no-stale and removed Stale labels Feb 5, 2023
titaiwangms added a commit that referenced this pull request Feb 5, 2023
Why:

ONNX had mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since torch.onnx.export is using cpp checker for graph-level check with older version of ONNX,this improvement should be added. Also, this version bump enables #83186

Updated 12/5/2022:
This PR includes ONNX 1.13.0 pre-release (https://github.com/onnx/onnx/tree/rel-1.13.0)


[ghstack-poisoned]
titaiwangms added a commit that referenced this pull request Feb 5, 2023
Why:

ONNX had mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since torch.onnx.export is using cpp checker for graph-level check with older version of ONNX,this improvement should be added. Also, this version bump enables #83186

Updated 12/5/2022:
This PR includes ONNX 1.13.0 pre-release (https://github.com/onnx/onnx/tree/rel-1.13.0)


[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Feb 8, 2023
ONNX had mismatch checker usage between cpp and python and it's later fixed by onnx/onnx#4386. And since `torch.onnx.export` is using cpp checker for graph-level check with older version of ONNX,this improvement should be added. Also, this version bump enables #83186

Updated 12/5/2022:
This PR includes ONNX 1.13.0 release (https://github.com/onnx/onnx/tree/rel-1.13.0)

For [CVE-2022-25882](https://nvd.nist.gov/vuln/detail/CVE-2022-25882)
Pull Request resolved: #90332
Approved by: https://github.com/kit1980, https://github.com/malfet
Fix #82589
Why:
1. **full_check** works in `onnx::checker::check_model` function as it turns on **strict_mode** in `onnx::shape_inference::InferShapes()` which I think that was the intention of this part of code.
2. **strict_mode** catches failed shape type inference (invalid ONNX model from onnx perspective) and ONNXRUNTIME can't run these invalid models, as ONNXRUNTIME actually rely on ONNX shape type inference to optimize ONNX graph. Why we don't set it True for default? >>> some of existing users use other platform, such as caffe2 to run ONNX model which doesn't need valid ONNX model to run.
3. This PR doesn't change the original behavior of `check_onnx_proto`, but add a warning message for those models which can't pass strict shape type inference, saying the models would fail on onnxruntime.

cc EikanWang jgong5 wenzhe-nrv sanchitintel

[ghstack-poisoned]
@titaiwangms
Copy link
Collaborator Author

@pytorchbot merge -g

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 8, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: GraphQL query
fragment PRCheckSuites on CheckSuiteConnection {
edges {
node {
app {
name
databaseId
}
workflowRun {
workflow {
name
}
url
}
checkRuns(first: 50) {
nodes {
name
conclusion
detailsUrl
}
pageInfo {
endCursor
hasNextPage
}
}
conclusion
}
cursor
}
pageInfo {
hasNextPage
}
}

query ($owner: String!, $name: String!, $number: Int!, $cursor: String!) {
repository(name: $name, owner: $owner) {
pullRequest(number: $number) {
commits(last: 1) {
nodes {
commit {
oid
checkSuites(first: 10, after: $cursor) {
...PRCheckSuites
}
}
}
}
}
}
}
, args {'name': 'pytorch', 'owner': 'pytorch', 'number': 83186, 'cursor': 'Y3Vyc29yOnYyOpHPAAAAAoddcug='} failed: [{'message': 'Something went wrong while executing your query. Please include 07C3:7CCA:43256AB:88A6B31:63E4169C when reporting this issue.'}]

Details for Dev Infra team Raised by workflow job

@titaiwangms
Copy link
Collaborator Author

Merge failed

Reason: GraphQL query fragment PRCheckSuites on CheckSuiteConnection { edges { node { app { name databaseId } workflowRun { workflow { name } url } checkRuns(first: 50) { nodes { name conclusion detailsUrl } pageInfo { endCursor hasNextPage } } conclusion } cursor } pageInfo { hasNextPage } }

query ($owner: String!, $name: String!, $number: Int!, $cursor: String!) { repository(name: $name, owner: $owner) { pullRequest(number: $number) { commits(last: 1) { nodes { commit { oid checkSuites(first: 10, after: $cursor) { ...PRCheckSuites } } } } } } } , args {'name': 'pytorch', 'owner': 'pytorch', 'number': 83186, 'cursor': 'Y3Vyc29yOnYyOpHPAAAAAoddcug='} failed: [{'message': 'Something went wrong while executing your query. Please include 07C3:7CCA:43256AB:88A6B31:63E4169C when reporting this issue.'}]

Details for Dev Infra team

Hi @kit1980,

Do you know what issue this might be?

@kit1980
Copy link
Contributor

kit1980 commented Feb 8, 2023

Looks like GitHub hiccup?
Let's try again.

@kit1980
Copy link
Contributor

kit1980 commented Feb 8, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

}

void check_onnx_proto(const std::string& proto_string, bool full_check) {
void check_onnx_proto(const std::string& proto_string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes public API, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does. It's one of TORCH API. Is there anything else I should do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to follow https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy
It basically says first emit a warning for 2 releases, then can delete.
But for this function specifically I'm not sure if it's even used by anyone.
We may need to revert if there are complains.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@titaiwangms titaiwangms Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Users should most likely be using the onnx API from onnx repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added bc breaking labels.

@kit1980 kit1980 added module: bc-breaking Related to a BC-breaking change topic: bc breaking topic category and removed module: bc-breaking Related to a BC-breaking change labels Feb 8, 2023
@facebook-github-bot facebook-github-bot deleted the gh/AllenTiTaiWang/4/head branch June 8, 2023 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged module: onnx Related to torch.onnx no-stale oncall: jit Add this issue/PR to JIT oncall triage queue open source release notes: onnx torch.onnx related changes that should show up in the release notes topic: bc breaking topic category topic: improvements topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ONNX] Exporter successfully export a ONNX model, but unexecutable on ONNXRUNTIME due to failed shape inference..