-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Dynamo+LTC: handle inplace ops #75359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 171ea7a (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
| assert len(res) == len(args_and_out) | ||
| for i, arg in enumerate(args): | ||
| # only copy those tensors that get inplace updated | ||
| if arg is not res[i]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this 'is' check actually check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this 'is' check actually check?
So we compiled a graph for union(argument lazy tensors, returned lazy tensors). That graph will return (after duplication in case some tensors are duplicated) all the argument tensors at the beginning of the list. There are 2 cases
- some of those argument tensors may get inplace updated. In that case, the graph will return a new copy of the tensor. We need copy the content back to the argument
- other arguments may not get any inplace update. In that case, we can skip the copying.
Removing the 'is' check should also work but make things a bit slower.
For some models in torchbench (e.g. pyhpc_isoneutral_mixing), dynamo will generate Fx graphs that has side effects . Those graphs may - return an empty tuple - change tensors passed in as forward method arguments in-place This makes the Dynamo+LTC integration fail since we extract compiled graph based on the lazy tensors returned from the forward method. From an empty tuple, we extract nothing. To solve this problem, we extract compile graph from `union(argument lazy tensors, returned lazy tensors)` instead. The inplace mutations applied to argument lazy tensors will be captured this way. Test plan: ``` pytest test/lazy/test_extract_compiled_graph.py ``` ``` LTC_TS_CUDA=1 gpui time python torchbench.py --speedup-ltc -dcuda --nvfuser --randomize-input --only pyhpc_isoneutral_mixing ``` [ghstack-poisoned]
|
@shunting314 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: Pull Request resolved: #75359 For some models in torchbench (e.g. pyhpc_isoneutral_mixing), dynamo will generate Fx graphs that has side effects . Those graphs may - return an empty tuple - change tensors passed in as forward method arguments in-place This makes the Dynamo+LTC integration fail since we extract compiled graph based on the lazy tensors returned from the forward method. From an empty tuple, we extract nothing. To solve this problem, we extract compile graph from `union(argument lazy tensors, returned lazy tensors)` instead. The inplace mutations applied to argument lazy tensors will be captured this way. Test Plan: ``` pytest test/lazy/test_extract_compiled_graph.py ``` ``` LTC_TS_CUDA=1 gpui time python torchbench.py --speedup-ltc -dcuda --nvfuser --randomize-input --only pyhpc_isoneutral_mixing ``` Reviewed By: ZolotukhinM Differential Revision: D35478799 Pulled By: shunting314 fbshipit-source-id: 8116768fc50fe7630e481e6039319ddf5c6a9416
|
Hey @shunting314. |
Stack from ghstack (oldest at bottom):
For some models in torchbench (e.g. pyhpc_isoneutral_mixing), dynamo will generate Fx graphs that has side effects . Those graphs may
This makes the Dynamo+LTC integration fail since we extract compiled graph based on the lazy tensors returned from the forward method. From an empty tuple, we extract nothing.
To solve this problem, we extract compile graph from
union(argument lazy tensors, returned lazy tensors)instead. The inplace mutations applied to argument lazy tensors will be captured this way.Test plan:
Differential Revision: D35478799