[inductor][ez] add overridable env var for disabling fx graph cache#166138
[inductor][ez] add overridable env var for disabling fx graph cache#166138shunting314 wants to merge 1 commit intogh/shunting314/245/basefrom
Conversation
I set TORCHINDUCTOR_FX_GRAPH_CACHE=0 a lot to make sure the compilation happens by disabling fx graph caching. I even put this in my .bashrc. But this cause a simple vllm script fail: https://gist.github.com/shunting314/4253b2b5ab5e7d1b0fc9516c84054904 Error log: https://gist.github.com/shunting314/1d04bbeb58bc486f975684f56d65615d The root cause is, 1. vllm patch inductor_config.fx_graph_cache to True here: https://github.com/vllm-project/vllm/blob/e255d929902dcf8968541d2cbf0d18f0fe3f9c49/vllm/compilation/compiler_interface.py#L308 The code in vllm relies fx graph cache is on (unless VLLM_DISABLE_COMPILE_CACHE is overriden to false) 2. setting TORCHINDUCTOR_FX_GRAPH_CACHE=0 will cause inductor_config.fx_graph_cache not overridable. I add TORCHINDUCTOR_FX_GRAPH_CACHE_DEFAULT so that we can still use it to skip fx graph cache while still allow project like vllm to override it. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166138
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit edd15b3 with merge base e20c9bf ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
I set TORCHINDUCTOR_FX_GRAPH_CACHE=0 a lot to make sure the compilation happens by disabling fx graph caching. I even put this in my .bashrc. But this cause a simple vllm script fail: https://gist.github.com/shunting314/4253b2b5ab5e7d1b0fc9516c84054904 Error log: https://gist.github.com/shunting314/1d04bbeb58bc486f975684f56d65615d The root cause is, 1. vllm patch inductor_config.fx_graph_cache to True here: https://github.com/vllm-project/vllm/blob/e255d929902dcf8968541d2cbf0d18f0fe3f9c49/vllm/compilation/compiler_interface.py#L308 The code in vllm relies fx graph cache is on (unless VLLM_DISABLE_COMPILE_CACHE is overriden to false) 2. setting TORCHINDUCTOR_FX_GRAPH_CACHE=0 will cause inductor_config.fx_graph_cache not overridable. I add TORCHINDUCTOR_FX_GRAPH_CACHE_DEFAULT so that we can still use it to skip fx graph cache while still allow project like vllm to override it. ghstack-source-id: 836f92f Pull Request resolved: #166138
|
In addition to 'VLLM_DISABLE_COMPILE_CACHE=0', we can also remove cached code under '.cache/vllm/' |
|
@BoyuanFeng that triggers compilation (rather than load from saved artifacts), but this line will still fail since the compiled result is not saved to cache: https://github.com/vllm-project/vllm/blob/e255d929902dcf8968541d2cbf0d18f0fe3f9c49/vllm/compilation/compiler_interface.py#L476 |
|
The setting like TORCHINDUCTOR_FX_GRAPH_CACHE=0 causes the following behavior: The new envvar will avoid that and is safe to put in .bashrc |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: trunk / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge, unstable) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: trunk / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge, unstable) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge -f 'force merge since it fails for so many times..." |
|
❌ 🤖 pytorchbot command failed: |
|
@pytorchbot merge -f 'force merge since it fails for so many times...' |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
I set TORCHINDUCTOR_FX_GRAPH_CACHE=0 a lot to make sure the compilation
happens by disabling fx graph caching. I even put this in my .bashrc.
But this cause a simple vllm script fail:
https://gist.github.com/shunting314/4253b2b5ab5e7d1b0fc9516c84054904
Error log:
https://gist.github.com/shunting314/1d04bbeb58bc486f975684f56d65615d
The root cause is,
vllm patch inductor_config.fx_graph_cache to True here:
https://github.com/vllm-project/vllm/blob/e255d929902dcf8968541d2cbf0d18f0fe3f9c49/vllm/compilation/compiler_interface.py#L308
The code in vllm relies fx graph cache is on (unless
VLLM_DISABLE_COMPILE_CACHE is overriden to false)
setting TORCHINDUCTOR_FX_GRAPH_CACHE=0 will cause
inductor_config.fx_graph_cache not overridable.
I add TORCHINDUCTOR_FX_GRAPH_CACHE_DEFAULT so that we can still use it to skip fx
graph cache while still allow project like vllm to override it.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben