Correctly set max_numwarps in coordinate_descent_tuner#159146
Correctly set max_numwarps in coordinate_descent_tuner#159146jataylo wants to merge 8 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159146
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 9182ff2 with merge base cf6d089 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…ing (#2416) pytorch#159146 (cherry picked from commit be95f40)
Relands #2416 with caching fix Upstream equivalent pytorch#159146 --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
Relands #2416 with caching fix Upstream equivalent pytorch#159146 --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> (cherry picked from commit f0aebdc)
Relands #2416 with caching fix Upstream equivalent pytorch#159146 --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> (cherry picked from commit f0aebdc)
…#2421) Relands ROCm#2416 with caching fix Upstream equivalent pytorch#159146 --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> (cherry picked from commit f0aebdc)
|
Hello @shunting314 @davidberard98 @jeffdaily ! We would like to have this prior to release/2.9 cut. The main issue is that linter is wary of lru_cache usage because of memory leaks. We've seen it used in other places, is there no way to merge this with lru_cache? Maybe you know other methods of making it more convenient since decorated func will be called a lot. Regards, Iurii. |
Relands #2416 with caching fix Upstream equivalent pytorch#159146 --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> (cherry picked from commit f0aebdc)
|
@iupaikov-amd can you separate the function from the class (I.e make it a function instead of a method) and apply lru cache on that function instead? |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
3c04d8d to
8758e6b
Compare
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
e676a49 to
368950e
Compare
|
Cleaned this up, opening for review. |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/19819289174 |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Current max_numwarps is incorrect on ROCm as warp_size is not taken into account. This PR resolves this and handles in a none hardcoded way using device props when available. Pull Request resolved: #159146 Approved by: https://github.com/jansel, https://github.com/shunting314
Current max_numwarps is incorrect on ROCm as warp_size is not taken into account. This PR resolves this and handles in a none hardcoded way using device props when available.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben