PyTorch on ROCm v6.5.0rc (gfx1151 / AMD Strix Halo / Ryzen AI Max+ 395) Detecting Only 15.49GB VRAM Despite 96GB Usable #5152
Replies: 3 comments
-
|
*** I'm having the exact same issue, can anyone please offer some advice, what to try, what info to grab, etc...? *** It would be fantastic to not have to use NVidia as the only working option. |
Beta Was this translation helpful? Give feedback.
-
|
My This is using ROCm 6.4.3 where it shows 4 pools and TheRock/ROCm 7.0 nightly where it shows 3 pool. I don't know about your exact setup, but some recommendations:
I've been writing up some docs here: https://strixhalo-homelab.d7.wtf/AI/AI-Capabilities-Overview but for advanced usage/WIP notes, you can check some of my original working notes when I was poking around directly: https://llm-tracker.info/_TOORG/Strix-Halo |
Beta Was this translation helpful? Give feedback.
-
|
Leonard,
Thank you for this information, we will try the suggestions and report back.
At this point, we have PyTorch and Python working without issue and have managed to get QWEN 2.5 VL 32B running g and fine tuning without issue.
Our last issue is getting MMDetection running properly.
We will report back soon, and if you've heard of anyone getting MMDetection running, please let us know.
Thanks again,
Trevor Chandler
…________________________________
From: Leonard ***@***.***>
Sent: Sunday, August 17, 2025 9:00:03 PM
To: ROCm/ROCm ***@***.***>
Cc: Trevor Chandler ***@***.***>; Comment ***@***.***>
Subject: Re: [ROCm/ROCm] PyTorch on ROCm v6.5.0rc (gfx1151 / AMD Strix Halo / Ryzen AI Max+ 395) Detecting Only 15.49GB VRAM Despite 96GB Usable (Discussion #5152)
You don't often get email from ***@***.*** Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
My rocminfo shows 4 pools and all are full sized:
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 131159480(0x7d155b8) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
This is using ROCm 6.4.3 where it shows 4 pools and TheRock/ROCm 7.0 nightly where it shows 3 pool. I don't know about your exact setup, but some recommendations:
* Use the latest kernel. I'm currently using 6.17.0-rc1-1-mainline but 6.15+ should be fine. Newer kernels have more fixes though
* Use the latest linux-firmware - there are some major fixes that were only recently upstreamed, really the more up-to-date the better
* ROCm 6.4.1 I believe was the first release to have minimal gfx1151 support. I believe 6.4.2 or 6.4.3 was the first to intro hipblaslt for gfx1151. If your distro doesn't have up-to-date packages, I've found using TheRock/ROCm nightlies (either via tarball or the pip helper) to be the best way to get gfx1151 ROCm support: https://github.com/ROCm/TheRock/blob/main/RELEASES.md
I've been writing up some docs here: https://strixhalo-homelab.d7.wtf/AI/AI-Capabilities-Overview but for advanced usage/WIP notes, you can check some of my original working notes when I was poking around directly: https://llm-tracker.info/_TOORG/Strix-Halo
—
Reply to this email directly, view it on GitHub<#5152 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BTH2TXHIPT4MHSI2NFH4YID3OE6THAVCNFSM6AAAAACDD3I35CVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMJTGU4DKNY>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi ROCm Team,
I’m running into an issue where PyTorch built for ROCm (v6.5.0rc from [scottt/rocm-TheRock](https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch)) on an AMD Strix Halo machine (gfx1151) is only detecting 15.49 GB of VRAM, even though ROCm and
rocm-smireport 96GB VRAM available.❯ System Setup:
rocm-smi,rocminfo,glxinfo❯
rocm-smiVRAM Report:command:
output:
❯
rocminfoOutput Summary:GPU Agent (gfx1151) reports two global memory pools:
So from ROCm’s HSA agent side, only about 15.49 GB is visible for each global segment. But
rocm-smiandglxinfoshow 96 GB as accessible.❯
glxinfo:command:
output:
❯ PyTorch VRAM Check (via
torch.cuda.get_device_properties(0).total_memory):❯ Full Python Test Output:
❯ Questions / Clarifications:
rocm-smiandglxinfoclearly indicate that 96GB is present and usable?Happy to provide any additional logs or test specific builds if needed. This GPU is highly promising for wide range of application. I am in plans to use this to train models.
Thanks for the great work on ROCm so far!
Beta Was this translation helpful? Give feedback.
All reactions