-
Notifications
You must be signed in to change notification settings - Fork 363
Description
It would be useful to have the ability in WebGPU to get the available/recommended maximum memory, e.g., to determine whether/how much of a local LLM can be offloaded to the GPU. Apologies if this has been discussed before, I couldn't find any existing issues. I also realize there are security considerations around this (https://www.w3.org/TR/webgpu/#security-memory-resources), but I think the use-cases are important enough that it's worth figuring out if it's possible. For example, WebGPU could follow the convention used by JavaScript's Device Memory API and only expose very rough thresholds, which would still be useful for applications.
Right now, the value I am using is maxBufferSize, but I don't think this is always representative. I'd also appreciate pointers to another method if there is an accepted one.
There are ways to expose this information on the existing backends:
Metal
- recommendedMaxWorkingSetSize for the total memory.
- optionally, currentlyAllocatedSize to avoid multiple applications clobbering each other.
Vulkan
- vkGetPhysicalDeviceMemoryProperties and examining the returned heaps for the total memory
- optionally, using VkPhysicalDeviceMemoryBudgetPropertiesEXT to take into account existing allocations.
DirectX
I'm not as familiar with DirectX, but I think something like DXGI_ADAPTER_DESC3 would expose enough information to get a rough idea.