Add device.simulateLoss(), and prevent mappedAtCreation on destroyed devices by kainino0x · Pull Request #5115 · gpuweb/gpuweb

kainino0x · 2025-03-21T00:01:35Z

EDIT: I propose having both destroy() and simulateLoss() because they are useful for different things. destroy() to clean up resources easily during shutdown, and simulateLoss() for testing application behavior on device loss.

Issue: fixes #5102 (see there for discussion and past minutes), fixes #4177

…devices Issue: fixes gpuweb#5102, fixes gpuweb#4177

mwyrzykowski

I think this attempts to workaround limitations or inconsistencies in the API by introducing another API function on the GPUDevice. Instead we should address the limitations or inconsistencies without adding new API.

Specifically for the getMappedRange case on a lost device, should we just align the behavior to mapAsync when it is known the device is lost on the content timeline?

kainino0x · 2025-03-22T05:12:11Z

I think this attempts to workaround limitations or inconsistencies in the API by introducing another API function on the GPUDevice. Instead we should address the limitations or inconsistencies without adding new API.

No, my claim is that both destroy() and simulateLoss() are useful for different things. destroy() to clean up resources easily during shutdown, and simulateLoss() for testing application behavior on device loss.

Specifically for the getMappedRange case on a lost device, should we just align the behavior to mapAsync when it is known the device is lost on the content timeline?

This would go against #1629. Of course, mapAsync already does, but I think that's more OK because it's async. That said, we did leave open the possibility of making mapAsync work on lost devices, too: #1629 (comment)
I think we mainly avoided it to simplify implementations, though I don't think the implementation is actually complicated.

mwyrzykowski

I may not be able to join the meeting today, but API calls which are only for testing application behavior don't seem appropriate for inclusion into the specification. destroy() already results in device lost, so this seems sufficient for testing device lost if needed.

kainino0x · 2025-03-26T22:30:57Z

I don't think it's a problem to provide things that are mainly for testing. Examples:

The Web platform requires all platform exception types to be user-constructible. AFAIK the only reason this is globally required is for testing purposes.
We probably wouldn't need the ability to catch validation errors using error scopes if not for testing. (See this doc)
WebGL has WEBGL_lose_context.

Applications need to be able to test their code in the standard web platform. I don't think it would be reasonable if any or all of these capabilities were hidden behind some special browser flags: code couldn't be tested under the same platform that runs in production, plus everyone who writes software for the web would need to know about this.

kainino0x · 2025-03-27T01:08:34Z

Re: @kdashg's proposal that we change device.destroy()'s behavior so that it doesn't unmap buffers (instead of adding a new thing).

This is possible. It shouldn't have really direct impacts on application behavior. However a lot of WebGPU applications are pushing the resource limits of the system, so may be implicitly relying on device.destroy() to clean up memory used by mappings.

I don't think it's going to be a common problem, because mappings generally shouldn't live that long anyway, but in particular patterns like the "queue of mapped buffers" for data upload could have several large mappings alive that would no longer be cleaned up promptly, until applications update their code to clean these up explicitly.

kainino0x · 2025-03-27T01:14:40Z

Another side note: I was wondering if there's any conflict with triple-mapping, if keeping the mapping alive could have costs beyond just the raw memory allocation. I think there is no direct problem, since it always has to be safe anyway for the entire GPU process to crash. But I think triple-mapped buffers could be allocated in different physical memory spaces that are more memory-constrained than regular mappings, making cleanup a bit more important.

Kangz · 2025-03-27T14:30:32Z

GPU Web WG 2025-03-25/26 Pacific-time

KN: addresses last week's discussion.
GT: think simulateLoss might be good, but not sure I know enough about how loss works to know how to use it. In WebGL loss can happen at any command. Adding mappedAtCreation failing will throw everything off.
KN: no. Only prevents it on devices that've been destroyed. simulateLoss does the same thing as a natural device loss. Wouldn't stop you from creating buffers mappedAtCreation. Both stop you from mapping stuff asynchronously. Bit unfortunate. Talked about faking the mapping if the device is lost; didn't do that because it'd be more work for impls to fake mappings.
KN: don't think it's very hard to do so might want to consider doing it at some point.
KG: my concerns are half-similar to Mike's on the PR. Would rather have simulateLoss and not destroy() - that's what we have in WebGL. Think destroy is less important than device loss.
KN: only difference is in buffer mapping, which doesn't exist in WebGL.
KG: if you wanted that to not happen, try harder in the impl, I'd say. More imp't for impls to figure out whether device loss will cause problems with the app, then to make it slightly simpler for them to make a mapped buffer when the other side's destroyed the context. Bunch of ways you can monkey–patch and implement destroy yourself. Have to give you the most things you can't do yourself.
KN: would you then propose we change the current behavior of destroy() so it doesn't unmap buffers?
KG: I think that'd be great.
KR: need to talk with partners, make sure that they aren't surprised that we aren't cleaning up their memory. Google Meet has already raised this issue at the Wasm level.
KN: I'll write something up.

Add device.simulateLoss(), and prevent mappedAtCreation on destroyed …

d95b960

…devices Issue: fixes gpuweb#5102, fixes gpuweb#4177

kainino0x added proposal api WebGPU API labels Mar 21, 2025

kainino0x added this to the Milestone 1 milestone Mar 21, 2025

kainino0x mentioned this pull request Mar 21, 2025

Spec requires mappedAtCreation to still work after device is destroyed #5102

Open

mwyrzykowski reviewed Mar 21, 2025

View reviewed changes

mwyrzykowski requested changes Mar 26, 2025

View reviewed changes

kainino0x added the needs-cts-issue This change requires tests (or would need tests if accepted), but may not have a CTS issue filed yet label Mar 28, 2025

Kangz modified the milestones: Milestone 1, Milestone 2 Oct 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add device.simulateLoss(), and prevent mappedAtCreation on destroyed devices#5115

Add device.simulateLoss(), and prevent mappedAtCreation on destroyed devices#5115
kainino0x wants to merge 1 commit intogpuweb:mainfrom
kainino0x:simulateloss

kainino0x commented Mar 21, 2025 •

edited

Loading

Uh oh!

mwyrzykowski left a comment

Uh oh!

kainino0x commented Mar 22, 2025

Uh oh!

mwyrzykowski left a comment

Uh oh!

kainino0x commented Mar 26, 2025

Uh oh!

kainino0x commented Mar 27, 2025

Uh oh!

kainino0x commented Mar 27, 2025

Uh oh!

Kangz commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kainino0x commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mwyrzykowski left a comment

Choose a reason for hiding this comment

Uh oh!

kainino0x commented Mar 22, 2025

Uh oh!

mwyrzykowski left a comment

Choose a reason for hiding this comment

Uh oh!

kainino0x commented Mar 26, 2025

Uh oh!

kainino0x commented Mar 27, 2025

Uh oh!

kainino0x commented Mar 27, 2025

Uh oh!

Kangz commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kainino0x commented Mar 21, 2025 •

edited

Loading