GPU: Ignore VK_ERROR_SURFACE_LOST_KHR returned by vkQueuePresentKHR() #14683

999pingGG · 2025-12-20T20:00:55Z

sigh Here we go again. This is a follow-up to my previous PR. The comment in the code explains it all. Because I had been testing in debug mode with a simple clear screen app, I didn't notice this happened. This time I've thoroughly tested and I can switch apps, use picture-in-picture mode, resize the window and everything is good.

thatcosmonaut · 2025-12-20T20:38:21Z

Surely there's a better way that apps are supposed to handle this? Ignoring surface errors and hoping the swapchain gets recreated later smells really fishy to me.

999pingGG · 2025-12-20T21:13:05Z

I thought about that, but it would have to happen somehow after acquiring the swapchain texture and before submitting the command buffer. When you successfully acquire an image you reasonably assume you will be able to present it, but it's not the case here because Android destroys the surface asynchronously. The way to do it like you describe would probably be to asynchronously signal a flag when the activity generates a surface destroyed event and checking it before presenting, but I don't think it's practical to do that. EDIT: At least not with the current SDL GPU API.

999pingGG · 2025-12-20T21:23:20Z

Another possibly "more robust" way to work around this would be to ignore the error just once when it happens; two surface lost errors in a row is a hard error. Tho I agree all of this is hacky and I wish there was a better option.

thatcosmonaut · 2025-12-20T21:32:09Z

Could we use AddEventWatch to detect the surface destroy event? We're using that to flag when the swapchain should be recreated on resize, so maybe we could detect it and cancel presenting. Mostly I'd like to avoid treating errors as expected occurrences, that could lead to all kinds of problems.

999pingGG · 2025-12-20T23:10:47Z

I just did some tests. Adding an event watch doesn't help because the app's thread acquires the swapchain texture > records command buffer > submits. No chance for event handlers to run. But I see messages from SDL in the Android logcat, presumably from another thread, showing a sequence roughly like this when the app goes into the background: App acquires swapchain texture > SDL logs onPause() > surfaceDestroyed() > nativePause() > onTrimMemory() > onStop() > vkQueuePresentKHR VK_ERROR_SURFACE_LOST_KHR. So maybe there's a good way after all, but requires bigger changes which I'm not sure how to best implement. Can someone give some pointers?
EDIT: To be clear, those messages come from the SDL Java code. SDLSurface.java in particular has the function surfaceDestroyed() that calls onNativeSurfaceDestroyed() in SDL_android.c. I think we should signal SDL GPU from here somehow, using a mutex. How exactly could this be done?

thatcosmonaut · 2025-12-21T01:59:22Z

Android's awfulness continues to amaze me.

OK, since we've established that Android can destroy our presentation surface literally any time it wants, how about this: if we get SURFACE_LOST_KHR, we clean the command buffer and do not submit it, and mark the swapchain and surface as needing to be recreated. That way the app will eventually proceed to acquiring the swapchain where either the swapchain will be properly recreated or an error can be properly detected.

slouken · 2025-12-21T16:00:53Z

OK, since we've established that Android can destroy our presentation surface literally any time it wants, how about this: if we get SURFACE_LOST_KHR, we clean the command buffer and do not submit it, and mark the swapchain and surface as needing to be recreated. That way the app will eventually proceed to acquiring the swapchain where either the swapchain will be properly recreated or an error can be properly detected.

This seems like a good approach to me. @999pingGG, are you able to implement and test this?

999pingGG · 2025-12-21T18:58:29Z

Sure, I'm absolutely interested in fixing this! But I'm afraid something is not entirely clear to me. Isn't this what is kind of already happening? In my PR we treat VK_SUCCESS and VK_ERROR_SURFACE_LOST_KHR equally, which means VULKAN_INTERNAL_ReleaseCommandBuffer() will be called at the end of the function (is that what you mean by "clean the command buffer and not submit it" since I see the call to vkQueueSubmit() before vkQueuePresentKHR(), and the only thing throwing an error is the latter, no way to know beforehand if it will fail?). Then, assuming the app gives the chance for event handlers to run, this code here will run and mark the surface and swapchain for recreation: https://github.com/999pingGG/SDL/blob/d202a7ba79ead0435deb530d55e52085d14cab2a/src/gpu/vulkan/SDL_gpu_vulkan.c#L9663-L9669 (the surface recreation bits were previously contributed by me because Android needs it)
So they are recreated the next time we acquire an image, after returning from the background, and presentation succeeds having failed just once.
If there's a misunderstanding in my part, please let me know!

thatcosmonaut · 2025-12-21T19:50:32Z

You're right, I forgot that Present happens separately after Submit. (I wonder what happens to the draw commands that write to the swapchain if the surface has been destroyed...) I still think we should explicitly mark the swapchain for recreate if the surface is lost on Present and not wait for the event to do it.

999pingGG · 2025-12-21T22:46:06Z

Sorry, I did a mistake and the PR closed, I'll open another one. EDIT: Nevermind, this thing fixed itself. How about my latest changes? Your idea is much better.

thatcosmonaut · 2025-12-22T01:14:41Z

Looks good, thanks.

slouken · 2025-12-22T01:17:13Z

Merged, thanks!

999pingGG closed this Dec 21, 2025

999pingGG force-pushed the main branch from d202a7b to 37fca1f Compare December 21, 2025 21:46

GPU: Better detection of surface destruction

c95350f

999pingGG reopened this Dec 21, 2025

thatcosmonaut approved these changes Dec 22, 2025

View reviewed changes

slouken merged commit 5a25720 into libsdl-org:main Dec 22, 2025
46 checks passed

GPU: Ignore VK_ERROR_SURFACE_LOST_KHR returned by vkQueuePresentKHR() #14683

GPU: Ignore VK_ERROR_SURFACE_LOST_KHR returned by vkQueuePresentKHR() #14683

Conversation

999pingGG commented Dec 20, 2025

Uh oh!

thatcosmonaut commented Dec 20, 2025

Uh oh!

999pingGG commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

999pingGG commented Dec 20, 2025

Uh oh!

thatcosmonaut commented Dec 20, 2025

Uh oh!

999pingGG commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thatcosmonaut commented Dec 21, 2025

Uh oh!

slouken commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

999pingGG commented Dec 21, 2025

Uh oh!

thatcosmonaut commented Dec 21, 2025

Uh oh!

999pingGG commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thatcosmonaut commented Dec 22, 2025

Uh oh!

Uh oh!

slouken commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

999pingGG commented Dec 20, 2025 •

edited

Loading

999pingGG commented Dec 20, 2025 •

edited

Loading

slouken commented Dec 21, 2025 •

edited

Loading

999pingGG commented Dec 21, 2025 •

edited

Loading