SDL_SetRenderTarget crash in Android

I’m occasionally seeing this crash in Android, seemingly in SDL_SetRenderTarget() (OpenGLES), when restoring my app to the foreground:

12-10 14:05:09.160 18786 18786 F DEBUG   : Cmdline: com.rtrussell.inverter
12-10 14:05:09.160 18786 18786 F DEBUG   : pid: 18552, tid: 18605, name: SDLThread  >>> com.rtrussell.inverter <<<
12-10 14:05:09.160 18786 18786 F DEBUG   :       #00 pc 000000000006e548  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libSDL2.so (BuildId: 09e8b2784f83eeea073e4bb0dafffa74b7e5807e)
12-10 14:05:09.160 18786 18786 F DEBUG   :       #01 pc 0000000000065f1c  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libSDL2.so (BuildId: 09e8b2784f83eeea073e4bb0dafffa74b7e5807e)
12-10 14:05:09.160 18786 18786 F DEBUG   :       #02 pc 0000000000067e60  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libSDL2.so (SDL_SetRenderTarget_REAL+60) (BuildId: 09e8b2784f83eeea073e4bb0dafffa74b7e5807e)
12-10 14:05:09.160 18786 18786 F DEBUG   :       #03 pc 0000000000050b58  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libSDL2.so (SDL_SetRenderTarget+16) (BuildId: 09e8b2784f83eeea073e4bb0dafffa74b7e5807e)
12-10 14:05:09.160 18786 18786 F DEBUG   :       #04 pc 0000000000030b80  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libmain.so (mainloop+760) (BuildId: 8d4cd7fbb163d62ddb031dbdfe0d5c05c04df07c)
12-10 14:05:09.161 18786 18786 F DEBUG   :       #05 pc 0000000000032b6c  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libmain.so (SDL_main+3476) (BuildId: 8d4cd7fbb163d62ddb031dbdfe0d5c05c04df07c)
12-10 14:05:09.161 18786 18786 F DEBUG   :       #06 pc 000000000004029c  /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/lib/arm64/libSDL2.so (Java_org_libsdl_app_SDLActivity_nativeRunMain+500) (BuildId: 09e8b2784f83eeea073e4bb0dafffa74b7e5807e)
12-10 14:05:09.161 18786 18786 F DEBUG   :       #13 pc 0000000000014a80  [anon:dalvik-classes.dex extracted in memory from /data/app/~~-psVMLFnoa2C3y64iRg7Eg==/com.rtrussell.inverter-6UDBqMIWb19kbwI0BuLzLw==/base.apk] (org.libsdl.app.SDLMain.run+156)
12-10 14:05:09.206  5242 18794 W ActivityManager: crash : com.rtrussell.inverter,10396

I’m handling the SDL_RENDER_DEVICE_RESET event, but (depending on exactly when this event is issued) I can’t be sure that I’m not calling SDL_SetRenderTarget() with a texture that has been destroyed but not yet recreated. Any ideas?

did you call “SDL_SetRenderTarget(renderer, NULL))” ? like in docs/README-android.md

SDL_APP_WILLENTERBACKGROUND: after you read this message, GL context gets backed-up and you should not use the SDL renderer API.

When this event is received, you have to set the render target to NULL, if you're using it. (eg call SDL_SetRenderTarget(renderer, NULL))

I handle SDL_APP_WILLENTERBACKGROUND in an SDL_AddEventWatch() filter, rather than my main event handler, (which I thought was recommended). I therefore assumed I should not be making calls to the renderer from there, because it might be in a different thread. Is that not right?

I tried it anyway, both in my SDL_AddEventWatch() filter and in my main event handler; it made no difference. Still getting the crash perhaps 10% of the time when restoring from the background on a Samsung tablet. I’ve never seen it happen on my phone.

For Android, I think this is even worse to use SDL_AddEventWatch(), because it may be called from the Activity thread, or from the C SDL thread. Depending on who makes the SDL_PushEvent() call.

If you call it from the event loop, you at least know it always from the C Thread.

SDL_APP_WILLENTERBACKGROUND is always send from the c thread. so that shouldn’t matter for this one (I mean using EventWatch vs EventLoop mode).

I dont remember exactly why it will crash. but I know SDL_Render use also an EventWatch. And it may be called from the activity thread. so solution was the target to be null

I tried it both ways, with no effect.

What’s the sequence of events if a SDL_RENDER_DEVICE_RESET occurs - does it happen before or after SDL_DIDENTERFOREGROUND?

Not sure if express myself correctly:
“For Android, I think this is even worse to use SDL_AddEventWatch(), because it may be called from the Activity thread, or from the C SDL thread. Depending on who makes the SDL_PushEvent() call.”

This is not only for the enter background event, but for any events. (on android). because your code would be called from the Activity or C thread, without you knowing it.

About the SDL_APP_WILLENTERBACKGROUND. It is always send from the c thread. so that shouldn’t matter for this one (I mean using EventWatch vs EventLoop mode). → always called from C code.

About the sequence. See the code. reset is sent in android_egl_context_restore();
so you would received first:
will_enter
did_enter
reset.
when you poll your events. when you read will_enter. the reset is queued (if it occurred).

maybe the order is wrong. and should be RESET first. otherwise you restore whereas it has been reset
=> maybe this is a bug, and you should try to modify SDL code for that.

if you watch then, you have something different:
will_enter
watch called.
did_end
watch called
reset
watch called

did you try to set the target to null before entering to background ?

I would change the code to this below.
so that RESET appear before the DID ENTER BG

214             /* Android_ResumeSem was signaled */
215             SDL_SendAppEvent(SDL_APP_WILLENTERFOREGROUND);

225 #if SDL_VIDEO_OPENGL_EGL
226             /* Restore the GL Context from here, as this operation is thread dependent */
227             if (!isContextExternal && !SDL_HasEvent(SDL_QUIT)) {
228                 SDL_LockMutex(Android_ActivityMutex);
229                 android_egl_context_restore(Android_Window);
230                 SDL_UnlockMutex(Android_ActivityMutex);
231             }

216             SDL_SendAppEvent(SDL_APP_DIDENTERFOREGROUND);
217             SDL_SendWindowEvent(Android_Window, SDL_WINDOWEVENT_RESTORED, 0, 0);
218 
219             if (videodata->pauseAudio) {
220                 ANDROIDAUDIO_ResumeDevices();
221                 openslES_ResumeDevices();
222                 aaudio_ResumeDevices();
223             }
224

The SDL_APP_WILLENTERBACKGROUND and SDL_APP_DIDENTERFOREGROUND:events only set/clear a global flag, so it doesn’t matter from which thread they are called. I’m sure I’ve read that they should be processed in an SDL_AddEventWatch() filter to ensure they are handled promptly. By the time they are seen by the main event handler it may be ‘too late’.

I’ve confirmed that the SDL_RENDER_DEVICE_RESET event isn’t happening at all, so that’s not related to the crash.

Then try to get the full stack trace to see where it crashes inside SDL_SetRenderTarget ?

Btw, are you using latest SDL2 ?

I’ll try.

No, but I’ve compared the code of GLES_SetRenderTarget() between what I’m using and the latest version and there are no changes.

It seems to be crashing in SetCopyState() (a call to SetDrawState() is highlighted in the code window). Does this help?

art_sigsegv_fault 0x0000007de5a4b58c
art::FaultManager::HandleFault(int, siginfo*, void*) 0x0000007de5a4b3f8
art::SignalChain::Handler(int, siginfo*, void*) 0x0000007e89d69328
<unknown> 0x0000007e9525263c
SetCopyState SDL_render_gles.c:801
GLES_RunCommandQueue SDL_render_gles.c:918
FlushRenderCommands SDL_render.c:218
SDL_SetRenderTarget_REAL SDL_render.c:1843
SDL_SetRenderTarget SDL_dynapi_procs.h:351
maintick bbcsdl.c:1072
mainloop bbcsdl.c:473
SDL_main bbcsdl.c:1009
Java_org_libsdl_app_SDLActivity_nativeRunMain SDL_android.c:674
art_quick_generic_jni_trampoline 0x0000007de58d4048
art_quick_invoke_static_stub 0x0000007de58ca9ec
art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*) 0x0000007de58ee6bc
bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*) 0x0000007de5a0ade8
void art::interpreter::ExecuteSwitchImplCpp<false, false>(art::interpreter::SwitchImplContext*) 0x0000007de5810704
ExecuteSwitchImplAsm 0x0000007de58d69dc
art::interpreter::ExecuteSwitch(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool) (.llvm.3351068054637636664) 0x0000007de5b3d098
art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.3351068054637636664) 0x0000007de587ddbc
art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*) 0x0000007de595a9e8
bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*) 0x0000007de5a0b060
void art::interpreter::ExecuteSwitchImplCpp<false, false>(art::interpreter::SwitchImplContext*) 0x0000007de58166c0
ExecuteSwitchImplAsm 0x0000007de58d69dc
art::interpreter::ExecuteSwitch(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool) (.llvm.3351068054637636664) 0x0000007de5b3d098
art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.3351068054637636664) 0x0000007de587ddbc
artQuickToInterpreterBridge 0x0000007de587c9ec
art_quick_to_interpreter_bridge 0x0000007de58d417c
art_quick_invoke_stub 0x0000007de58ca768
art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*) 0x0000007de590e984
art::JValue art::InvokeVirtualOrInterfaceWithJValues<art::ArtMethod*>(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, art::ArtMethod*, jvalue const*) 0x0000007de59c1db8
art::Thread::CreateCallback(void*) 0x0000007de5a578f0
__pthread_start(void*) 0x0000007e705396b4
__start_thread 0x0000007e704d8f50

I’ve worked around the issue by adding a delay (currently 100 ms), after receipt of the SDL_APP_DIDENTERFOREGROUND event, before starting to render again; it hasn’t crashed since. However it does seem that there’s a bug in SDL2 whereby it’s not safe to call SDL_SetRenderTarget() immediately after receiving that event.

It would be nice to fix this in SDL3 (even though I can’t use SDL3 because support for the OpenGLES backend, which my app needs for the glLogicOp() function, has been removed), :frowning_face:

Indeed, it says it’s craying in SetCopyState() but this doesn’t explain why …

You didn’t answer my previous question, did you call SDL_SetRenderTarget(renderer, NULL) before entering into background. (eg immediatly after polling WILL_ENTER_BG) ?

Yes, I tried that, it made no difference. I didn’t really expect it to, because the ‘crashing’ SDL_SetRenderTarget() (which happens after the DIDENTERFOREGROUND event) restores it to a non-NULL target anyway.

I think there is a race condition that can appears in the SDL_Renderer WatchEvent when using a target texture. That’s why it’s better if it’s null before entering BG.
And you should not use the watch event and set back the target texture there.
because, watch event makes this happen immediately, so it will happen before the next resize event occurs. And the resized event trigger the SDL renderer watch event (SDL_renderer.c / SDL_RendererEventWatch / SIZE_CHANGED).

→ you can try to test to comment this block ( SDL/SDL_render.c at main · libsdl-org/SDL · GitHub )

… or, maybe this slightly different, but already fixed with Fixed bug #5850: Android EGL_BAD_ACCESS because of viewport command w… · libsdl-org/SDL@314bb5a · GitHub

I didn’t. I told you that my watch event handlers simply set and reset a global flag. To test your suggestion, I set the target to NULL in my normal (polled) event handler; it made no difference.

My delay workaround completely solves the problem for me, I do not intend to do any more work on it. But I think SDL should be fixed so the workaround isn’t necessary.

how should SDL fix the issue if you don’t do more work on helping debug it - as far as I understand no one else could reproduce it?

Has anybody tried to reproduce it? I’m not aware that anybody has tried and failed.

I hoped that the debug output I listed would provide enough information to identify the culprit. We know that it happens if you call SDL_SetRenderTarget() very shortly after receiving the SDL_APP_DIDENTERFOREGROUND event; if you wait 100ms it never seems to fail. We know that it crashes in an <unknown> routine called from SetCopyState(). Doesn’t that tie it down? Is there something required by SetCopyState which isn’t guaranteed to have been restored when the event is issued?

I might have been more motivated to help debug it if the rug hadn’t been pulled from under my feet by SDL3 dropping support for OpenGLES, which I need!

Where does it say that SDL3 is dropping support for OpenGL ES? Looking at the source code right now, and it has both GL ES and GL ES 2 renderers still.