Android app stability

I have no reason to think the issues I am encountering are linked. The cropping isn’t as far as I can tell associated with either the SIGSEGV crashes or the black screen on restoring from the background. They happen in different circumstances.

I recreate textures when I get a SDL_RENDER_TARGETS_RESET or SDL_RENDER_DEVICE_RESET event, but I never recreate the renderer (actually that’s not entirely true, I do recreate the renderer in some specific circumstances but not routinely).

No, I’m using SDL_Renderer.

I think maybe my memory is bad, the last time I tried SDL on Android must have been 6 years ago by now. I might have been using WindowSurface instead of a Renderer at that time and likely I was depending on undefined behavior.

If you do try destroying/recreating the renderer, then you are correct in destroying/reloading all textures as well.

I’m still guessing without actual test code, I haven’t touched Android code in a long time. Sorry for that.
Even if I am correct and this work-around does actually work, this whole situation should still be reported as a bug on their github as it is certainly not preferred behavior.

I’m going by what it says here. It talks about SDL2 backing up and restoring the GL_context, but it doesn’t say anything about the app needing to recreate the renderer that I can see.

It would make sense that if the GL_context is destroyed, resulting in a SDL_RENDER_DEVICE_RESET event, one might have to recreate the renderer, but it doesn’t say that.

That was a good link. OK, so here’s what I’m seeing, are you processing all of these:

Pause / Resume behaviour

If SDL_HINT_ANDROID_BLOCK_ON_PAUSE hint is set (the default), the event loop will block itself when the app is paused (ie, when the user returns to the main Android dashboard). Blocking is better in terms of battery use, and it allows your app to spring back to life instantaneously after resume (versus polling for a resume message).

Upon resume, SDL will attempt to restore the GL context automatically. In modern devices (Android 3.0 and up) this will most likely succeed and your app can continue to operate as it was.

However, there’s a chance (on older hardware, or on systems under heavy load), where the GL context can not be restored. In that case you have to listen for a specific message (SDL_RENDER_DEVICE_RESET) and restore your textures manually or quit the app.

You should not use the SDL renderer API while the app going in background:

SDL_APP_WILLENTERBACKGROUND: after you read this message, GL context gets backed-up and you should not use the SDL renderer API.

When this event is received, you have to set the render target to NULL, if you're using it. (eg call SDL_SetRenderTarget(renderer, NULL))

SDL_APP_DIDENTERFOREGROUND: GL context is restored, and the SDL renderer API is available (unless you receive SDL_RENDER_DEVICE_RESET).

That last bit seems to fit the kind of situation you are describing.

I’m processing all the events, but as I stated previously in the thread the instruction that you “should not use the SDL renderer API” after receiving the SDL_APP_WILLENTERBACKGROUND message is easier said than done!

Since the event has to be handled asynchronously in an Event Watch handler, there’s inevitably a possibility that the SDL renderer API will be used afterwards, if the main thread code is already ‘past the point of no return’.

I am taking at face value that “In modern devices (Android 3.0 and up) this will most likely succeed and your app can continue to operate as it was”.so although I do handle the SDL_RENDER_DEVICE_RESET event I’m not expecting to receive one.

My apologies, I skimmed some of the previous posts before making mine but I did not take it all in.
I did not realize I was asking you to repeat yourself.

I see you set the render target to a texture and it remains there for most of the maintick loop, this seems a likely suspect.

It’s also further evidence that there is some expectation to jump operations to the main thread at some point since SDL_SetRenderTarget is supposed to only be called there.
(Maybe by sending a user-event and SDL_Delay-ing the return from the filter? Perhaps a semaphore? I’m not sure how else to block a filter until a confirmation is received, or if blocking the filter’s return helps in any way.)

Indeed, it’s very confusing. It shouldn’t need me, or you, to try to make sense of incomplete and contradictory documentation! The unwanted behavior happens too infrequently and unpredictably for trial-and-error changes to the code to be a sensible approach.

Have you tried temporarily removing your event watching code? I got exactly what you reported when I added it, and no problems when I removed it.

this SDL_renderer WatchEvent part is not thread safe:

SDLActivity can manipulates the SDL target (see code below), while your own code in main C thread may be doing the same.

@rtrussell

if you are able to test some SDL code, and that you are able to reproduce the issue (even in rare occasion), you can try this SDL3 patch to change (not so much) the EventWatch behavior but make thread safe

diff --git a/src/events/SDL_events.c b/src/events/SDL_events.c
index c909a2a58..02c171794 100644
--- a/src/events/SDL_events.c
+++ b/src/events/SDL_events.c
@@ -1546,10 +1546,20 @@ void SDL_PumpEvents(void)
 }
 
 // Public functions
+static bool SDL_CallEventWatchers(SDL_Event *event);
 
 bool SDL_PollEvent(SDL_Event *event)
 {
+#ifdef SDL_PLATFORM_ANDROID
+    // Call event watch, from SDL C Thread.
+    bool ret = SDL_WaitEventTimeoutNS(event, 0);
+    if (ret) {
+        SDL_CallEventWatchers(event);
+    }
+    return ret;
+#else
     return SDL_WaitEventTimeoutNS(event, 0);
+#endif
 }
 
 #ifndef SDL_PLATFORM_ANDROID
@@ -1805,10 +1815,14 @@ bool SDL_PushEvent(SDL_Event *event)
         event->common.timestamp = SDL_GetTicksNS();
     }
 
+#ifdef SDL_PLATFORM_ANDROID
+    // No event Watchers, called from Activity.
+#else
     if (!SDL_CallEventWatchers(event)) {
         SDL_ClearError();
         return false;
     }
+#endif
 
     if (SDL_PeepEvents(event, 1, SDL_ADDEVENT, 0, 0) <= 0) {
         return false;




I don’t keep a copy of the target texture pointer in my own code, so I rely on discovering it, when required, using SDL_GetRenderTarget(). If there are situations when SDL2 can change the target texture ‘behind the scenes’ that could easily break my app.

In what circumstances is the code you highlighted in SDL_render.c called?

But wouldn’t this defeat the whole purpose of the Event Watch seeing the event early (when it is added to the event queue), because by the time SDL_PollEvent() is called it may be too late to block the main loop in Android.

In any case I never call SDL_PollEvent() so any changes there won’t affect my app at all. I call SDL_PumpEvents() and SDL_PeepEvents().

this very code changes the target behind the scene, between 702 and 752: the target is for internal purpose set to null.

this is called when the SDLActivity is about to sending one of those 2 events

SDL_WINDOWEVENT_SIZE_CHANGED,
SDL_WINDOWEVENT_DISPLAY_CHANGED

this code is called as an EventWatch in SDLActivity thread. and this is most like called when going to foreground I guess.

you code run in main C thread, so rarely, it may discover the target as “null”.

(maybe just while-loop until the target is not null…. but that’s flimsy)

if you want to fake a bad behavior, add a SDL_Delay(1000) between 702 and 752, so that your code get scheduled in middle of the renderer WatchEvent..

Indeed, that makes EventWatch appears later, upon PollEvent() call.

I don’t think the early timing is really important, you may move this to PumpEvents instead of PollEvent() …

On what platform(s)? There’s no evidence of it happening on the majority of platforms I support, because if it did my app would immediately fail and I would have spotted that long ago. Unless this is something specific to Android, it doesn’t seem to be an issue that I need to be concerned about.

From the SDL2 migration guide: “we’ve added new SDL events for some Android and iOS specific details, but you should set up an SDL event filter to catch them as soon as the OS reports them, because waiting until your next SDL_PollEvent() loop will be too late”.

This is android we’re talking about. SDLActivity thread is android.

There is not only target been null. but all the other lines that may has issue when that a concurrent accesses.

Thread safety is more important to me than handling faster the events.
And I am pretty sure Android pause / resume mechanism is resilient to a 500 ms delay. test it.

though, it seems to me there are other flaws with those lifecycle event. because the Java pause method (nativePause) which signals the Pause, now returns faster / immediately without hand-shaking with (eg waiting for the) native c thread.

I guess that when it returns, the activity understands the pause is consumed.

So I believe that may be an issue if you are rendering, and in the middle of that, Android Pause called/returned and you haven’t yet got the lifecycle events.

Both are important, I expect. I notice that in SDL_render.c there’s a mutex Android_ActivityMutex_Lock_Running() and I wonder whether the purpose of that is to protect against the thread safety issue you believe you have identified.

If not, and there’s a genuine bug, it’s in SDL2’s code so it needs to be reported at GitHub and fixed by the developers (hopefully urgently, if it’s as serious as you suggest). As you understand it better than I do, perhaps you would report it.

Wait, you’re quoting SDL2 migration for SDL3.
I looked at the SDL2 code, and I am confusing SDL2 and SDL3.

SDL2 is still using the semaphore mechanism, and this is stable. so this ok. (I am already trying something for SDL3).
for SDL2, it’s sure that’s working will PollEvents(): once you’ve polled the will enter bg, it blocks, until the SDLActivity resumes. PollEvents gets the event 1 by 1.

I’ve never tested PeepEvent() … but looking the code, pollEvent() is mostly PumpEvent + PeepEvent, so that should be ok too.

unless you poll a lot of event at the same time, both foreground and background events for instance maybe. But I don’t know your code, if you can reproduce the issue, if you have logs, etc.

Anyway, I am adding the issue for SDL2 with Renderer target: