I’ve no idea whether this is related to SDL or not, but an Android update today has caused my app to stop working, and it appears that calls to OpenGLES (1.0) functions are now resulting in a SEGV_ACCERR fault. SDL_GetRendererInfo confirms that I have successfully created an OpenGLES 1.0 context, so I’m mystified.
It’s looking as though (for example) SDL_GL_GetProcAddress(“glEnable”) is returning a different address from dlsym (RTLD_DEFAULT, “glEnable”). What could cause that?
Despite my app successfully opening an OpenGLES 1.0 context, SDL_GL_GetProcAddress is seemingly returning NULL for any function that is not in GLES 2.0! So for example SDL_GL_GetProcAddress(“glLogicOp”) is returning NULL and similarly for “glLightfv”; but non-zero values are returned for “glEnable” and “glClearColor” etc.
dlsym (RTLD_DEFAULT) is returning non-zero values for “glLogicOp” and “glLightfv” etc. but calling those addresses results in an immediate SEGV_ACCERR. So something very strange is happening with address resolution. Where is SDL getting the handle it passes to dlsym?
The architecture is armv7 and the device is a OnePlus 5 phone running OxygenOS 5.0.1 and Android 8.0.0; the issue started only with the latest OS update which arrived yesterday.
You’re a genius! There’s a section of code in SDL_egl.c which is disabled on Android because “eglGetProcAddress is busted on Android”. But the link given there shows this bug as ‘fixed’ so I’ve removed the conditional test and allowed it to call eglGetProcAddress. Voilà, everything works again!
So it looks as though SDL_egl.c needs attention. For a start the test for Android either needs to be removed or to be made conditional on the version of Android. Also, the fallback code is evidently broken.
I saw this disabled section of code. It solves your issue but maybe it is an issue elsewhere (at least it was).
It says it is fixed but in which version ? is this part of android version or vendor driver ?
Maybe you could double check you issue to see if :
SDL_GL_GetProcAddress() returned NULL because the dlopen() fails to find the library or because it fails to find the symbol. you can call “dlerror()” to have a string error message also.
Because some symbols returned NULL and others returned an address - and assuming it’s the same library for all GLES functions - then I would conclude that it’s the latter. I’ll do the test if you still think it’s important, but in which source file is the ‘dlopen’ call?
yes, maybe you’re right, only one library seems to be loaded directly.
Maybe it worth double checking, because there are some fallback.
for OpenGL ES 1.0, it loads DEFAULT_OGL_ES, or DEFAULT_OGL_ES_PVR
still in src/video/SDL_egl.c, there is SDL_EGL_LoadLibrary().
It calls SDL_LoadObject() which is a “dlopen()”.
Because there are many calls, you can put printf in SDL_LoadObject() ( src/loadso/dlopen/SDL_sysloadso.c ),
so that you’ll see what gets loaded on your device.
Another thing, you can grab the .so file of the device and do an “objdump -T” to see which symbol it contains.
I don’t directly know which of these corresponds to SDL_GL_GetProcAddress returning NULL, but should SDL be looking in libGLESv2 at all given that I have created a v1 context?
So SDL_GL_GetProcAddress(“glLogicOp”) is calling SDL_LoadFunction with the handle for ‘libGLESv2.so’, but glLogicOp is a GLES 1 function only! Returning NULL in this case is presumably correct, but unhelpful.
I just tried with a OpenGLES 2.0 app, and it also loads two libraries libGLESv2 and libEGL.so (but 2, not 4 times).
first, with “path =DEFAULT_OGL_ES2”, then at comment “Try loading a EGL symbol, if it does not work try the default library paths”. It seems to be expected, it has two handles: egl_dll_handle and dll_handle …
It loads 4 times on your side because they might be a re-creation of SDL_Window underneath. Maybe it fails to get the 1.0 context ? I agree it would make sense to load libGLESv1.
(maybe double-check with another device where it worked)
For me, it’s loaded twice:
with "path =DEFAULT_OGL_ES2"
at comment “Try loading a EGL symbol, if it does not work try the default library paths”
OK, I can do that. I looked at the docs for eglGetProcAddress and it specifically states that “Function pointers returned by eglGetProcAddress are independent of the display and the currently bound context and may be used by any context” so that looks to be safe.
OK, on a ‘working device’ SDL is still passing the handle of libGLESv2! The only reason it ‘works’ is that my own app has a fallback of calling dlsym(RTLD_DEFAULT) when SDL_GL_GetProcAddress returns NULL.
What I conclude from this is that SDL has a bug which is causing it to pass the handle of ‘libGLESv2.so’ even when the app has opened a GLES V1 context. This bug went unnoticed by me because my fallback of calling dlsym(RTLD_DEFAULT) has previously always returned the correct address. As the result of yesterday’s OS update, dlsym is no longer doing that, so the SDL bug now hits me.
But it shouldn’t matter how the 1.0 context is established, SDL_GL_GetProcAddress should use the correct library for the actual rendering context in use. It can’t ever make sense to get function addresses from ‘libGLESv2.so’ when a 1.0 context is in use, or ‘libGLESv1.so’ when a 2.0 context is in use.
Edit: Doesn’t SDL_GL_CONTEXT_MAJOR_VERSION apply to OpenGL rather than to OpenGLES?
It could reload libraries in SDL_RecreateWindow() but it doesn’t.
It only checks for a difference on window flags, no attribute flags, to reload libraries.
(“if ((window->flags & SDL_WINDOW_OPENGL) != (flags & SDL_WINDOW_OPENGL))” ).
So you still have the v2 libraries loaded with a openges renderer.
Maybe that’s a bug (or a feature, it allows to use the opengles renderer with different context … )
I think we’re losing sight of a key factor: OpenGLES provides an API - eglGetProcAddress - which is guaranteed to return the appropriate function address for the current renderer. If SDL were to call this function, the complications you describe would be unnecessary. It currently does not do so because, apparently, “eglGetProcAddress is busted on Android”.
Checking the link given in the source, this issue was marked as “fixed” in August 2010, more than seven years ago! I would argue that it is inappropriate to continue to block the function’s use on Android unconditionally. At the very least, the version of Android should be tested and the fallback code (which seems to be broken) used only when really necessary.