Multithreaded memory crash

satchmo · April 29, 2014, 12:33am

I have a reproducible crash when I quit my app. It doesn’t happen every time, but maybe one in ten runs. It seems to be related to using multiple threads - I’ve stripped almost everything out and made an example app that reproduces the crash - code below. I’ve reproduced the crash on two different machines, both Windows 7 64-bit, building SDL from the SDL2-2.0.3 source .zip linked on the site (the crash happens with the redistributable too, but there doesn’t seem to be any source for symbols so I’m building it myself to get a proper callstack). I’m building a Win32 console app in Debug.

The crashes can happen in various places, but it’s almost always in SDL’s dlmalloc code. Sometimes it comes from SDL_Quit, sometimes SDL_RunThread, and sometimes in other event handling code. Sometimes it’s freeing a bad pointer, but other times it’s while trying to get a free block. It’s fairly random.

The code is pretty straight-forward - it starts a few threads that simply wait on a semaphore, which the main thread signals after the esape key is pressed. Can anyone see something I’m doing wrong? Or might this be an issue on SDL’s side? This is my first time using SDL so I wouldn’t be surprised if I’m doing something I shouldn’t.

Code:

SDL_sem* killSignal;
SDL_sem* deadSignal;
SDL_atomic_t numAliveThreads;

int testcrashThread(void* data)
{
SDL_AtomicAdd(&numAliveThreads, 1);

// Just wait to be killed
SDL_SemWait(killSignal);

// Last alive thread sends the dead signal
int prevNumAliveThreads = SDL_AtomicAdd(&numAliveThreads, -1);
if(prevNumAliveThreads == 1)
{
    SDL_SemPost(deadSignal);
}
return 0;

}

int main(int argc, char **argv)
{
//SDL_LogSetAllPriority(SDL_LOG_PRIORITY_VERBOSE);

// Set up SDL
SDL_Init(SDL_INIT_TIMER | SDL_INIT_VIDEO | SDL_INIT_EVENTS);
SDL_Window* sdlWindow = SDL_CreateWindow("testcrash", 200, 200, 100, 100, SDL_WINDOW_SHOWN);
SDL_Renderer* sdlRenderer = SDL_CreateRenderer(sdlWindow, -1, SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);

// Create sync prims
killSignal = SDL_CreateSemaphore(0);
deadSignal = SDL_CreateSemaphore(0);
SDL_AtomicSet(&numAliveThreads, 0);

// Kick off threads
unsigned numThreads = SDL_GetCPUCount();
for (unsigned t = 0; t < numThreads; t++)
{
    SDL_Thread* thread = SDL_CreateThread(testcrashThread, NULL, NULL);
    SDL_DetachThread(thread);
}

// Wait for escape to be hit
while(true)
{
    SDL_Event e;
    if(SDL_PollEvent(&e) && e.type == SDL_KEYDOWN && e.key.keysym.sym == SDLK_ESCAPE)
    {
        break;
    }
}

// Kill threads
for(unsigned t = 0; t < numThreads; t++)
{
    SDL_SemPost(killSignal);
}

// Wait for them to die
SDL_SemWait(deadSignal);

// Clean up & shut down
SDL_DestroySemaphore(killSignal);
SDL_DestroySemaphore(deadSignal);
SDL_DestroyRenderer(sdlRenderer);
SDL_DestroyWindow(sdlWindow);
SDL_Quit();

return 0;

}

And an example callstack:

Code:

SDL2.dll!SDL_free_REAL(void * mem) Line 4379 C
SDL2.dll!SDL_RunThread(void * data) Line 294 C
SDL2.dll!RunThread(void * data) Line 85 C
SDL2.dll!RunThreadViaBeginThreadEx(void * data) Line 100 C
msvcr120d.dll!_callthreadstartex() Line 376 C
msvcr120d.dll!_threadstartex(void * ptd) Line 359 C
kernel32.dll!@BaseThreadInitThunk at 12
() Unknown
ntdll.dll!__RtlUserThreadStart() Unknown
ntdll.dll!__RtlUserThreadStart at 8
() Unknown

Nathaniel_J_Fries · April 29, 2014, 2:50pm

bug appears to be in your code.

numAliveThreads is initialized to zero, then incremented by each new thread.
but you signal all threads are dead when numAliveThreads has been decremented down to 1.
this means there is still one thread running, possibly even still waiting in the kernel on a semaphore which the main thread is going to destroy.------------------------
Nate Fries

satchmo · April 29, 2014, 3:23pm

I’m pretty sure that code is ok. SDL_AtomicAdd() returns the previous value of the variable before the add (hence the variable name prevNumAliveThreads) - when the previous value is 1 it means it was just decremented to 0, so this is the last thread alive.

I believe I’ve actually identified the issue - it’s a bug with SDL’s thread management which does memory allocation in a non-thread-safe manner, as SDL_RunThread() itself runs in a separate thread. SDL’s memory allocator isn’t thread-safe (the dlmalloc implementation doesn’t have USE_LOCKS defined), and when I kill the threads manually using a semaphore just before calling the SDL_Quit(), they end up dying at around the same time SDL is cleaning itself up. SDL_RunThread() calls SDL_free() when the thread function exits, and this is probably happening at the same time as some other frees in various SDL subsystems. Another common crash I get is in SDL_free_REAL() when called by SDL_VideoQuit_REAL(), which backs up this theory. Also, if I #define USE_LOCKS 1 in SDL_malloc.c, everything works correctly.

Putting locks around every allocation is a very heavy-handed approach to making memory management thread-safe (although some very high-profile game engines unfortunately do exactly that), but in the case of SDL it may be warranted as the alternative is an overhaul of how thread memory management is handled, and in general SDL’s memory allocation code is called relatively infrequently.

Nathaniel_J_Fries · April 30, 2014, 12:54pm

satchmo wrote:

I’m pretty sure that code is ok. SDL_AtomicAdd() returns the previous value of the variable before the add (hence the variable name prevNumAliveThreads) - when the previous value is 1 it means it was just decremented to 0, so this is the last thread alive.

I believe I’ve actually identified the issue - it’s a bug with SDL’s thread management which does memory allocation in a non-thread-safe manner, as SDL_RunThread() itself runs in a separate thread. SDL’s memory allocator isn’t thread-safe (the dlmalloc implementation doesn’t have USE_LOCKS defined), and when I kill the threads manually using a semaphore just before calling the SDL_Quit(), they end up dying at around the same time SDL is cleaning itself up. SDL_RunThread() calls SDL_free() when the thread function exits, and this is probably happening at the same time as some other frees in various SDL subsystems. Another common crash I get is in SDL_free_REAL() when called by SDL_VideoQuit_REAL(), which backs up this theory. Also, if I #define USE_LOCKS 1 in SDL_malloc.c, everything works correctly.

Putting locks around every allocation is a very heavy-handed approach to making memory management thread-safe (although some very high-profile game engines unfortunately do exactly that), but in the case of SDL it may be warranted as the alternative is an overhaul of how thread memory management is handled, and in general SDL’s memory allocation code is called relatively infrequently.

Ah, you’re right about AtomicAdd behavior, sorry.
Good bug catch.
Shouldn’t be hard to tweak dlmalloc to use much less locking. The way Hoard does this should work despite any internal allocator differences.------------------------
Nate Fries

Juan_Manuel_Borges_C · May 7, 2014, 8:27pm

You wait for dead signal, once the thread sends it, you destroy the system,
what the thread still needs to do is clean after itself, that mere return
0…, that could kick some pool job…

Usually the approach is to wait for thread end EXPLICITILY, say
WaitThread(), then and that guarantees its destruction.

Not discarding core misstakes, but theory is like this ;-).
El 07/05/2014 18:47, “satchmo” <keith.oconor at gmail.com> escribi?:> I have a reproducible crash when I quit my app. It doesn’t happen every

time, but maybe one in ten runs. It seems to be related to using multiple
threads - I’ve stripped almost everything out and made an example app that
reproduces the crash - code below. I’ve reproduced the crash on two
different machines, both Windows 7 64-bit, building SDL from the SDL2-2.0.3
source .zip linked on the site (the crash happens with the redistributable
too, but there doesn’t seem to be any source for symbols so I’m building it
myself to get a proper callstack). I’m building a Win32 console app in
Debug.

The crashes can happen in various places, but it’s almost always in SDL’s
dlmalloc code. Sometimes it comes from SDL_Quit, sometimes SDL_RunThread,
and sometimes in other event handling code. Sometimes it’s freeing a bad
pointer, but other times it’s while trying to get a free block. It’s fairly
random.

The code is pretty straight-forward - it starts a few threads that simply
wait on a semaphore, which the main thread signals after the esape key is
pressed. Can anyone see something I’m doing wrong? Or might this be an
issue on SDL’s side? This is my first time using SDL so I wouldn’t be
surprised if I’m doing something I shouldn’t.

Code:

SDL_sem* killSignal;
SDL_sem* deadSignal;
SDL_atomic_t numAliveThreads;

int testcrashThread(void* data)
{
SDL_AtomicAdd(&numAliveThreads, 1);
// Just wait to be killed
SDL_SemWait(killSignal);

// Last alive thread sends the dead signal
int prevNumAliveThreads = SDL_AtomicAdd(&numAliveThreads, -1);
if(prevNumAliveThreads == 1)
{
    SDL_SemPost(deadSignal);
}
return 0;
}

int main(int argc, char **argv)
{
//SDL_LogSetAllPriority(SDL_LOG_PRIORITY_VERBOSE);
// Set up SDL
SDL_Init(SDL_INIT_TIMER | SDL_INIT_VIDEO | SDL_INIT_EVENTS);
SDL_Window* sdlWindow = SDL_CreateWindow("testcrash", 200, 200, 100,
100, SDL_WINDOW_SHOWN);
SDL_Renderer* sdlRenderer = SDL_CreateRenderer(sdlWindow, -1,
SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);
// Create sync prims
killSignal = SDL_CreateSemaphore(0);
deadSignal = SDL_CreateSemaphore(0);
SDL_AtomicSet(&numAliveThreads, 0);

// Kick off threads
unsigned numThreads = SDL_GetCPUCount();
for (unsigned t = 0; t < numThreads; t++)
{
    SDL_Thread* thread = SDL_CreateThread(testcrashThread, NULL, NULL);
    SDL_DetachThread(thread);
}

// Wait for escape to be hit
while(true)
{
    SDL_Event e;
    if(SDL_PollEvent(&e) && e.type == SDL_KEYDOWN && e.key.keysym.sym
== SDLK_ESCAPE)
{
break;
}
}
// Kill threads
for(unsigned t = 0; t < numThreads; t++)
{
    SDL_SemPost(killSignal);
}

// Wait for them to die
SDL_SemWait(deadSignal);

// Clean up & shut down
SDL_DestroySemaphore(killSignal);
SDL_DestroySemaphore(deadSignal);
SDL_DestroyRenderer(sdlRenderer);
SDL_DestroyWindow(sdlWindow);
SDL_Quit();

return 0;
}

And an example callstack:

Code:

SDL2.dll!SDL_free_REAL(void * mem) Line 4379 C
SDL2.dll!SDL_RunThread(void * data) Line 294 C
SDL2.dll!RunThread(void * data) Line 85 C
SDL2.dll!RunThreadViaBeginThreadEx(void * data) Line 100 C
msvcr120d.dll!_callthreadstartex() Line 376 C
msvcr120d.dll!_threadstartex(void * ptd) Line 359 C
kernel32.dll!@BaseThreadInitThunk at 12 () Unknown
ntdll.dll!__RtlUserThreadStart() Unknown
ntdll.dll!__RtlUserThreadStart at 8 () Unknown

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

satchmo · May 9, 2014, 6:46pm

Juan Manuel Borges Ca??o wrote:

You wait for dead signal, once the thread sends it, you destroy the system, what the thread still needs to do is clean after itself, that mere return 0…, that could kick some pool job…
Usually the approach is to wait for thread end EXPLICITILY, say WaitThread(), then and that guarantees its destruction.
That’s one way to do threading with SDL, but the other is to call SDL_DetachThread() and let the thread clean up after itself, like my sample code does.

That’s mostly irrelevant though; the thread local storage implementation also makes thread-unsafe realloc() calls. The real problem here is the memory model used for the threading implementation, not the method of managing the threads themselves.

I just went to log a bug about this, but it looks like there’s one already (https://bugzilla.libsdl.org/show_bug.cgi?id=2494) that was logged just last month. I’ll add a comment.