Hi,
Over the past couple of days, I’ve been battling with SDL, SDL_Mixer and
SMPEG to try to
find an audio hang bug. I believe I’ve found the problem, which I think
is a race
condition inside SDL’s semaphore implementation (at least the Windows
implementation).
The semaphore code uses Windows’ built in semaphore functions, but it
also maintains
a separate count value. This count value is updated with bare increment
and decrement
operations in SemPost and SemWaitTimeout - no locking primitives to
protect them.
In tracking down the apparent audio bug, I found that at some point a
semaphore’s count
value was being decremented to -1, which is clearly not a valid value
for it to take.
I’m still not certain exactly what sequence of operations is occuring
for this to happen,
but I believe that overall it’s a race condition between a thread
calling SemPost (which
increments the count) and the thread on the other end calling SemWait
(which decrements it).
I will try to make a test case to verify this, but I’m not sure if I’ll
be able to
(threading errors being difficult to reproduce even in the best
circumstances).
However, assuming this is the cause of my problems, there is a very
simple fix:
Windows provides InterlockedIncrement() and InterlockedDecrement()
functions to perform
increments and decrements which are guaranteed to be atomic.
So the fix is in thread/win32/SDL_syssem.c:
replace occurrences of --sem->count with InterlockedDecrement(&sem->count);
and replace occurrences of ++sem->count with
InterlockedIncrement(&sem->count);
If anyone can take a look and confirm that there is a race condition in
the code, then
I’d appreciate it (since the audio bug is not 100% reliably
reproducable, I can’t be
certain that I’ve nailed it with this fix, although I am fairly sure).
This is using SDL v1.2.12, built with VC++ 2008 Express, running on a
Core 2 duo processor.
I admit I haven’t checked subversion to see if this has already been
fixed, so I
apologise if this is old news for everyone.
Thanks,
John B