Possible SDL Timer Bug?

Scott_W_Bradley · May 6, 2005, 6:55pm

I am not sure if this is a bug, or a “feature” that I happen to find
undesireable but that others rely on…

My problem is that SDL_RemoveTimer() does not guarantee that once it
returns the timer you have removed will not longer be called – due to
the fact that SDL_ThreadedTimerCheck releases the SDL_timer_mutex
around the callback call. The cvsid of the SDL_timer.c that I am using
is 1.3, and I see that two commits have been this year in attempt to
fix some race conditions related to this code, but I still don’t think
the problem is fixed. My suggestion is that the SDL_timer_mutex is not
released during the callback call. (I’ll get back to that later).
Consider the following code:

CMyParam *myParam = new CMyParam();
SDL_TimerID myTimer = SDL_AddTimer(100, myCallback(), myParam);

// main program run loop, do stuff, etc...

// now about to exit -- need to cleanup
SDL_RemoveTimer(myTimer);
delete myParam;
// BUG: it is possible that callback(myParam) gets called once

more here, with
// and invalid myParam pointer.

I see that SDL_ThreadedTimerCheck() has some code using a
’list_changed’ variable to workaround the fact that the list could
change while calling the callback. This wouldn’t be the case if it
didn’t release the lock though. (That, plus the documentation
forbidding you from calling timer functions in a callback – from the
doc wiki: “The timer callback function may run in a different thread
than your main program, and so shouldn’t call any functions from
within itself.”).

I also see that the last two commits to SDL_timer.c (cvsid 1.4 and
1.5) seem to be aimed at fixing the problem that the SDL_TimerID
pointer retrieved from the list (the ‘t’ variable) could be freed
between the release of SDL_timer_mutex and the call to t->cb(). This
also wouldn’t happen if it didn’t release the lock.

However, we still have the problem that the t->param could be
invalidated between the release of the SDL_timer_mutex and the call to
t->cb(). In my particular circumstance, it would be a burden to try to
make all my ‘param’ cleanup happen from the timer callback, or to add
reference counting to this particular param object. In my experience,
it doesn’t seem unreasonable to expect that the “kill timer” function
guarantees that when it returns, the callback will no longer be
called.

If we make the rule about timer callback functions not calling any
funtions a little more specific, e.g. From your timer callback:

Don’t call any timer functions
Don’t take any of your own locks if you also call timer functions
from inside those locks.
Don’t do work that takes a long time (other timer callbacks are serialized)

Then, I think we can safely remove the calls to release and re-acquire
the SDL_timer_mutex around the call to the callback.

So, to the authors…I am wondering if this change is something you’ll
consider, or is there a feeling that existing code will break because
it doesn’t adhere to the rules above, or what? If I make this change
myself on my local copy, is there a general feeling that other
people’s code will have problems with these rules?

Thanks,
-Scott

slouken · May 16, 2005, 8:26am

My problem is that SDL_RemoveTimer() does not guarantee that once it
returns the timer you have removed will not longer be called – due to
the fact that SDL_ThreadedTimerCheck releases the SDL_timer_mutex
around the callback call. The cvsid of the SDL_timer.c that I am using
is 1.3, and I see that two commits have been this year in attempt to
fix some race conditions related to this code, but I still don’t think
the problem is fixed. My suggestion is that the SDL_timer_mutex is not
released during the callback call. (I’ll get back to that later).

Unfortunately, supporting other timer calls from within the timer callback
is allowed by the API, and we can’t guarantee that we’ll have a recursive
locking mechanism available.

Fixes not involving holding the lock are definitely welcome!

See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment