Syncing with the monitor retrace

Multithreading is not a solution. Seriously, if you want accurate
timings, the last thing you want is threads. Random context switches
kill accurate timings. Don’t think that a multi-core CPU will protect
you; the threads still need to use mutexes to synchronize. Ideally, you
wouldn’t even use a multitasking operating system.

You can get reasonably accurate vsync timings like this:
SDL_RenderPresent(…);
base_time = SDL_GetTicks();
SDL_RenderPresent(…);
interval = SDL_GetTicks() - base_time;
Future vsyncs will happen predictably at base_time + interval * N. For
added accuracy, make several measurements of the interval, discarding
any very high values or very low values, and rebase base_time after each
SDL_RenderPresent.On 2013-05-06 14:27, jeroen clarysse wrote:

allright, here’s a wrap-up of my findings so far :

  • SDL does not have a way to detect vertical blank, so we can not use that to coordinate things
  • multithreading is a solution, but it is impossible to accurately coordinate everything inside my own framework in such a way that it always works


Rainer Deyke (rainerd at eldwood.com)

@Nathaniel :

as a side note, i’ve peeked inside the SDL sources, and found

Code:

#define MAXEVENTS 128

so your 256 was even optimistic :slight_smile:

and yes : it is a “running” array with head/tail/next pointers that walk through the array, overwriting anything that is older than 128 !

but I’m not so worried about this : on a 100Hz monitor, I would be calling PumpEvents every 10msec…if there are so many things happening that this 128 queue is filled up in 10msec, any timing accuracy will be worthless anyway ! Most likely, refreshes will be missed, which is detected inside the main thread that checks if the current clock is never more than the (previous_clock+refresh_time)

@Rainer : thx for the feedback

yeah… sigh… I know… mutexes make things a lot more complicated… I really have to decide what to do. I could force my users to use a multi core CPU, and thread scheduling would allow me to ensure that both threads are NOT running on the same core. So context switches here will be minimal… I think (correct me if i’m wrong !!)

It is really a choice between two evils :

  • use threading :

PRO : we can use SetSwapInterval(1) so we KNOW for SURE that images are displayed in sync. By using the “flag” system i described earlyer, I can start my timer IMMEDIATELY after the RenderPresent has completed, so any user-related device input is synced to the end of the swap.
CON : risk of complications due to threads
CON : mutex code needed, which might slow down things a lot
CON : sdl uses its own thread also to handle events. (again : correct me if i’m wrong) It will take a lot of fine tuning to make sure these three threads don’t interfere

  • use your own loop

PRO : threading risks avoided, mutex bottleneck solved
CON : sdl is threaded anyway, so we are STILL based on threads !
CON : refresh rate is not a simple constant. You can’t just calculate it from a few swap() calls I have noticed. It seems to vary a bit : not much, but if you have a 100Hz display, that implies 12000 refreshes in 2 minutes (reasonably expectable trial length in our experiments). If you vary 0.05msec per refresh, you can end up with a 5 msec deviation in 1000 frames . So we’d have to periodically recalibrate this… but that would imply switching from SwapInterval(0) to SwapInterval(1) periodically… making htings quite complicated since I have to predict that this will NOT happen at a critical time in the experiment !

ideas ?

I have to either sacrifice sync-accuracy, or simplicity

the only really proper solution would be to have a GetScanLine() routine like DirectDraw has…

Late to the discussion, but I really doubt SDL_GetTicks is going to be
even remotely useful for something like this, just out of accuracy. In
fact I think the minimum guaranteed accuracy is just 10ms, that’s
1/100th of a second, which goes to say how inaccurate it is.

2013/5/6, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>:> @Rainer : thx for the feedback

yeah… sigh… I know… mutexes make things a lot more complicated… I
really have to decide what to do. I could force my users to use a multi core
CPU, and thread scheduling would allow me to ensure that both threads are
NOT running on the same core. So context switches here will be minimal… I
think (correct me if i’m wrong !!)

It is really a choice between two evils :

  • use threading :

PRO : we can use SetSwapInterval(1) so we KNOW for SURE that images are
displayed in sync. By using the “flag” system i described earlyer, I can
start my timer IMMEDIATELY after the RenderPresent has completed, so any
user-related device input is synced to the end of the swap.
CON : risk of complications due to threads
CON : mutex code needed, which might slow down things a lot
CON : sdl uses its own thread also to handle events. (again : correct me if
i’m wrong) It will take a lot of fine tuning to make sure these three
threads don’t interfere

  • use your own loop

PRO : threading risks avoided, mutex bottleneck solved
CON : sdl is threaded anyway, so we are STILL based on threads !
CON : refresh rate is not a simple constant. You can’t just calculate it
from a few swap() calls I have noticed. It seems to vary a bit : not much,
but if you have a 100Hz display, that implies 12000 refreshes in 2 minutes
(reasonably expectable trial length in our experiments). If you vary
0.05msec per refresh, you can end up with a 5 msec deviation in 1000 frames
. So we’d have to periodically recalibrate this… but that would imply
switching from SwapInterval(0) to SwapInterval(1) periodically… making
htings quite complicated since I have to predict that this will NOT happen
at a critical time in the experiment !

ideas ?

I have to either sacrifice sync-accuracy, or simplicity

the only really proper solution would be to have a GetScanLine() routine
like DirectDraw has…

Late to the discussion, but I really doubt SDL_GetTicks is going to be
even remotely useful for something like this, just out of accuracy. In
fact I think the minimum guaranteed accuracy is just 10ms, that’s
1/100th of a second, which goes to say how inaccurate it is.

are you sure about that ? According to the documentation, SDL_getTicks is in msec… Looking in the SDL sources, it is a wrapper around gettimeofday(), which is msec also…

but if you are right, there is always SDL_GetPerformanceCounter(), which should be more accurate, right ?

interesting nonetheless !

SDL_GetTicks returns 1ms values, but the minimum you can expect from
the OS is 10ms. Note that this depends on the OS, on some systems you
will indeed get 1ms accuracy, you just can’t rely on it.

And yes, SDL_GetPerformance*() is definitely much more accurate, it’s
designed to use the high precision timers (SDL_GetTicks() uses the low
precision ones).

2013/5/6, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>:>

Late to the discussion, but I really doubt SDL_GetTicks is going to be
even remotely useful for something like this, just out of accuracy. In
fact I think the minimum guaranteed accuracy is just 10ms, that’s
1/100th of a second, which goes to say how inaccurate it is.

are you sure about that ? According to the documentation, SDL_getTicks is in
msec… Looking in the SDL sources, it is a wrapper around gettimeofday(),
which is msec also…

but if you are right, there is always SDL_GetPerformanceCounter(), which
should be more accurate, right ?

interesting nonetheless !

I doublechecked the SDL2 sources, and found this for GetPerformaceCounter() :

Code:
Uint64
SDL_GetPerformanceCounter(void)
{
#if HAVE_CLOCK_GETTIME
Uint64 ticks;
struct timespec now;

clock_gettime(CLOCK_MONOTONIC, &now);
ticks = now.tv_sec;
ticks *= 1000000000;
ticks += now.tv_nsec;
return (ticks);

#else
Uint64 ticks;
struct timeval now;

gettimeofday(&now, NULL);
ticks = now.tv_sec;
ticks *= 1000000;
ticks += now.tv_usec;
return (ticks);

#endif
}

and this for GetTicks() :

Code:

Uint32
SDL_GetTicks(void)
{
#if HAVE_CLOCK_GETTIME
Uint32 ticks;
struct timespec now;

clock_gettime(CLOCK_MONOTONIC, &now);
ticks =
    (now.tv_sec - start.tv_sec) * 1000 + (now.tv_nsec -
                                          start.tv_nsec) / 1000000;
return (ticks);

#else
Uint32 ticks;
struct timeval now;

gettimeofday(&now, NULL);
ticks =
    (now.tv_sec - start.tv_sec) * 1000 + (now.tv_usec -
                                          start.tv_usec) / 1000;
return (ticks);

#endif
}

so basically, if you DONT have the have_clock_gettime enabled, GetTicks and PerfCounter derive from the same function : gettimeofday(), which is according to the BSD man pages microsecond accurate…

You’re only taking into account *nix systems…

2013/5/6, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>:> I doublechecked the SDL2 sources, and found this for GetPerformaceCounter()

:

Code:
Uint64
SDL_GetPerformanceCounter(void)
{
#if HAVE_CLOCK_GETTIME
Uint64 ticks;
struct timespec now;

clock_gettime(CLOCK_MONOTONIC, &now);
ticks = now.tv_sec;
ticks *= 1000000000;
ticks += now.tv_nsec;
return (ticks);

#else
Uint64 ticks;
struct timeval now;

gettimeofday(&now, NULL);
ticks = now.tv_sec;
ticks *= 1000000;
ticks += now.tv_usec;
return (ticks);

#endif
}

and this for GetTicks() :

Code:

Uint32
SDL_GetTicks(void)
{
#if HAVE_CLOCK_GETTIME
Uint32 ticks;
struct timespec now;

clock_gettime(CLOCK_MONOTONIC, &now);
ticks =
    (now.tv_sec - start.tv_sec) * 1000 + (now.tv_nsec -
                                          start.tv_nsec) / 1000000;
return (ticks);

#else
Uint32 ticks;
struct timeval now;

gettimeofday(&now, NULL);
ticks =
    (now.tv_sec - start.tv_sec) * 1000 + (now.tv_usec -
                                          start.tv_usec) / 1000;
return (ticks);

#endif
}

so basically, if you DONT have the have_clock_gettime enabled, GetTicks and
PerfCounter derive from the same function : gettimeofday(), which is
according to the BSD man pages microsecond accurate…

You’re only taking into account *nix systems…

sorry, you’re right about that indeed… I assume that on windows, SDL will use QueryPerformanceCounter(), which is also very accurate. On other systems such, I have no idea really… But mac/unix & windows is probably 99% of the SDL target ?

OSX and Android too.

Just looked up Windows. It does have the option of using
QueryPerformanceCounter, but also it can be made to use GetTickCount,
which is extremely inaccurate (in fact MSDN says it may be even more
inaccurate than 10ms), so you can’t rely on it unless you control the
build process.

Of course at this point I’d have to wonder what Windows system doesn’t
support QueryPerformanceCounter, given it was around back in the
Windows 9x era already… Maybe removing GetTickCount support should
be considered in the future? In fact, SDL_GetTicks() in itself
probably could be built entirely on top of SDL functions only (no
system-specific code). I know it predates the high precision timers
though so that may be why it’s done that way still.

2013/5/6, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>:>

You’re only taking into account *nix systems…

sorry, you’re right about that indeed… I assume that on windows, SDL will
use QueryPerformanceCounter(), which is also very accurate. On other systems
such, I have no idea really… But mac/unix & windows is probably 99% of the
SDL target ?

Sik wrote:

OSX and Android too.

Just looked up Windows. It does have the option of using
QueryPerformanceCounter, but also it can be made to use GetTickCount,
which is extremely inaccurate (in fact MSDN says it may be even more
inaccurate than 10ms), so you can’t rely on it unless you control the
build process.

Of course at this point I’d have to wonder what Windows system doesn’t
support QueryPerformanceCounter, given it was around back in the
Windows 9x era already… Maybe removing GetTickCount support should
be considered in the future? In fact, SDL_GetTicks() in itself
probably could be built entirely on top of SDL functions only (no
system-specific code). I know it predates the high precision timers
though so that may be why it’s done that way still.

I don’t think that SDL2 is still dependent on GetTickCount !! I just opened the project folder and did a search on all files for “GetTickCount”. Only in SDL_systimer.c, inside src/timer/windows is there a reference still, but that one is inside preprocessor directives #ifdef USE_GETTICKCOUNT, which is not defined in any makefile

so i’m fairly sure that all timing code on all major platforms is now microsecond accurate, or at least millisecond… (not withstanding scheduler interrupts of course !)

The fact the USE_GETTICKCOUNT code is still there does though imply it
may still get used under some conditions, and the programs using SDL
shouldn’t rely on SDL being built in a particular way (only that the
same source code is used).

Again, it’s debatable why that code is still present. As I said,
SDL_GetTicks doesn’t even need system-specific functions, it could be
done with other SDL calls only (at this point becoming just a
convenience function).

2013/5/6, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>:>

Sik wrote:

OSX and Android too.

Just looked up Windows. It does have the option of using
QueryPerformanceCounter, but also it can be made to use GetTickCount,
which is extremely inaccurate (in fact MSDN says it may be even more
inaccurate than 10ms), so you can’t rely on it unless you control the
build process.

Of course at this point I’d have to wonder what Windows system doesn’t
support QueryPerformanceCounter, given it was around back in the
Windows 9x era already… Maybe removing GetTickCount support should
be considered in the future? In fact, SDL_GetTicks() in itself
probably could be built entirely on top of SDL functions only (no
system-specific code). I know it predates the high precision timers
though so that may be why it’s done that way still.

I don’t think that SDL2 is still dependent on GetTickCount !! I just opened
the project folder and did a search on all files for “GetTickCount”. Only in
SDL_systimer.c, inside src/timer/windows is there a reference still, but
that one is inside preprocessor directives #ifdef USE_GETTICKCOUNT, which is
not defined in any makefile

so i’m fairly sure that all timing code on all major platforms is now
microsecond accurate, or at least millisecond… (not withstanding scheduler
interrupts of course !)

The fact the USE_GETTICKCOUNT code is still there does though imply it
may still get used under some conditions, and the programs using SDL
shouldn’t rely on SDL being built in a particular way (only that the
same source code is used).

Again, it’s debatable why that code is still present. As I said,
SDL_GetTicks doesn’t even need system-specific functions, it could be
done with other SDL calls only (at this point becoming just a
convenience function).

true !

jeroen clarysse wrote:

@Rainer : thx for the feedback

yeah… sigh… I know… mutexes make things a lot more complicated… I really have to decide what to do. I could force my users to use a multi core CPU, and thread scheduling would allow me to ensure that both threads are NOT running on the same core. So context switches here will be minimal… I think (correct me if i’m wrong !!)

It is really a choice between two evils :

  • use threading :

PRO : we can use SetSwapInterval(1) so we KNOW for SURE that images are displayed in sync. By using the “flag” system i described earlyer, I can start my timer IMMEDIATELY after the RenderPresent has completed, so any user-related device input is synced to the end of the swap.
CON : risk of complications due to threads
CON : mutex code needed, which might slow down things a lot
CON : sdl uses its own thread also to handle events. (again : correct me if i’m wrong) It will take a lot of fine tuning to make sure these three threads don’t interfere

  • use your own loop

PRO : threading risks avoided, mutex bottleneck solved
CON : sdl is threaded anyway, so we are STILL based on threads !
CON : refresh rate is not a simple constant. You can’t just calculate it from a few swap() calls I have noticed. It seems to vary a bit : not much, but if you have a 100Hz display, that implies 12000 refreshes in 2 minutes (reasonably expectable trial length in our experiments). If you vary 0.05msec per refresh, you can end up with a 5 msec deviation in 1000 frames . So we’d have to periodically recalibrate this… but that would imply switching from SwapInterval(0) to SwapInterval(1) periodically… making htings quite complicated since I have to predict that this will NOT happen at a critical time in the experiment !

ideas ?

I have to either sacrifice sync-accuracy, or simplicity

the only really proper solution would be to have a GetScanLine() routine like DirectDraw has…

In both solutions (threaded and threadless), event processing is restricted by calls to SDL_PumpEvents, which is limited by vsync.
Unless you’re drawing fairly complex scenes, you certainly don’t need 10ms to render.
You should have plenty of time to also properly handle events in the main thread for each frame.

And SDL event thread:
Windows - no
Linux/X11 - no.
Mac OS X - no
In fact, I think BeOS might be the only system to use it (but don’t quote me on that).

Sik wrote:

Late to the discussion, but I really doubt SDL_GetTicks is going to be
even remotely useful for something like this, just out of accuracy. In
fact I think the minimum guaranteed accuracy is just 10ms, that’s
1/100th of a second, which goes to say how inaccurate it is.

Probably correct. It was just an idea, which it seems he has proven wrong.

jeroen clarysse wrote:

Late to the discussion, but I really doubt SDL_GetTicks is going to be
even remotely useful for something like this, just out of accuracy. In
fact I think the minimum guaranteed accuracy is just 10ms, that’s
1/100th of a second, which goes to say how inaccurate it is.

are you sure about that ? According to the documentation, SDL_getTicks is in msec… Looking in the SDL sources, it is a wrapper around gettimeofday(), which is msec also…

but if you are right, there is always SDL_GetPerformanceCounter(), which should be more accurate, right ?

interesting nonetheless !

gettimeofday is actually microsecond resolution.

And on Unix, SDL_GetPerformanceCounter also uses gettimeofday or clock_gettime, same as SDL_GetTicks, making accuracy comparable. On PSP and BeOS, it is literally just a proxy to SDL_GetTicks. In fact, SDL_GetPerformanceCounter is only useful on Windows, which is the only supported system that provides actual performance counters.

anyway, I think the 10ms guarantee is for portability reasons. Some platforms might not actually have any timing mechanism with greater precision than 10ms.------------------------
Nate Fries