Syncing with the monitor retrace

jeroen_clarysse1 · April 30, 2013, 10:16am

a renderer can be created with the SDL_RENDERER_PRESENTVSYNC flag, which (according to the docs) causes “present is synchronized with the refresh rate”.

my question is : suppose my monitor is a 100Hz monitor and refreshes at time 10msec, 20msec, 30msec, etc etc

what happens when I call RenderePresent() at time 25 ? Will my whole application be locked until time 30 is reached, upon which the display is updated ? Or is there some internal queueing in SDL that returns from the Present() call immediately but internally will call another Present routine when 30 is reached ?

For my code, it is important to be able to sync with the retrace, but i do NOT want to block my app… I need to poll external devices every msec…

in DirectX, there a call “Get Retrace Scan Position()” exists, which returns the position of the beam. I used to simply check when that beam crossed over to 0, and called a FLIP() at that point… is such a thing possible with SDL? How would you recommend me solving it ?

jeroen_clarysse1 · April 30, 2013, 10:23am

as a side note : I’ve been digging through the NSOpenGL documentation in XCode, and found this note :

NSOpenGLCPSwapInterval
Sets or gets the swap interval.
The swap interval is represented as one long. If the swap interval is set to 0 (the default), the flushBuffer method executes as soon as possible, without regard to the vertical refresh rate of the monitor. If the swap interval is set to 1, the buffers are swapped only during the vertical retrace of the monitor.
Available in OS X v10.0 and later.
Declared in NSOpenGL.h.

again, it isn’t entirely clear what happens when you call an openGL context swap at time 25 : the documentation states “swapped only during retrace”… does that mean that if you call swap outside that interval, NOTHING IS SWAPPED, or is the swap simply delayed ? If so, is the app blocked, or does some internal mechanism handle this in a separate thread ? What happens if stuff is drawn to the backbuffer between the call at 25 and the effective retrace sync at 30 ? What happens if TWO swap calls are made within one retrace ?

questions, questions, questions…

jeroen_clarysse1 · April 30, 2013, 10:35am

as one last question (I should have grouped my questions in one reply… )

is there a way in SDL to “hook into the vertical blank” ? I.e. : can I attach a function to the OpenGL engine that gets called EVERY time a retrace is completed ? That would be a nice way to do proper stimulus presentation in “frame counts”…

Scott_Percival · April 30, 2013, 11:37am

I could be wrong, but here’s my understanding from the OpenGL side of
things. WGL and GLX don’t have a method to poll for the refresh rate or the
vertical retrace status, instead they have an extension
(GLX_EXT_swap_control) to set the swap interval, exposed in SDL as
SDL_GL_SetSwapInterval and SDL_GL_GetSwapInterval.

A swap interval of 0 means buffers are swapped as fast as possible with no
regard for vsync, 1 means the buffer swap call will block by sleeping until
the vertical retrace finishes, 2 means the same but for every 2nd retrace,
and so forth. Therefore, in the 100Hz case a draw call made at 25ms would
sleep until it hits 30ms, then release.

Also there’s a further extension (GLX_EXT_swap_control_tear) for
“Xbox-style” vsync handling, where any errant draw call that misses the
retrace will trigger a buffer swap ASAP, then afterwards revert back to
vsync. But that doesn’t really help in this case.

The OpenGL platform APIs don’t support callbacks, so you’re probably out of
luck. If you really need vertical sync, you will most likely have to poll
your external devices from a separate thread, or perhaps change your
strategy (e.g. set the swap interval to 1, determine the refresh rate, set
the swap interval to 0, start polling and trigger buffer swaps by checking
a timer).On 30 April 2013 18:35, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>wrote:

**
as one last question (I should have grouped my questions in one reply… )

is there a way in SDL to “hook into the vertical blank” ? I.e. : can I
attach a function to the OpenGL engine that gets called EVERY time a
retrace is completed ? That would be a nice way to do proper stimulus
presentation in “frame counts”…

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

jeroen_clarysse1 · April 30, 2013, 11:51am

damd… that’s a painful situation…

how do high performance games handle this ? I mean : you sometimes see fps > 100 on some games… I guess they use a separate thread to poll the keyboard & mouse and do collision calculations ? If you look at games in the process manager/activity monitor, they are for sure taxing the CPU, so they are not sitting idle waiting for the VSync…

your last suggestion might work, but it is a bit dangerous, since any rounding errors will accumulate over time, causing redraws to occur midscreen

my biggest fear here is the following : my app is not a game, but an engine for psychology experiments. We need to be able to poll devices all the time to see if a subject has responded to visual stimuli on the screen. Even a false positive (subject presses too early to be a proper reaction time, or even BEFORE the stimulus is shown) needs to be detected. With a sleep inside the OpenGL_Swap, I will be losing the capacity to detect these responses

I could go multithreaded, but as people have pointed out to me in another forum topic, the odds of introducing more problems that I’m solving are substantial. Especially timing-wise, threads are a “bag of hurt”

Scott_Percival · April 30, 2013, 12:45pm

With the exception of fancy >100Hz monitors, FPS >100 would imply that the
game is running with vsync turned off. Also most games only need to pump
the event queue (which deals with input) once per draw call, which
simplifies things.

Can you describe the external device that you’re polling? If it just
appears to the PC as an ordinary input device (e.g. keyboard, mouse,
joystick), and you need sub-frame input accuracy (which seems like slight
overkill IMHO), then you might be able to rig something by looping
SDL_PollEvent, and using a timer to record measurements and swap the
buffers as described before.On 30 April 2013 19:51, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be>wrote:

**
damd… that’s a painful situation…

how do high performance games handle this ? I mean : you sometimes see fps

100 on some games… I guess they use a separate thread to poll the
keyboard & mouse and do collision calculations ? If you look at games in
the process manager/activity monitor, they are for sure taxing the CPU, so
they are not sitting idle waiting for the VSync…

your last suggestion might work, but it is a bit dangerous, since any
rounding errors will accumulate over time, causing redraws to occur
midscreen

my biggest fear here is the following : my app is not a game, but an
engine for psychology experiments. We need to be able to poll devices all
the time to see if a subject has responded to visual stimuli on the screen.
Even a false positive (subject presses too early to be a proper reaction
time, or even BEFORE the stimulus is shown) needs to be detected. With a
sleep inside the OpenGL_Swap, I will be losing the capacity to detect these
responses

I could go multithreaded, but as people have pointed out to me in another
forum topic, the odds of introducing more problems that I’m solving are
substantial. Especially timing-wise, threads are a “bag of hurt”

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Gabriele_Greco · April 30, 2013, 12:46pm

my biggest fear here is the following : my app is not a game, but an
engine for psychology experiments. We need to be able to poll devices all
the time to see if a subject has responded to visual stimuli on the screen.
Even a false positive (subject presses too early to be a proper reaction
time, or even BEFORE the stimulus is shown) needs to be detected. With a
sleep inside the OpenGL_Swap, I will be losing the capacity to detect these
responses

I don’t thing an human being is capable of reaction times under 1/60 of
second, so I’m quite sure that any choice you’ll make will not invalidate
your simulation purposes.

Anyway I think the only way to be sure to process as many frames as
possibile and never wait it’s to disable VSYNC, if the FPS are high you
will barely notice any tearing.

I could go multithreaded, but as people have pointed out to me in another
forum topic, the odds of introducing more problems that I’m solving are
substantial. Especially timing-wise, threads are a “bag of hurt”

This will not solve your problem anyway, since if you detect something on
another thread you’ll have anyway to push the results to the main thread,
and wait for the vsync to display anything.–
Ing. Gabriele Greco, DARTS Engineering
Tel: +39-0100980150 Fax: +39-0100980184
s-mail: Piazza Della Vittoria 9/3 - 16121 GENOVA (ITALY)

Scott_Smith · April 30, 2013, 12:57pm

Most modern games use multithreading which can get complicated.

I handle this by letting the main loop render as fast as possible and then
adjusting objects according to how long the render took. Vsync can also be used
to limit the frame rate and it works since the logic is tied to how much time
has passed, not how many frames have rendered.
I use this method and works well, as I can get the same logical speed of the
games across slow to fast systems, with just a fluctuation of frame rate.

Heres a more detailed write up on the method:
http://www.koonsolo.com/news/dewitters-gameloop/________________________________
From: jeroen.clarysse@ppw.kuleuven.be (jeroen clarysse)
To: sdl at lists.libsdl.org
Sent: Tue, April 30, 2013 7:51:12 AM
Subject: Re: [SDL] syncing with the monitor retrace

damd… that’s a painful situation…

how do high performance games handle this ? I mean : you sometimes see fps > 100
on some games… I guess they use a separate thread to poll the keyboard & mouse
and do collision calculations ? If you look at games in the process
manager/activity monitor, they are for sure taxing the CPU, so they are not
sitting idle waiting for the VSync…

your last suggestion might work, but it is a bit dangerous, since any rounding
errors will accumulate over time, causing redraws to occur midscreen

my biggest fear here is the following : my app is not a game, but an engine for
psychology experiments. We need to be able to poll devices all the time to see
if a subject has responded to visual stimuli on the screen. Even a false
positive (subject presses too early to be a proper reaction time, or even BEFORE
the stimulus is shown) needs to be detected. With a sleep inside the
OpenGL_Swap, I will be losing the capacity to detect these responses

I could go multithreaded, but as people have pointed out to me in another forum
topic, the odds of introducing more problems that I’m solving are substantial.
Especially timing-wise, threads are a “bag of hurt”

jeroen_clarysse1 · April 30, 2013, 12:58pm

Gabriele Greco wrote:

I don’t thing an human being is capable of reaction times under 1/60 of second, so I’m quite sure that any choice you’ll make will not invalidate your simulation purposes.

ha, you’re underestimating psychologists
no, in al seriousness : we sometimes display a series of images (animation if you want) and need to record input from the subject. The actual response time to the image is not relevant, but a few seconds later, a variation of that animation might be shown, and we need to see if the subject is responding slower or faster to this new animation. Sometimes, the examined effect is in the order of 5 msec ! If each frame of the animation can induce a sleep-time, it is impossible to measure this

Gabriele Greco wrote:

Threading will not solve your problem anyway, since if you detect something on another thread you’ll have anyway to push the results to the main thread, and wait for the vsync to display anything.

Well, in our experiments, the visual feedback on the screen rarely needs accurate timing. It’s only the subject’s response to the first visual stimulus that needs to be recorded accurately

Rainer_Deyke · April 30, 2013, 3:19pm

Simple, almost predictable way to measure reaction time:

Draw your image.
Present()
Measure the time after Present() returns.
Measure the time when you receive user input.
Take the difference between these two times.
It doesn’t matter if Present() returns immediately or blocks for 20 ms
or blocks for five minutes, so long as it returns at the exact time when
the image is sent to the screen. The only problem is that the OS might
choose the schedule another task in the exact instant between when
Present() returns and when you measure the time, but there’s no way to
avoid this without a real-time OS.

The same principle applies to animation, so long as you can pump out
frames faster than the screen refresh rate. Just measure the time after
the first frame.On 2013-04-30 14:58, jeroen clarysse wrote:

ha, you’re underestimating psychologists no, in al seriousness :
we sometimes display a series of images (animation if you want) and
need to record input from the subject. The actual response time to
the image is not relevant, but a few seconds later, a variation of
that animation might be shown, and we need to see if the subject is
responding slower or faster to this new animation. Sometimes, the
examined effect is in the order of 5 msec ! If each frame of the
animation can induce a sleep-time, it is impossible to measure this

–
Rainer Deyke (rainerd at eldwood.com)

Frederik_vom_Hofe · April 30, 2013, 4:37pm

Actually GLX at last has glXGetVideoSyncSGI to read out the vsync frame counter. I don’t know if there is an equivalent on windows.

How do you want to read out reactions as short as 5 Milliseconds?

60Hz screen = full frame update in 16,6_ Milliseconds
120 Hz screen = full frame update in 8,3_ Milliseconds
200 Hz screen = full frame update in 5 Milliseconds

You could limit yourself to only use the upper or lower part of the screen and thereby make the time for a “full frame update” proportional shorter.

Also flat screen pixels that change in a wide color range may need “a long time” to change. Gray to gray is often 3-1 ms, but black-white changes need more time. (And manufacturer informations on reaction timings are garbage)

And then there is the so called “input lag”. A fixed time delay between the sending of data to the screen and when the screen starts to change the first pixels. Some flat screens lag more then 25ms. But because it is fix, you could measure it and just set a variable in you program accordingly. Note: CRTs have input lag, too!
The graphic card and driver also causes some “input lag” but not that noticeable.

Still the easiest way is to not use vsync at all and render as fast as possible. Then you only have to compensate for input lag.

jeroen_clarysse1 · April 30, 2013, 5:42pm

Frederik vom Hofe wrote:

Actually GLX at last has glXGetVideoSyncSGI to read out the vsync frame counter. I don’t know if there is an equivalent on windows.

How do you want to read out reactions as short as 5 Milliseconds?

60Hz screen = full frame update in 16,6_ Milliseconds
120 Hz screen = full frame update in 8,3_ Milliseconds
200 Hz screen = full frame update in 5 Milliseconds

You could limit yourself to only use the upper or lower part of the screen and thereby make the time for a “full frame update” proportional shorter.

Also flat screen pixels that change in a wide color range may need “a long time” to change. Gray to gray is often 3-1 ms, but black-white changes need more time. (And manufacturer informations on reaction timings are garbage)

And then there is the so called “input lag”. A fixed time delay between the sending of data to the screen and when the screen starts to change the first pixels. Some flat screens lag more then 25ms. But because it is fix, you could measure it and just set a variable in you program accordingly. Note: CRTs have input lag, too!
The graphic card and driver also causes some “input lag” but not that noticeable.

Still the easiest way is to not use vsync at all and render as fast as possible. Then you only have to compensate for input lag.

I’m aware of these limitations, and I myself also think that the obsession with display timing in psychology experiments is sometimes exaggerated ! But there are a ton of articles on this subject, and (some of) my colleagues can be rather persistent about this issue… it is not so much about displaying fast, but about accurate timing :

we want to know as precise as possible WHEN a stimulus is presented on the screen.
we also want to control HOW LONG it is visible with as much precision as possible
the faster it is presented, the better (aka : higher refresh rate is better)
we need to be able to inspect devices ALL the time
all of this should be feasible in a “framework” where the experimenter has as much freedom as (s)he wants.

ideal would be if we would stop using ‘time’ as a measurement of onset for stimuli, but rather use ‘frames’. So instead of saying “present the stimulus at time 100”, we would say “present it after 10 frames” (on a 100Hz monitor). The problem is that I can not reliably count frames : if the OS takes the processor away for just enough time to miss one refresh, everything goes bananas. A “hook” or “interrupt” in the OpenGL core that triggers a custom function at every vertical blank would have been the solution, but hardware probably doesn’t support this, unless this glXGetVideoSyncSGI is exactly that ? My app needs to be Mac + Win

as you see : there is nothing really complicated, but there are a lot of issues involved which CAN lead to complex situations… If this was just a simple display-image-then-wait-for-device, things would be simple. But unfortunately i’m trying to write a FRAMEWORK in which the researchers can create their experiments, and thus I have to foresee execution paths that lead to bizarre results

thanks for replying, all of you !

Frederik_vom_Hofe · April 30, 2013, 7:20pm

In this case I would use vsync and multithreading. (Multithreading is not THAT scary)
You can also give your application a higher priority so it is unlikely that other events stall it.
And you can build in detectors for missed vsync, or time stalls in the even thread.

Then only input lag and frame queues can ruin your measurements.

Some imaginary code:

Code:
SDL_atomic_t syncCounter;

const float screenRefreshRate = 60.0f; //could be auto dedected

const float maxEventTimeStall = 0.05 f;

int eventThread( void *ptr )
{
int lastSyncCounter = 0;
float currentTime = 0;
float currentTimeLastLoop = 0;
float lastSyncCounterTime = static_case(SDL_GetTicks()) / 1000.0f;
while(true)
{
currentTimeLastLoop = currentTime;
currentTime = static_case(SDL_GetTicks()) / 1000.0f;
if ( currentTime - currentTimeLastLoop > maxEventTimeStall )
{
cout << “ERROR event thread stalled to long” << endl;
}

	int currentSyncCounter = SDL_AtomicGet(&vSyncCounter);
	if ( lastSyncCounter != currentSyncCounter )
	{
		lastVsyncCounter = currentSyncCounter;
		if ( ( currentTime - lastSyncCounterTime ) > ( 1.0f / screenRefreshRate * 1.5f ) )
		{
			//if the time from the last frame is older then 1.5 times what it is supposed to be, we trigger an error 
			cout << "ERROR frame was to late to get into vsync" << endl;
		}
		lastSyncCounterTime = currentTime;
	}
	//work with some input
}

}

int main()
{
//enable vsync
//start eventThread;
while(true)
{
clearScreen();
if (
(SDL_AtomicGet(&syncCounter) >= 1100 ) &&
(SDL_AtomicGet(&syncCounter) < 1150 ) )
{
//draw some cat images at frames 1100-1149
}
Swap();
SDL_AtomicIncRef(&syncCounter);
}
}

jeroen_clarysse1 · April 30, 2013, 7:24pm

@ Frederik vom Hofe :

 wow ! This is awesome !!!

I still have to read it more thoroughly (just got home after a long hard workday so my brain is a bit fuzzy) but it helps me a lot !

thank you ! thank you !

jeroen_clarysse1 · April 30, 2013, 7:32pm

as a silly side-note : I know how multithreading works, but I wonder HOW fast thread switches happen on a single core CPU… what is on average the time assigned to every thread ? 5 msecs ? 1 msec ? less than a msec ?

Frederik_vom_Hofe · April 30, 2013, 7:59pm

Thread switching happens 100. of times in a Second anyway in form of hardware IRQs.

The only thing that may cause issues on a single core CPU is the fact that a pooling thread would use 100% CPU time. Such a thread has higher chances of getting CPU time taken away by the OS for a longer time frame. A simple SDL_Delay(1) in the event thread could fix that without introduction to much lag.

jeroen_clarysse1 · April 30, 2013, 8:55pm

@ Frederik vom Hofe :

am I right if i summarize your code as follows :

work with 2 threads :

the main thread, which ONLY does screen updates, and does each of these with VSYNC turned on. This means that on a 100Hz monitor, the thread will sit idle for 9.9 msecs and then blit some stuff on screen. After each swap, a (mutexed) counter is incremented to keep track of the number of frames that have passed. Comparing this value with the previous value can detect OS-generated lag greater than one refresh.
a secondary worker thread which polls the clock and checks for device input every millisecond. By looking at the clock, and comparing that time with the previous looked-at-clock-time, the thread can determine when the OS has caused lag. This thread can also use the mutexed counter to launch actions at specific frame counts.

one question that pops up in my head is : can the worker thread textures that need to be blitted in the main thread ? My two threads would simply share an array of textures+coordinates (I call this the “blit queue”). But sometimes my textures change. Is SDL thread safe in such a way ?
And further down the thread-safe train of thought : can the worker thread call SDL_pumpEvents to process key and mouse events as well ?

the only thing I have to work out, is how to handle “static events”. These are events that are at specific times (for instance at time 205, which is NOT a multiple of refresh rate). If such an event happens, some changes in the blit queue will be made, and the next Swap() call will display these changes. However, my framework can also start internal timers… I have to figure out a way to start these timers when the image is displayed… or I have to delay the static event until the next frame was passed… but neither of these is all that complicated

thanks again ! You gave me whole new insights. The only thing I’m afraid of, is that SDL is not thread safe…

jeroen_clarysse1 · May 3, 2013, 1:51pm

Hi frederik

i’ve been thinking about your samlpe code, but I’m still a bit puzzled : I made a little image to illustrate. The first line is the main thread which does nothing except swapping images synced to the VRS. The 2nd line is the event thread which polls devices and staticaly-timed events. Whenever something is triggered in the event thread, it will simply update a (mutex-protected) array of bitmaps_to_be_drawn. The main thread will, right before the synced-swap check if that array was modified (a simple mutexed boolean must_redraw) and if so, redraw the backbuf before swapping.

so far so good.

I also added some vertical lines to indicate a 100Hz refresh rate. The red curves indicate how the main thread “jumps” every time to the next swap.

Now imagine an event occurs in the event thread at time 13, so between the 2nd and 3rd swap. This event will update the bitmaps_to_be_drawn array and raise the must_redraw boolean. My problem is : WHEN WILL THE NEW SCREN BE VISIBLE ?

Obviously, I want the update to become visible at the next retrace : at time 20. However, if I understand correctly, this will NOT BE THE CASE : at time 20, the main thread will wake up from the sleep it went into after the swap at time 10 !!! So at time 20, it will update the backbuffer and call Swap(vsync=true), which means that it will go back to sleep and update the monitor only at time 30 !!!

am I correct ?

what do you propose as a solution ?

Nathaniel_J_Fries · May 3, 2013, 3:00pm

SDL_Delay has no understanding of frames.

It delays for in milliseconds (on a 100Hz display, 1/10th of a frame, on a 60hz display, 1/17th of a frame, etc).------------------------
Nate Fries

Frederik_vom_Hofe · May 3, 2013, 4:37pm

jeroen clarysse wrote:

Now imagine an event occurs in the event thread at time 13, so between the 2nd and 3rd swap. This event will update the bitmaps_to_be_drawn array and raise the must_redraw boolean. My problem is : WHEN WILL THE NEW SCREN BE VISIBLE ?

Obviously, I want the update to become visible at the next retrace : at time 20. However, if I understand correctly, this will NOT BE THE CASE : at time 20, the main thread will wake up from the sleep it went into after the swap at time 10 !!! So at time 20, it will update the backbuffer and call Swap(vsync=true), which means that it will go back to sleep and update the monitor only at time 30 !!!

am I correct ?

what do you propose as a solution ?]

Thats the cost of syncing to the screen.On a 100Hz screen, a single frame can only use 10ms to draw or it will miss the vsync. The default strategy is to draws as fast as possible and then idle until the vsync occurs. This means any visual change you make will take between 10-20 ms (on 100Hz) until you see it on screen. Theoretically you could use SDL_Delay after the blocking swap function. But this will just make it very likely that you miss the next vsync. Also it would not help a lot: e.g. by cutting the draw time in half you only have 5ms to draw and still 5-15ms before change appears on the screen. But this only is a problem if you need direct screen changes after input. Otherwise just use scripts that the render thread knows in advance (like 3 frames) and then can show stuff at exact predefined frames/time. Nathaniel J Fries wrote:

SDL_Delay has no understanding of frames.

It delays for in milliseconds (on a 100Hz display, 1/10th of a frame, on a 60hz display, 1/17th of a frame, etc).

The only point in using SDL_Delay(1) was to make the event thread idle the smallest possible amount of time so it doesn’t use up 100% CPU but still can measure input timings exactly.