SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage--or not

Borislav_Trifonov · October 30, 2008, 8:06am

A single-threaded program I’m testing with main loop having SDL_GL_SwapBuffer() in it causes around 25-35% CPU usage according to the Windows Task Manager’s Performance tab.

If I force vsync in the NVIDIA driver, somewhat more than half the times I run the program, the CPU usage is near 1% as I expect and want it to be (throughout the run). The other times, though, it’s around 27% (one core near maxed out–throughout the run)–and it’s this latter case that I can’t make any sense of, and is a problem since I will be adding other threads and such waste is going to be an issue. There’s no difference in input for each run. Removing all other SDL_…() calls in the loop also makes no difference. Running other threads created with SDL makes no difference. The fact that I don’t get a consistent result has me baffled…

The only idea I had is that it could be a sampling artifact of the Task Manager’s measurement, but I think that’s not likely. I don’t even know how to begin troubleshooting here

Borislav_Trifonov · October 30, 2008, 5:06pm

Creature, as I wrote, “Removing all other SDL_…() calls in the loop also makes no difference.” I get the same behavior whether there’s an SDL_PollEvents() in the loop or not, so this cannot be the issue. Additionally, you seem to suggest that SDL_GL_SwapBuffer() is not blocking, but I’m quite sure it is at least when vsync is on, because when I do turn vsync on, at least in the 60-65% or runs I do it cuts CPU usage to near zero (the other 35-40% of runs are the pathological case I described of ~27% CPU usage, mostly on one core).> > A single-threaded program I’m testing with main loop having

SDL_GL_SwapBuffer() in it causes around

25-35% CPU usage according to the Windows Task Manager’s
Performance tab.

If I force vsync in the NVIDIA driver, somewhat more than half
the times I
run the program, the CPU usage is
near 1% as I expect and want it to be (throughout the
run).? The other times,
though, it’s around 27% (one core
near maxed out–throughout the run)–and it’s this latter case
that I can’t
make any sense of, and is a
problem since I will be adding other threads and such waste is
going to be an
issue.? There’s no difference in
input for each run.? Removing all other SDL_…() calls
in the loop also
makes no difference.? Running other
threads created with SDL makes no difference.? The fact
that I don’t get a
consistent result has me baffled…

The only idea I had is that it could be a sampling artifact of
the Task
Manager’s measurement, but I think
that’s not likely.? I don’t even know how to begin
troubleshooting here

I think you’re experiencing a well-known ‘problem’ in SDL. I say
’problem’
since it can be or can not be considered as a problem. If you
check for events
in SDL, you call SDL_PollEvent for example. SDL_GL_SwapBuffers
sends a command
to OpenGL saying ‘hey! You can flip the buffers and render now,
I’ve finished
inputting command’. In the meantime your program continues,
where SDL_PollEvent
hoggs up every available millisecond to check for events, the
more actions you
put in your main loop, the less the PollEvent function will have
time to check.

Some consider it as a ‘problem’, where others say “You’re
running a game most
likely anyway, the user isn’t doing anything in the meantime, so
why not direct
the processor to the game?”. Sometimes putting a small SDL_Delay
at the end of
the game loop (2 - 5 milliseconds or so) causes the processor
usage to remain
stable, but this might create an undesired effect.

Brian_Barrett · October 30, 2008, 5:26pm

I’m not sure what behaviour you are expecting.

Most games are glorified infinite loops. In the absence of other
tasks, such a loop will take up close to 100% of the processor. Why
wouldn’t it? The OS has no idea what calculations you are doing - you
could be raytracing for all it knows. Unless your program indicates
that it has no use for its timeslice (by way of something like
SDL_Delay()) or blocks (e.g. on a I/O operation, or in your case when
SwapBuffer() is made a blocking call) it will run as fast as it can.

If you can’t find something interesting to do with the time, use
SDL_Delay(). Note that for users with less powerful machines than
yourself this delay could cause problems, if they need every last
cycle to play the game at “normal” speed.On Thu, Oct 30, 2008 at 5:06 PM, Borislav Trifonov wrote:

Creature, as I wrote, “Removing all other SDL_…() calls in the loop also makes no difference.” I get the same behavior whether there’s an SDL_PollEvents() in the loop or not, so this cannot be the issue. Additionally, you seem to suggest that SDL_GL_SwapBuffer() is not blocking, but I’m quite sure it is at least when vsync is on, because when I do turn vsync on, at least in the 60-65% or runs I do it cuts CPU usage to near zero (the other 35-40% of runs are the pathological case I described of ~27% CPU usage, mostly on one core).

Mason_Wheeler · October 30, 2008, 5:40pm

Try using a timed framerate loop. Decide what your target game framerate should be–this is not necessarily exactly the same as your display framerate, which is controlled by Vsync since you have it on–and divide that into 1000 to find how many miliseconds each “game frame” should take.

You need a variable that stores the current time in miliseconds. At the end of each frame’s logic, find a way to get the time in miliseconds, compare it against the time when the next frame should start (the time that you started working on the frame + the length of one frame) and sleep for the difference if the number is positive.>----- Original Message ----

From: Brian <brian.ripoff at gmail.com>
Subject: Re: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…

If you can’t find something interesting to do with the time, use
SDL_Delay(). Note that for users with less powerful machines than
yourself this delay could cause problems, if they need every last
cycle to play the game at “normal” speed.

Borislav_Trifonov · November 5, 2008, 3:18am

I’m sorry but I’m getting replies that have nothing to do with the issue. The issue is that a program that should behave in deterministic ways gives two completely different runtime behaviors with a certain probability of each occurring, without any change in input. One of the behaviors is correct, as expected, and the other one indicates that there is something wrong. In regards to using SDL_Delay(), that only applies to my multithreaded version and I am using that there, but it has nothing to do with the fact that the problem shows up in the single threaded version!

Again, the single threaded version I run that does very little computation then blocks on SDL_GL_SwapBuffer() (waiting for vsync forced on in the driver) in about 60% of the runs will stay (throughout the run) with near-zero CPU usage, according to task manager, and the other roughly 40% of the runs will stay (throughout the run) with around 27% CPU usage, one of the cores showing near maxed out. Adding more threads that are not too busy (they have a small SDL_Delay()) makes no difference. Calling or not calling any other SDL functions (such as SDL_PollEvents()) in the main loop makes no difference. The expected behavior is obviously that it will be the first case all of the time. But it’s not. How can that be?> ----- Original Message -----

From: @Borislav_Trifonov (Borislav Trifonov)
Date: Tuesday, November 4, 2008 6:08 pm
Subject: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…
To: @Borislav_Trifonov
Brian brian.ripoff at gmail.com
Thu Oct 30 10:26:16 PDT 2008
* Previous message: [SDL] SDL_GL_SwapBuffer()
vsync randomly causes high CPU usage–or not…

Next message: [SDL] SDL_GL_SwapBuffer() +
vsync randomly causes high CPU usage–or not…

Messages sorted by: [ date ] [ thread ] [
subject ] [ author ]

I’m not sure what behaviour you are expecting.

Most games are glorified infinite loops. In the absence of other
tasks, such a loop will take up close to 100% of the processor. Why
wouldn’t it? The OS has no idea what calculations you are doing -
you
could be raytracing for all it knows. Unless your program indicates
that it has no use for its timeslice (by way of something like
SDL_Delay()) or blocks (e.g. on a I/O operation, or in your case when
SwapBuffer() is made a blocking call) it will run as fast as it can.

If you can’t find something interesting to do with the time, use
SDL_Delay(). Note that for users with less powerful machines than
yourself this delay could cause problems, if they need every last
cycle to play the game at “normal” speed.

On Thu, Oct 30, 2008 at 5:06 PM, Borislav Trifonov <@Borislav_Trifonov> wrote:

Creature, as I wrote, “Removing all other SDL_…() calls in
the loop also makes no difference.” I get the same
behavior whether there’s an SDL_PollEvents() in the loop or not,
so this cannot be the issue.

Ulrich_von_Zadow · November 5, 2008, 7:27am

Hi,

I’ve seen cases where SwapBuffer does busy waiting.

Regards,

UliOn Nov 5, 2008, at 4:18 AM, Borislav Trifonov wrote:

I’m sorry but I’m getting replies that have nothing to do with the
issue. The issue is that a program that should behave in
deterministic ways gives two completely different runtime behaviors
with a certain probability of each occurring, without any change in
input. One of the behaviors is correct, as expected, and the other
one indicates that there is something wrong. In regards to using
SDL_Delay(), that only applies to my multithreaded version and I am
using that there, but it has nothing to do with the fact that the
problem shows up in the single threaded version!

Again, the single threaded version I run that does very little
computation then blocks on SDL_GL_SwapBuffer() (waiting for vsync
forced on in the driver) in about 60% of the runs will stay
(throughout the run) with near-zero CPU usage, according to task
manager, and the other roughly 40% of the runs will stay (throughout
the run) with around 27% CPU usage, one of the cores showing near
maxed out. Adding more threads that are not too busy (they have a
small SDL_Delay()) makes no difference. Calling or not calling any
other SDL functions (such as SDL_PollEvents()) in the main loop
makes no difference. The expected behavior is obviously that it
will be the first case all of the time. But it’s not. How can that
be?

----- Original Message -----
From: Borislav Trifonov
Date: Tuesday, November 4, 2008 6:08 pm
Subject: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU
usage–or not…
To: bdt at shaw.ca

Brian brian.ripoff at gmail.com
Thu Oct 30 10:26:16 PDT 2008

Previous message: [SDL] SDL_GL_SwapBuffer()

vsync randomly causes high CPU usage–or not…

Next message: [SDL] SDL_GL_SwapBuffer() +
vsync randomly causes high CPU usage–or not…

Messages sorted by: [ date ] [ thread ] [
subject ] [ author ]

I’m not sure what behaviour you are expecting.

Most games are glorified infinite loops. In the absence of other
tasks, such a loop will take up close to 100% of the processor. Why
wouldn’t it? The OS has no idea what calculations you are doing -
you
could be raytracing for all it knows. Unless your program indicates
that it has no use for its timeslice (by way of something like
SDL_Delay()) or blocks (e.g. on a I/O operation, or in your case when
SwapBuffer() is made a blocking call) it will run as fast as it can.

If you can’t find something interesting to do with the time, use
SDL_Delay(). Note that for users with less powerful machines than
yourself this delay could cause problems, if they need every last
cycle to play the game at “normal” speed.

On Thu, Oct 30, 2008 at 5:06 PM, Borislav Trifonov wrote:

Creature, as I wrote, “Removing all other SDL_…() calls in
the loop also makes no difference.” I get the same
behavior whether there’s an SDL_PollEvents() in the loop or not,
so this cannot be the issue.

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
Any technology distinguishable from magic is insufficiently advanced.

Ulrich von Zadow | +49-172-7872715
Jabber: @Ulrich_von_Zadow
Skype: uzadow

Bob_Pendleton · November 5, 2008, 8:26pm

There are a number of possibilities. I’m going to give you my best
guess based on experience. I could be complete wrong.

My guess is that you have a chaotic process going on with two
attractors. The most probable attractor is what you call the expected
behavior. The less probable attractor is the bad behavior. The fact
that you get into one or the other with a consistent and measurable
probability tells me that that is what you are seeing, a chaotic
process with two distinct attractors.

So… what causes it? You most likely have a race condition in your
code. If one thread wins the race you get the correct behavior. If the
other thread wins you get a lot of waiting for locks and some sort of
thrashing that causes the bad behavior. My guess is that if the race
goes the wrong way something winds up using uninitialized memory
and/or using a bad pointer. The problem doesn’t happen if the race
goes the “correct” way. I saw the same kind of problem in some of my
early multi-threaded code back in the '70s.

How do you fix it? First run the code with a memory checker turned on
to see if you have the same pattern of memory usage in both behaviors.
If you do then either I am wrong or the memory checker is changing the
behavior. If the bad behavior goes away when you use the memory
checker then the checker is removing the race condition.

Second, hand check to make sure that everything is initialized before
it is used by any thread. Don’t do this check alone. You already know
you did it right (all programmers know they did it right, which is why
debugging is so hard so doing it by your self will not work.
Explain the code to someone else. Even if the someone else is a dog, a
cat, or a bird, you will get better results from a code inspection if
it is not done alone. Look carefully for race conditions. Get rid of
static variables and be very careful about all global values. If you
are using C++ be aware that global classes are initialized before
main() runs. That last one is just designed to lead to race
conditions. The cool singleton pattern can be a killer when used in a
multi-threaded program.

The following is not directed at you or anyone specifically, it is
directed at this weird idea that adding delays to code is a good idea:
If you have to add explicit SDL_Delay() calls to a piece of
multi-threaded code to get it to work correctly your code does not
work correctly and needs to be redesigned. Explicit delay calls are
only to be used to slow down a loop and/or control the frequency of a
process. If you have more than one delay in your whole program you are
doing something wrong. You never need to use delay and should be
suspicious of your design if you do find yourself using it.

Oh well, working with threads is a lot of fun once you get it figured
out. When I was first learning to work with threads I learned that
everything I thought I knew about threads was wrong. I have found
that to be true of everyone I have mentored through the process of
learning threads. It really helps to accept that you probably have no
understanding of threads. Once you do that then the main barrier to
learning threads is gone.

Bob PendletonOn Tue, Nov 4, 2008 at 9:18 PM, Borislav Trifonov wrote:

I’m sorry but I’m getting replies that have nothing to do with the issue. The issue is that a program that should behave in deterministic ways gives two completely different runtime behaviors with a certain probability of each occurring, without any change in input. One of the behaviors is correct, as expected, and the other one indicates that there is something wrong. In regards to using SDL_Delay(), that only applies to my multithreaded version and I am using that there, but it has nothing to do with the fact that the problem shows up in the single threaded version!

Again, the single threaded version I run that does very little computation then blocks on SDL_GL_SwapBuffer() (waiting for vsync forced on in the driver) in about 60% of the runs will stay (throughout the run) with near-zero CPU usage, according to task manager, and the other roughly 40% of the runs will stay (throughout the run) with around 27% CPU usage, one of the cores showing near maxed out. Adding more threads that are not too busy (they have a small SDL_Delay()) makes no difference. Calling or not calling any other SDL functions (such as SDL_PollEvents()) in the main loop makes no difference. The expected behavior is obviously that it will be the first case all of the time. But it’s not. How can that be?

----- Original Message -----
From: Borislav Trifonov
Date: Tuesday, November 4, 2008 6:08 pm
Subject: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…
To: bdt at shaw.ca
Brian brian.ripoff at gmail.com
Thu Oct 30 10:26:16 PDT 2008
* Previous message: [SDL] SDL_GL_SwapBuffer()
vsync randomly causes high CPU usage–or not…

Next message: [SDL] SDL_GL_SwapBuffer() +
vsync randomly causes high CPU usage–or not…

Messages sorted by: [ date ] [ thread ] [
subject ] [ author ]

I’m not sure what behaviour you are expecting.

Most games are glorified infinite loops. In the absence of other
tasks, such a loop will take up close to 100% of the processor. Why
wouldn’t it? The OS has no idea what calculations you are doing -
you
could be raytracing for all it knows. Unless your program indicates
that it has no use for its timeslice (by way of something like
SDL_Delay()) or blocks (e.g. on a I/O operation, or in your case when
SwapBuffer() is made a blocking call) it will run as fast as it can.

If you can’t find something interesting to do with the time, use
SDL_Delay(). Note that for users with less powerful machines than
yourself this delay could cause problems, if they need every last
cycle to play the game at “normal” speed.

On Thu, Oct 30, 2008 at 5:06 PM, Borislav Trifonov wrote:

Creature, as I wrote, “Removing all other SDL_…() calls in
the loop also makes no difference.” I get the same
behavior whether there’s an SDL_PollEvents() in the loop or not,
so this cannot be the issue.
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–

Bob Pendleton: writer and programmer
email: Bob at Pendleton.com
web: www.GameProgrammer.com
www.Wise2Food.com

±-------------------------------------+

Mason_Wheeler · November 5, 2008, 8:54pm

If you have to add explicit SDL_Delay() calls to a piece of
multi-threaded code to get it to work correctly your code does not
work correctly and needs to be redesigned. Explicit delay calls are
only to be used to slow down a loop and/or control the frequency of a
process. If you have more than one delay in your whole program you are
doing something wrong. You never need to use delay and should be
suspicious of your design if you do find yourself using it.

Be careful making blanket statements like that. I was building a game
engine last year with a scripting API, and every script ran on its own
thread. There was a “wait” command in the script, plus a handful of special
effects that executed over a certain amount of time and had an option to
wait to move on to the next command until the current one was finished
executing. I implemented the first one directly with a call to Sleep(), and
the second by sending initialization information to the graphics engine
(on the main thread) and then sleeping. I don’t see how I could have
done effectively it without sleeping the script threads.>----- Original Message ----

From: Bob Pendleton
Subject: Re: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…

Bob_Pendleton · November 6, 2008, 4:16pm

If you have to add explicit SDL_Delay() calls to a piece of
multi-threaded code to get it to work correctly your code does not
work correctly and needs to be redesigned. Explicit delay calls are
only to be used to slow down a loop and/or control the frequency of a
process. If you have more than one delay in your whole program you are
doing something wrong. You never need to use delay and should be
suspicious of your design if you do find yourself using it.

Be careful making blanket statements like that. I was building a game
engine last year with a scripting API, and every script ran on its own
thread. There was a “wait” command in the script, plus a handful of special
effects that executed over a certain amount of time and had an option to
wait to move on to the next command until the current one was finished
executing. I implemented the first one directly with a call to Sleep(), and
the second by sending initialization information to the graphics engine
(on the main thread) and then sleeping. I don’t see how I could have
done effectively it without sleeping the script threads.

I am very careful about making blanket statements like that. It is a
weird thing, you can ask and ask and ask and never get an answer, but
claim absolute knowledge and people will work for days or even weeks
to prove you wrong. So, sometimes I makes statements like that because
I want to find out if I am right. Other times I do it because I think
I now something and I can help other people learn.

If I am right, then someone else has a chance to learn something from
me. If they prove me wrong, then I learn something from them. In
either case, someone gets something they didn’t have before and I am
happy.On Wed, Nov 5, 2008 at 2:54 PM, Mason Wheeler wrote:

----- Original Message ----
From: Bob Pendleton <@Bob_Pendleton>
Subject: Re: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…

Having a wait/sleep command in scripts is handy, I have to admit that.
Just like using a single sleep in the main loop of and SDL program is
handy for limiting the frame rate. But, I allowed for that in my
blanket statement The trouble is that sleeping does not force the
threads you want to run to actually run. It only forces one thread to
sleep(). I can see that if you have a small number of scripts your way
works most of the time.

OTOH, I would use a very different method. I would use a thread pool
to store threads allocated to running scripts. You do not need a
thread for each script. No matter how many threads you have you will
never have more threads running than you have processors. So, you can
limit your self to one thread per processor and save some overhead.
(You might need more if you have other blocking operations in your
scripting system.) Put your threads in a thread pool and share them
between scripts. You can implement wait() by having it place a script
in a priority queue (a sorted queue aka a heap). Its priority is the
time when it is supposed to wake up. A wait(0) should (if your queue
is correctly implemented) put a script at the end of the group of
threads already scheduled for the same time so it will just give up
control to the next ready to run script.

If no script is ready to run, use a timer to wake up a thread at the
time a script will be ready to run. If more than one script is ready
to run let more threads out of the pool and have them run the other
ready scripts. When a thread finishes it should check to see if there
are other ready scripts and run them. If there are no ready scripts it
should go back to the pool. If it is the last active scripting thread
it should wait on a timer for the next waiting thread. Or, if the
queue is empty wait on the queue so it will awaken when a script is
placed in the queue. If you want to get fancy you can have the threads
do fine grain scheduling (time slicing) on the ready scripts. Which is
something your won’t necessarily get out of the OS.

A good OS does much the same thing for processes and threads that I
just described, Not all OSes are good… And, most OSes have limits on
the number of threads you can have. The only limit on what I just
described is the amount of memory available to your program. Oh yeah,
the engine code winds up with no sleep() calls.

Lets take on the problem of sending commands to the rendering thread.
This is a common problem in threaded code. You need to ask another
thread to do some work for you, and you can’t continue until it is
done. The easiest, and safest, way that I know of to solve the problem
is based on a queue and a semaphore. The requesting threads put their
requests on the “request” queue and then lock a semaphore. The
rendering thread waits on the “request” queue for rendering requests
and when if finishes the request it sends a reply by unlocking the
semaphore. This technique works if each requesting thread has its own
semaphore and includes it as part of the request.

You have to use a counting mutex like a semaphore to avoid race
conditions where the request gets finished before the requesting
thread waits for the results. By using queues you force the requesting
threads to wait until their work is done. If you sleep you don’t know
if the work is finished before you move on. If you use the sleep(0)
trick the rendering thread may never run. If you actually force your
threads to sleep for longer than 0 you may be wasting processor time
that you could have used.

Also, as more and more worker threads wait for results the odds of the
rendering thread actually getting to do something gets higher and
higher. When all the requesting threads are blocked the rendering
thread is the only left that can run and so it will run. Using sleep
there is no way to be sure the rendering thread will ever run.

Bob Pendleton

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–

Bob Pendleton: writer and programmer
email: Bob at Pendleton.com
web: www.GameProgrammer.com
www.Wise2Food.com

±-------------------------------------+

elupus · November 10, 2008, 11:26am

I’m sorry but I’m getting replies that have nothing to do with the issue. The issue is that a program that should behave in deterministic ways gives two completely different runtime behaviors with a certain probability of each occurring, without any change in input. One of the behaviors is correct, as expected, and the other one indicates that there is something wrong. In regards to using SDL_Delay(), that only applies to my multithreaded version and I am using that there, but it has nothing to do with the fact that the problem shows up in the single threaded version!

Again, the single threaded version I run that does very little computation then blocks on SDL_GL_SwapBuffer() (waiting for vsync forced on in the driver) in about 60% of the runs will stay (throughout the run) with near-zero CPU usage, according to task manager, and the other roughly 40% of the runs will stay (throughout the run) with around 27% CPU usage, one of the cores showing near maxed out. Adding more threads that are not too busy (they have a small SDL_Delay()) makes no difference. Calling or not calling any other SDL functions (such as SDL_PollEvents()) in the main loop makes no difference. The expected behavior is obviously that it will be the first case all of the time. But it’s not. How can that be?

[33 quoted lines suppressed]

Hi,

Just want to chime in that we have this issue with xbmc aswell. Has occured
both on windows (AT/NVidia cards) and linux only NVidia cards asfar as I
know. The only way to reduce the cpu usage is doing manual sleeps before
swapping buffers.

Separate issue on my ati card that there is no vsyncing at all :/. Anybody
know if there is something odd with laptops when it comes to vsync.

JoakimOn Tue, 04 Nov 2008 19:18:13 -0800, Borislav Trifonov wrote:

----- Original Message -----
From: Borislav Trifonov
Date: Tuesday, November 4, 2008 6:08 pm
Subject: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…
To: bdt at shaw.ca

Pierre_Phaneuf · November 12, 2008, 5:59pm

Separate issue on my ati card that there is no vsyncing at all :/. Anybody
know if there is something odd with laptops when it comes to vsync.

Because LCDs don’t really have a retrace?On Mon, Nov 10, 2008 at 6:26 AM, elupus wrote:

–
http://pphaneuf.livejournal.com/

Mason_Wheeler · November 12, 2008, 6:08pm

Separate issue on my ati card that there is no vsyncing at all :/. Anybody
know if there is something odd with laptops when it comes to vsync.

Because LCDs don’t really have a retrace?

Then what do they have?>----- Original Message ----

From: Pierre Phaneuf
Subject: Re: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes high CPU usage–or not…
On Mon, Nov 10, 2008 at 6:26 AM, elupus wrote:

Stephen_Anthony · November 12, 2008, 6:09pm

I’m pretty sure it’s a driver issue. Many (most?) of the Intel onboard
graphics cards don’t support vsync either in SDL in Linux, but do in
Windows. I suspect ATI is similar. In fact, I don’t think NVidia does
it from SDL either, but it can be enforced at the OS level.

LCDs in general certainly do support sync to vblank, as they still have a
refresh rate (ie, 60Hz). Otherwise, one wouldn’t be able to prevent
tearing in OpenGL.

SteveOn November 12, 2008, Pierre Phaneuf wrote:

On Mon, Nov 10, 2008 at 6:26 AM, elupus wrote:

Separate issue on my ati card that there is no vsyncing at all :/.
Anybody know if there is something odd with laptops when it comes to
vsync.

Because LCDs don’t really have a retrace?

elupus · November 12, 2008, 6:59pm

I’m pretty sure it’s a driver issue. Many (most?) of the Intel onboard
graphics cards don’t support vsync either in SDL in Linux, but do in
Windows. I suspect ATI is similar. In fact, I don’t think NVidia does
it from SDL either, but it can be enforced at the OS level.

Intel do support it with some trick, ATI seem to work without issues for us
atleast. NVidia has a uggly habit of busy waiting too, but yea you can’t
control it on nvidia using the swap interval setting.

LCDs in general certainly do support sync to vblank, as they still have a
refresh rate (ie, 60Hz). Otherwise, one wouldn’t be able to prevent
tearing in OpenGL.

Steve

Well, in my case opengl does sync to vblank in fullscreen opengl, but not
at all in windowed mode. So it’s atleast sortof workable, but it’s very
annoying.

The high cpu usage bug is quite annoying thou. Seems to busy wait when
waiting for vsync to occur.

JoakimOn Wed, 12 Nov 2008 14:39:54 -0330, Stephen Anthony wrote:

Stephen_Anthony · November 12, 2008, 7:55pm

Well, in my case opengl does sync to vblank in fullscreen opengl, but
not at all in windowed mode. So it’s atleast sortof workable, but it’s
very annoying.

Also usually an OS and/or driver issue. For Nvidia, fullscreen and
windowed do vsync in Linux KDE 3.x, Windows XP, and OSX. But Linux KDE
4.x and Windows Vista don’t do it in windowed mode. I suspect it’s
related to the compositing nature of those latter environments (although
that doesn’t explain OSX, which is also composited).

The high cpu usage bug is quite annoying thou. Seems to busy wait when
waiting for vsync to occur.

My only experience is with Nvidia, and I’ve never seen a busy-wait vsync
in Linux, OSX or Windows. I can’t comment on Intel or ATI in this
regard.

SteveOn November 12, 2008, elupus wrote:

elupus · November 13, 2008, 1:11am

Well, in my case opengl does sync to vblank in fullscreen opengl, but
not at all in windowed mode. So it’s atleast sortof workable, but it’s
very annoying.

Also usually an OS and/or driver issue. For Nvidia, fullscreen and
windowed do vsync in Linux KDE 3.x, Windows XP, and OSX. But Linux KDE
4.x and Windows Vista don’t do it in windowed mode. I suspect it’s
related to the compositing nature of those latter environments (although
that doesn’t explain OSX, which is also composited).

Right, forgot to mention. With aero running, vsync in window does seem to
work. But I suspect that is no feature of opengl, but that the composer
just doesn’t allow opengl to draw into it’s display untill vblank.

The high cpu usage bug is quite annoying thou. Seems to busy wait when
waiting for vsync to occur.

My only experience is with Nvidia, and I’ve never seen a busy-wait vsync
in Linux, OSX or Windows. I can’t comment on Intel or ATI in this
regard.

Well the problem has been seen on ATI on both windows and linux i just
noted. Heres a report for linux on the issue
http://ati.cchtml.com/show_bug.cgi?id=1223On Wed, 12 Nov 2008 16:25:09 -0330, Stephen Anthony wrote:

On November 12, 2008, elupus wrote:

Forest_Hale · November 15, 2008, 6:50am

A technical note: LCDs do have a vblank of sorts - they receive a standard signal (whether VGA or DVI or HDMI) which has enforced hblank and vblank periods, often matching the timings used by CRTs, so
at the signal level there is a substantial vblank period, even if the LCD doesn’t directly care about it - additionally, the LCD does fade the pixels toward their new color as the data comes in, so
there is a refresh strobe passing over the screen, it just has a different meaning (triggering a fade to the new colors, rather than emitting the new colors). And some LCDs have an option to blank
between refreshes to improve the appearance of animations (by re-stimulating the eye so it notices the new frame, just like the refresh of a CRT). So overall, vblank is very much alive in video
cards, video signals, and LCD monitors.

As for busy waiting - I think this is highly driver dependent, and may depend on the tickrate of the kernel (the rate at which it considers switching to another process) - if the tickrate is
fine-grained, the libGL might detect that it can safely wait for nearly the right time and then just busywait briefly until the vblank begins.

Ideally however a device driver would set an interrupt that reawakens the sleeping thread(s) that are blocking on it, or something like that.

Of course it’s hard to tell how these things are done in the closed source drivers, and I haven’t looked at the open ones…

Busywaiting does not surprise me, overall, but I have not observed any busy waiting on vsync in the nvidia drivers.

elupus wrote:> On Wed, 12 Nov 2008 14:39:54 -0330, Stephen Anthony wrote:

I’m pretty sure it’s a driver issue. Many (most?) of the Intel onboard
graphics cards don’t support vsync either in SDL in Linux, but do in
Windows. I suspect ATI is similar. In fact, I don’t think NVidia does
it from SDL either, but it can be enforced at the OS level.

Intel do support it with some trick, ATI seem to work without issues for us
atleast. NVidia has a uggly habit of busy waiting too, but yea you can’t
control it on nvidia using the swap interval setting.

LCDs in general certainly do support sync to vblank, as they still have a
refresh rate (ie, 60Hz). Otherwise, one wouldn’t be able to prevent
tearing in OpenGL.

Steve

Well, in my case opengl does sync to vblank in fullscreen opengl, but not
at all in windowed mode. So it’s atleast sortof workable, but it’s very
annoying.

The high cpu usage bug is quite annoying thou. Seems to busy wait when
waiting for vsync to occur.

Joakim

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
Forest ‘LordHavoc’ Hale
Author of DarkPlaces Quake1 engine and mod
http://icculus.org/twilight/darkplaces/
Address: 94340 Horton Road Blachly OR 97412
Phone/Fax: 541-925-4130