How to get constant framerates without busywaits

Stephen_Anthony · March 14, 2002, 7:52am

This has been a nagging problem in a poject I’m working on for quite some
time. First, let me give some background.

The program is an Atari 2600 emulator called Stella. The problem is that the
main game loop is set up as follows:

while(!quit)
{
handleEvents()
updateDisplay()

Busy-wait until next frame should start
}

I’d like to change that last part to wait for the next frame to start, but
not use a busy-wait. I’ve tried using sleep, usleep, select, etc. but since
the resolution of the timer is 10ms under Linux, a wait is always “too long”.

For example, assume that I want to run at 60 fps (default). Assuming that
the frame and event handling takes 8 ms to complete, then I want to wait
another 8-9 ms (since 60 fps implies 16 ms/frame). Problem is that when I
try to wait 8 ms, I will actually have to wait 10 ms (at least). So the
effective framerate now becomes 50 fps, since the whole loop actually took
closer to 20 ms.

Is there any system-independent / timeslice-independent way to restructure
this game loop so that this won’t happen. I’m not looking for actual code
here, just an idea on how to proceed.

And before anyone makes the suggestion, yes, I must run at the specified rate
(60 vs 50). This is because the program is emulating a hardware device that
actually ran at NTSC rates (60 fps)

If I asked this question before, I apologize. Its just that I understand the
problem better now and can explain it more clearly. If this isn’t the
appropriate group, then suggestions on where to look are appreciated.

Thanks,
Steve Anthony

Stephen_Anthony · March 14, 2002, 8:50am

I forgot to mention in the previous email. I also tried nanosleep, but that
didn’t work either.

Also, the actual code takes much less than 8 ms to execute (more like 1-2
ms). The problem is that I can’t wait on 1 ms intervals, only 10 ms
intervals. So the emulator runs at 50 fps. If I try a higher fps, then it
stays running at 50fps, until I reach some crossover point in the time it
takes to render a frame, in which case it jumps to 100 fps.

The main problem is like I stated above: I need 1ms resolution, not 10 ms, or
alternatively, a better algorithm

Thanks,
Steve

Martijn_Melenhorst · March 14, 2002, 10:00am

This has been a nagging problem in a poject I’m working on for quite some
time. First, let me give some background.

The program is an Atari 2600 emulator called Stella. The problem is that the
main game loop is set up as follows:

while(!quit)
{
handleEvents()
updateDisplay()

Busy-wait until next frame should start
}

I’d like to change that last part to wait for the next frame to start, but
not use a busy-wait. I’ve tried using sleep, usleep, select, etc. but since
the resolution of the timer is 10ms under Linux, a wait is always “too long”.

For example, assume that I want to run at 60 fps (default). Assuming that
the frame and event handling takes 8 ms to complete, then I want to wait
another 8-9 ms (since 60 fps implies 16 ms/frame). Problem is that when I
try to wait 8 ms, I will actually have to wait 10 ms (at least). So the
effective framerate now becomes 50 fps, since the whole loop actually took
closer to 20 ms.

Assuming the client OS can not guarantee you a realtime 60 fps screen mode, how do you know your updateDisplay() call itself will not wait for
the next realtime frame? Because if it does, you’d get irregular realtime framerates. Ofcourse, this only happens if your handleEvents() takes
longer than 1/60th of a second…

Is there any system-independent / timeslice-independent way to restructure
this game loop so that this won’t happen. I’m not looking for actual code
here, just an idea on how to proceed.

And before anyone makes the suggestion, yes, I must run at the specified rate
(60 vs 50). This is because the program is emulating a hardware device that
actually ran at NTSC rates (60 fps)

I may be using a strange solution here, but it would be interesting to see anyone pointing me at faults here, but I used a SDL_Thread solution,
using metaphores to synchronize. Here’s my solution:

while (!quit)
{
if (60fpsTimeElapsed)
{
HandleEvents()
DrawGraphics()
}
if (!WaitingSemaphoreActive) ActivateSemaphore()
}

Then in my flipthread:
{
WaitForSemaphore()
SDL_Flip()
}

or something like that. It’s pseudo-code, ofcourse. So, the waiting for the flip() of the screen is done in the seperate flip thread, untying the main
loop from the thread. Ofcourse, this still forces you to draw the whole screen an unnecessary amount of times, perhaps you could work around
that, but it will allow your “60fpsTimeElapsed” routine to be run at a more consistent rate than before.

David_Olofson · March 14, 2002, 10:51am

[…]

Is there any system-independent / timeslice-independent way to
restructure this game loop so that this won’t happen. I’m not looking
for actual code here, just an idea on how to proceed.

No totally portable way, but many OSes have some form of “multimedia
timers”, and some (like Win32) will actually run a h/w timer at the
specified rate, so you can have your application sleep on the timers.
Win32 multimedia timers have a resolution of 1 ms at best.

On Linux (not sure about other Un*x-like systems) there is a Real Time
Clock device driver that you may use. It can be programmed to power-of-2
rates, and you can block on reads from the device, to have your thread
“clocked” by the timer.

Either way, I still think you’re on the wrong track…

And before anyone makes the suggestion, yes, I must run at the
specified rate (60 vs 50). This is because the program is emulating a
hardware device that actually ran at NTSC rates (60 fps)

Well, Kobo Deluxe is in fact running at exactly 33.333 fps. The fact that
the graphics engine my be running at a significantly higher (or lower)
fps is another story.

There are no timers in there, just one thread, and no busy-waiting.

I’m just checking SDL_GetTicks() once per rendered frame, to see how many
“logic” frames I should run to catch up with the time corresponding to
the next frame to render. (This may be anything from 0 times and up.)

The fact that the logic code doesn’t really run once every 30 ms doesn’t
matter, as it communicates with the outside world only once per rendered
frame. (On low end machines, one could consider increasing the user input
testing rate, but the same principles still apply.)

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Thursday 14 March 2002 16:51, Stephen Anthony wrote:

David_Olofson · March 14, 2002, 11:14am

I forgot to mention in the previous email. I also tried nanosleep, but
that didn’t work either.

Also, the actual code takes much less than 8 ms to execute (more like
1-2 ms). The problem is that I can’t wait on 1 ms intervals, only 10
ms intervals. So the emulator runs at 50 fps. If I try a higher fps,
then it stays running at 50fps, until I reach some crossover point in
the time it takes to render a frame, in which case it jumps to 100 fps.

How about “busy-waiting” with a sched_yield() in the loop? Although it’s
not as efficient as blocking on a retrace IRQ (you can always dream,
can’t you?) or something, but it kind of allows you to busy wait without
hogging the CPU, in the eyes of the scheduler. (That is, the scheduler is
less likely to consider you a CPU hog and lower your priority, as it will
if you’re just busy-waiting.)

BTW, sched_yield() is a pthreads calls. Don’t know what it’s called on
non-posix platforms, but I’d be surprized to see a multitasking OS that
doesn’t have a corresponding call. (What it does is basically “call” the
scheduler to see if any other thread has some work to do. If you’re the
only runnable thread, the call just returns.)

The main problem is like I stated above: I need 1ms resolution, not 10
ms, or alternatively, a better algorithm

I think you need a different design…

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Thursday 14 March 2002 17:49, Stephen Anthony wrote:

Kylotan · March 14, 2002, 11:36am

BTW, sched_yield() is a pthreads calls. Don’t know what it’s called on
non-posix platforms, but I’d be surprized to see a multitasking OS that
doesn’t have a corresponding call. (What it does is basically “call”
the
scheduler to see if any other thread has some work to do. If you’re the
only runnable thread, the call just returns.)

I think people use Sleep(0) on Win32 to achieve a similar effect.From: david.olofson@reologica.se (David Olofson)
Sent: Thursday, March 14, 2002 7:13 PM

–
Kylotan

David_Olofson · March 14, 2002, 12:10pm

BTW, sched_yield() is a pthreads calls. Don’t know what it’s called on
non-posix platforms, but I’d be surprized to see a multitasking OS
that doesn’t have a corresponding call. (What it does is basically
“call”

the

scheduler to see if any other thread has some work to do. If you’re
the only runnable thread, the call just returns.)

I think people use Sleep(0) on Win32 to achieve a similar effect.

Ah! Now that you say it…

BTW, yes, I did discover that SDL_Delay() does not work this way - at
least not on Linux. Maybe it does on Windows…?

*looks at the source*

Looking at the source, it should, provided you’re right about Sleep(0).

*looks at the Linux version*

Well, there is some #ifdef’ed yield call in there, but if that isn’t
compiled in by default, it seems to me that SDL_Delay(0) will effectively
do nothing - which is very different from yielding!

How about throwing in

if(!ms)
{
	pthread_yield();
	return;
}

before the while loop, to make it behave like on Win32 - or do we still
have Linux systems without pthreads (or rather, pthread_yield()) around?

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Thursday 14 March 2002 20:34, Kylotan wrote:

From: David Olofson <david.olofson at reologica.se>
Sent: Thursday, March 14, 2002 7:13 PM

Stephen_Anthony · March 14, 2002, 2:11pm

[snipped]

Either way, I still think you’re on the wrong track…

Yes, this is what I figured. What I’m trying to find out is how to do it
correctly

And before anyone makes the suggestion, yes, I must run at the
specified rate (60 vs 50). This is because the program is emulating
a hardware device that actually ran at NTSC rates (60 fps)

Well, Kobo Deluxe is in fact running at exactly 33.333 fps. The fact
that the graphics engine my be running at a significantly higher (or
lower) fps is another story.

There are no timers in there, just one thread, and no busy-waiting.

But does it hog the CPU? If it does, then that solution is no better
than what I have. Note that my problem isn’t that the emulator can’t
render the frames, just that its wasting CPU during the idle times.

I’m just checking SDL_GetTicks() once per rendered frame, to see how
many “logic” frames I should run to catch up with the time
corresponding to the next frame to render. (This may be anything from 0
times and up.)

The fact that the logic code doesn’t really run once every 30 ms
doesn’t matter, as it communicates with the outside world only once per
rendered frame. (On low end machines, one could consider increasing the
user input testing rate, but the same principles still apply.)

Yes, but in the emulator, you do a call to the emulator core every frame
(mediaSource->update()) and then put that to the screen. The real Atari
ran at 60 fps, or more to the point, it did this update 60 times per
second. So the reason I want to run at 60 fps is because the logic has
to be updated 60 times per second.

Maybe I should have been more clear. The fact that the framerate is 60
fps is indirectly related to the fact that the logic has to be updated
60 times per second.

If you can show me a way of doing both of these and not hog the CPU, then
I will be extremely happy

So I can change the question slightly. How do I call logic code 60 times
a second and not make it busywait until the next time the logic code
should be called?

Thanks,
SteveOn March 14, 2002 03:20 pm, you wrote:

David_Olofson · March 14, 2002, 3:04pm

[…]

There are no timers in there, just one thread, and no busy-waiting.

But does it hog the CPU?

It does, unless the video driver implements retrace sync’ed flips, as
they do on DirectX/Win32. On Linux (except in OpenGL mode with some
drivers), it will just run as fast as it can.

If it does, then that solution is no better
than what I have. Note that my problem isn’t that the emulator can’t
render the frames, just that its wasting CPU during the idle times.

There’s not much to do about this. I can only think of two ways to reduce
the amount of CPU hogging:

1) Keep track of your average effective rendering + "flipping"
   time, and subtract that from the total 60 Hz period, to get
   an approximate "delay time". If that's greater than 10 ms,
   SDL_Delay(10) - if not, busy-wait... Of course, it's quite
   likely that you'll actually be sleeping for more than 10 ms
   every time you call SDL_Delay(10). :-/

2) Throw a pthread_yield() into the busy-waiting loop, to give
   the scheduler the impression that you're not a *real* CPU
   hog. This should improve your chances of actually being
   allowed to use the CPU when you need it. (Hogging lowers
   your priority, so that *any* other thread that decides to
   do something can take the CPU away from you for several
   jiffies!)

(BTW, IMHO, it would be nice if SDL_Delay(0) would concistently perform
the operation corresponding to pthread_yield() on all platforms where
it’s relevant and possible.)

The good news is that I think I’ve figured out a relatively nice
(non-busy-waiting) way of implementing retrace sync on Linux, and other
Un*x like systems that might need it. You’ll get a daemon that runs a PLL
in sync with the retrace (a very low priority thread that should
basically replacy the kernel idle thread), and a shared library with
calls that implement “were is the raster beam now?” and blocking retrace
sync. There will be no polling, and it can be implemented on anything
that supports testing for retrace - no interrupts needed.

Hopefully, this will improve the chances of Linux eventually being able
to achieve the same animation quality as DirectX has provided ever since
it was introduced.

I’m just checking SDL_GetTicks() once per rendered frame, to see how
many “logic” frames I should run to catch up with the time
corresponding to the next frame to render. (This may be anything from
0 times and up.)

The fact that the logic code doesn’t really run once every 30 ms
doesn’t matter, as it communicates with the outside world only once
per rendered frame. (On low end machines, one could consider
increasing the user input testing rate, but the same principles still
apply.)

Yes, but in the emulator, you do a call to the emulator core every
frame (mediaSource->update()) and then put that to the screen. The
real Atari ran at 60 fps, or more to the point, it did this update 60
times per second. So the reason I want to run at 60 fps is because the
logic has to be updated 60 times per second.

Maybe I should have been more clear. The fact that the framerate is 60
fps is indirectly related to the fact that the logic has to be
updated 60 times per second.

Of course, but that’s not what I’m talking about. There’s a very big
difference between running something exactly once every 60th of a
second, and running it at an average rate of 60 Hz.

The point is that you only need to hit the right video frame - not the
right microsecond, or even the right millisecond. Big difference when
you’re on a normal “general purpose” OS without high resolution timers.

If you can show me a way of doing both of these and not hog the CPU,
then I will be extremely happy

So I can change the question slightly. How do I call logic code 60
times a second and not make it busywait until the next time the logic
code should be called?

Either retrace sync or “yield looping” will do the trick, or it’s simply
not possible without high resolution timers - and Linux do not have any.

All right, there is the RTC device. Unfortunately, you’ll have to be
root to set it any higher than 64 Hz or so - and it supports only
power-of-two frequencies, so you’d have to use something higher for
sufficient accuracy… (Another problem that my retrace sync daemon would
“hide”, as you won’t have to touch the RTC directly.)

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Thursday 14 March 2002 23:09, Stephen Anthony wrote:

Stephen_Anthony · March 14, 2002, 3:21pm

[snipped]

There’s not much to do about this. I can only think of two ways to
reduce the amount of CPU hogging:

Keep track of your average effective rendering + “flipping”
time, and subtract that from the total 60 Hz period, to get
an approximate “delay time”. If that’s greater than 10 ms,
SDL_Delay(10) - if not, busy-wait… Of course, it’s quite
likely that you’ll actually be sleeping for more than 10 ms
every time you call SDL_Delay(10). :-/

Throw a pthread_yield() into the busy-waiting loop, to give
the scheduler the impression that you’re not a real CPU
hog. This should improve your chances of actually being
allowed to use the CPU when you need it. (Hogging lowers
your priority, so that any other thread that decides to
do something can take the CPU away from you for several
jiffies!)

(BTW, IMHO, it would be nice if SDL_Delay(0) would concistently perform
the operation corresponding to pthread_yield() on all platforms where
it’s relevant and possible.)

OK, I will look into this.

Maybe I should have been more clear. The fact that the framerate is
60 fps is indirectly related to the fact that the logic has to be
updated 60 times per second.

Of course, but that’s not what I’m talking about. There’s a very big
difference between running something exactly once every 60th of a
second, and running it at an average rate of 60 Hz.

The point is that you only need to hit the right video frame - not
the right microsecond, or even the right millisecond. Big difference
when you’re on a normal “general purpose” OS without high resolution
timers.

Yes, this is very true. I will have to think about it some more, since I
obviously haven’t got a firm grasp on it all yet.

So I can change the question slightly. How do I call logic code 60
times a second and not make it busywait until the next time the logic
code should be called?

Either retrace sync or “yield looping” will do the trick, or it’s
simply not possible without high resolution timers - and Linux do not
have any.

All right, there is the RTC device. Unfortunately, you’ll have to be
root to set it any higher than 64 Hz or so - and it supports only
power-of-two frequencies, so you’d have to use something higher for
sufficient accuracy… (Another problem that my retrace sync daemon
would “hide”, as you won’t have to touch the RTC directly.)

I look forward to the changes you are proposing. It sounds like they
will solve my (and a lot of other peoples) problems.

Also, I’d like to make a point of saying thank you for all the help
you’ve given, both on this subject and on your glSDL stuff. Its nice to
find someone who is knowledgable in an area, but its even better when
they are willing to share that knowledge and help other people understand
how it all works.

Thanks for the help,
SteveOn March 14, 2002 07:32 pm, you wrote:

Kylotan · March 14, 2002, 5:24pm

…

I think people use Sleep(0) on Win32 to achieve a similar effect.

Ah! Now that you say it…

BTW, yes, I did discover that SDL_Delay() does not work this way -
at
least not on Linux. Maybe it does on Windows…?

looks at the source

Looking at the source, it should, provided you’re right about
Sleep(0).

Mr MSDN says:

VOID Sleep(
DWORD dwMilliseconds // sleep time in milliseconds
);

Parameters:
dwMilliseconds
Specifies the time, in milliseconds, for which to suspend execution. A
value of zero causes the thread to relinquish the remainder of its time
slice to any other thread of equal priority that is ready to run. If
there are no other threads of equal priority ready to run, the function
returns immediately, and the thread continues execution. A value of
INFINITE causes an infinite delay.

So yeah, if there’s nothing else waiting, Sleep(0) costs you virtually
nothing, and sounds equivalent to sched_yield().> From: David Olofson <david.olofson at reologica.se>

Sent: Thursday, March 14, 2002 8:09 PM
Subject: Re: [SDL] How to get constant framerates without busywaits

–
Kylotan

Stefan_Hubner · March 15, 2002, 11:25pm

Hi!
What about this way:
when starting the game
long started = SDL_GetTicks();
float timethen = (float)(started +
how_many_milliseconds_until_the_first_frame_display);

while (in_the_main_loop) {
doEventHandling();
doDrawingToBuffer();
do while ((float)SDL_GetTicks()<timethen);
doSwapBuffers();
timethen+= 16.0f + 2/3.0f; // 60 fps!
}

Ok, this is not what you asked. This is a busywait. But this one probably
is the most accurate one (I said probably, on my computer I normally get a
Delay of somewhat at 2ms between two calls of SDL_GetTicks(). so everything
looks pretty much smoother than with delays)
St0fF.

Stephen_Anthony · March 16, 2002, 10:26am

Thanks for the response. But this is what I have already. Actually, in
the code I use, I have substituted SDL_GetTicks with gettimeofday, which
leads to no delay.

But my problem is that I’d like to avoid the busy-wait altogether. But I
also want to keep smooth framerates too. I will have to experiment and
see if delays will make the framerate noticably less smooth. If it
does, then I’ll stick with a busywait. How it looks onscreen is more
important than CPU usage. I just thought I could have the best of both
worlds

Thanks,
SteveOn March 15, 2002 09:45 pm, you wrote:

Hi!
What about this way:
when starting the game
long started = SDL_GetTicks();
float timethen = (float)(started +
how_many_milliseconds_until_the_first_frame_display);

while (in_the_main_loop) {
doEventHandling();
doDrawingToBuffer();
do while ((float)SDL_GetTicks()<timethen);
doSwapBuffers();
timethen+= 16.0f + 2/3.0f; // 60 fps!
}

Ok, this is not what you asked. This is a busywait. But this one
probably is the most accurate one (I said probably, on my computer I
normally get a Delay of somewhat at 2ms between two calls of
SDL_GetTicks(). so everything looks pretty much smoother than with
delays)
St0fF.

Adam_Gates · March 17, 2002, 2:42pm

Why don’t you want a busy wait? Why is everybody concerned with their
CPU usage while running games?

I don’t think it is possible to get constant framerates without
busywaits. If you try any other method you are blocking on some
operating system controlled event. When you do that you are leaving it
up to the operating system to wake your thread again. Even a sleep for
2ms call is not gauranteed to return in 2ms, all it gaurantess is that
it will be at least 2ms before it returns. You need a real time
operating system to get gaurantees on how long your thread will sleep,
thats why real time operating systems exist.

Stephen Anthony wrote:>

This has been a nagging problem in a poject I’m working on for quite some
time. First, let me give some background.

The program is an Atari 2600 emulator called Stella. The problem is that the
main game loop is set up as follows:

while(!quit)
{
handleEvents()
updateDisplay()

Busy-wait until next frame should start
}

I’d like to change that last part to wait for the next frame to start, but
not use a busy-wait. I’ve tried using sleep, usleep, select, etc. but since
the resolution of the timer is 10ms under Linux, a wait is always “too long”.

For example, assume that I want to run at 60 fps (default). Assuming that
the frame and event handling takes 8 ms to complete, then I want to wait
another 8-9 ms (since 60 fps implies 16 ms/frame). Problem is that when I
try to wait 8 ms, I will actually have to wait 10 ms (at least). So the
effective framerate now becomes 50 fps, since the whole loop actually took
closer to 20 ms.

Is there any system-independent / timeslice-independent way to restructure
this game loop so that this won’t happen. I’m not looking for actual code
here, just an idea on how to proceed.

And before anyone makes the suggestion, yes, I must run at the specified rate
(60 vs 50). This is because the program is emulating a hardware device that
actually ran at NTSC rates (60 fps)

If I asked this question before, I apologize. Its just that I understand the
problem better now and can explain it more clearly. If this isn’t the
appropriate group, then suggestions on where to look are appreciated.

Thanks,
Steve Anthony

Stephen_Anthony · March 17, 2002, 3:04pm

Why don’t you want a busy wait? Why is everybody concerned with their
CPU usage while running games?

I don’t think it is possible to get constant framerates without
busywaits. If you try any other method you are blocking on some
operating system controlled event. When you do that you are leaving it
up to the operating system to wake your thread again. Even a sleep for
2ms call is not gauranteed to return in 2ms, all it gaurantess is that
it will be at least 2ms before it returns. You need a real time
operating system to get gaurantees on how long your thread will sleep,
thats why real time operating systems exist.

Well, there are a few reasons. One is that this is an emulator that is
extremely low powered. Why use up the entire CPU when less than 1% would
have been enough, even on a Pentium 100?

Also, what if it is ported to some portable device or something? More
CPU usage means more power consumption. Actually, this is also related
to current CPU’s. More processing translates to more heat being
generated.

I guess the main reason is that I learned in CS that busywaits are
sloppy. They are, by definition, a waste of processor time. Besides,
I’ve seen other emulators that do it, so I know it can be done. Problem
is that by examining their code, I can’t figure out how they did it
Thats why I was looking for a general algorithm, maybe something that
could help me understand how other people did it.

Its a matter of pride I guess. The non busy-wait version would be much
more ‘elegant’. It may not be required, but it would be icing on the
cake. I come from an Amiga background, where you had to conserve every
resource you had. I can’t break free from that mentality, and honestly,
I’m not sure that I want to

SteveOn March 17, 2002 07:04 pm, you wrote:

Vlad_Romascanu · March 17, 2002, 6:34pm

If you’re looking for an average of 30fps with no busy wait how about a
calibrated time loop? SDL_Delay will probably have a resolution of anywhere
from 5 to 15ms on a system, which means that at 60Hz you can get some
jitter, but your framerate will stay 60Hz overall.

long t0 = SDL_GetTicks(); // reference time
long t_ideal = t0; // “virtual”, ideal current time

while (1) {
/* … /
/ do stuff /
/ … */

/* Synchronize virtual and real times */

t_ideal += 17; // or 20 for 50Hz
long t_now = SDL_GetTicks();

if (t_now < t_ideal) {
SDL_Delay(t_ideal - t_now); // we are ahead, so go to sleep
} else {
// do nothing!
// we are lagging, so do not sleep and go on to the next
// frame right away
}
}

This will get you a very accurate average 17ms between frames (i.e. 100
frames will render over exactly 1.7 s). If you want to get exactly
16.6666666…ms as a period (60Hz) you can use an additional "fractional"
term as follows:

long t0 = SDL_GetTicks(); // reference time
long t_ideal = t0; // “virtual”, ideal current time
int f = 0; // fractional term

while (1) {
/* … /
/ do stuff /
/ … */

/* Synchronize virtual and real times */

t_ideal += 16;

if (++f >= 6) { // did the fractional term roll over?
f=0;
t_ideal++;
}

long t_now = SDL_GetTicks();

if (t_now < t_ideal) {
SDL_Delay(t_ideal - t_now); // we are ahead, so go to sleep
} else {
// do nothing!
// we are lagging, so do not sleep and go on to the next
// frame right away
}
}

David_Olofson · March 18, 2002, 6:30am

Right, something like this is definitely the preferred method if you want
accurate timing and accurate average frame rate. Not using some form of
“running” timer will cause per-frame errors to add and affect the average
frame rate.

If accuracy is less important, but average frame rate still has to be
“exact”, you could just throw some conditional SDL_Delay() into the inner
loop, but of course, you have to realize that on most targets, that delay
won’t be exactly 10 ms! Most probably, it will be somewhere in between 10
and 20 ms, depending on when you call SDL_Delay(). (If it could be
“exactly” 10 ms, life would be much easier - but if that was possible, so
would delays of any duration.)

That’s the point with SDL_Delay(0) (as it works on Win32, and IMHO,
should work on other platforms; you yield immediately, and either get
the CPU back right away, at the next IRQ driver reschedule (every 10 ms
on most OSes), or when some the last of the other runnable threads blocks
or yields.

That is, the delay would be “usually 10 ms at most” rather than “at least
10 ms”. Should make a big difference if your frame rate is higher than
some 50 Hz…

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Saturday 16 March 2002 02:15, St0fF 64 wrote:

Hi!
What about this way:
when starting the game
long started = SDL_GetTicks();
float timethen = (float)(started +
how_many_milliseconds_until_the_first_frame_display);

while (in_the_main_loop) {
doEventHandling();
doDrawingToBuffer();
do while ((float)SDL_GetTicks()<timethen);
doSwapBuffers();
timethen+= 16.0f + 2/3.0f; // 60 fps!
}

Ok, this is not what you asked. This is a busywait. But this one
probably is the most accurate one (I said probably, on my computer I
normally get a Delay of somewhat at 2ms between two calls of
SDL_GetTicks(). so everything looks pretty much smoother than with
delays)
St0fF.

David_Olofson · March 18, 2002, 4:38pm

Why don’t you want a busy wait? Why is everybody concerned with their
CPU usage while running games?

Not the CPU usage per se, but it’s quite likely that the OS scheduler
will get “pissed off” and allow some other processes run for a while
every now and then. Not good at all, if you want to maintain a steady
framerate…

I don’t think it is possible to get constant framerates without
busywaits.

Yes it is - if you’re using double buffering on a retrace synced target.
You can’t pick your frame rate, but you just have to deal with that.

If you try any other method you are blocking on some
operating system controlled event. When you do that you are leaving it
up to the operating system to wake your thread again. Even a sleep for
2ms call is not gauranteed to return in 2ms, all it gaurantess is that
it will be at least 2ms before it returns. You need a real time
operating system to get gaurantees on how long your thread will sleep,
thats why real time operating systems exist.

Right, but IMHO, it’s much more interesting to sleep until there is
another buffer to render into. Maintaining a steady “internal” frame rate
is simply pointless, as you can’t force the screen refreshes anyway.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Sunday 17 March 2002 23:34, Adam Gates wrote:

Martijn_Melenhorst · March 18, 2002, 5:52pm

Why don’t you want a busy wait? Why is everybody concerned with their
CPU usage while running games?

Not the CPU usage per se, but it’s quite likely that the OS scheduler
will get “pissed off” and allow some other processes run for a while
every now and then. Not good at all, if you want to maintain a steady
framerate…

I actually do not think this would be a problem on a Windows platform, since
I am even able to hang the whole OS when constantly reconnecting to IRC
from one of my self-made IRC clients… I guess you are talking about Linux
or something like that, since a ‘pissed off’ scheduler is obviously something
Windows does not support, or something like that.

I don’t think it is possible to get constant framerates without
busywaits.

Yes it is - if you’re using double buffering on a retrace synced target.
You can’t pick your frame rate, but you just have to deal with that.

He’s not saying he wants to pick his framerate, I think? Anyway, the constant
framerates by waiting for a synced target are not always as accurate as you
might think, since the implementation of this is very hardware/OS specific. Some
could use an IRQ and still have a delay of several milliseconds, which would make
your game still not an exact 100% perfect.

If you try any other method you are blocking on some
operating system controlled event. When you do that you are leaving it
up to the operating system to wake your thread again. Even a sleep for
2ms call is not gauranteed to return in 2ms, all it gaurantess is that
it will be at least 2ms before it returns. You need a real time
operating system to get gaurantees on how long your thread will sleep,
thats why real time operating systems exist.

Right, but IMHO, it’s much more interesting to sleep until there is
another buffer to render into. Maintaining a steady “internal” frame rate
is simply pointless, as you can’t force the screen refreshes anyway.

Ah, but you do not always want to draw a new frame for all you have
updated in your internal game-sprite-positions, or whatever you want to do
at a fixed frame rate, do you? Perhaps you want to update the positions
of a enemies, for example, at an EXACT 60 fps, not at whatever the OS
allows you to do per sync. So, if your enemy positions are updated and
there is not a new buffer ready, just update them again at the next interval,
until there is a buffer ready. (hhm… reminds me of a discussion some
days ago :)>On Sunday 17 March 2002 23:34, Adam Gates wrote:

David_Olofson · March 18, 2002, 6:26pm

[…]

I guess the main reason is that I learned in CS that busywaits are
sloppy. They are, by definition, a waste of processor time.

Yes indeed - but if you don’t have an OS that can do what you want, you
don’t really have a choice…

Besides,
I’ve seen other emulators that do it, so I know it can be done.
Problem is that by examining their code, I can’t figure out how they
did it

Did you check how they’re doing audio and video output? I bet the answer
is in there…

Either they’re using double buffering with h/w pageflipping (on Windows)
and blocking on the retrace, or they’re “abusing” the audio card as a
timer, pretty much like many audio/MIDI applications do.

Note, however, that you probably have to use “shared DMA buffer mode” to
use audio for anything like the times you need, as just setting the
buffer size low enough would probably not even give you sound on Windows.
It works on Linux, though…

Of course, they could also be using some timers. On Windows you can use
multimedia timers (up to 1 kHz; can wake up threads), and on Linux you’d
use /dev/rtc.

Thats why I was looking for a general algorithm, maybe
something that could help me understand how other people did it.

There isn’t one. If there isn’t anything that you can block on (timer,
retrace, whatever) that will wake you up at the right time, busy-waiting
is your only option. You could possibly improve the situation slightly
by throwing a sched_yield()/Sleep(0) into the loop, but that’s as nice as
it gets.

Its a matter of pride I guess. The non busy-wait version would be much
more ‘elegant’. It may not be required, but it would be icing on the
cake.

Yeah…

I come from an Amiga background, where you had to conserve every
resource you had. I can’t break free from that mentality, and
honestly, I’m not sure that I want to

Same here.

Wasting resources to compensate for poor operating systems or drivers is
rarely a good idea, and usually gives worse results than would a proper
solution.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Monday 18 March 2002 00:02, Stephen Anthony wrote: