How to get constant framerates without busywaits

Adam_Gates · March 18, 2002, 7:16pm

I agree with most of your points here.

I think the problem is that you want gauranteed constant framerate. If
you relax this condition to average framerate then it will be a lot
easier to code and noone will know the difference anyway.

The timing loops people have put up here will achieve an average
framerate. The yielding or sleeping is only used to keep it an
approximate sync. The time your program sleeps is not controlled by your
program, it is controlled by the os. You are only giving hints to the os
that your program doesn’t need to do anything for a while. Sometimes the
os will take too much time from your program so you may not want to
sleep at all. In the end you will get your desired framerate, but only
an average, not a constant.

In your example:

For example, assume that I want to run at 60 fps (default). Assuming
that
the frame and event handling takes 8 ms to complete, then I want to wait
another 8-9 ms (since 60 fps implies 16 ms/frame). Problem is that
when I
try to wait 8 ms, I will actually have to wait 10 ms (at least). So the
effective framerate now becomes 50 fps, since the whole loop actually
took closer to 20 ms.

But for the next frame it takes 8ms to render. So now you only need to
sleep for 6ms. The idea is that you need to track your framerate against
realtime and constantly adjust the amount of time you sleep for. You may
want to decide at the end of frame that if your calculated sleep period
is too small and you may not want to sleep at all.

Stephen Anthony wrote:> On March 17, 2002 07:04 pm, you wrote:

Why don’t you want a busy wait? Why is everybody concerned with their
CPU usage while running games?

I don’t think it is possible to get constant framerates without
busywaits. If you try any other method you are blocking on some
operating system controlled event. When you do that you are leaving it
up to the operating system to wake your thread again. Even a sleep for
2ms call is not gauranteed to return in 2ms, all it gaurantess is that
it will be at least 2ms before it returns. You need a real time
operating system to get gaurantees on how long your thread will sleep,
thats why real time operating systems exist.

Well, there are a few reasons. One is that this is an emulator that is
extremely low powered. Why use up the entire CPU when less than 1% would
have been enough, even on a Pentium 100?

Also, what if it is ported to some portable device or something? More
CPU usage means more power consumption. Actually, this is also related
to current CPU’s. More processing translates to more heat being
generated.

I guess the main reason is that I learned in CS that busywaits are
sloppy. They are, by definition, a waste of processor time. Besides,
I’ve seen other emulators that do it, so I know it can be done. Problem
is that by examining their code, I can’t figure out how they did it
Thats why I was looking for a general algorithm, maybe something that
could help me understand how other people did it.

Its a matter of pride I guess. The non busy-wait version would be much
more ‘elegant’. It may not be required, but it would be icing on the
cake. I come from an Amiga background, where you had to conserve every
resource you had. I can’t break free from that mentality, and honestly,
I’m not sure that I want to

Steve

David_Olofson · March 18, 2002, 7:45pm

Why don’t you want a busy wait? Why is everybody concerned with
their CPU usage while running games?

Not the CPU usage per se, but it’s quite likely that the OS scheduler
will get “pissed off” and allow some other processes run for a while
every now and then. Not good at all, if you want to maintain a steady
framerate…

I actually do not think this would be a problem on a Windows platform,
since I am even able to hang the whole OS when constantly reconnecting
to IRC from one of my self-made IRC clients… I guess you are talking
about Linux or something like that, since a ‘pissed off’ scheduler is
obviously something Windows does not support, or something like that.

hehe Well, the real term would probably be “dynamic priority”, and it’s
a feature found in most timesharing operating systems. Win95/98/ME do not
implement this in any remotely useful way, AFAIK, but NT/2k/XP seem to
try at least. (They have to, if they’re supposed to work in servers…)

I don’t think it is possible to get constant framerates without
busywaits.

Yes it is - if you’re using double buffering on a retrace synced
target. You can’t pick your frame rate, but you just have to deal
with that.

He’s not saying he wants to pick his framerate, I think?

IIRC, the initial question was about an emulator that needs to run at an
exact rate - 60 fps, I think.

Here, he asks for a fixed rate, and sure, retrace sync will give you
that, if available - but of course, the rate you get is whatever refresh
rate the current mode happens to use…

Anyway, the
constant framerates by waiting for a synced target are not always as
accurate as you might think, since the implementation of this is very
hardware/OS specific. Some could use an IRQ and still have a delay of
several milliseconds, which would make your game still not an exact
100% perfect.

That’s to be expected - and it shouldn’t matter either, as long as you
have a chance to get your buffer ready in time for the next frame.

If you try any other method you are blocking on some
operating system controlled event. When you do that you are leaving
it up to the operating system to wake your thread again. Even a
sleep for 2ms call is not gauranteed to return in 2ms, all it
gaurantess is that it will be at least 2ms before it returns. You
need a real time operating system to get gaurantees on how long your
thread will sleep, thats why real time operating systems exist.

Right, but IMHO, it’s much more interesting to sleep until there is
another buffer to render into. Maintaining a steady “internal” frame
rate is simply pointless, as you can’t force the screen refreshes
anyway.

Ah, but you do not always want to draw a new frame for all you have
updated in your internal game-sprite-positions, or whatever you want to
do at a fixed frame rate, do you?

Right - but that is only makes it more interesting to turn this fixed
rate into a “virtual” frame rate, and actually just “run the clock” to
the correct time right before rendering each output frame.

Perhaps you want to update the
positions of a enemies, for example, at an EXACT 60 fps, not at
whatever the OS allows you to do per sync.

Sure, no problem. Kobo Deluxe does it exactly every 30 ms - but only in
the “virtual time space”. However, that doesn’t matter, as there is no
I/O between video frames anyway.

So, if your enemy positions
are updated and there is not a new buffer ready, just update them again
at the next interval, until there is a buffer ready. (hhm… reminds me
of a discussion some days ago

Yep. I thought variants of the correct solution for this had been
discussed enough around here…

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Tuesday 19 March 2002 02:51, Martijn Melenhorst wrote:

On Sunday 17 March 2002 23:34, Adam Gates wrote:

Vlad_Romascanu · March 19, 2002, 6:14am

busywaits …

Not the CPU usage per se, but it’s quite likely that the OS scheduler
will get “pissed off” and allow some other processes run for a while
every now and then. Not good at all, if you want to maintain a steady
framerate…

Good point. On Win2k if you abuse the scheduler (try to get fine
granularity via a lot of small-interval/0-interval sleep’s) the scheduler
does indeed tend to get pissed off (initially it gives you 5ms granularity,
but after a few seconds of abuse it “demotes” your thread, even if it is set
to the “highest priority”, to 10ms granularity, which is perfectly
reasonable since you’re hogging the CPU with context switches ).

I posted a message the other day containing an algorithm with a calibrated
sleep loop that will do the trick no matter what your scheduler thinks it’s
doing.

V.> -----Original Message-----

From: David Olofson [mailto:david.olofson at reologica.se]
Sent: Monday, March 18, 2002 7:36 PM
To: sdl at libsdl.org
Subject: Re: [SDL] How to get constant framerates without

Stephen_Anthony · March 19, 2002, 8:16am

I agree with most of your points here.

I think the problem is that you want gauranteed constant framerate. If
you relax this condition to average framerate then it will be a lot
easier to code and noone will know the difference anyway.

Yes, after much head-scratching and talking back and forth with David
Olofson, this finally makes sense to me. Average framerate will suffice as
long as that average is taken over a small enough timeframe (say 0.5 - 1
second). Then any changes in instantaneous framerate would (hopefully) never
be noticed.

Thanks,
SteveOn March 18, 2002 11:38 pm, you wrote:

David_Olofson · March 19, 2002, 9:33am

busywaits …

Not the CPU usage per se, but it’s quite likely that the OS scheduler
will get “pissed off” and allow some other processes run for a while
every now and then. Not good at all, if you want to maintain a steady
framerate…

Good point. On Win2k if you abuse the scheduler (try to get fine
granularity via a lot of small-interval/0-interval sleep’s) the
scheduler does indeed tend to get pissed off (initially it gives you
5ms granularity, but after a few seconds of abuse it “demotes” your
thread, even if it is set to the “highest priority”, to 10ms
granularity, which is perfectly reasonable since you’re hogging the CPU
with context switches ).

Hmm… Sounds like a bug in the Win2k scheduler. The OS should not
penalize an application for trying to be “nice” rather than just
busy-waiting.

I posted a message the other day containing an algorithm with a
calibrated sleep loop that will do the trick no matter what your
scheduler thinks it’s doing.

Does it actually work for frame rates above 50 Hz?

On a system with a correctly implemented timing, SDL_Delay() should wait
at least the time you specify - which in fact means it will round
upwards to the next “jiffy” (100 Hz).

So, if the frame rate is above 50 Hz, you can never sleep without
waking up too late more often than not.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Tuesday 19 March 2002 15:13, Vlad Romascanu (LMC) wrote:

-----Original Message-----
From: David Olofson [mailto:david.olofson at reologica.se]
Sent: Monday, March 18, 2002 7:36 PM
To: sdl at libsdl.org
Subject: Re: [SDL] How to get constant framerates without

David_Olofson · March 19, 2002, 9:57am

Possible, but it’s actually even simpler than that.

1) You could, at least theoreticaly, measure the "exact"
   time of each flip. (If you can't get good timestamps
   for some reason, try filtering some...)

2) You can *calculate* the exact time of every
   logic/engine frame - using the same unit and reference
   time as used for the "flip timestamping" in 1).

Now, just compare the two streams of timestamps, and you’ll see that
there will be zero or more logic/engine frames between each two flip
timestamps. For example:

Flips	FramesOn Tuesday 19 March 2002 17:14, Stephen Anthony wrote:

On March 18, 2002 11:38 pm, you wrote:

I agree with most of your points here.

I think the problem is that you want gauranteed constant framerate.
If you relax this condition to average framerate then it will be a
lot easier to code and noone will know the difference anyway.

Yes, after much head-scratching and talking back and forth with David
Olofson, this finally makes sense to me. Average framerate will
suffice as long as that average is taken over a small enough timeframe
(say 0.5 - 1 second). Then any changes in instantaneous framerate
would (hopefully) never be noticed.

---------------------
0	0

10
	17
20

30
	33
40

50	50

60
	67

Here, we have a retrace sync’ed flip setup, running at a refresh rate of
100 Hz. Your engine is running at a “virtual” frame rate of 60 Hz, which
gives you 16.666… ms per frame.

The correct way to render this example would be:

Flip	Time	Engine frames
-----------------------------
1	0	1
2	10	1
3	20	0
4	30	1
5	40	0
6	50	1
7	60	0
8	70	1

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -’

David_Olofson · March 19, 2002, 10:18am

Oh, BTW, have anyone actually tried using a “smart” filter for this, or
simply a traditional PLL, coarse tuned to the expected refresh rate?

The problem I’m thinking about avoiding here is the mess that a normal
filter makes when it faces a system that alternates between full and half
frame rate in some cycle. (It’s most commonly seen in triple buffering
setups, but it could happen anywhere, depending on what the game engine
is up to.)

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Tuesday 19 March 2002 18:56, David Olofson wrote:

On Tuesday 19 March 2002 17:14, Stephen Anthony wrote:

On March 18, 2002 11:38 pm, you wrote:

I agree with most of your points here.

I think the problem is that you want gauranteed constant framerate.
If you relax this condition to average framerate then it will be a
lot easier to code and noone will know the difference anyway.

Yes, after much head-scratching and talking back and forth with David
Olofson, this finally makes sense to me. Average framerate will
suffice as long as that average is taken over a small enough
timeframe (say 0.5 - 1 second). Then any changes in instantaneous
framerate would (hopefully) never be noticed.

Possible, but it’s actually even simpler than that.

You could, at least theoreticaly, measure the “exact”
time of each flip. (If you can’t get good timestamps
for some reason, try filtering some…)

David_Olofson · March 19, 2002, 10:27am

[…lots of stuff…]

Simpler…?

Well, what I meant to explain was that it’s not an average you need, but:

The number of logic/engine frames per rendered frame.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Tuesday 19 March 2002 18:56, David Olofson wrote:

On Tuesday 19 March 2002 17:14, Stephen Anthony wrote:

On March 18, 2002 11:38 pm, you wrote:

I agree with most of your points here.

I think the problem is that you want gauranteed constant framerate.
If you relax this condition to average framerate then it will be a
lot easier to code and noone will know the difference anyway.

Yes, after much head-scratching and talking back and forth with David
Olofson, this finally makes sense to me. Average framerate will
suffice as long as that average is taken over a small enough
timeframe (say 0.5 - 1 second). Then any changes in instantaneous
framerate would (hopefully) never be noticed.

Possible, but it’s actually even simpler than that.

Stephen_Anthony · March 19, 2002, 10:38am

I agree with most of your points here.

I think the problem is that you want gauranteed constant framerate.
If you relax this condition to average framerate then it will be a
lot easier to code and noone will know the difference anyway.

Yes, after much head-scratching and talking back and forth with David
Olofson, this finally makes sense to me. Average framerate will
suffice as long as that average is taken over a small enough
timeframe (say 0.5 - 1 second). Then any changes in instantaneous
framerate would (hopefully) never be noticed.

Possible, but it’s actually even simpler than that.

You could, at least theoreticaly, measure the “exact”
time of each flip. (If you can’t get good timestamps
for some reason, try filtering some…)

You can calculate the exact time of every
logic/engine frame - using the same unit and reference
time as used for the “flip timestamping” in 1).

OK, you lost me again here What are you considering a flip to be?
Are you assuming double-buffering? Because right now I don’t use
double-buffering. Are you saying that this method will only work if the
hardware supports retrace sync’ed flips?

Now, just compare the two streams of timestamps, and you’ll see that
there will be zero or more logic/engine frames between each two flip
timestamps. For example:

Flips Frames

0 0

10
17
20

30
33
40

50 50

60
67

Here, we have a retrace sync’ed flip setup, running at a refresh rate
of 100 Hz. Your engine is running at a “virtual” frame rate of 60 Hz,
which gives you 16.666… ms per frame.

The correct way to render this example would be:

Flip Time Engine frames

1 0 1
2 10 1
3 20 0
4 30 1
5 40 0
6 50 1
7 60 0
8 70 1

How do I interpret this last table? What does the 0 and 1 in the engine
frame represent? The fact that you should use buffer 0 or buffer 1? Or
that you should not / should do the next logic frame??

Also, does this solve the problem of busy-waiting? Where / how would you
use a sleep call?

I’m sorry if I seem a bit dense here. I really want to understand how
this works. Maybe you could provide a simple example in pseudocode? I
sort of understand what you’re saying, I just don’t know how to go about
writing it

Thanks,
SteveOn March 19, 2002 02:26 pm, you wrote:

On Tuesday 19 March 2002 17:14, Stephen Anthony wrote:

On March 18, 2002 11:38 pm, you wrote:

Vlad_Romascanu · March 19, 2002, 11:10am

busywaits …

Good point. On Win2k if you abuse the scheduler (try to get fine
granularity via a lot of small-interval/0-interval sleep’s) the
scheduler does indeed tend to get pissed off (initially it gives you
5ms granularity, but after a few seconds of abuse it “demotes” your
thread, even if it is set to the “highest priority”, to 10ms
granularity, which is perfectly reasonable since you’re
hogging the CPU
with context switches ).

Hmm… Sounds like a bug in the Win2k scheduler. The OS should not
penalize an application for trying to be “nice” rather than just
busy-waiting.

It’s nice, but not nice enough because it’s trying to be too “interactive”
(i.e. do a few microseconds worth of work and then ask for a context switch
when the poor scheduler probably allocated a quanta ten or a hundred times
larger for it, so the scheduler pretty much just wasted two context
switches). Buys wait is definitely worse, but the scheduler doesn’t know
the difference between busy-wait and “legitimate” work, so busy-wait will
probably be treated pretty well by the scheduler. It all depends on the
assumptions and tradeoffs that the scheduler designer(s) had in mind.

I posted a message the other day containing an algorithm with a
calibrated sleep loop that will do the trick no matter what your
scheduler thinks it’s doing.

Does it actually work for frame rates above 50 Hz?

Of course. “On average” >50Hz, granted.

On a system with a correctly implemented timing, SDL_Delay()
should wait
at least the time you specify - which in fact means it will round
upwards to the next “jiffy” (100 Hz).

So, if the frame rate is above 50 Hz, you can never sleep without
waking up too late more often than not.

Scenario: assuming that the scheduler can’t wake me up before 20ms no matter
what and I need 60Hz rate then my pseudo-code will render one frame, sleep,
render two frames one after the other (looks like a waste, but keeps the
average framerate perfect, although one might want to optimize this to do
only one flip() since the user won’t see the difference yet call the
frame-rate specific emulation code twice as well), sleeps, renders one
frame, etc.

Here is the pseudo-code again:

long t0 = SDL_GetTicks(); // reference time
long t_ideal = t0; // “virtual”, ideal current time

while (1) {
/* … /
/ do stuff /
/ … */

/* Synchronize virtual and real times */

t_ideal += 17; // or 20 for 50Hz
long t_now = SDL_GetTicks();

if (t_now < t_ideal) {
SDL_Delay(t_ideal - t_now); // we are ahead, so go to sleep
} else {
// continue while() loop!
// we are lagging, so do not sleep but go on to the next
// frame right away
}
}

V.> -----Original Message-----

From: David Olofson [mailto:david.olofson at reologica.se]
Sent: Tuesday, March 19, 2002 12:32 PM
To: sdl at libsdl.org
Subject: Re: [SDL] How to get constant framerates without

David_Olofson · March 19, 2002, 11:27am

[…]

OK, you lost me again here What are you considering a flip to be?
Are you assuming double-buffering? Because right now I don’t use
double-buffering. Are you saying that this method will only work if
the hardware supports retrace sync’ed flips?

A “flip” here can be whatever mentod you use of making a frame visible.
It will work better if you’re running at full frame rate with retrace
sync, but it’s not critical.

[…]

How do I interpret this last table? What does the 0 and 1 in the
engine frame represent? The fact that you should use buffer 0 or
buffer 1? Or that you should not / should do the next logic frame??

Sorry; it’s “Engine frames” - simply the number of logic/engine frames to
“run” before you render the output frame. You calculate this once per
output frame, after checking the current time.

Also, does this solve the problem of busy-waiting? Where / how would
you use a sleep call?

No. There’s still no way to do that, short of retrace sync properly
implemented in the driver.

I’m sorry if I seem a bit dense here. I really want to understand how
this works.

It seems trivial to me, but it’s still hard to explain - and I’m not
particularly good at describing anything right now, it seems… heh

Maybe you could provide a simple example in pseudocode? I
sort of understand what you’re saying, I just don’t know how to go
about writing it

There are many ways to write it… Here’s some code from the Spitfire
Engine, used in Kobo Deluxe:

—8<-----------------------------------------------------------
void cs_engine_advance(cs_engine_t *e, float to_frame)
{
if(to_frame > 0)
{
int frames = floor(to_frame) - floor(e->time);
if(frames > 0)
{
while(frames–)
{
__run_all(e);
e->on_frame(e);
}
}
}
e->time = to_frame;
if(e->wx || e->wy)
__wrap_all(e);
__update_points(e, to_frame - floor(to_frame));
}
----------------------------------------------------------->8—

__run_all() updates all “object” (sprites and “points” for scrolling etc)
positions, based on their velocity and acceleration values, evaluates
collisions and stuff. (None of this is used in Kobo Deluxe.)

on_frame() is your logic/engine callback - this is where you hook your
emulator frame() function in. (Actually, this is wrapped by the C++ API,
so you “hook the callback in” by deriving from gfxengine_t and throwing
in your own frame() method in your new class.)

__wrap_all() “fixes” object coordinates for wrapping levels, like those
in Kobo Deluxe.

__update_points() implements the interpolation; this is where the actual
graphics coordinates for all objects are calculated for each rendered
frame.

—8<-----------------------------------------------------------
void gfxengine_t::run()
{
open();
show();
start_engine();
is_running = 1;
while(is_running)
{
int tick = SDL_GetTicks() - start_tick;
float toframe = (float)tick / ticks_per_frame;
cs_engine_advance(csengine, toframe);
pre_render();
window->select();
cs_engine_render(csengine);
post_render();
if(autoinvalidate)
window->invalidate();
flip();
}
stop_engine();
}
----------------------------------------------------------->8—

The first two lines in the loop is where I check the time, and translate
that into a fractional time, expressed in logic frames. That is, the
integer part is which frame I want to render, and the fractional part
says how close we are to the next logic frame.

Next, I call cs_engine_advance() (above), to run the whole game logic
until the right frame.

cs_engine_render() renders all objects (sprites) into the output
“window”, whereas pre_render() and post_render() are hooks for the game
to render stuff before and after the objects are rendered, respectively.
(Kobo Deluxe uses the first one for the background, and the second for
overlay text, any debugging stuff and finally, the frame with the rounded
corners.)

flip() can perform SDL_UpdateRects(), SDL_FlipSurface() and other stuff,
depending on the selected engine “buffer mode”.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Tuesday 19 March 2002 19:36, Stephen Anthony wrote:

David_Olofson · March 19, 2002, 11:38am

[…]

Hmm… Sounds like a bug in the Win2k scheduler. The OS should not
penalize an application for trying to be “nice” rather than just
busy-waiting.

It’s nice, but not nice enough because it’s trying to be too
“interactive” (i.e. do a few microseconds worth of work and then ask
for a context switch when the poor scheduler probably allocated a
quanta ten or a hundred times larger for it, so the scheduler pretty
much just wasted two context switches).

Haaah! That explains why you can’t do serious lowlatency audio work on
NT/2k without serious hacks and kludges… (And not really even then,
actually.)

Buys wait is definitely worse,
but the scheduler doesn’t know the difference between busy-wait and
“legitimate” work, so busy-wait will probably be treated pretty well by
the scheduler. It all depends on the assumptions and tradeoffs that
the scheduler designer(s) had in mind.

Sometimes one wonders what they did have in mind…

Is this a special case, or are you always penalized for calling any
syscalls that could potentially block? (My experiences with audio on NT
indicate the latter…)

I posted a message the other day containing an algorithm with a
calibrated sleep loop that will do the trick no matter what your
scheduler thinks it’s doing.

Does it actually work for frame rates above 50 Hz?

Of course. “On average” >50Hz, granted.

With a stall of 10-20 ms every time it thinks it’s ok to sleep…

On a system with a correctly implemented timing, SDL_Delay()
should wait
at least the time you specify - which in fact means it will round
upwards to the next “jiffy” (100 Hz).

So, if the frame rate is above 50 Hz, you can never sleep without
waking up too late more often than not.

Scenario: assuming that the scheduler can’t wake me up before 20ms no
matter what and I need 60Hz rate then my pseudo-code will render one
frame, sleep, render two frames one after the other (looks like a
waste, but keeps the average framerate perfect, although one might want
to optimize this to do only one flip() since the user won’t see the
difference yet call the frame-rate specific emulation code twice as
well), sleeps, renders one frame, etc.

Yeah, I get the idea.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter | -------------------------------------> http://olofson.net -'On Tuesday 19 March 2002 20:09, Vlad Romascanu (LMC) wrote:

Vlad_Romascanu · March 19, 2002, 11:58am

Haaah! That explains why you can’t do serious lowlatency
audio work on
NT/2k without serious hacks and kludges… (And not really even then,
actually.)

Yeah. You’re either stuck with multi-media timers (very good ms resolution,
but there is stuff you’re not allowed to do from within the callback so you
may still have to offload work to another thread, which means the other
thread must wait on an event/queue, => back to the initial problem) or you
are stuck with writing kernel mode stuff (brrrr…)

Sometimes one wonders what they did have in mind…

Termites?!

Is this a special case, or are you always penalized for calling any
syscalls that could potentially block? (My experiences with
audio on NT
indicate the latter…)

I guess there is some internal accounting of time you spend in your
timeslice vs. the allocated time and/or the overhead of context-switches.
So the scheduler probably says one of the following things: (1) you are
killing me with 100ns of context switch for 10ns of work so I’ll put you
aside [sensible intention, though this should only be looked at by the
scheduler if the CPU is >90% busy], or (2) you gave up your time-slice so
I’ll make you wait a couple of rounds since you did not need the rest of
your time-slice anyway [which is fine if you give it up once every now and
then, but is not OK if you always give it up instead of letting the
scheduler take it away from you].
But I’m not a M$ employee, don’t have access to their code, so can’t say for
sure what the actual reason behind this behaviour is. Number (2) would
explain your experience with audio on NT, though there may be other factors
involved in your experiments as well.

With a stall of 10-20 ms every time it thinks it’s ok to sleep…

Yep.

Cheers,
V.

Stephen_Anthony · March 19, 2002, 4:11pm

I’d like to give a big THANK YOU to both David Olofson and Vlad Romascanu
for their help on the above topic.

Vlad, the pseudocode you gave me for a calibrated time loop worked
perfectly. It is exactly what I needed here.

David, although I didn’t use your approach right now, I will keep it in
mind when I port the emulator to OpenGL, and hopefully, when the OpenGL
drivers finally implement retrace sync. The retrace sync is the best
option for smooth framerates, and I look forward to the time when it is
implemented in all drivers (or even X for that matter).

Again, thanks for all the help.
Steve