Smooth scrolling - There Is Hope :-)

Hi SDLers!

I just played around with a new retrace sync + flip trick, and I’m
getting very promising results, even though I’m currently running a
stock 2.4.1 kernel (no lowlatency), and there probably is more tuning
to do.

Smooth scrolling at full frame rate (320x240 @ 100 Hz), with no
tearing or flickering! :slight_smile:

Only occasional dropped frames due to someone hogging the CPU for
tens of ms every now and then… I have a two-component cure for that:

1. Apply lowlatency patch. (Yeah, even though I didn't use
   SCHED_FIFO or SCHED_RR with the lowlatency patch before,
   missed retrace syncs are *much* more common and extended
   without lowlatency!)

2. Use RTC, UTIME (patch), audio card or other *blocking*
   timing source instead of busy-waiting, to keep the
   scheduler from considering us CPU hogs.

Will try this later; RTC blocking timing first, as it requires only
stuff that’s been in mainstream Linux kernels for some time.

Anyway, what’s this trick!?

I set up a double buffered display without hardware pageflipping
(well, I can’t get h/w page flipping on all targets, that’s the
basically problem I’m addressing), but with a back buffer. That is,
a display that will cause SDL to do an SDL_UpdateRect() to do some
blitting (h/w or s/w - doesn’t matter much as long as there are no
loooong command queues in the way) when calling SDL_Flip().

Next, I throw in a modified version of my retrace sync patch for
Utah-GLX (the new one that uses RDTSC to deal with not hitting right
on every single retrace period - not yet released); modified so that
I can wait for any part of the screen. (Not that it’s very accurate;
only 128 points per frame, and then there’s the scheduler fckng
with us all the time…)

When this initializes, it’ll time a few video frames, and take the
shortest result, to get a very accurate measure of the actual
screen refresh rate. (Accurate enough that I couldn’t see any drift
in minutes when disabling the actual retrace sync.)

Now, for the actual trick;

while(1)
{
	Run control systems and other stuff;

	Render the screen;

	Sync at the middle of the screen;
	SDL_UpdateRect(upper half of screen);

	Sync at the retrace;
	SDL_UpdateRect(lower half of screen);
}

alternatively (to even the CPU rendering load);

while(1)
{
	Run control systems and other stuff;

	Render the upper half of the screen;
	Sync at the middle of the screen;
	SDL_UpdateRect(upper half of screen);

	Render the lower half of the screen;
	Sync at the retrace;
	SDL_UpdateRect(lower half of screen);
}

There; the tearing is gone, and we don’t need h/w page flipping or ms
accurate hard real time scheduling for the game and driver. Just get
the back->front blits to finish within their respective half-screen,
and you’re safe. That’s 5 ms at 100 Hz, so even an OS without
"multimedia class firm real time" should make it most of the time,
provided we can keep the scheduler from penalizing us for CPU
hogging. (See above on how to do that.)

NOTE: This does NOT work as you would expect in windowed mode! To
do it in windowed mode, you have to consider the size and
position of your window in relation to the screen. I’ll probably
experiment some with that as well some time…

Of course, one could split the screen updating in more than two
strips to increase the jitter tolerance to more than half a frame.
The point is to blit when the retrace is outside the area you’re
blitting to, which gets easier the smaller that area is. For example,
splitting into four bands means that you have 75% of a frame (more
actually; retrace + border time is on your side) to do 25% of the
screen.

A third alternative to the main loops above would be to hack the
rendering up in a function that can render a fraction of a screen
(say, 16-32 strips), and then spend the non-retrace (ie “somewhere in
the middle of the screen”) polling time doing one strip at a time,
checking the raster counter emulation timer after each one.

Whenever the raster gets too close to an area that hasn’t yet been
updated, or you have too many strips “buffered up”, do the
SDL_UpdateRect()!

That way, you can render constantly at full speed until the whole
screen is done, and spend the rest of the time (if there is any!),
just blitting the strips to the screen in sufficiently small chunks
to avoid hitting the raster beam. Should improve stability a great
deal in cases where the CPU has to spend significant amount of time
with the rendering.

As a fourth version, I’m getting really ambitious, and allowing all
this to be hidden behind the normal SDL API…;

Take the third approach above, and move the timing + back->front
blitting into a separate, high priority/real time thread, and use
RTC, UTIME, audio card (some have generic timers for sequencers, BTW

  • don’t know if drivers support them nowadays), Win32 multimedia
    timers or whatever you got to drive it in a constant rate, periodic
    fashion.

Set up a FIFO or ring buffer style arrangement between the
application thread and this video update thread, so that SDL_Flip()
can just push buffers over. Preferably make room for two entire
frames or more, but not so much that you get noticable control system
-> video latency. Two frames would correspond to triple buffering
with a normal h/w page flipping display setup. Have SDL_Flip() sleep
when this buffer gets full.

Every time it wakes up, it checks the current time against the
"virtual" raster counter to find out where we are, and flushes a
suitable amount of scan lines from the buffer to the screen, checks
the time again (just in case someone got in and stole some CPU time),
and goes back to sleep.

Whenever it’s close enough to the actual retrace, it polls for it
until it sees it - or times out as a result of someone doing
something “more important” at the wrong time. When it does hit the
retrace, it (obivously) does what my retrace sync code does already;
resynchronizes the internal “virtual” raster counter - or better,
only adjusts it a little, just in case some one should steal some CPU
time right between the “retrace sync” and the "adjust raster counter"
operations.

Do we have a usable solution for “crappy” video targets in sight here?

//David

PS. I was supposed to work on MAIA, but I ended up hacking the good
part of a 2D game engine w/ control system, and coming up with the
above… Damn you folks! “Scrolling example”, eh… ;-)))

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -’

[…loooooong explanation of various variants of double buffering…]

Do we have a usable solution for “crappy” video targets in sight
here?

BTW, this should also work with OpenGL, but probably not without at
least some minor driver hacks. You need a way to tell if the driver
is using h/w pageflipping, and if not, a way to tell it to blit a
slice of the back buffer into the front buffer, rather than just
glXSwapBuffers().

(Oh, it would work with standard glXSwapBuffers(), but it would be
terribly inefficient, copying the entire screen for every update
strip. It would also requiring the application to render the screen a
strip at a time without touching the rest of the back buffer, which
can be a problem with 3D applications, as they’d have to manage
multiple view frustums to do this efficiently…)

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Monday 19 February 2001 11:16, David Olofson wrote: