Mattias Engdeg?rd wrote:
Polling for the retrace is really out of the question on a multitasking
system.
Not if the alternative is blasting frames to the frame buffer as fast as you
can. If you are spending 100% CPU already, polling won’t make matters worse.
Except that blasting the frames in questions usually mean some
computations, whether it be MPEG decompression or 3D rasterizing. Now
you can start doing these computation while the video board is waiting
(with an radically better precision I might add) for the vertical
retrace.
The net result is higher framerates: who doesn’t want that?
The right thing is this: those modern boards usually support a “wait for
vertical retrace” flag on their command, which will let the board do
all the waiting.
Unless your game is of the type “blit an entire screenful each frame”,
perhaps playing a movie.
No problem with that. Just tag the blit command with the “wait for
retrace” flag, issue it and start uncompressing the next frame.
If you don’t want to grab 100% CPU, you have to wait for something.
In that case a device that blocks on a read() or ioctl() until the
next refresh would be handy, even if imprecise.
There is a big problem with any kind of userland "wait for retrace"
scheme. If it is polling, the app might get preempted and miss the exact
moment of the retrace, at best getting it late (perharps too late) and
at worst losing it completely. If it is a blocking system call, the
unblocking is not immediate, it only means that the kernel scheduler
puts back the process in the run queue, to be scheduled “when
appropriate”. Note that my system has a vertical refresh of over 100 Hz,
which means there is less than 10 millisecond between each retrace. The
Linux kernel preempts processes at a 100 Hz rate (except on Alpha, where
it is 1024 Hz I think), so not to miss the vertical retrace completely,
there would need to be NO other process in the run queue when the
blocked process is unblocked.
All of the “software” wait for vertical retrace solutions (those that
use the main CPU) are doomed in high end situations where it happens too
quickly (except maybe using a realtime OS, which is not the case with
Linux currently).
Remember that the vertical retrace idea is to take advantage of a window
in time where the ray is not actually drawing on the screen and where
changing the framebuffer will not cause visible distorsion and shearing.
When that window is half of the scheduler resolution, you’re in deep
shit if you are a process controlled by the scheduler, because in the
average case you’ll be missing the window.
The other way, relying on the video accelerator, is comparatively
perfect. The very hardware that knows when the vertical retrace happen
is the one that is triggered by it, it cannot get any better. Why isn’t
it used more? I think the reason is that using the 2D accelerator isn’t
all that popular, even today, and the video accelerator can only do the
waiting for the commands it is asked to do, not those that you do
yourself. Considering that hardware acceleration is as a rule of thumb
at least 20% faster (and often 700%-1000%), we’d be really better off by
switching to it!
Maybe it could be simulated by short sleeps (using RTC) and polls,
I haven’t looked into this. But you would still need a way to find out
whether a vertical retrace has taken place.
Really, all this simulation will get you is missing the vertical
retrace, or if you’re lucky, getting it when the ray is halfway up the
screen… If you are lucky!–
Pierre Phaneuf
http://ludusdesign.com/