How to optimize for blitting performance

Okay, it seems there have been threads on performance in the past, but I have to ask.

Suppose you wanted to use SDL to build a simple movie player (like xanim), but using uncompressed images in RAM. So all you want to do is blit data as fast as possible (preferably in doublebuffer mode).

Suppose the graphics card is unremarkable (eg. Matrox G200 or NVIDIA TNT2) and the server is Xfree86 3.3.5 in 24 bit truecolor mode. The images to blit are 24 bit color images at 720x480, 1024x768, or even larger if possible.

Is the best bet to use a fullscreen mode, a hardware surface, and then cross your fingers and hope? Would XFree 3.9 change matters? Xi Graphics?

–== Sent via Deja.com http://www.deja.com/ ==–
Share what you know. Learn what you don’t.

Suppose the graphics card is unremarkable (eg. Matrox G200 or NVIDIA TNT2) and the server is Xfree86 3.3.5 in 24 bit truecolor mode. The images to blit are 24 bit color images at 720x480, 1024x768, or even larger if possible.

Heheh, good luck.
With 1024x768 and the setup you’ve described, you are likely to get 10 FPS
or less. A better setup would be to dither to 16-bit mode, and you get
nearly twice the speed. I don’t know how hardware acceleration affects
the equation.

Is the best bet to use a fullscreen mode, a hardware surface, and then cross your fingers and hope? Would XFree 3.9 change matters? Xi Graphics?

Possibly, I haven’t benchmarked them.

Your mileage may vary. :slight_smile:

-Sam Lantinga				(slouken at devolution.com)

Lead Programmer, Loki Entertainment Software–
“Any sufficiently advanced bug is indistinguishable from a feature”
– Rich Kulawiec

Sam wrote:

Heheh, good luck.
With 1024x768 and the setup you’ve described, you are likely to get 10 FPS
or less. A better setup would be to dither to 16-bit mode, and you get
nearly twice the speed. I don’t know how hardware acceleration affects
the equation.

Right, in 24 bit mode I get almost 11 fps.
In 16 bit mode it’s 21 fps.

A few caveats:
Images are really about 900x550, for various
reasons, but I’m blitting this rectangle
into a 1024x768 screen. The machine I’m using
has XFree 3.3.3.1 (not 3.3.5 as I had
thought) and a Matrox G200.

In any case, it doesn’t seem to matter if
we’re fullscreen or not. X grabs the majority of
the CPU cycles (80% or more) and my player
gets the rest.

I’ll try XFree 3.9, which according to Mark
Vojkovich runs “noticeably faster” on the G200
in real world performance and in benchmarks.

n

–== Sent via Deja.com http://www.deja.com/ ==–
Share what you know. Learn what you don’t.

Suppose you wanted to use SDL to build a simple movie
player (like xanim), but using uncompressed images in RAM.

I’ve done this. You really want to use hardware OpenGL… Render a single
textured polygon and call glTexSubImage2D() to update the frames.

With a TNT2 on Windows, I get > 30fps playback of 720x480x24bit images.

I also wrote a raw Win32 (BitBlt()) player, which attained similar
performance.

I haven’t tried any of this on Linux… A G400 with hardware GL would
probably run fairly well; X might be too slow…

Dan

Suppose you wanted to use SDL to build a simple movie
player (like xanim), but using uncompressed images in RAM.

I’ve done this. You really want to use hardware OpenGL… Render a single
textured polygon and call glTexSubImage2D() to update the frames.

or it seems glDrawPixels should work also?

With a TNT2 on Windows, I get > 30fps playback of 720x480x24bit images.

I think I’ve seen similar numbers with other Windows players, like the one that ships with Nothing Real’s “Shake” compositing software.

Anyway, this wasn’t meant to be a “SDL v. OpenGL” thing, I only wanted to know what’s the best I can expect from SDL under the circumstances. Looks like I’m getting what others expect.

thanks
Neil

–== Sent via Deja.com http://www.deja.com/ ==–
Share what you know. Learn what you don’t.On Tue, 8 Feb 2000 16:52:07 Dan Maas wrote:

Neil Okamoto wrote:

I’ve done this. You really want to use hardware OpenGL… Render a single
textured polygon and call glTexSubImage2D() to update the frames.

or it seems glDrawPixels should work also?

No, glDrawPixels is awfully slow with most OpenGL driver/card combos…

Note that glTexSubImage2D() is used by Quake 3 for its animations (I
think) and is also awfully slow with my Mesa/Voodoo2 combo. I think
this is a problem in Mesa. Or maybe in my Pentium 225 (rather slow
according to the specs in Quake 3’s readme!)… :slight_smile:

With a TNT2 on Windows, I get > 30fps playback of 720x480x24bit images.

I think I’ve seen similar numbers with other Windows players, like the
one that ships with Nothing Real’s “Shake” compositing software.

Anyway, this wasn’t meant to be a “SDL v. OpenGL” thing, I only wanted
to know what’s the best I can expect from SDL under the circumstances.
Looks like I’m getting what others expect.

With most modern video cards (like TNT and Matrox G200), system memory
to video memory blits are accelerated by some kind of block transfer
system, which can be as many as 3-4 times faster than raw programmed I/O
blitting (can be only twice as fast also). I think XFree86 3.9.x uses
this, but XFree86 3.3.x definitely doesn’t.–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/