Bizarre(?) per-pixel alpha vs. no alpha benchmark results (0/1)

I benchmarked (code attached) blitting a 128x128 sprite
(attached) with and without per pixel alpha information
under PyGame. On each frame, I clear the 640x480 screen
with the background color via fill(), then blit the
sprite 4 times. The results are quite interesting:

640x480 windowed 16-bit 32-bit===============================================
plain opaque blit: 70 fps 47 fps
per pixel alpha : 80 fps (!) 49 fps
per pixel alpha + RLEACCEL: 93 fps (!) 55 fps

Comment: Why is p.p. Alpha blit faster in
windowed modes? Does this mean that there
will not be much benefit gained from optimizing
the alpha blending routines (like say, using
MMX)?

120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF

plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps

Comment: Why does using double-buffering impose
such a huge performance hit on the alpha blending and
at the same time INCREASE the performance of the
plain opaque blit?!? Could this be an artifact of
using PyGame? 640x480 32-bit mode exhibited analogous
behavior. No tearing visible in all tests.

120Hz 640x480 16-bit full screen, HWSURFACE
DOUBLEBUF no DOUBLEBUF

plain opaque blit: 120 fps / 146 fps
per pixel alpha : 23 fps / 25 fps
per pixel alpha + RLEACCEL: 35 fps / 35 fps

Comment: Severe tearing when not using DOUBLEBUF, even
when the fps was way below the refresh rate as in the
case of the alpha blits! Why would using HWSURFACE
create tearing? 32-bit results analogous to 16-bit
ones.

Notes

  1. Enabling RLEACCEL made no difference for all the
    plain opaque blits.
  2. In windowed mode, HWSURFACE doesn’t seem to have
    an effect even if i use a low-res mode like 800x600.
    Considering I’m using a 16MB Voodoo3, shouldn’t
    there be plenty of video memory to store the sprite
    in?

Details on the sprite image I used

They come from the same image generated under
Photoshop. One was saved as a Targa (with alpha
channel) and the other is a PNG (Photoshop 6 does
not export alpha channel for PNGs), otherwise
identical. I’ve included it the PNG version to
show the background pixel to non-background pixel
ratio as well as the alpha channel in grayscale
PNG.

My setup

DirectX 8
Win2K SP1
16MB Voodoo3 AGP
120Hz refresh at 640x480 fullscreen

640x480 windowed 16-bit 32-bit

plain opaque blit: 70 fps 47 fps
per pixel alpha : 80 fps (!) 49 fps
per pixel alpha + RLEACCEL: 93 fps (!) 55 fps

this means nothing to me unless you provide code for what you are
doing (in C please). Also provide information about the SDL video
driver you are using (DirectX or DIB), and a link to your images

Comment: Severe tearing when not using DOUBLEBUF, even
when the fps was way below the refresh rate as in the
case of the alpha blits! Why would using HWSURFACE
create tearing? 32-bit results analogous to 16-bit
ones.

tearing is caused by interaction of CRT refresh with your drawing.
DOUBLEBUF means “use hardware page-flipping” and shouldn’t have any
significance with software surfaces

640x480 windowed 16-bit 32-bit

plain opaque blit: 70 fps 47 fps
per pixel alpha : 80 fps (!) 49 fps
per pixel alpha + RLEACCEL: 93 fps (!) 55 fps

120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF

plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps

My first reflection: “Someone is reading VRAM here…”

120Hz 640x480 16-bit full screen, HWSURFACE
DOUBLEBUF no DOUBLEBUF

plain opaque blit: 120 fps / 146 fps
per pixel alpha : 23 fps / 25 fps
per pixel alpha + RLEACCEL: 35 fps / 35 fps

Comment: Severe tearing when not using DOUBLEBUF, even
when the fps was way below the refresh rate as in the
case of the alpha blits! Why would using HWSURFACE
create tearing? 32-bit results analogous to 16-bit
ones.

HWSURFACE + no DOUBLEBUF means that you want a single buffer, and that it’s
supposed to be the actual screen buffer. And you get exactly that; that’s why
you get tearing.

Notes

  1. Enabling RLEACCEL made no difference for all the
    plain opaque blits.

Well, you can hardly get it any faster than copying a number of fixed size
blocks, can you? :slight_smile:

  1. In windowed mode, HWSURFACE doesn’t seem to have
    an effect even if i use a low-res mode like 800x600.

No, because you can’t access the framebuffer directly in windowed mode.
(Applies to most targets and platforms, AFAIK. There are a few technical
reasons for that, mostly related to limitations of non high-end hardware.)

Considering I’m using a 16MB Voodoo3, shouldn’t
there be plenty of video memory to store the sprite
in?

Yes, but it doesn’t matter if the card doesn’t accelerate blits in windowed
mode. (Some cards purely designed for 3D gaming hardly seem to know what "2D"
and “windowing” means, so this is not too unusual, it seems…)

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Friday 23 March 2001 11:58, Andy Sy wrote:

David Olofson wrote:

Andy Sy wrote:

120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF

plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps

My first reflection: “Someone is reading VRAM here…”

DOUBLEBUF forces HWSURFACE …

  • Randi

Regimental Command
Generic Armored Combat System
http://regcom.sourceforge.net

Well, not on all targets (unfortunately) but it definitely looks like that’s
the case here.

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Friday 23 March 2001 17:47, Randi J. Relander wrote:

David Olofson wrote:

Andy Sy wrote:

120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF

plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps

My first reflection: “Someone is reading VRAM here…”

DOUBLEBUF forces HWSURFACE …