I benchmarked (code attached) blitting a 128x128 sprite
(attached) with and without per pixel alpha information
under PyGame. On each frame, I clear the 640x480 screen
with the background color via fill(), then blit the
sprite 4 times. The results are quite interesting:
640x480 windowed 16-bit 32-bit===============================================
plain opaque blit: 70 fps 47 fps
per pixel alpha : 80 fps (!) 49 fps
per pixel alpha + RLEACCEL: 93 fps (!) 55 fps
Comment: Why is p.p. Alpha blit faster in
windowed modes? Does this mean that there
will not be much benefit gained from optimizing
the alpha blending routines (like say, using
MMX)?
120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF
plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps
Comment: Why does using double-buffering impose
such a huge performance hit on the alpha blending and
at the same time INCREASE the performance of the
plain opaque blit?!? Could this be an artifact of
using PyGame? 640x480 32-bit mode exhibited analogous
behavior. No tearing visible in all tests.
120Hz 640x480 16-bit full screen, HWSURFACE
DOUBLEBUF no DOUBLEBUF
plain opaque blit: 120 fps / 146 fps
per pixel alpha : 23 fps / 25 fps
per pixel alpha + RLEACCEL: 35 fps / 35 fps
Comment: Severe tearing when not using DOUBLEBUF, even
when the fps was way below the refresh rate as in the
case of the alpha blits! Why would using HWSURFACE
create tearing? 32-bit results analogous to 16-bit
ones.
Notes
- Enabling RLEACCEL made no difference for all the
plain opaque blits.
- In windowed mode, HWSURFACE doesn’t seem to have
an effect even if i use a low-res mode like 800x600.
Considering I’m using a 16MB Voodoo3, shouldn’t
there be plenty of video memory to store the sprite
in?
Details on the sprite image I used
They come from the same image generated under
Photoshop. One was saved as a Targa (with alpha
channel) and the other is a PNG (Photoshop 6 does
not export alpha channel for PNGs), otherwise
identical. I’ve included it the PNG version to
show the background pixel to non-background pixel
ratio as well as the alpha channel in grayscale
PNG.
My setup
DirectX 8
Win2K SP1
16MB Voodoo3 AGP
120Hz refresh at 640x480 fullscreen
640x480 windowed 16-bit 32-bit
plain opaque blit: 70 fps 47 fps
per pixel alpha : 80 fps (!) 49 fps
per pixel alpha + RLEACCEL: 93 fps (!) 55 fps
this means nothing to me unless you provide code for what you are
doing (in C please). Also provide information about the SDL video
driver you are using (DirectX or DIB), and a link to your images
Comment: Severe tearing when not using DOUBLEBUF, even
when the fps was way below the refresh rate as in the
case of the alpha blits! Why would using HWSURFACE
create tearing? 32-bit results analogous to 16-bit
ones.
tearing is caused by interaction of CRT refresh with your drawing.
DOUBLEBUF means “use hardware page-flipping” and shouldn’t have any
significance with software surfaces
640x480 windowed 16-bit 32-bit
plain opaque blit: 70 fps 47 fps
per pixel alpha : 80 fps (!) 49 fps
per pixel alpha + RLEACCEL: 93 fps (!) 55 fps
120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF
plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps
My first reflection: “Someone is reading VRAM here…”
120Hz 640x480 16-bit full screen, HWSURFACE
DOUBLEBUF no DOUBLEBUF
plain opaque blit: 120 fps / 146 fps
per pixel alpha : 23 fps / 25 fps
per pixel alpha + RLEACCEL: 35 fps / 35 fps
Comment: Severe tearing when not using DOUBLEBUF, even
when the fps was way below the refresh rate as in the
case of the alpha blits! Why would using HWSURFACE
create tearing? 32-bit results analogous to 16-bit
ones.
HWSURFACE + no DOUBLEBUF means that you want a single buffer, and that it’s
supposed to be the actual screen buffer. And you get exactly that; that’s why
you get tearing.
Notes
- Enabling RLEACCEL made no difference for all the
plain opaque blits.
Well, you can hardly get it any faster than copying a number of fixed size
blocks, can you?
- In windowed mode, HWSURFACE doesn’t seem to have
an effect even if i use a low-res mode like 800x600.
No, because you can’t access the framebuffer directly in windowed mode.
(Applies to most targets and platforms, AFAIK. There are a few technical
reasons for that, mostly related to limitations of non high-end hardware.)
Considering I’m using a 16MB Voodoo3, shouldn’t
there be plenty of video memory to store the sprite
in?
Yes, but it doesn’t matter if the card doesn’t accelerate blits in windowed
mode. (Some cards purely designed for 3D gaming hardly seem to know what "2D"
and “windowing” means, so this is not too unusual, it seems…)
//David
.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |
--------------------------------------> david at linuxdj.com -'On Friday 23 March 2001 11:58, Andy Sy wrote:
David Olofson wrote:
Andy Sy wrote:
120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF
plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps
My first reflection: “Someone is reading VRAM here…”
DOUBLEBUF forces HWSURFACE …
Regimental Command
Generic Armored Combat System
http://regcom.sourceforge.net
Well, not on all targets (unfortunately) but it definitely looks like that’s
the case here.
//David
.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |
--------------------------------------> david at linuxdj.com -'On Friday 23 March 2001 17:47, Randi J. Relander wrote:
David Olofson wrote:
Andy Sy wrote:
120Hz 640x480 16-bit full screen, SWSURFACE
DOUBLEBUF no DOUBLEBUF
plain opaque blit: 120 fps 71 fps (!)
per pixel alpha : 24 fps(!) 80 fps
per pixel alpha + RLEACCEL: 30 fps(!) 95 fps
My first reflection: “Someone is reading VRAM here…”
DOUBLEBUF forces HWSURFACE …