Blitting from RGBA surface is THAT slow?

Hello all,

 I've encountered another problem. My artist brought me some map

objects: trees, stones, walls, etc. He saved all this stuff as PNG
with alpha channel.
In my turn, I loaded these pictures with IMG_Load, then
SDL_DisplayFormatAlpha’ed them. And now, when even 5-6 trees are
present on the screen, I watch terribly low perfomance.
I re-render every frame, where I draw terrain, map objects, GUI
and cursor. Those objects that have no partially transparent areas
are drawn rather fast (0-1 ms). Such objects as circle with gradiant
transparency are drawn very slow (~33 ms). Does anyone know
possible reasons?

  I would be very grateful for help, this problem totally paralyses

my work…–
Best regards,
Flashback

Flashback wrote:
[…]

and cursor. Those objects that have no partially transparent areas
are drawn rather fast (0-1 ms). Such objects as circle with gradiant
transparency are drawn very slow (~33 ms). Does anyone know
possible reasons?

Reading from video memory is very slow, so if you are using a display
surface created with SDL_HWSURFACE, try to use SDL_SWSURFACE if you want
to blit RGBA surfaces (transparency == you have to read the destination).
You may also use the SDL_RLEACCEL flag with SDL_SetAlpha to gain some
additionnal speed (no test for each pixel).

Gautier.

Alpha blending (rendering with a source A channel) is a rather
expensive operation, compared to plain opaque blitting. More
severely, it involves both reading and writing the destination
surface - and if the destination surface is in VRAM (a hardware
surface), the reads become insanely slow on most modern hardware.
(I measured the read bandwidth on some P-II class machine and found
it to be similar to that of reading from an old IDE hard drive…
heh)

Three suggestions:

  • Use a backend that hardware accelerates alpha.
    Practically impossible, unless you can live with
    accelerated OpenGL as a system requirement - but
    if you can, and really need massive amounts of
    blending and/or ultra smooth full screen scrolling,
    you can either use glSDL (which allows your game
    to “run” without OpenGL as well), or use OpenGL
    directly. If you really require the power of
    full h/w acceleration, the latter is probably the
    way to go, but I think that would be hard to
    motivate for most games.

  • Avoid doing the alpha blending directly to hardware
    surfaces. It’s often faster to do all rendering in
    an off-screen software surface and then do only
    opaque rectangular blits to the display surface.

  • Keep your alpha blended areas minimal and use SDL’s
    RLE acceleration. “Clean” the alpha channels so that
    nearly opaque areas become opaque and nearly
    transparent areas become fully transparent. (You may
    have your artist do that, or you can add a filter in
    your loader.)

Using the latter two “tricks”, rendering with reasonable amounts of
alpha blending (mainly around the edges of objects, for antialiasing)
can be practically as fast as plain opaque/colorkey rendering.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Sunday 26 December 2004 18.48, Flashback wrote:

Hello all,

 I've encountered another problem. My artist brought me some

map objects: trees, stones, walls, etc. He saved all this stuff as
PNG with alpha channel.
In my turn, I loaded these pictures with IMG_Load, then
SDL_DisplayFormatAlpha’ed them. And now, when even 5-6 trees are
present on the screen, I watch terribly low perfomance.
I re-render every frame, where I draw terrain, map objects,
GUI and cursor. Those objects that have no partially transparent
areas are drawn rather fast (0-1 ms). Such objects as circle with
gradiant transparency are drawn very slow (~33 ms). Does anyone
know possible reasons?

Note: If you just use SDL_SWSURFACE, you can no longer make use of
hardware page flipping and/or retrace sync for tearing free
animation.

To get the best of both worlds, you can ask for a double buffered
hardware surface, and if you get one (not always possible; check
after init), set up a third software surface for rendering. If you
don’t get a hardware surface, just use the (shadow) surface that SDL
provides; it’s the best you can get under such circumstances.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Sunday 26 December 2004 20.48, Gautier Portet wrote:

Flashback wrote:
[…]

and cursor. Those objects that have no partially transparent
areas are drawn rather fast (0-1 ms). Such objects as circle with
gradiant transparency are drawn very slow (~33 ms). Does anyone
know possible reasons?

Reading from video memory is very slow, so if you are using a
display surface created with SDL_HWSURFACE, try to use
SDL_SWSURFACE if you want to blit RGBA surfaces (transparency ==
you have to read the destination).

Hello all,

Gautier Portet wrote:

Reading from video memory is very slow, so if you are using a display
surface created with SDL_HWSURFACE, try to use SDL_SWSURFACE if you want
to blit RGBA surfaces (transparency == you have to read the destination).
You may also use the SDL_RLEACCEL flag with SDL_SetAlpha to gain some
additionnal speed (no test for each pixel).
Hmm… Does this mean I need to call SDL_SetAlpha for each loaded
image? It’s not a big deal, though. Thanks, I’ll try it.

David Olofson wrote:

Three suggestions:

  • Use a backend that hardware accelerates alpha.
    Practically impossible, unless you can live with
    accelerated OpenGL as a system requirement - but
    if you can, and really need massive amounts of
    blending and/or ultra smooth full screen scrolling,
    you can either use glSDL (which allows your game
    to “run” without OpenGL as well), or use OpenGL
    directly. If you really require the power of
    full h/w acceleration, the latter is probably the
    way to go, but I think that would be hard to
    motivate for most games.
    Unfortunately, I can’t use OpenGL in my game for various reasons.
    One of them is that I don’t know OpenGL good enough yet (what a
    shame! :slight_smile: ). It’s not hard to learn it, I suppose, but my time is
    very limited, and I can’t rebuild whole project. So, I think I
    must stay with pure 2D SDL.
  • Avoid doing the alpha blending directly to hardware
    surfaces. It’s often faster to do all rendering in
    an off-screen software surface and then do only
    opaque rectangular blits to the display surface.
    That’s maybe a real way out. Gautier Porter was also speaking
    about using SW surfaces, and that should work. I can’t
    overestimate your valuable ideas, thanks a lot!
  • Keep your alpha blended areas minimal and use SDL’s
    RLE acceleration. “Clean” the alpha channels so that
    nearly opaque areas become opaque and nearly
    transparent areas become fully transparent. (You may
    have your artist do that, or you can add a filter in
    your loader.)
    Sure, I can tell artist to do that, but poor guy has too much
    work last time :slight_smile: You said something about the filter, what did
    you mean? Are there any small examples how to achieve that result?
    (Though, if you don’t have a time for writing examples, I think
    I’ll go with your second suggestion.)

Using the latter two “tricks”, rendering with reasonable amounts of
alpha blending (mainly around the edges of objects, for antialiasing)
can be practically as fast as plain opaque/colorkey rendering.
Thanks, that should work!

To get the best of both worlds, you can ask for a double buffered
hardware surface, and if you get one (not always possible; check
after init)
Sure, I do this after SDL_SetVideoMode:

   "if (0 == screen->flags & SDL_HWSURFACE)"

   It always worked good, but I want to be sure: is this a correct
  way to check that?

…set up a third software surface for rendering.
Looks like kinda tripple buffering, doesn’t it?–
Best regards,
Flashback

[…]

Unfortunately, I can't use OpenGL in my game for various

reasons. One of them is that I don’t know OpenGL good enough yet
(what a shame! :slight_smile: ). It’s not hard to learn it, I suppose, but my
time is very limited, and I can’t rebuild whole project. So, I
think I must stay with pure 2D SDL.

Well, you can avoid that problem by using glSDL (the wrapper or
preferably the just released backend patch) - but if you rely on it
for playable performance (rather than just using it for some extra
speed where OpenGL is available), you still have the OpenGL
requirement problem. You can snap it in and give it a try, but I
suspect that it’s more important to get the best out of the other
backends first, since OpenGL is not available on every machine.

[…]

 Sure, I can tell artist to do that, but poor guy has too much
work last time :) You said something about the filter, what did
you mean?

Something that scans the alpha channel pixel by pixel, setting nearly
transparent pixels to transparent and setting nearly opaque pixels to
opaque. You can also add some scaling and offset, to get an “alpha
contrast” filter.

Are there any small examples how to achieve that 

result? (Though, if you don’t have a time for writing examples, I
think I’ll go with your second suggestion.)

There is code for that, and some other filters, in Kobo Deluxe. That’s
very far from a small example, though! :smiley:

I’ve optimized and cleaned up some things since the last release, but
I just don’t have time to finish it. (The radar screen is broken -
but at least the rest should work with double buffered page flipping
OpenGL displays now… And there are some new sounds, bug fixes and
stuff.) I guess I should release it as is for now.

[…]

To get the best of both worlds, you can ask for a double buffered
hardware surface, and if you get one (not always possible; check
after init)

   Sure, I do this after SDL_SetVideoMode:

   "if (0 == screen->flags & SDL_HWSURFACE)"

   It always worked good, but I want to be sure: is this a

correct way to check that?

Well, you can leave out the “0 ==” part, since if() takes any non-zero
value as “true”…

Note that when checking for the true condition, it’s recommended to do
it like this:

if((screen->flags & SDL_SOMEFLAG) == SDL_SOMEFLAG)

This is because a #define (or enum) like that could be a set of bits
and not just one bit. (I usually avoid that when I design APIs,
though. Using bit flags is messy enough as it is. :smiley: )

…set up a third software surface for rendering.

   Looks like kinda tripple buffering, doesn't it?

Almost, but since the extra buffer is different from the page flipping
pair, I prefer to refer to it as “semitripple buffering”, to
distinguish it from “normal” tripple buffering, which uses three
actual hardware pages to cut the application and/or rendering
hardware some extra slack.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Monday 27 December 2004 09.51, Flashback wrote:

[…]

   "if (0 == screen->flags & SDL_HWSURFACE)"

   It always worked good, but I want to be sure: is this a

correct way to check that?

Well, you can leave out the “0 ==” part, since if() takes any
non-zero value as “true”…

Sorry; “leave out” isn’t the whole answer. You’ll obviously have to
invert the if() cases, or use the ‘!’ operator. :slight_smile:

Either way, it’s just a matter of taste. I prefer to keep the code as
short as possible without making it incomprehensible, simply because
less code means less reading and less code to misunderstand. I
especially dislike if()s with hairy, multiline conditions, which is
one reason to do away with everything that isn’t strictly required.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Monday 27 December 2004 13.00, David Olofson wrote:

a) Turn off the alpha channel ( eg with GIMP - unify all channels ), if
you don’t need them.

b) Convert surfaces to display format

c) Enable RLE ColorKey for surfaces with a lot of transparent pixels (
if you don’t need to change them )

d) Enable RLE Alpha for surfaces with lot of semi-transparent pixels (
if you don’t need to change them )

e) Create a buffer for background and surfaces that are rarely modified

f) Try hardware surface ( this depends on the hardware support - check
first, before setting video-mode, if hardware has accelerated alpha
surface blitting support, otherwise software surface maybe faster )

e) Or forget all that, and render images with OpenGL API ( but I have
little experience with it … )

Hope this helps you.

Regards;
Pedro Amaral Couto

A Dom, 2004-12-26 ?s 17:48, Flashback escreveu:> Hello all,

 I've encountered another problem. My artist brought me some map

objects: trees, stones, walls, etc. He saved all this stuff as PNG
with alpha channel.
In my turn, I loaded these pictures with IMG_Load, then
SDL_DisplayFormatAlpha’ed them. And now, when even 5-6 trees are
present on the screen, I watch terribly low perfomance.
I re-render every frame, where I draw terrain, map objects, GUI
and cursor. Those objects that have no partially transparent areas
are drawn rather fast (0-1 ms). Such objects as circle with gradiant
transparency are drawn very slow (~33 ms). Does anyone know
possible reasons?

  I would be very grateful for help, this problem totally paralyses

my work…

Hello David Olofson, hello all.

  David Olofson wrote:

Well, you can avoid that problem by using glSDL (the wrapper or
preferably the just released backend patch) - but if you rely on it
for playable performance (rather than just using it for some extra
speed where OpenGL is available), you still have the OpenGL
requirement problem. You can snap it in and give it a try, but I
suspect that it’s more important to get the best out of the other
backends first, since OpenGL is not available on every machine.
Yep, I’ll try glSDL. Sounds promising. Is there any direct link
or I need just google for it?

Something that scans the alpha channel pixel by pixel, setting nearly
transparent pixels to transparent and setting nearly opaque pixels to
opaque. You can also add some scaling and offset, to get an “alpha
contrast” filter.
Very good idea. I’ll try it today :slight_smile:

There is code for that, and some other filters, in Kobo Deluxe. That’s
very far from a small example, though! :smiley:
Have no doubt :slight_smile:

Note that when checking for the true condition, it’s recommended to do
it like this:
if((screen->flags & SDL_SOMEFLAG) == SDL_SOMEFLAG)
Knowledge base updated
Thanks!

This is because a #define (or enum) like that could be a set of bits
and not just one bit. (I usually avoid that when I design APIs,
though. Using bit flags is messy enough as it is. :smiley: )
Agree. I’d better use several bools :slight_smile:

Almost, but since the extra buffer is different from the page flipping
pair, I prefer to refer to it as “semitripple buffering”, to
distinguish it from “normal” tripple buffering, which uses three
actual hardware pages to cut the application and/or rendering
hardware some extra slack.
I did some tests, here are results (if someone’s
interested, he can continue reading this letter):

SOF**
Description of variables and processes:

SDL_Surface *screen = SDL_SetVideoMode(SCREEN_W, SCREEN_H, SCREEN_BPP, SDL_HWSURFACE | SDL_FULLSCREEN | SDL_DOUBLEBUF);
SDL_Surface *terra, scene;
objects - array of SDL_Surfaces

Rendering scene (terra) = SDL_BlitSurface(terra, &viewport, scene, NULL);

Rendering scene (objects) = for (int i = 0; i < objects_count; i++)
SDL_BlitSurface(objects[i], NULL, scene, &dest_rect);

Drawing scene = SDL_BlitSurface(scene, NULL, screen, NULL);

Test results:--------------------------------------------------

Settings: terra = HW_SURFACE
scene = HW_SURFACE
objects = HW_SURFACEs

Results: Rendering scene (terra) took 0 ms.
Rendering scene (objects) took 8 ms.
Drawing scene took 0 ms.

Time elapsed: 8 ms.

Settings: terra = HW_SURFACE
scene = HW_SURFACE
objects = SW_SURFACEs

Results: Rendering scene (terra) took 0 ms.
Rendering scene (objects) took 96 ms.
Drawing scene took 0 ms.

Time elapsed: 96 ms.

Settings: terra = HW_SURFACE
scene = SW_SURFACE
objects = HW_SURFACEs

Results: Rendering scene (terra) took 366 ms.
Rendering scene (objects) took 13 ms.
Drawing scene took 16 ms.

Time elapsed: 395 ms.

Settings: terra = HW_SURFACE
scene = SW_SURFACE
objects = SW_SURFACEs

Results: Rendering scene (terra) took 363 ms.
Rendering scene (objects) took 15 ms.
Drawing scene took 16 ms.

Time elapsed: 394 ms.

Settings: terra = SW_SURFACE
scene = HW_SURFACE
objects = HW_SURFACEs

Results: Rendering scene (terra) took 17 ms.
Rendering scene (objects) took 10 ms.
Drawing scene took 0 ms.

Time elapsed: 27 ms.

Settings: terra = SW_SURFACE
scene = HW_SURFACE
objects = SW_SURFACEs

Results: Rendering scene (terra) took 16 ms.
Rendering scene (objects) took 75 ms.
Drawing scene took 0 ms.

Time elapsed: 91 ms.

Settings: terra = SW_SURFACE
scene = SW_SURFACE
objects = HW_SURFACEs

Results: Rendering scene (terra) took 15 ms.
Rendering scene (objects) took 18 ms.
Drawing scene took 17 ms.

Time elapsed: 50 ms.

Settings: terra = SW_SURFACE
scene = SW_SURFACE
objects = SW_SURFACEs

Results: Rendering scene (terra) took 15 ms.
Rendering scene (objects) took 14 ms.
Drawing scene took 16 ms.

Time elapsed: 45 ms.

EOF**


Best regards,
Flashback

The glSDL backend patch for SDL:
http://icps.u-strasbg.fr/~marchesin/sdl/glsdl-final.patch

The latest snapshot of the wrapper version:
http://www.olofson.net/download/glSDL-20040602.tar.gz

(Older versions and other stuff in the same directory:)
http://www.olofson.net/download

You should use the first one, which we hope will be included in SDL
eventually - but if you don’t feel like compiling your own SDL lib or
don’t want to mess with SDL for other reasons, you can give the
wrapper version a try.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Monday 27 December 2004 15.24, Flashback wrote:

Hello David Olofson, hello all.

  David Olofson wrote:

Well, you can avoid that problem by using glSDL (the wrapper or
preferably the just released backend patch) - but if you rely
on it for playable performance (rather than just using it for
some extra speed where OpenGL is available), you still have the
OpenGL requirement problem. You can snap it in and give it a try,
but I suspect that it’s more important to get the best out of the
other backends first, since OpenGL is not available on every
machine.

   Yep, I'll try glSDL. Sounds promising. Is there any direct

link or I need just google for it?

Hello all,

  PedroAC wrote:

a) Turn off the alpha channel ( eg with GIMP - unify all channels ), if
you don’t need them.
a) I need them :slight_smile:
b) I use Photoshop7. When I flattern image (unify all channels),
I get white background, and semi-transparent pixels are mixed
with white. That’s not what I need.

b) Convert surfaces to display format
I always do that.

c) Enable RLE ColorKey for surfaces with a lot of transparent pixels (
if you don’t need to change them )
d) Enable RLE Alpha for surfaces with lot of semi-transparent pixels (
if you don’t need to change them )
What do you mean: “if you don’t need to change them” ? How
SDL_RLEACCEL affects writing to surface pixels?

f) Try hardware surface ( this depends on the hardware support - check
first, before setting video-mode, if hardware has accelerated alpha
surface blitting support, otherwise software surface maybe faster )
Sounds strange, but SDL_GetVideoInfo says that:

Possible to create hardware surfaces: 1
Window manager available: 1
Hardware to hardware blits accelerated: 1
Hardware to hardware colorkey blits accelerated: 1
Hardware to hardware alpha blits accelerated: 0
Software to hardware blits accelerated: 1
Software to hardware colorkey blits accelerated: 1
Software to hardware alpha blits accelerated: 0
Color fills accelerated: 1
Total amount of video memory in Kilobytes: 126950

 Though I have GeForce FX 5200, it's not so old... Why doesn't it

support aplha blits then?

e) Or forget all that, and render images with OpenGL API ( but I have
little experience with it … )
I’ll try glSDL soon. Hope it will work fine.

Hope this helps you.
Thanks!–
Best regards,
Flashback

Hello all,

  David Olofson wrote:

The glSDL backend patch for SDL:
http://icps.u-strasbg.fr/~marchesin/sdl/glsdl-final.patch

The latest snapshot of the wrapper version:
http://www.olofson.net/download/glSDL-20040602.tar.gz

(Older versions and other stuff in the same directory:)
http://www.olofson.net/download

You should use the first one, which we hope will be included in SDL
eventually - but if you don’t feel like compiling your own SDL lib or
don’t want to mess with SDL for other reasons, you can give the
wrapper version a try.
I think I’ll use compiled version. I was trying to play with SDL
sources some time ago, but that didn’t help me :slight_smile: I’d better use
stable stuff.
Thanks for links!–
Best regards,
Flashback

Flashback wrote:

d) Enable RLE Alpha for surfaces with lot of semi-transparent pixels
( if you don’t need to change them )
What do you mean: “if you don’t need to change them” ? How
SDL_RLEACCEL affects writing to surface pixels?

think about it. as pixel data is stored in a compressed format, you can
not address them directly, you first have to decompress them. by the
way, does SDL_LockSurface() decompress RLE surfaces?

clemens

[…]

 Though I have GeForce FX 5200, it's not so old... Why doesn't

it support aplha blits then?

It doesn’t matter what card you have. It’s a driver and/or API
limitation. Accelerated blending is only supported by 3D APIs
(OpenGL, Direct3D) and some of the latest generation 2D APIs. AFAIK,
with SDL, only the DirectFB and glSDL backends can provide
accelerated alpha blending.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Tuesday 28 December 2004 15.29, Flashback wrote:

A Ter, 2004-12-28 ?s 14:29, Flashback escreveu:

  PedroAC wrote:

a) Turn off the alpha channel ( eg with GIMP - unify all channels ), if
you don’t need them.

   What do you mean: "if you don't need to change them" ? How
 SDL_RLEACCEL affects writing to surface pixels?

I mean: if you don’t need to change surface pixels ( modify the image
). SDL_RLE* compresses pixels data - if you try to modify it, SDL needs
to unpack this data ( lock(unpack) -> change pixels -> unlock(pack) ).
www.prepressure.com/techno/compressionrle.htm
www.nondot.org/~sabre/graphpro/sprite3.html

When using OpenGL you shouldn’t use colorkey - use alpha channel
instead ( I read at an article about glSDL ).

Hope you all had a Merry Christmas, and have a Happy New Year!

Hello all,

  Clemens Kirchgatterer wrote:

think about it. as pixel data is stored in a compressed format, you can
not address them directly, you first have to decompress them.
I just didn’t know what kind of acceleration SDL_RLEACCEL gives
me (had no chance to learn that subject yet).

by the way, does SDL_LockSurface() decompress RLE surfaces?
PedroAC says it does, as I understand :slight_smile:

 Though I have GeForce FX 5200, it's not so old... Why doesn't

it support aplha blits then?
David Olofson wrote:
It doesn’t matter what card you have. It’s a driver and/or API
limitation. Accelerated blending is only supported by 3D APIs
(OpenGL, Direct3D) and some of the latest generation 2D APIs. AFAIK,
with SDL, only the DirectFB and glSDL backends can provide
accelerated alpha blending.
Ah, I see. Thanks for clearifying that for me.
I begin to like glSDL even now, though I didn’t try it yet.

  PedroAC wrote:

I mean: if you don’t need to change surface pixels ( modify the image
). SDL_RLE* compresses pixels data - if you try to modify it, SDL needs
to unpack this data ( lock(unpack) -> change pixels -> unlock(pack) ).
www.prepressure.com/techno/compressionrle.htm
www.nondot.org/~sabre/graphpro/sprite3.html
I’ll definitely check that out, thanks.

Hope you all had a Merry Christmas, and have a Happy New Year!
Ukrainian Christmas begins on 7th of January, so I’ll have some
sweet (and so desirable!) rest later.
Have a good… new 2005 year, developers! :slight_smile:
I saw lots of skillful people here (who have a big future, I
think). So I wish you all many successful projects!–
Best regards,
Flashback