Alpha blitting

“Mattias Engdeg?rd” wrote

no you are correct here. after looking through SDL’s blitting,
it appears pixel alphas will override any surface alpha.
personally, i would prefer pixel alpha being combined with
surface alpha.

It was a design decision, but I don’t think it would break anything if
we changed it to combine the alphas. The problem is that it would
require a whole new blitter, and one that I wasn’t able to implement
efficiently. Basically an extra multiplication for each pixel would be
required, and I had my doubts whether it could ever be hardware
accelerated.

I could write a slow catch-all routine of course, but would that
be useful? I don’t like implementing things that can’t be done well.

i think something like this would be useful. maybe there are
alternatives? it would be great if there was some way to combine
these two (even if it did end up as a slow, catchall blitter)

instead of combining the alphas during the blit, there could be some
routine to premultiply the entire alpha plane for an image? maybe
this could be an option in SDL_SetAlpha?

instead of combining the alphas during the blit, you cap the
maximum alpha value to the surface alpha? this would get rid of
the multiply, but you’d need some sort of
"pix_alpha > surf_alpha ? surf_alpha : pix_alpha" logic, which
would might end up slower.

in any event, i’m not sure what directdraw does when applying
surface and pixel alphas? i would assume videocards can handle
it, because they can do something similar with direct3d/opengl.
does the latest stuff in Xfree4 support something like this?

instead of combining the alphas during the blit, there could be some
routine to premultiply the entire alpha plane for an image? maybe
this could be an option in SDL_SetAlpha?

Any reason why this cannot be done by client code?

instead of combining the alphas during the blit, you cap the
maximum alpha value to the surface alpha? this would get rid of
the multiply, but you’d need some sort of
"pix_alpha > surf_alpha ? surf_alpha : pix_alpha" logic, which
would might end up slower.

That would probably be slower, but it would be interesting to compare it.
On big stretches of relative uniform pixel alpha the branch prediction might
make it fast enough. You could also try something like

delta = pix_alpha - surf_alpha;
mask = delta >> 8;
result_alpha = surf_alpha - (delta & mask);

That’s 4 ALU ops; if you don’t have signed (arithmetic) right shift
you need to mask the result. Some CPUs have fast integer multiplies,
but the number of multiplies that can be done at a time are often less
than the number of ALUs that can do do simple operations (add/sub,
logical operations, shifts). Only benchmarking can tell

“Mattias Engdeg?rd” wrote

instead of combining the alphas during the blit, there could be some
routine to premultiply the entire alpha plane for an image? maybe
this could be an option in SDL_SetAlpha?

Any reason why this cannot be done by client code?

does the clients are lazy count? :]
actually it does seem like one of the cleanest solutions.

if this train of logic was used too heavily there would be
no software blitters included with sdl at all, heh

i think i’m correct in remembering that the alpha blitters
support premultiplied RGB’s for a small optimization?
perchance i will write a routine to multiply the RGB and/or A
for an image and set the correct flags. would there be any
interest/support for an SDL_MultAlpha() ?

if this train of logic was used too heavily there would be
no software blitters included with sdl at all, heh

We want SDL to make use of the hardware blitting capabilities of the
display target, so we need software blitters to fall back on when that
hardware isn’t available

i think i’m correct in remembering that the alpha blitters
support premultiplied RGB’s for a small optimization?

Not in SDL they don’t. They might do that in the future, for several reasons

perchance i will write a routine to multiply the RGB and/or A
for an image and set the correct flags. would there be any
interest/support for an SDL_MultAlpha() ?

I’m not saying it would be a bad idea, but the design of the API should
be carefully thought out. Let’s try to do well-architected designs instead
of a jumble of ad-hoc function calls