Alpha blending with hardware surfaces

alpha blending hardware surfaces is, as I’ve found/been told/read hideously
slow, unless you are lucky enough to have hardware accelerated alpha
blending (dunno if I’ve ever seen that).

I’m trying to work out the best method to alpha blend and to get some
reasonable speed when using a hardware surface enabled sdl app.

My best effort is to:
create two software surfaces, one with alpha blending enabled, and the other
with standard software flags enabled.

I then blit the alpha surface onto the software surface (alpha->software
surface not too slow), and then blit the software surface to the display
surface (a hardware surface, and so too slow to alpha too directly).

I dunno, maybe this is useful to other newbies of alpha blending like me,
and if not, someone can tell me why I’m really stupid and it’s not a good
