Faster blitting?

According to the documentation, SDL_DisplayFormat() converts
a surface for the fastest posible blitting. On the other hand,
we have RLE blitting for color-keyed images.

Would it be possible to make an even faster format while
supporting [more than 1 bit] alpha?

Ideally, this format would encode each scanline as runs of 1)
empty pixels, 2) opaque pixels, 3) pixels with intermediate
alpha. For cases 2) and 3) the format would have either a 8 bit
index and an 8 bit alpha or 24 color bits and an 8 bit alpha.

Would this be faster than plain alpha blitting?

About a paletted image, isn’t a palette lookup and a write faster
than a full 24/32 bit read and a 24/32 bit write? The 768 byte
palette should fit in the cache and using 8bpp values should mean
that image data is read 3x/4x faster. I’m sorry if this is
completely wrong, I don’t have much experience with optimizing
cache usage. Please clarify.

I know very few formats, if any (some TGA format?) support a 8bpp
indexed image and 8 bit alpha. AFAIK GIF and PNG support paletted
images only with 1 bit alpha. However, one could take a 32 bit PNG,
split it into a 24 bit BMP (colormap) and an 8 bit BMP (alpha),
quantize the colormap BMP and store both. On runtime, you could load
both images normally and then combine them into the proposed format
(automagically, of course - name them as image. and
image.alpha. and encapsulate it in your image loading
functions/classes)

If you think this could deliver a performance improvement, I’m
willing to implement all of this in C. I’d rather leave the MMX/SSE
optimization to someone with experience in these areas.

Thanks,
–Gabriel

Ing. Gabriel Gambetta
ARTech - GeneXus Development Team
ggambett at artech.com.uy

My normal reply to this would be to tell you to use OpenGL and use
hardware acceleration to make it fast. But, you have asked an
interesting question and I know you well enough to know that you already
know my standard answer :slight_smile:

I’m going to be a bit verbose in the following because I want everyone
to understand what I said.

According to the documentation, SDL_DisplayFormat() converts
a surface for the fastest posible blitting. On the other hand,
we have RLE blitting for color-keyed images.

Would it be possible to make an even faster format while
supporting [more than 1 bit] alpha?

Ideally, this format would encode each scanline as runs of 1)
empty pixels, 2) opaque pixels, 3) pixels with intermediate
alpha. For cases 2) and 3) the format would have either a 8 bit
index and an 8 bit alpha or 24 color bits and an 8 bit alpha.

Would this be faster than plain alpha blitting?

RLE stands for run length encoding. The idea of RLE is to get rid of
runs of pixels that have the same color. In a “normal” RLE encode image
there are two kinds of blocks of data. Each block starts with a tag that
says whether the block is RLE compressed or is just a block of pixels.
RLE encoded blocks have the form tag/length/pixel while unencoded blocks
have the form tag/length/pixel/pixel/…/pixel. When decoding these
blocks you either put “length” copies of the single pixel value in the
image or you copy all the saved pixels into the image.

There is no reason why the color values in and RLE encoded image can’t
contain an alpha value as part of the color. By doing a few extra steps
in the decompression code or by adding extra tags you could special case
compressed blocks with and without alpha and uncompressed blocks with
and without alpha. That would let you have alpha in an RLE image and you
would only pay for using alpha on blocks of data that contain alpha.
That seems like a pretty good idea.

One gotcha though. The tag/length part of a block is usually encoded as
a single byte with the high order bit used as the tag. That means that a
block can’t be more that 127 pixels long. If you have four tag values
then you need 2 bits for tags and you cut the maximum length of a block
to 63 bytes. Now days I can’t see any good reason to not use longer
tab/length values so this is not a big deal.

All in all, there is no good reason not to do what you propose. It would
be very nice to have for antialiased sprites and fonts.

About a paletted image, isn’t a palette lookup and a write faster
than a full 24/32 bit read and a 24/32 bit write? The 768 byte
palette should fit in the cache and using 8bpp values should mean
that image data is read 3x/4x faster. I’m sorry if this is
completely wrong, I don’t have much experience with optimizing
cache usage. Please clarify.

Not so much a cache issue as an extra work issue. If I have a pixel
value in the format I need I can just store it where it needs to go. In
the case of compressed data blocks the pixel winds up in a register and
just gets stored several times into memory. For uncompressed blocks you
have to read each pixel from one location in memory and then store it
into another location. If you have to look up the pixel in a palette
then the for each pixel you have to read the pixel, index into the
palette and get the real pixel value, and then store the pixel. That
means you have to do an extra read for each pixel. You also have to
compute an index for each palette access. The effect of the extra
operations will affect different CPUs in different ways. There might be
a slow down. There might not.

I know very few formats, if any (some TGA format?) support a 8bpp
indexed image and 8 bit alpha. AFAIK GIF and PNG support paletted
images only with 1 bit alpha. However, one could take a 32 bit PNG,
split it into a 24 bit BMP (colormap) and an 8 bit BMP (alpha),
quantize the colormap BMP and store both. On runtime, you could load
both images normally and then combine them into the proposed format
(automagically, of course - name them as image. and
image.alpha. and encapsulate it in your image loading
functions/classes)

If you think this could deliver a performance improvement, I’m
willing to implement all of this in C. I’d rather leave the MMX/SSE
optimization to someone with experience in these areas.

It seems to be worth testing. And having RLE encoded anti-aliased
sprites and fonts would be pretty nice to have.

	Bob PendletonOn Wed, 2004-03-31 at 09:29, Gabriel Gambetta wrote:

Thanks,
–Gabriel

Ing. Gabriel Gambetta
ARTech - GeneXus Development Team
ggambett at artech.com.uy


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

±--------------------------------------+

According to the documentation, SDL_DisplayFormat() converts
a surface for the fastest posible blitting. On the other hand,
we have RLE blitting for color-keyed images.

Would it be possible to make an even faster format while
supporting [more than 1 bit] alpha?

Ideally, this format would encode each scanline as runs of 1)
empty pixels, 2) opaque pixels, 3) pixels with intermediate
alpha. For cases 2) and 3) the format would have either a 8 bit
index and an 8 bit alpha or 24 color bits and an 8 bit alpha.

Would this be faster than plain alpha blitting?

Yes - and looking at the performance improvement when using RLE on
alpha surfaces (antialiazed sprites and stuff), I’m quite certain
that SDL is already doing what you suggest.

For example, the rounded framework with shadows and glass highlights
in Kobo Deluxe is a single RGBA surface that’s blitted over the
playfield after rendering each frame. Doing that without RLE results
in a terrible frame rate even on P-II+ CPUs (tried it), but as it is,
it’s actually the fastest way to do it. Fully transparent pixels are
skipped and opaque pixels are just copied.

About a paletted image, isn’t a palette lookup and a write faster
than a full 24/32 bit read and a 24/32 bit write? The 768 byte
palette should fit in the cache and using 8bpp values should mean
that image data is read 3x/4x faster.

Maybe on reasonably modern CPUs… (Fast cores + relatively slow
memory.)

The problem is that each palette entry will have to be 4 bytes for
24/32 bpp, which means the palette is about as large as a 56x55
surface. So, unless you do this only for large surfaces or have lots
of surfaces share the same physical palette, you’re going to lose
even if the code is memory bound.

I’m sorry if this is
completely wrong, I don’t have much experience with optimizing
cache usage. Please clarify.

Well, it’s pretty simple in this case actually; for the cache to do
any good, you have to use the cached palette data several times -
which means the total size of the palette and the image data must
be smaller than the corresponding 24/32 bpp image.

Further, unless you explicitly tell the cache controller to fetch the
entire palette before you use it, the access palette pattern (which
depends on the image data) may screw things up, making things slower
than a plain 24/32 bpp blit even if the latter touches more memory.

I know very few formats, if any (some TGA format?) support a 8bpp
indexed image and 8 bit alpha. AFAIK GIF and PNG support paletted
images only with 1 bit alpha.

GIF doesn’t support alpha at all AFAIK; only colorkey transparency.
Dunno about PNG… It seems like GIMP won’t even let you create a
palettized image with an alpha channel in it’s internal format, so I
can’t try it either.

However, one could take a 32 bit PNG,
split it into a 24 bit BMP (colormap) and an 8 bit BMP (alpha),
quantize the colormap BMP and store both.

Had to hack a tool to do that for some editing tool for The Sims,
BTW… That was just because the editing tool lacked this obvious
feature, but it suggests that The Sims use some kind of 8 bit
palettized + 8 bit alpha format internally.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Wednesday 31 March 2004 17.29, Gabriel Gambetta wrote:

PNG supports alpha channel for RGB images (eg. plain RGBA). I believe
it supports alpha channel for paletted images, but I’m not entirely sure
how that works right now (the documentation for the tRNS chunk is poor),
but I know that it and SDL_image will at least handle simple color keying
properly.

The last time I tried Gimp (a year or two ago?), loading then saving a PNG
with alpha information threw it away, which was horribly broken, so I went
back (having been rather disgusted in Gimp :slight_smile: to Photoshop and havn’t tried
it since.On Wed, Mar 31, 2004 at 08:00:04PM +0200, David Olofson wrote:

I know very few formats, if any (some TGA format?) support a 8bpp
indexed image and 8 bit alpha. AFAIK GIF and PNG support paletted
images only with 1 bit alpha.

GIF doesn’t support alpha at all AFAIK; only colorkey transparency.
Dunno about PNG… It seems like GIMP won’t even let you create a
palettized image with an alpha channel in it’s internal format, so I
can’t try it either.


Glenn Maynard

Hrm, I’ve never had trouble with PNGs in Gimp. Perhaps didn’t try PNGs
two years ago, though. ;^)

And yeah, PNG can do alpha either with RGBA or in indexed mode.
(The PNG website lists what different ways it can…)

-bill!
(chattering far too much today, sorry!)On Thu, Apr 01, 2004 at 04:40:07PM -0500, Glenn Maynard wrote:

The last time I tried Gimp (a year or two ago?), loading then saving a PNG
with alpha information threw it away, which was horribly broken, so I went
back (having been rather disgusted in Gimp :slight_smile: to Photoshop and havn’t tried
it since.

PNG supports alpha channel for RGB images (eg. plain RGBA). I believe
it supports alpha channel for paletted images, but I’m not entirely sure
how that works right now (the documentation for the tRNS chunk is poor),
but I know that it and SDL_image will at least handle simple color keying
properly.

The last time I tried Gimp (a year or two ago?), loading then saving a PNG
with alpha information threw it away, which was horribly broken, so I went
back (having been rather disgusted in Gimp :slight_smile: to Photoshop and havn’t tried
it since.

PNGs definitely save right in GIMP now. It’s what I use ona regular basis.

-TomT64

Hrm, I’ve never had trouble with PNGs in Gimp. Perhaps didn’t try
PNGs two years ago, though. ;^)
And yeah, PNG can do alpha either with RGBA or in indexed mode. (The
PNG website lists what different
ways it can…)

When I convert a PNG with alpha to indexed in The GIMP, it also converts
the alpha channel to 1 bpp :frowning:

Ing. Gabriel Gambetta
ARTech - GeneXus Development Team
ggambett at artech.com.uy

Yes, that’s what I concluded the other day… (This was with GIMP
2.0-pre3 on Linux/x86.)

GIMP doesn’t seem to support palette + “real” alpha at all, unless
possibly if some of the file output plugins can convert from RGB to
indexed on the fly. The PNG plugin doesn’t, at least.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Friday 02 April 2004 15.04, Gabriel Gambetta wrote:

Hrm, I’ve never had trouble with PNGs in Gimp. Perhaps didn’t
try PNGs two years ago, though. ;^)
And yeah, PNG can do alpha either with RGBA or in indexed mode.
(The PNG website lists what different ways it can…)

When I convert a PNG with alpha to indexed in The GIMP, it also
converts the alpha channel to 1 bpp :frowning:

Photoshop doesn’t support alpha in the palette (color key only), either;
I don’t know of any tools that do. I don’t think even pngcrush will do
this for you, and SDL surfaces can’t represent them (unless you use the
"unused" palette entry to hold alpha, which won’t work with any blits).On Fri, Apr 02, 2004 at 04:05:14PM +0200, David Olofson wrote:

GIMP doesn’t seem to support palette + “real” alpha at all, unless
possibly if some of the file output plugins can convert from RGB to
indexed on the fly. The PNG plugin doesn’t, at least.


Glenn Maynard