I’ve added runtime mmx cpu & prefetch detection to Rafal Bursig’s and
Rene Dudfield’s patches (those two patches add MMX to alpha blitting
functions).
The resulting patch is here : http://icps.u-strasbg.fr/~marchesin/sdl_mmxalphablit.patch
I’ve added runtime mmx cpu & prefetch detection to Rafal Bursig’s and
Rene Dudfield’s patches (those two patches add MMX to alpha blitting
functions).
The resulting patch is here : http://icps.u-strasbg.fr/~marchesin/sdl_mmxalphablit.patch
why don’t you put this in cvs?> ----- Original Message -----
From: illumen@yahoo.com (Rene Dudfield)
To:
Sent: Tuesday, August 12, 2003 8:59 AM
Subject: Re: [SDL] [patch] MMX alpha blit patches with MMX detection
Stephane Marchesin wrote:
Hi,
I’ve added runtime mmx cpu & prefetch detection to Rafal Bursig’s and
Rene Dudfield’s patches (those two patches add MMX to alpha blitting
functions).
The resulting patch is here : http://icps.u-strasbg.fr/~marchesin/sdl_mmxalphablit.patch
SDL_RLEaccel.c: In function SDL_RLEAlphaBlit': SDL_RLEaccel.c:1321: invalid lvalue in unary&’
…that’s why.
I also seem to remember that the patch changed the semantics of the destination
alpha channel.
See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment
Yeah it did. I’m not sure why? Is there a good reason to change it?
Or is any change unacceptable?
My one function didn’t change the semantics. Which is the
BlitRGBtoRGBPixelAlphaMMXPrefetch one in the patch. But it doesn’t use
those nice macros only gcc inline asm.
I also seem to remember that the patch changed the semantics of the
destination
alpha channel.
See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment
Yeah it did. I’m not sure why? Is there a good reason to change it?
Or is any change unacceptable?
My one function didn’t change the semantics. Which is the
BlitRGBtoRGBPixelAlphaMMXPrefetch one in the patch. But it doesn’t
use those nice macros only gcc inline asm.
Well, SDL behaviour shouldn’t change because programs rely on it.
I also noticed some mistakes in the code. I’ll fix those and check that
every asm function behaviour matches its C equivalent.
If you do byte-by-byte comparison of the output between C and MMX
functions, you’ll notice that the results for 555 and 565 RGB alpha
blits aren’t exactly the same. This is because MMX functions for 555 and
565 RGB have an higher accuracy. If you want the exact same behaviour
that’s possible by masking the three lower alpha bits in the MMX
functions. Just ask !
I removed one MMX function because after I fixed it to match its C
equivalent, it revealed to be slower than the C version on a PIII
(although a bit faster on an Athlon XP).
I’ve also added MMX and PIII replacements for SDL_memcpy. Those provide
some speed up in testvidinfo -benchmark (at least for me, under linux &
X11).
Of course, I had to cut & paste the functions on the source code of the
benchmark (yes, dirty).
I didn’t find any benchmark testing the alpha blitting functions (and
testing them all would mean having to switch video depth anyway) so I
used little hacks like this. Maybe an extension to testvidinfo could
test alpha blits…
Of course, I had to cut & paste the functions on the source code of
the benchmark (yes, dirty).
I didn’t find any benchmark testing the alpha blitting functions (and
testing them all would mean having to switch video depth anyway) so I
used little hacks like this. Maybe an extension to testvidinfo could
test alpha blits…
Stephane
Perhaps everyone on the list could test their projects(or others games)
with the latest cvs and compare the fps to the last release? Report
your results here
These are my results from before:
50.6342458682 / 37.2473964489 * 100 - 100 = 35.94% increase
34.4127871498 / 21.7745967561 * 100 - 100 = 58.04% increase
That was on a duron 850, running linux with 32 bit, per pixel alpha images.
But I have a question regarding the alpha blending code… If your
source image pixel has a color=255(bright) and alpha=255(opaque), you
would expect the output color to be 255, right?
After briefly looking at the MMX math, it appears to multiply color by
alpha, then divide by 256. This would yield an output color of 254 in
the above case:
truncate((255 * 255) / 256) = 254
Am I correct?
-SeanOn Thursday, August 21, 2003, at 11:52 PM, Sam Lantinga wrote:
Probably doesnt matter if you are right. The new code is faster enough that it
isnt worth it to care. SDL isnt known for its blitting accuracy.On 22-Aug-2003, Sean Gies wrote:
(I’m new here)
But I have a question regarding the alpha blending code… If your
source image pixel has a color=255(bright) and alpha=255(opaque), you
would expect the output color to be 255, right?
After briefly looking at the MMX math, it appears to multiply color by
alpha, then divide by 256. This would yield an output color of 254 in
the above case:
truncate((255 * 255) / 256) = 254
–
Patrick “Diablo-D3” McFarland || unknown at panax.com
"Computer games don’t affect kids; I mean if Pac-Man affected us as kids, we’d
all be running around in darkened rooms, munching magic pills and listening to
repetitive electronic music." – Kristian Wilson, Nintendo, Inc, 1989
But I have a question regarding the alpha blending code… If your
source image pixel has a color=255(bright) and alpha=255(opaque), you
would expect the output color to be 255, right?
After briefly looking at the MMX math, it appears to multiply color by
alpha, then divide by 256. This would yield an output color of 254 in
the above case:
truncate((255 * 255) / 256) = 254
Am I correct?
Yes.
Now look at the C code surrouding it, not only at the patch. It has the
same behaviour (divides by 256, notice the ">> 8"s here and there). This
issue has always been there, you just didn’t notice it before.
In most of the functions, for example in BlitRGBtoRGBPixelAlpha(), there
is a special case for (alpha == SDL_ALPHA_OPAQUE) that the MMX functions
have too. So the MMX functions reproduct the same behaviour as the C
ones and the result of blitting is 255 in both cases. The only place
where there is no such special case is BlitRGBtoRGBSurfaceAlpha() (don’t
ask me why, because I don’t know
Anyway, we aren’t able to visually tell the difference between
(255,255,255) and (254,254,254). That means the only case I can think of
where this would be a problem is when you blit a surface with surface
alpha to another surface and then add colorkeying to the resulting surface.