Alpha blending bug - fixed?

It turns out mm5 never got the correct value. This failed :

[...]
"movd %1, %%mm5\n\t"
: : "m" (amask), "m" (sf->Ashift) );

mm5 got 0xFF000018 instead of 0x00000018. However I did this :

Uint32 ashift = sf->Ashift;
[...]
"movd %1, %%mm5\n\t"
: : "m" (amask), "m" (ashift) );

and everything worked fine.

The only thing I could find out was that ashift is 32-bit aligned but
sf->Ashift isn’t:

printf(“ashift %8X [%d]\n”, &ashift, (int)(&ashift) % 4);
printf(“sf->Ashift %8X [%d]\n”, &sf->Ashift, (int)(&sf->Ashift) % 4);

ashift at BFF0E0C4 [0]
sf->Ashift at 08AFB9CD [1]

Could this be the problem? If it is, the fix is extremely trivial.
Should I submit a patch?

--Gabriel________________________________________________________________________

Gabriel Gambetta
Mystery Studio - http://www.mysterystudio.com
Gabriel on Graphics - http://gabrielongraphics.blogspot.com

The only thing I could find out was that ashift is 32-bit aligned but
sf->Ashift isn’t:

I don’t think it’s an alignment issue, since GCC should be smart enough
to get it into a register even if unaligned…however, sf->AShift is 8
bits, and ashift is 32, which is probably why you get different
results…it probably landed in (say) %al instead of %eax, leaving junk
in the rest of the register’s bits.

Try the attached patch and see if it works, too.

–ryan.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed…
Name: blitbug2.diff
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20061121/1ba1cefe/attachment.txt

Thanks for listening :slight_smile:

The patch you sent doesn’t even compile -

./src/video/SDL_blit_A.c: In function ‘BlitRGBtoRGBPixelAlphaMMX3DNOW’:
./src/video/SDL_blit_A.c:1571: error: memory input 1 is not directly
addressable

In any case I guess GCC should be smart enough to make this work as
expected, whether it’s an alignment issue or a data size issue. GCC bug
maybe?

--Gabriel

El mar, 21-11-2006 a las 16:30 -0500, Ryan C. Gordon escribi?:> > The only thing I could find out was that ashift is 32-bit aligned but

sf->Ashift isn’t:

I don’t think it’s an alignment issue, since GCC should be smart enough
to get it into a register even if unaligned…however, sf->AShift is 8
bits, and ashift is 32, which is probably why you get different
results…it probably landed in (say) %al instead of %eax, leaving junk
in the rest of the register’s bits.

Try the attached patch and see if it works, too.

Hi,

the original code takes the address of the sf->Ashift variable and passes
this address to the assembler, because the ‘m’ constraint is used! Therefore
the movd instruction will not only load the content of sf->Ashift into
the mm5 register but also the following bits of the SDL_Pixelformat
structure (the lower three bytes of Rmask). The original propose of using the
local ashift variable (which is 32 bits wide) as input for the assembler
statement was correct.

To force gcc to load sf->Amask into a register you could also use

  •     : /* nothing */ : "m" (sf->Amask), "m" (sf->Ashift) );
    
  •     : /* nothing */ : "r" (sf->Amask), "r" ((Uint32) sf->Ashift) );
    

Note the “r” constraint.

Kind regards,

FrankOn Tuesday 21 November 2006 22:51, Gabriel Gambetta wrote:

Thanks for listening :slight_smile:

The patch you sent doesn’t even compile -

./src/video/SDL_blit_A.c: In function ‘BlitRGBtoRGBPixelAlphaMMX3DNOW’:
./src/video/SDL_blit_A.c:1571: error: memory input 1 is not directly
addressable

In any case I guess GCC should be smart enough to make this work as
expected, whether it’s an alignment issue or a data size issue. GCC bug
maybe?

–Gabriel

El mar, 21-11-2006 a las 16:30 -0500, Ryan C. Gordon escribi?:

The only thing I could find out was that ashift is 32-bit aligned but
sf->Ashift isn’t:

I don’t think it’s an alignment issue, since GCC should be smart enough
to get it into a register even if unaligned…however, sf->AShift is 8
bits, and ashift is 32, which is probably why you get different
results…it probably landed in (say) %al instead of %eax, leaving junk
in the rest of the register’s bits.

Try the attached patch and see if it works, too.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Dept. of Computer Science, Dresden University of Technology, Germany

http://os.inf.tu-dresden.de/~fm3

-------------- next part --------------
A non-text attachment was scrubbed…
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20061121/7457d71b/attachment.pgp

  •     : /* nothing */ : "m" (sf->Amask), "m" (sf->Ashift) );
    
  •     : /* nothing */ : "r" (sf->Amask), "r" ((Uint32) sf->Ashift) );
    

Note the “r” constraint.

Ah, okay, I’m an idiot…I saw “m” and thought “r” (and on an iBook
where I couldn’t build x86 code…).

We’ll go with Gabriel’s patch.

–ryan.

We’ll go with Gabriel’s patch.

…which is now Subversion revision #2914 for the 1.2 branch, and #2915
for the 1.3 branch.

Thanks!

–ryan.

This bug is actually my fault, and I just stumbled upon it myself a couple
days ago. I apologize for that, as I should have known better to
double-check the data type of Ashift.
IMHO, Frank’s and Ryan’s version of the fix is better with some minor
modifications to allows the compiler a bit more freedom. Also another
function has to be patched in a similar way – the pure MMX (non-3DNow!)
one, gcc version of BlitRGBtoRGBPixelAlphaMMX.
In light of all this I propose the following patch (also attached):

Index: SDL_blit_A.c===================================================================
— SDL_blit_A.c (revision 2923)
+++ SDL_blit_A.c (working copy)
@@ -369,7 +369,9 @@
packsswb_r2r(mm6, mm3); /* 0000FFFF -> mm3 /
pxor_r2r(mm0, mm3); /
0000F000 -> mm3 (~channel mask) /
/
get alpha channel shift */

  • movd_m2r(sf->Ashift, mm5); /* Ashift -> mm5 */
  • asm volatile (

  •   "movd %0, %%mm5"
    
  •   : : "rm" ((Uint32) sf->Ashift) ); /* Ashift -> mm5 */
    

    while(height–) {
    DUFFS_LOOP4({
    @@ -1566,7 +1568,6 @@
    int dstskip = info->d_skip >> 2;
    SDL_PixelFormat* sf = info->src;
    Uint32 amask = sf->Amask;

  • Uint32 ashift = sf->Ashift;

    asm (
    /* make mm6 all zeros. /
    @@ -1588,7 +1589,7 @@
    /
    get alpha channel shift /
    “movd %1, %%mm5\n\t” /
    Ashift -> mm5 */

  • : /* nothing */ : "m" (amask), "m" (ashift) );
    
  • : /* nothing */ : "m" (amask), "rm" ((Uint32) sf->Ashift) );
    

    while(height–) {

–Alex.

----- Original Message -----
From: sdl-bounces+avcp-sdlmail=usa.net@libsdl.org
[mailto:sdl-bounces+avcp-sdlmail=usa.net at libsdl.org] On Behalf Of Ryan C.
Gordon
Sent: Tuesday, November 21, 2006 6:35 PM
To: A list for developers using the SDL library. (includes SDL-announce)
Subject: Re: [SDL] Alpha blending bug - fixed?

We’ll go with Gabriel’s patch.

…which is now Subversion revision #2914 for the 1.2 branch, and #2915
for the 1.3 branch.

Thanks!

–ryan.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed…
Name: SDL_blit_A.patch.txt
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20061201/43e1a0dd/attachment.txt

Hi Alex,On Saturday 02 December 2006 00:32, Alex Volkov wrote:

This bug is actually my fault, and I just stumbled upon it myself a couple
days ago. I apologize for that, as I should have known better to
double-check the data type of Ashift.
IMHO, Frank’s and Ryan’s version of the fix is better with some minor
modifications to allows the compiler a bit more freedom. Also another
function has to be patched in a similar way – the pure MMX (non-3DNow!)
one, gcc version of BlitRGBtoRGBPixelAlphaMMX.
In light of all this I propose the following patch (also attached):

Index: SDL_blit_A.c

— SDL_blit_A.c (revision 2923)
+++ SDL_blit_A.c (working copy)
@@ -369,7 +369,9 @@
packsswb_r2r(mm6, mm3); /* 0000FFFF -> mm3 /
pxor_r2r(mm0, mm3); /
0000F000 -> mm3 (~channel mask) /
/
get alpha channel shift */

  • movd_m2r(sf->Ashift, mm5); /* Ashift -> mm5 */
  • asm volatile (

  • "movd %0, %%mm5"
    
  • : : "rm" ((Uint32) sf->Ashift) ); /* Ashift -> mm5 */
    

    while(height–) {
    DUFFS_LOOP4({
    @@ -1566,7 +1568,6 @@
    int dstskip = info->d_skip >> 2;
    SDL_PixelFormat* sf = info->src;
    Uint32 amask = sf->Amask;

  • Uint32 ashift = sf->Ashift;

    asm (
    /* make mm6 all zeros. /
    @@ -1588,7 +1589,7 @@
    /
    get alpha channel shift /
    “movd %1, %%mm5\n\t” /
    Ashift -> mm5 */

  • : /* nothing */ : “m” (amask), “m” (ashift) );

  • : /* nothing */ : “m” (amask), “rm” ((Uint32) sf->Ashift) );

while(height–) {

“rm” is senseless as the compiler can never choose “m”. It must choose
"r" since using “m” he would complain

memory input 1 is not directly addressable

Kind regards,

Frank

Dept. of Computer Science, Dresden University of Technology, Germany

http://os.inf.tu-dresden.de/~fm3

-------------- next part --------------
A non-text attachment was scrubbed…
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20061203/a46f1b99/attachment.pgp

Frank Mehnert wrote:

  • : /* nothing */ : “m” (amask), “rm” ((Uint32) sf->Ashift) );

“rm” is senseless as the compiler can never choose “m”.
It must choose “r” since using “m” he would complain
memory input 1 is not directly addressable

That is not entirely correct, Frank. The compiler may elect to pre-calculate
((Uint32) sf->Ashift) and put it in a temp stack var to be used later in the
MOVD. That is if the compiler decides it would make code faster, for
example, not that it would make any difference in this case.

–Alex.

Yes, you are right, it seems that compiler is able to do that. Just tested
with an appropriate test case. However, it is interesting that gcc > 3.4
refuses to compile the same statement without ‘r’ . gcc-3.3 and 3.4 would
warn in this case

warning: use of memory input without lvalue in asm operand 0 is deprecated

while gcc-4.0 and gcc-4.1 would refuse to compile.

Kind regards,

FrankOn Monday 04 December 2006 18:51, Alex Volkov wrote:

Frank Mehnert wrote:

  • : /* nothing */ : “m” (amask), “rm” ((Uint32) sf->Ashift) );

“rm” is senseless as the compiler can never choose “m”.
It must choose “r” since using “m” he would complain
memory input 1 is not directly addressable

That is not entirely correct, Frank. The compiler may elect to
pre-calculate ((Uint32) sf->Ashift) and put it in a temp stack var to be
used later in the MOVD. That is if the compiler decides it would make code
faster, for example, not that it would make any difference in this case.

Dept. of Computer Science, Dresden University of Technology, Germany

http://os.inf.tu-dresden.de/~fm3

-------------- next part --------------
A non-text attachment was scrubbed…
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20061205/ace05026/attachment.pgp