VC5 and SDL_RLEACCEL

First, a little history. It appears that the SDL-1.1.3 Win32 binaries
compiled with VC5 crash when you try to use RLE accelerated blits. If the
library is compiled with VC6, then RLE blitting works fine.

Well, I dusted off the old VC5 and reinstalled it. First I compiled a
release version of SDL-1.1.3 and the RLE blits failed just like the binaries
on the web site did. (consistency is a good thing :slight_smile: Then I compiled a
debug version and the RLE blits started working again. Finally, I went back
and generated a release version with the VC5 optimizer turned off and,
voila, everything worked fine.

So after all of this, it looks like the VC5 optimizer is messing up the RLE
blitting code. Any ideas on where to proceed from here? Is anyone familiar
with known “quirks” of the VC5 optimizer? What kind of priority do we place
on this issue? It’s always possible that my logic is flawed here, but this
really looks suspicious.

  • Randi

Regimental Command
Generic Armored Combat System
http://regcom.sourceforge.net

Hi,

I also had problems with the optimizer of VC5, but the most failures
came from wrong or missing typecasts. You should typecast correctly
everywhere, don’t assume anything.

Proff–

Florian ‘Proff’ Schulze - @Florian_Schulze
Member: TeamTNT - http://www.teamtnt.com
Homepage: - http://proff.fly.to
ICQ#: - 40510245

So after all of this, it looks like the VC5 optimizer is messing up the RLE
blitting code. Any ideas on where to proceed from here? Is anyone familiar
with known “quirks” of the VC5 optimizer? What kind of priority do we place
on this issue? It’s always possible that my logic is flawed here, but this
really looks suspicious.

I’m willing to help fixing it), if it is a bug in my code that only
shows up with the VC5 optimizer. I’m less inclined (= not at all) to
include more or less dirty workarounds just to compensate for bugs in
VC5, but that might not be the case.

Fire up your debugger and check where it crashes, analyse core dumps (or
what win32 call them) and so on. There is a little macrology that might
have to be unfolded by hand to find the exact location.

I also had problems with the optimizer of VC5, but the most failures
came from wrong or missing typecasts. You should typecast correctly
everywhere, don’t assume anything.

If you get warnings, please post all of them. The C Standard allows
spurious diagnostics, so a standard-conforming compiler is in its full
rights to say

frotz.c:75: Warning: ‘foo’ is not a very good variable name. Please be
more imaginative next time.
frotz.c:85: Warning: It’s almost midnight. If you go home and get some
sleep I promise you will get less warnings tomorrow.

but they might also indicate potential problems.

Mattias Engdeg?rd wrote:

Randi J. Relander wrote:

So after all of this, it looks like the VC5 optimizer
is messing up the RLE blitting code. Any ideas on where
to proceed from here?

I’m willing to help fixing it), if it is a bug in my code
that only shows up with the VC5 optimizer. I’m less inclined
(= not at all) to include more or less dirty workarounds just
to compensate for bugs in VC5, but that might not be the case.

I have the RLE assembly/source output from both compilers and will be going
through the code line-by-line. It crashes in the decompresser but the
problem could just as easily be in the compressor. The quick solution is to
either turn off optimization for the RLE module or just use VC6 to compile
the binaries. I have worked with compilers for embedded systems that you did
not dare to turn on optimization because you were practically guaranteed to
have something fail :slight_smile:

  • Randi

Regimental Command
Generic Armored Combat System
http://regcom.sourceforge.net

Randi J. Relander wrote

Mattias Engdeg?rd wrote:

I’m willing to help fixing it), if it is a bug in my
code that only shows up with the VC5 optimizer. I’m
less inclined (= not at all) to include more or less
dirty workarounds just to compensate for bugs in VC5,
but that might not be the case.

I have the RLE assembly/source output from both compilers
and will be going through the code line-by-line. It crashes
in the decompresser but the problem could just as easily
be in the compressor.

Well, I found the problem. It is in the RLE compression “ADD_SEGMENT” macro
in the “SDL_RLESurface” function. The VC5 optimizer sees a “skip = 0"
statement getting executed over and over again inside a loop and happily
moves it outside the loop. Unfortunately, it also ignores the fact that
"skip” is being used the first time through the loop before it is assigned
:slight_smile:

The following loop at the end “ADD_SEGMENT” …

do {
    int len = MIN(run, maxn);
    ADD_COUNT(skip);
    ADD_COUNT(len);
    skip = 0;
    memcpy(dst, srcbuf, len * bpp);
    srcbuf += len * bpp;
    dst += len * bpp;
    run -= len;
} while(run);

gets “optimized” into this …

skip = 0;
do {
    int len = MIN(run, maxn);
    ADD_COUNT(0);
    ADD_COUNT(len);
    memcpy(dst, srcbuf, len * bpp);
    srcbuf += len * bpp;
    dst += len * bpp;
    run -= len;
} while(run);

The loop is the one responsible for breaking up long opaque runs. The result
of the failed “optimization” is that all opaque skip values in the RLE
buffer are set to zero. Normally, only the ones after the first one in a
long run would be. The code crashes when the “SDL_RLEBlit” is executed on
the corrupted buffer.

While it is obviously the fault of the VC5 optimizer, it conceptually boils
down to a loop doing “one thing” the first time through and "another thing"
the rest of the times through. Definitely not a bug in the RLE code, but
something that could be modified slightly without being considered a “dirty
workaround” :slight_smile:

My bigger worry is that there might be other VC5 optimizer issues elsewhere
in the code, possibly leading to the requirement of using either VC6 or a
cross-compiler for the binary releases. These problems are a bear to track
down and tend to leave you a little gun shy. You start questioning every
line of code that you write.

  • Randi

Regimental Command
Generic Armored Combat System
http://regcom.sourceforge.net

Well, I found the problem. It is in the RLE compression “ADD_SEGMENT” macro
in the “SDL_RLESurface” function. The VC5 optimizer sees a “skip = 0"
statement getting executed over and over again inside a loop and happily
moves it outside the loop. Unfortunately, it also ignores the fact that
"skip” is being used the first time through the loop before it is assigned
:slight_smile:

Good work!

Now this gives me shivers. If you can’t trust your compiler to get the most
elementary stuff right (the code did nothing fancy at all), how can you
use it at all? I must have written thousands of loops with a similar logic,
naively thinking that the rules of the language should apply to them as well.

While it is obviously the fault of the VC5 optimizer, it conceptually boils
down to a loop doing “one thing” the first time through and "another thing"
the rest of the times through. Definitely not a bug in the RLE code, but
something that could be modified slightly without being considered a “dirty
workaround” :slight_smile:

I’ve been thinking about a way to simplify the RLE encoder in other
ways, mainly in order to reduce the amount of code by getting rid of
the ADD_SEGMENT macro.

(I would also like to call the Transparent() function through a function
pointer, so the transparency criteria can be changed, but I’m worried about
pipeline stalls. Does anyone know if modern processors can predict indirect
function calls? Otherwise a branch may be cheaper.)

My bigger worry is that there might be other VC5 optimizer issues elsewhere
in the code, possibly leading to the requirement of using either VC6 or a
cross-compiler for the binary releases. These problems are a bear to track
down and tend to leave you a little gun shy. You start questioning every
line of code that you write.

Exactly. Is this yet another Microsoft trick to make people keep upgrading
each time? It’s very clever. If the alternative would be to live in fear,
uncertainty and doubt over the fate of my own code after its passage
through Visual C’s digestive system(*), then I wouldn’t hesitate to upgrade
no matter the cost.

(*) A rather appropriate metaphor given the final product, don’t you think?

Mattias Engdeg?rd wrote:

I’ve been thinking about a way to simplify the
RLE encoder in other ways, mainly in order to
reduce the amount of code by getting rid of the
ADD_SEGMENT macro.

(I would also like to call the Transparent()
function through a function pointer, so the
transparency criteria can be changed, but I’m
worried about pipeline stalls.)

Might be worth writing four seperate encoders and call them based on the
pixel depth. Harder to maintain but easier to streamline. I think that the
core of the encoder is complicated enough that there may not be a good macro
solution.

My bigger worry is that there might be other
VC5 optimizer issues elsewhere in the code,
possibly leading to the requirement of using
either VC6 or a cross-compiler for the binary
releases.

Exactly. Is this yet another Microsoft trick to
make people keep upgrading each time?

Don’t look now, but I think that I just saw a black helicopter hovering
behind that ridge over there :slight_smile:

It would be nice if we could just make changes to the optimizer in visualc-5.0.0.tar.gz instead of shelling out more money for yet another development tool.

In any event, what do we do in the short/long term? Sam?

  • Randi

Regimental Command
Generic Armored Combat System
http://regcom.sourceforge.net

Well, I found the problem. It is in the RLE compression “ADD_SEGMENT” macro
in the “SDL_RLESurface” function. The VC5 optimizer sees a “skip = 0"
statement getting executed over and over again inside a loop and happily
moves it outside the loop. Unfortunately, it also ignores the fact that
"skip” is being used the first time through the loop before it is assigned
:slight_smile:

Wow. Okay Randi, I just fixed this, and I’ll have it in CVS by the end of
today when I commit other Win32 fixes I’m working on.

Great detective work! :slight_smile:

See ya,
-Sam Lantinga, Lead Programmer, Loki Entertainment Software