Problem with SDL_DisplayFormatAlpha

Hi,

I’m the lead developer of a game - Battle for Wesnoth (http://www.wesnoth.org) that uses SDL. While we’ve mostly found SDL easy to use and reliable, we’ve recently come across a problem, and we’re trying to work out if it’s due to our own misunderstanding of the SDL API, or a bug in SDL.

It seems that calling the function SDL_DisplayFormatAlpha() on a surface will sometimes cause the source surface to be corrupted, particularly after the surface returned from SDL_DisplayFormatAlpha() is freed.

We have managed to boil this down to a minimal case. There is a source file at http://www.whitevine.net/sdlproblem/main.c and a data file at http://www.whitevine.net/sdlproblem/img.bmp – to reproduce the problem, compile main.c, linking to SDL, and then run the program with img.bmp in the directory you’re running the program from.

The expected result of the program is to display img.bmp twice on the screen. The result we’ve been getting is the first time img.bmp displays correctly, and the second time it is ‘corrupted’ (it resembles yellow static).

Why the image is being corrupted, I’m not sure. However, removing the call to SDL_DisplayFormatAlpha that assigns the result to ‘dummy_tmp’ causes the program to run as expected. Even removing the call SDL_FreeSurface(dummy_tmp) causes the program to run as expected.

I am working under the assumption that SDL_DisplayFormatAlpha() can be called on any valid surface, and that it returns a completely new surface, and doesn’t mutate its argument. Is this assumption correct? It seems in the given test program, it sometimes does mutate its argument.

I, and users of my program, have tested this under Windows XP using SDL 1.2.6, and under SDL 1.2.5, 1.2.6, and SDL CVS using GNU/Linux. All these platforms had the problem. We’ve also tested under SDL 1.2.5 using FreeBSD, and it didn’t have the problem.

Any help would be greatly appreciated.

David White
Lead Developer, Battle for Wesnoth (http://www.wesnoth.org)

I’m the lead developer of a game - Battle for Wesnoth (http://www.wesnoth.org) that uses SDL. While we’ve mostly found SDL easy to use and reliable, we’ve recently come across a problem, and we’re trying to work out if it’s due to our own misunderstanding of the SDL API, or a bug in SDL.

Wow, this is a really good bug. What’s happening is that SDL keeps cached
information about the blit in the surface. If the surface is being blitted
to a different destination, then it invalidates the cache and recalculates
the fastest blit method to the new surface. The way it can tell whether or
not it is blitting to a new surface is whether the target surface pointer
has changed. In this case, the memory manager reuses the pointer that was
just freed for the new surface, and SDL thinks the blit target hasn’t changed.

So… I can’t think of a way to fix this without changing the format of the
SDL surface structure, which I’d hate to do in a stable release. This will
definitely be fixed in SDL 1.3, but in the meantime see if you can allocate
your target surfaces up front so you aren’t reusing the same memory pointers
over and over again?

See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment

This will
definitely be fixed in SDL 1.3, but in the meantime see if you can
allocate
your target surfaces up front so you aren’t reusing the same memory
pointers
over and over again?

I guess I can try this, but as you might imagine, it is likely to be a
difficult and error-prone approach to implement in a non-trivial program.

An alternative approach I could try would be to invalidate the cache myself
after blitting, effectively disabling the cache, but making things work
correctly.

i.e. in the sample program I gave, adding

struct SDL_BlitMap { SDL_Surface* dst; }

at the top of the file, and,

img->map->dst = NULL;

after the second call to SDL_DisplayFormatAlpha().

Obviously this approach will have a cost in terms of speed. Any idea how
much slower this is likely to make things? Is SDL_MapSurface() a costly
operation?

The reason I like this approach more, is that I can implement it simply by
wrapping some SDL functions, and then removing the cache invalidation once
an SDL version that doesn’t have this bug becomes widespread.

Thanks for your help!

David White
Lead Developer, Battle for Wesnoth (http://www.wesnoth.org)

Sam Lantinga wrote:

I’m the lead developer of a game - Battle for Wesnoth (http://www.wesnoth.org) that uses SDL. While we’ve mostly found SDL easy to use and reliable, we’ve recently come across a problem, and we’re trying to work out if it’s due to our own misunderstanding of the SDL API, or a bug in SDL.

Wow, this is a really good bug. What’s happening is that SDL keeps cached
information about the blit in the surface. If the surface is being blitted
to a different destination, then it invalidates the cache and recalculates
the fastest blit method to the new surface. The way it can tell whether or
not it is blitting to a new surface is whether the target surface pointer
has changed. In this case, the memory manager reuses the pointer that was
just freed for the new surface, and SDL thinks the blit target hasn’t changed.

So… I can’t think of a way to fix this without changing the format of the
SDL surface structure, which I’d hate to do in a stable release.

Isn’t it possible to create a flag (SDL_NEWSURFACE for ex) that means
"this surface is newly created", and set it during surface creation.
Then when a blit happens, if that flag is set, invalidate the map,
recalculate the map and remove the flag ?

Sure, that’s not a clean fix.

Stephane

We tracked this down on IRC. The problem is that the logic for testing
the validity of the blit mappings is flawed.

If you:
Create Surface A
Create Surface B
Copy A to B
Delete Surface B
Create Surface C
Copy A to C

If surface B and C are allocated at the same addresses then the
SDL_BlitSurface logic will assume that they are the same surface with
the same format and won’t generate a new blit mapping.

Possible solution would be rather than incrementing format_version when
the format of a surface changes SDL should set format_version to the
value of a global format_version variable that is incremented each
time.On Feb 19, 2004, at 10:47 AM, David White wrote:

Hi,
?
I’m the lead developer of a game - Battle for Wesnoth
(http://www.wesnoth.org) that uses SDL. While we’ve mostly found SDL
easy to use and reliable, we’ve recently come across a problem, and
we’re trying to work out if it’s due to our own misunderstanding of
the SDL API, or a bug in SDL.
?
It seems that calling the function SDL_DisplayFormatAlpha() on a
surface will sometimes cause the source surface to be corrupted,
particularly after the surface returned from SDL_DisplayFormatAlpha()
is freed.
?
We have managed to boil this down to a minimal case. There is a source
file at http://www.whitevine.net/sdlproblem/main.c and a data file at
http://www.whitevine.net/sdlproblem/img.bmp – to reproduce the
problem, compile main.c, linking to SDL, and then run the program with
img.bmp in the directory you’re running the program from.
?
The expected result of the program is to display img.bmp twice on the
screen. The result we’ve been getting is the first time img.bmp
displays correctly, and the second time it is ‘corrupted’ (it
resembles yellow static).
?
Why the image is being corrupted, I’m not sure. However, removing the
call to SDL_DisplayFormatAlpha that assigns the result to 'dummy_tmp’
causes the program to run as expected. Even removing the call
SDL_FreeSurface(dummy_tmp) causes the program to run as expected.
?
I am working under the assumption that SDL_DisplayFormatAlpha() can be
called on any valid surface, and that it returns a completely new
surface, and doesn’t mutate its argument. Is this assumption correct?
It seems in the given test program, it sometimes does mutate its
argument.
?
I, and users of my program, have tested this under Windows XP using
SDL 1.2.6, and under SDL 1.2.5, 1.2.6, and SDL CVS using GNU/Linux.
All these platforms had the problem. We’ve also tested under SDL 1.2.5
using FreeBSD, and it didn’t have the problem.
?
Any help would be greatly appreciated.
?
David White
Lead Developer, Battle for Wesnoth (http://www.wesnoth.org)


Martin
http://akawaka.csn.ul.ie/blog.php
http://www.treyarch.com

I’m the lead developer of a game - Battle for Wesnoth (http://www.wesnoth.org) that uses SDL. While we’ve mostly found SDL easy to use and reliable, we’ve recently come across a problem, and we’re trying to work out if it’s due to our own misunderstanding of the SDL API, or a bug in SDL.

Martin suggested a great fix which I’ve implemented. It’s binary compatible
and fixes the problem. Please update from CVS and see if that works for you.

Thanks!
-Sam Lantinga, Software Engineer, Blizzard Entertainment

So… I can’t think of a way to fix this without changing the format of
the

SDL surface structure, which I’d hate to do in a stable release.

Isn’t it possible to create a flag (SDL_NEWSURFACE for ex) that means
"this surface is newly created", and set it during surface creation.
Then when a blit happens, if that flag is set, invalidate the map,
recalculate the map and remove the flag ?

Consider the following sequence:

  • surface A is blitted to surface B, has its map set to surface B’s memory
    address
  • surface B is freed
  • surface C is created, and allocated at the same memory address as surface
    B was. Has SDL_NEWSURFACE flag set.
  • surface X is blitted to surface C, causing C’s SDL_NEWSURFACE flag to be
    removed
  • surface A is blitted to surface C, and since surface C is in the memory
    address of surface B, and surface C does not have its SDL_NEWSURFACE flag
    set any longer, the invalid blit occurs.

David White
Lead Developer, Battle for Wesnoth (http://www.wesnoth.org)

I’m the lead developer of a game - Battle for Wesnoth
(http://www.wesnoth.org) that uses SDL. While we’ve mostly found SDL easy to
use and reliable, we’ve recently come across a problem, and we’re trying to
work out if it’s due to our own misunderstanding of the SDL API, or a bug in
SDL.

Martin suggested a great fix which I’ve implemented. It’s binary
compatible
and fixes the problem. Please update from CVS and see if that works for
you.

Preliminary testing shows that it seems to solve the problem. Thank you to
you and Martin for fixing this issue so quickly!

When is it likely that a release of SDL will be made that incorporates these
changes?

David White
Lead Developer, Battle for Wesnoth (http://www.wesnoth.org)

Preliminary testing shows that it seems to solve the problem. Thank you to
you and Martin for fixing this issue so quickly!

It’s always helpful to have a simple test case to figure things out. :slight_smile:

When is it likely that a release of SDL will be made that incorporates these
changes?

This fix will be in the SDL 1.2.7 release which will be any day now.
I’m just resolving some build issues with some Linux distributions,
and waiting on some Win32 resolution switching testing and it’ll be
all set.

See ya!
-Sam Lantinga, Software Engineer, Blizzard Entertainment