john skaller <@john_skaller>:
Hi, I seem to have a problem that setting a clipping rectangle
on software renderer is corrupting something and causing a crash.
[…]
Regarding that and your follow-ups on that one, can you please provide
a stripped down example that could cause the issue?
I don’t know if I can or not. My code is written in Felix, which generates
C++ corresponding to ordinary C use of SDL. However the program
consists of a standard mainline, four shared libraries loaded under program
control, and about 10 other shared libraries autoloaded by the linker,
(including SDL and freetype) and includes a garbage collector,
asynchronous I/O support and other stuff.
The crash invariably occurs during garbage collection, because that’s
a time when a lot of memory is scanned and unused memory freed.
The SDL data structures involved are NOT garbage collected,
but if an SDL function corrupted some Felix GC memory the
GC’s operation (including free()ing garbage) could trigger
an access violation.
So if I just write a simple cut down C program that does a single
blit to a surface and set a clipping rectangle it probably won’t
crash If it would crash the problem would have been found
by other ages ago.
I experienced
some memory corruptions in the past with the software clipping, but was
never able to reproduce them.
It’d help to create (or let the developers create) a fix, though.
I have looked at the code, and I have to repeat that in my actual
use case there is NO issue of non-intersection, even though I am
suspicious of that code. My clipping region is strictly interior
to the surface.
Tracing through the actual sequence of calls indicated in another post
as leading to an access violation, as shown by Valgrind, I cannot see anything
at all wrong in the SDL C source code. If that was my only data I’d say that I have
screwed up one of the pointers eg to the renderer or window surface.
The problem is that if I remove the setting of the clipping region everything
actually works, and because of my very lame inefficient code that
includes MANY calls to the garbage collector.
Heck, the bug could be in my C compiler (I’m using Clang 3.3 from SVN,
but SDL is built with Apple’s gcc 4.2).
[Hmm … actually MY code uses Clang’s C/C++ libraries but SDL is built
against GNU’s C libraries … interesting … arrggh … :]
I had a problem in Felix itself before, because Felix uses a lot of "reinterpetations"
of store which breaks strict aliasing rules.
Given the nature of SDL its very likely SDL breaks the C Standard too,
but the build script neglects to turn strict aliasing optimisations off.
This is mandatory in Clang and gcc 4.2 with even -O1. These compilers
really do perform type based optimisations that assume your code does
not do any type punning. Type punning is not allowed in ISO C.
So the bottom line here is: I don’t think I can provide any cut down code
demonstrating the fault until I have a better understanding what the problem
actually is. And in that case I can probably demonstrate the problem by
reference to the source and fix it. Since my own code is quite complex,
and is written in another language which is translated to C++, making
the generated code also hard to follow, I cannot rule out an error in my
own code.
I’m just suspicious because SDL_RenderSetClipRect is the
ONLY function that causes a problem. Setting the scale doesn’t.
Removing the clipping fixes the problem.
Valgrind says this:
==327== Invalid read of size 4
==327== at 0x100924FD9: SDL_IntersectRect (in /usr/local/lib/libSDL2-2.0.0.dylib)
==327== by 0x100926BA0: SDL_SetClipRect (in /usr/local/lib/libSDL2-2.0.0.dylib)
==327== by 0x1008F426D: SW_UpdateClipRect (in /usr/local/lib/libSDL2-2.0.0.dylib)
==327== by 0x11F24CFB9: flxusr::edit_display::draw::resume() (in /Users/johnskaller/felix/demos/sdl/edit_display.dylib)
and I’ve seen an invalid write too.
The thing is, valigrind isn’t reliable (it reports these things in code I know is correct).
However under valgrind my code does NOT crash.
Under gdb it crashes in many different ways, always in the garbage collector,
but for many different reasons (indicating corruption).On 30/07/2013, at 8:53 PM, mva at sysfault.org wrote:
–
john skaller
@john_skaller
http://felix-lang.org