SDL_RenderGeometry and strict aliasing

Hello everyone,

I have some concerns about the new SDL geometry rendering API and potential violation of C/C++ strict aliasing rules.

As a thought experiment, if I have an array of SDL_Vertex structures and I wanted to use the SDL_RenderGeometryRaw function, then I would need to obtain pointers to position, colour and texture coordinate data within the SDL_Vertex structure.

auto xy = reinterpret_cast<const float*>(
    reinterpret_cast<const char*>(&vertices) + offsetof(SDL_Vertex, position));
    
auto rgba = reinterpret_cast<const int*>(
    reinterpret_cast<const char*>(&vertices) + offsetof(SDL_Vertex, color));
    
auto uv = reinterpret_cast<const float*>(
    reinterpret_cast<const char*>(&vertices) + offsetof(SDL_Vertex, tex_coord));

SDL_RenderGeometryRaw(
    renderer, texture, 
    xy,   sizeof(SDL_Vertex),
    rgba, sizeof(SDL_Vertex),
    uv,   sizeof(SDL_Vertex),
    vertex_count,
    indices, num_indices, sizeof(int));

I believe that strict aliasing rules will allow me to obtain raw float and int pointers into this structure, but dereferencing them will result in undefined behaviour. I am not sure if SDL actually does dereference these pointers, but I did notice that SDL_RenderGeometry is implemented in terms of SDL_RenderGeometryRaw, so this exact logic is part of the library.

Ultimately I wanted to ask whether this is potentially a violation of strict aliasing rules, either within my own code, or within SDL’s code? I can imagine that maybe SDL maybe has compilation flags that make this behaviour safe, or maybe my understanding on the topic is wrong altogether.

:heart:

AFAIK this doesn’t violate strict aliasing, since you aren’t accessing floats as though they were ints or anything like that, aren’t gonna run into issues with the optimizer stashing something in a register instead of writing it to memory, etc.

In fact this is basically the way all modern graphics APIs work (even OpenGL). While they allow vertex attributes to be stored in separate buffers, in practice they’re usually stored interleaved (as in your example) for better cache performance. There are usually alignment requirements, but those are easy enough to adhere to since most programs are storing vertices in some kind of struct.

Not 100% sure if your code violates strict aliasing, probably not (as you said, you’re not dereferencing anything), but I think it could be simpler and at the same time definitely mostly not violating strict aliasing:

SDL_Vertex* vertices = ...; // vertices array - like before
const float* xy = &vertices[0].position.x;
// ok, the following one still sucks:
const int* rgba = reinterpret_cast<const int*>(&vertices[0].color);
const float* uv = &vertices[0].tex_coord.x;
SDL_RenderGeometryRaw( /* ... as before ... */ );

Not sure why SDL’s SDL_RenderGeometry() implementation doesn’t do this - maybe because this code might crash if vertices == NULL, while their (and your) code shouldn’t crash, at least as long as vertex_count == 0 (so the calculated pointers are never dereferenced), but IMHO the better solution would be to just check for vertices == NULL and handle it explicitly.

Totally possible that SDLs internal code violates strict aliasing, but they’ll just build with -fno-strict-aliasing.
OTOH, also possible that it doesn’t violate strict aliasing, because it only derefences parts of the memory as float that are actually float (same for int)?

UPDATE: Small changes to code example so it might actually work

Reading both responses, it seems like the position and texture coordinate data might be safe from strict aliasing violations as the graphics API is expecting float data. The color data is questionable though, as it is reinterpreting an SDL_Color struct as an int.

Ultimately I would like to create my own vertex, point and color structures to be compatible with the SDL_RenderGeometryRaw function (I imagine this will be a common use-case). It would be nice to create a high level data structure similar to the SDL_Vertex:

struct vertex {
    point<float> xy;  // 2 x float
    point<float> uv;  // 2 x float
    color rgba;       // 4 x byte
};

But if this has the potential to violate strict aliasing rules, then the data structure would need to be lower-level:

struct vertex {
    float xy[2];
    float uv[2];
    int rgba;
};

The color being represented as an int is particularly unfortunate as it would require shifts and masks to copy the RGBA components into it.

I guess the obvious solution is to stick to using the SDL_Vertex type, because SDL can guarantee it. But it would be ideal if we can safely use custom structures.

Oh, you’re right, I somehow missed that the color member is an SDL_Color struct (with 4 Uint8) and not a 32bit integer - you’re right, that should definitely violate strict aliasing.
TBH this is pretty weird design, the doc comment says
\param color Vertex colors (as SDL_Color)
If that color must be in SDL_Color format anyway (=> no way to specify a different color format that fits in an int), then way is that argument const int* and not const SDL_Color* ?

Ultimately I would like to create my own vertex, point and color structures to be compatible with the SDL_RenderGeometryRaw function (I imagine this will be a common use-case). It would be nice to create a high level data structure similar to the SDL_Vertex:

I think using custom formats for xy and uv is safe enough, as long as they’re just two consecutive floats (so you can pass the address of the first float of the first array element) - if any strict aliasing rule gets violated, then only in SDL itself, not in your code.

For the color you could maybe use a union, like

struct vertex {
  whatever xy;
  whatever uv;
  union {
    int sdlintColor;
    YourColor color; // must still be 4 bytes in same order as SDL_Color
  };
};

and then use const int* rgba = &yourCustomVertices[0].sdlintColor;
That might still violate strict aliasing when interpreted very strictly, but should be much more compatible: c++ - Unions and type-punning - Stack Overflow claims that type punning via union is valid in C99 and newer, but not C++, but AFAIK the common compilers support it.

It seems that strict aliasing breaks when you access the same memory with two different pointers of different type (eg int / float), because compiler is allowed to re-order instructions.

  • but I don’t think we do that.
  • it seems sdl2 is built with -fno-strict-aliasing
  • people are used to building without this … (lot of discussion on internet)

I’ve removed the one I got:

I didn’t used

const float* xy = &vertices[0].position.x;
...

because it triggers also a strict aliasing warning :slight_smile:
(with gcc -wstrict-aliasing=1)

src/render/SDL_render.c:3736:39: warning: dereferencing type-punned pointer might break strict-aliasing rules [-Wstrict-aliasing]
 3736 |     const float *xy = (const float *)((const Uint8 *)&vertices[0].position);

That was some quick work there @Sylvain_Becker, thank you for the response!

It looks as though strict aliasing is definitely a non-issue within SDL itself, but does this also extend to anyone using the SDL_RenderGeometryRaw function? Obviously the user needs to be careful about padding and alignment if they are casting vector/color structs into float/int pointers, but the aliasing is not something they need to be concerned with?

I understand that the safest way to go is to store the vertex data in the expected form from the start, but this is probably a good thought experiment while I have your attention :innocent:

The code in the warning is different from the line above though - do you also get the warning when not casting to const Uint8 * first and using &vertices[0].position.x ?
(...position.x is a float, so dereferencing it to const float* should be absolutely valid - if anything is fishy about it at all, then treating that xy pointer as an array with more than two elements later)

Also, you’re using int color = (renderer->color.r << 0) | (renderer->color.g << 8) | (renderer->color.b << 16) | ((Uint32)renderer->color.a << 24); now - are you sure this is equivalent to the old code on both Big Endian and Little Endian machines?

Sorry, you’re definitively right:

They’re probably a problem with big endian indeed. I suspect also other places …
I’ll create an issue so that we don’t forgot to try it.

But I don’t strict aliasing is an issue here… There is SDL_RenderGeometry() that should not need casting

I think you can ignore strict aliasing in SDL2 itself, as long as it’s explicitly built with -fno-strict-aliasing. Casting SDL_Color to int at least behaved correctly on all platforms.

However it’s unfortunate that the public API of SDL2 (SDL_RenderGeometryRaw()) requires the user to either violate strict aliasing or do the kinda cumbersome (Endian-Specific) conversion from SDL_Color (or compatible) to int themselves - would’ve been better to let SDL_RenderGeometryRaw() take SDL_Color as well and do the potentially strict aliasing violating conversion in the SDL2 code, where the users compilerflags don’t matter.
Or alternatively, allow the user to use any colorformat that fits in an int and let them specify the corresponding SDL_PixelFormatEnum in an additional function parameter.

(I’m sorry I didn’t notice this before the 2.0.18 release, I don’t use the SDL_Render API myself)

The problem with making SDL_RenderGeometryRaw() take an array of SDL_Color is that it’s supposed to take the raw geometry that hasn’t been stuffed into SDL-specific types, like what you get out of Dear ImGui. Having to convert an array of vertices to use an SDL_Color struct at runtime is unnecessary extra work and slow.

I would suggest just altering the SDL_RenderGeometryRaw() docs to eliminate mention of SDL_Color and say that color needs to be RGBA32. Perhaps change the type of color to uint32_t.

Does anyone actually use SDL_Color for storing color, anyway?

Maybe I’m missing something but I don’t really understand that design goal of avoiding struct types. The color is four separate byte components, treating it as a single integer is sort of lying about it which doesn’t seem great for a public API.

If you look at how the underlying graphics APIs behave and how other high level APIs behave, casting between two equivalent structs that have 4 byte components which come from separate libraries (eg SDL_Color and whatever the caller uses for its color struct) makes a lot more sense to me than temporarily treating it as a single large integer, and AFAIK is pretty commonly done among C graphics APIs.

In general I wish the SDL_RenderGeometry APIs had been designed starting from the base fundamentals of the underlying GPU APIs rather than trying to fit to specific higher level things like Dear ImGui, because as-is they’re overflexible to the point of not being hardware accelerated for every part that should be (they promise to handle different situations than what the hardware actually handles), while also mismatching some common conventions that other high level APIs already do account for.

1 Like

How often are you manipulating the individual channels that storing them as separate elements of a struct is necessary? We aren’t talking about a function that takes a single color as a function argument, but a whole buffer of them.

As to your last paragraph, I have no idea what you mean. SDL_RenderGeometryRaw() seems like as close to the way underlying GPU APIs draw things as you can get for something like SDL. You pass it a vertex buffer, what is essentially a vertex descriptor, and an optional index buffer, and SDL copies them to an internal buffer and then sends that to the GPU for you. What do you mean by “overflexible” and not all of it being hardware accelerated?

People asked for the ability to give SDL_Renderer 2D polygons to draw so they wouldn’t have to resort to OpenGL/DirectX/Metal just to draw things besides rectangles and/or use Dear ImGui etc for a long time, and that’s what SDL_RenderGeometry() and SDL_RenderGeometryRaw() give us.

edit: the one thing I would ask for in some sort of future SDL_RenderGeometryRawEx() or whatever is the ability to pass the color data as floats.

How often are you manipulating the individual channels that storing them as separate elements of a struct is necessary? We aren’t talking about a function that takes a single color as a function argument, but a whole buffer of them.

Individual channels are manipulated by users sometimes, but that’s not really what I’m getting at. The point isn’t about how frequently they’re changed, but whether the API is correctly informing users what the data is. A 32 bit RGBA color is not a signed integer, it’s four color components of 8 unsigned normalized bytes each. You can encode it into a signed integer (and decode it from one), but there’s no real reason SDL’s API should accept that specific encoding instead of the actual representation. It’s not like the backends benefit from the external API taking encoded integers, either.

As to your last paragraph, I have no idea what you mean. SDL_RenderGeometryRaw() seems like as close to the way underlying GPU APIs draw things as you can get for something like SDL. You pass it a vertex buffer, what is essentially a vertex descriptor, and an optional index buffer, and SDL copies them to an internal buffer and then sends that to the GPU for you. What do you mean by “overflexible” and not all of it being hardware accelerated?

That’s definitely what I wish the function did, but unfortunately if you look at the implementation it does a lot of work on each vertex and index in the CPU long before passing the data to the GPU. If the external API’s design was a bit tighter the implementation could be shrunk to (effectively) a couple or a few memcpy’s.

Part of the work it does that the lower level graphics APIs don’t do natively includes de-duplication and quad detection heuristics, and re-routing to SDL_RenderCopy/SDL_RenderFillRect when it thinks it’s a good idea. It also doesn’t pass the index list through to the GPU, and if the internal code were changed to do that it allows 1 byte indices which aren’t supported by some modern graphics APIs. The arbitrary strides it accepts for vertex data (including 0 stride) are also a little awkward to handle if it changed to copy the data more directly to the GPU.

SDL_RenderGeometry() does not do any de-duplication or quad detection, or call SDL_RenderCopy() or SDL_RenderFillRect() or any of those. In fact, just the opposite: both those functions can call SDL_RenderGeometry() to do their actual work. (the software renderer is another story, since being able to blit rectangles is faster than rasterizing and texturing polygons in software)

The thing with not being able to just call memcpy() is because SDL_RenderGeometryRaw() supports both interleaved and separate vertex attribute buffers, as well as optional texture coordinates. For simplicity it just interleaves them, whether you had them interleaved initially or not. This probably also makes it play nicer with SDL_RenderCopy() and friends using SDL_RenderGeometry() internally, since it leaves the door open for putting everything using the same texture, state, etc., into one big buffer and submitting the whole thing at once.

The only way to avoid manually interleaving the vertex data would be to require the vertex buffer to be interleaved, and possibly an entirely separate function for textured vs untextured. Instead they opted to externally mirror the way GPU APIs work. (and you know that someone would complain if they couldn’t use separate vertex attribute buffers)

The arbitrary strides are to allow applications to use their own vertex structs, which may include data that SDL doesn’t care about.

Anyway, this is a dumb derail.

I completely support this because SDL already provides functions for mapping RGBA values to a Uint32 value. While strict aliasing rules permit casting from unsigned to signed, it would make more sense for SDL_RenderGeometryRaw to accept color as a Uint32 pointer for consistency with the get/map RGBA functions.

But it isn’t just a case of changing the documentation. The conversion between an SDL_Color structure and a 32-bit integer depends on the endianness of the CPU. On a little-endian CPU it’s 0xaabbggrr but on a big-endian CPU it’s 0xrrggbbaa.

Put another way, SDL_Color is a platform-independent representation of the color but SDL_PIXELFORMAT_RGBA32 is platform-dependent.

SDL_RenderGeometry() does not do any de-duplication or quad detection, or call SDL_RenderCopy() or SDL_RenderFillRect() or any of those.

Sorry that was my bad, you’re absolutely right. The software rendering code for RenderGeometry is right in the middle of a lot of its other code so I missed what function I was looking at.

Nevertheless, the rest of what I said is true – the implementations never pass the indices along to the GPU and do manual expansion and manipulation of the vertex data instead. It could have fast paths to do that for some common scenarios (which it doesn’t right now), but because the external API favours flexibility for somewhat arbitrary use cases more than simplicity, those hypothetical fast paths would need to be conditional paths rather than happening all the time, which would make the implementation more complicated and the external API less predictable in terms of performance.

1 Like

Nope, in contrast to SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_RGBA32 is byte-wise RGBA, no matter the platform (on Little Endian it’s an alias to SDL_PIXELFORMAT_ABGR8888, on Big Endian it’s an alias to SDL_PIXELFORMAT_RGBA8888).

However, potentially calling SDL_MapRGBA() for every vertex seems a bit cumbersome and inefficient…