How can I speed up SDL_Surface?


static void UpdateRgbSurface(SdlResources* sdl_resources,
                             unsigned char* rgb_buffer) {
  unsigned char const kAlpha = 255U;
  int y = 0;
  int x = 0;
  SDL_assert(sdl_resources->surface != NULL);
  SDL_assert(sdl_resources->window_surface != NULL);


  /* Copy pixels from the source buffer (rgb_buffer) into SDL's RGBA surface. */
  for (y = 0; y < kSurfaceHeight; ++y) {
    for (x = 0; x < kSurfaceWidth; ++x) {
      Uint32* pixel = NULL;
      int const kBytesPerPixel = sdl_resources->surface->format->BytesPerPixel;
      int const kRowSize = sdl_resources->surface->pitch;
      int const kOffset = x * kBytesPerPixel + y * kRowSize;
      Uint8* const kPixels = (Uint8*)sdl_resources->surface->pixels;
      SDL_PixelFormat const* kPixelFormat = sdl_resources->surface->format;
      int const kSourceOffset = kSourceColorChannels * (x + kSurfaceWidth * y);
      unsigned char const kRed = rgb_buffer[kSourceOffset];
      unsigned char const kGreen = rgb_buffer[kSourceOffset + 1];
      unsigned char const kBlue = rgb_buffer[kSourceOffset + 2];
      Uint32 const kColor =
        SDL_MapRGBA(kPixelFormat, kRed, kGreen, kBlue, kAlpha);
      pixel = (Uint32*)(kPixels + kOffset);
      *pixel = kColor;


If I get callgrind right, this is the slowest function I have.
What can I do to speed it up?

Note: struct PixelBuffer is just an array of RGB pixels (24-bit, 3 unsigned chars).
I use C89 and SDL 2.0.

None of the stuff calculating the bytes per pixel, fetching a pointer to the surface’s pixel data, etc., needs to be done inside the loop.

If you’re doing software rendering, things are just gonna be slower. If you’re doing software rendering to an RGB buffer, consider rendering to an RGBA one and skipping this whole step instead.

Also, why C89 and an ancient version of SDL?


C89, other than the full support of C compilers (i.e., Clang, GCC, MSVC), no other reason.
I expressed myself wrong.
I meant some recent version of SDL 2.

Considering switching to C17, now.
The lack of <stdint.h> in C89 alone is a burden.

I’ve set up the pixel buffer as an “immediate” target.
So that I can not only work with SDL, but also with other backends (e.g., PPM images, stb_image.h).
It’s an array N of unsigned char.
But yeah, I could also put everything inside an RGBA encoded as uint32_t like SDL does.
Since most of the other libraries I use do not have real-time requirements.

Thank you!

Just a side note, are you using Uint32 and Sint16, etc? They’re defined by SDL2.

1 Like

I have all the SDL functions I use inside a wrapper translation unit (sdl_wrappers.c and sdl_wrappers.h)

I can use it inside sdl_wrappers.c, sure, however, I try to keep things independent of another (modularity).

So, if I decide to stick to C89 I can use

#include <limits.h>

#if (!defined(__STDC__))
#error No standard-conforming C implementation.

#if ((!defined(FIXED_WIDTH_INTEGERS)) && defined(__cplusplus) && (__cplusplus > 199711L))
#include <cstdint>
#elif ((!defined(FIXED_WIDTH_INTEGERS)) && defined(__cplusplus) && (__cplusplus == 199711L))
#include <stdint.h>


#if ((!defined(FIXED_WIDTH_INTEGERS)) && (!defined(__cplusplus)) && (defined(__STDC__)) && \
#include <stdint.h>

#if ((!defined(FIXED_WIDTH_INTEGERS)) && (!defined(__cplusplus)) && (defined(__STDC__)) && \
typedef signed char int8_t;
typedef signed short int16_t;
typedef signed long int32_t;
typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned long uint32_t;

#if (CHAR_BIT > 8)
#error Data type char is bigger than 8 bits.

#if (!defined(BOOLEAN))
#define BOOLEAN int

#if (!defined(BOOLEAN_FALSE))

#if (!defined(BOOLEAN_TRUE))
#define BOOLEAN_TRUE 1

#if (!defined(MAKE_BOOLEAN))
#define MAKE_BOOLEAN(x) (!!(x))

I know that there’s this

typedef enum {false, true} bool;

Idiom, but I compile my code with the -Wc++-compat flag.
So the above is not possible for me, since the compiler with that flag on will complain.

Why all the if-defs?
Well, a good chunk of my translation units are each independent of another, too.
That means code duplication and avoiding the one-definition-rule (ODR).

Why all that cumbersomeness?
So that you can pull a TU out and use it directly in another project without any additional files or changes.
Essentially, I have a good amount of 2-file libraries in my project.
My rasterzer.h and rasterizer.c is one of these TUs, for example.
pixel_buffer.h and pixel_buffer.c is yet another example.
“Plug-and-pray” or “plug-and-play”, if you like.

I also use a good chunk of these flags:

Alternatively, I can stick to char (>= 8 bit), short (>= 16 bit), long (>= 32 bit) and skip all these typedefs, since the standard guarantees these bit widths.