Most efficient way of getting render pixels

I’ve fiddled around with SDL2 and found that SDL_RenderReadPixels is significantly slower
compared to directly accessing the pixels of a surface.
I’m assuming that this is because of Surface resides in CPU memory while the render resides in GPU memory so that each call must copy from GPU => CPU.

In the migration documentation, it is specifically recommended to switch to the SDL_Renderer way of drawing, but in my case, I need extremely fast access to the pixels.

I’m binding the surface pixels from C++ to a Numpy array in python and with SDL_Surface method I have approximately 180us for the operation whole operation while using SDL_RenderReadPixels yields 179us for that call including the binding operation of approximately 180us,

Approach 1 (SDL v1 style):

unsigned char* gui_sdl::capture() const{
return static_cast<unsigned char*>(_rootSurface->pixels);

Result: 0us + bindings (180us) = 180us

Approach 2 (SDL v2 style):

unsigned char *gui_sdl::capture2() const{
    SDL_RenderReadPixels(_renderer, NULL,

       return static_cast<unsigned char*>(_captureSurface->pixels);

Result: 179 + bindings (180) = 359us

How would i approach this to improve the performance of reading pixels using the renderer?


Maybe @slouken can help with this :wink:

Welcome !

I would ensure that the pixel format that you are requesting from the renderer (RGB888 in your example) matches the format of the display surface.

If it doesn’t, time will be lost to conversion.

Best strategy would be to avoid reading from hardware surfaces at all, but as I can see the word “gui”, you appear to be using the drawing facilities of the renderer to produce GUI elements and then modify them, which makes sense.

SDL_ReadPixels() is very slow and only suitable for screenshots and stuff like that. Don’t use it for this. If you must alter pixel data and be able to read it, keep it in RAM and then just copy it to a texture with SDL_LockTexture/UnlockTexture.

Create a texture with the texture streaming flag (SDL_TEXTUREACCESS_STREAMING), make sure the pixel format is something hardware friendly like BGRA8888, and use SDL_LockTexture() and SDL_UnlockTexture() to make changes.

One caveat is that SDL_LockTexture/UnlockTexture is essentially write-only; it can’t guarantee that the memory pointer SDL_LockTexture() gives you will contain the old texture data, so if you need to read pixels then keep them in RAM and just blit them to the texture (if you don’t need to read them, just do all your writing inside SDL_LockTexture/UnlockTexture instead of writing to pixel data in RAM and copying it).

So, in pseudo C/C++:

pixel_t *uiPixels = AllocateSpace(UI_WIDTH * UI_HEIGHT * sizeof(pixel_t));  // assume pixel_t is 32-bit
int pitch = 0;
pixel_t *lockedPixels = nullptr;
if(!SDL_LockTexture(uiTexture, NULL, (void*)&lockedPixels, &pitch)) {
    // lockedPixels now points to driver-provided memory that we can write to that the driver will then copy to uiTexture
    FastCopy(uiPixels, lockedPixels, pitch);    // can't ignore pitch, it might be wider than UI_WIDTH
SDL_RenderCopy(MyRenderer, uiTexture, blah blah blah);