Improving Texture drawing performance

I’m testing how much small textures I can draw in a single game loop. In this case, I simply draw small random rectangles in screen.

First I took a naive case to calculate the random color and positions in every loop, the performance to draw 2^18 rectangles was a stable 24 FPS. I noticed one of my CPUs capped in 100%.

frame.c (3.3 KB)

Then later I took a less a naive approach where the computation of square position and colors are updated only when needed (when changing the amount of drawn rectangles). This has improved marginally, to 29 FPS (drawing the same amount of rectangles). Also in this case one of my CPUs capped in 100%.

frame_ext.c (4.7 KB)

Hence my initial hypothesis regarding the CPU time spent doing random and math was wrong, and seems the most cpu is probably spent copying the data from host memory to gpu memory.

How can I improve rendering speed while drawing all squares?

Your cpu will always be 100% using a while loop.
How many frames do you get drawing less rects ?

It’s mostly linear improvement. If I divide by 2 the amount of rects the performance is doubled.

And answering you, no, the CPU isn’t always at 100%. It gets to 100% when drawing 2^18 rectangles or more. When I draw 2^17 rectangles the CPU is around 100% but near 60 FPS, 2^16 around 60%. cpu and still 60FPS. Reducing the number of rectangles do not increase the FPS (since I have added a functtion to control it), but directly impacts in CPU performance after that (as already explained)

Ah I see your using SDL_framerateDelay.

Using SDL_RenderFillRect I am using about 60% - 70% of the cpu rendering 131072 (2 ^ 17).

Using SDL_RenderFillRects I am using about 30% of the cpu rendering 131072 (2 ^ 17).
Not sure you can change colors with this approach though.

int SDL_RenderFillRects(SDL_Renderer*   renderer,
                        const SDL_Rect* rects,
                        int             count)

Possible solutions

  1. Do not draw a rect if you know you won’t be able to see it on the screen.
  2. Use a texture to replicate the drawing of a Rect, the new version of SDL2 uses texture batching which should improve rendering speed.

Example of solution 2, about 27- 30% cpu

#include <SDL.h>
#include <array>

struct Square {
    SDL_Rect rect;
    SDL_Color col;
};

// Create a 1x1 white texture
 SDL_Texture* create1x1Texture(SDL_Renderer* const renderer) {
     SDL_Surface* surface(SDL_CreateRGBSurface(0, 1, 1, 8, 0, 0, 0, 0));
     if (surface) {
         SDL_Texture * texture = SDL_CreateTextureFromSurface(renderer, surface);
         SDL_FreeSurface(surface);
         return texture;
     }
     return nullptr;
 }



int main(int argc, const char* argv[])
{
    SDL_Init(SDL_INIT_EVERYTHING);
    SDL_Window* window     = SDL_CreateWindow("Frames", SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, 800, 600, 0);
    SDL_Renderer* renderer = SDL_CreateRenderer(window, -1, SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);

    SDL_Texture* texture = create1x1Texture(renderer);

    std::array<Square, 131072> squares{ };

    // Change squares data
    squares[0].rect = {10, 10, 10, 10};
    squares[0].col  = {255, 0, 0};

    bool running = true;
    while (running) {
        SDL_Event e;
        while (SDL_PollEvent(&e)) {
            switch (e.type) {
                case SDL_QUIT:
                    running = false;
                    break;
            }
        }
        SDL_RenderClear(renderer);
        for (const auto& square : squares) {
            SDL_SetTextureColorMod(texture, square.col.r, square.col.g, square.col.b);
            SDL_RenderCopy(renderer, texture, nullptr, &square.rect);
        }
        SDL_RenderPresent(renderer);
    }
    SDL_DestroyTexture(texture);
    SDL_DestroyRenderer(renderer);
    SDL_DestroyWindow(window);
    SDL_Quit();
    return 0;
}