Is there a maximum number of threads, mutexes, or conditional variables specified?

Does SDL3 have any limits on the maximum number of worker threads, conditions and mutexes supported, or can I create as many as I want?

For testing purposes, I checked and had no problems with distributing the calculations even across 2048 worker threads (my CPU has 6 logical cores). But I’d rather ask than provide an unsafe implementation.

There doesn’t seem to be a limit imposed by SDL.

Looking around, the major operating systems all support thousands of threads per process, with the main limit being how much physical RAM you have, since each thread gets some stack space of its own.

On Linux, there is a hard limit imposed by the operating system, but it varies from system to system depending on how much RAM the machine has. On the little Raspberry Pi 4 I use for PiHole, it’s over ten thousand threads (ulimit -u reports 14739, cat /proc/sys/kernel/threads-max reports 29478).

On macOS, you can find the limit with the command sysctl kern.num_taskthreads. On my 2018 Mac Mini running macOS Sequoia 15.3.1 this reports 16384

On Windows: for 32-bit versions, applications can create around 2000 threads per process before running out of address space. On 64-bit Windows… :man_shrugging: probably tens of thousands of threads.

So you’re probably fine. Just be aware that creating 2048 threads is going to use a lot of RAM just for stack space alone.

2 Likes

Thanks for the reply. Yes, I was wondering if SDL itself imposes any limitations, but if not, that’s great. I checked these 2048 threads just for fun, not that it has any practical solution (especially on a CPU with several cores).

Ultimately I will use the number of worker threads from 1 to SDL_GetNumLogicalCPUCores, because more has no practical use and will not improve performance.


However, I have a problem with threads and their locking using conditional variables, which is supposed to work the same as Windows’ Event objects, synced with WaitForSingleObject and WaitForMultipleObject. Using only the SDL3 API, I can implement such an event object as a condition-mutex-flag triplet, as shown in the example in the SDL_CreateCondition function documentation.

The problem looks like this:

  1. Create specified number of worker threads and “events” for their syncing.
  2. In the main thread, inside the main game loop:
    1. Do some stuff like update logic.
    2. Signal to all worker threads the task to do.
    3. Wait until all worker thread finishes.
    4. Do rendering and other stuff.
    5. Go to the point 2.1. (next game frame).
  3. Destroy all used resources and quit the game.

If I use a fixed number of worker threads, everything works fine. However, I wrote a demo where at runtime I can change the number of worker threads and problems appear. Adding more threads works fine, but reducing their number causes errors during the destruction of conditional variables. Changing the number of workers is done in the main thread, always when all workers are waiting for the condition to be signaled.

If I test the example program under the debugger (I use Lazarus IDE and Free Pascal, not C), the debugger crashes:

This crash concerns the ntdll:RtlGetCurrentServiceSessionId. If I run the program outside the debugger, reducing the number of worker threads works without errors, but threads that were dynamically created and released are reported by RTL as leaked.

In this test program, I change the number of workers by pressing the Up and Down keys. After each added or removed thread, I print information about it in the console. Sample console content:

E:\Applications\Tests\SDL Multithreaded CPU Renderer>multirender.exe
Thread #1 added.
Thread #2 added.
Thread #3 added.
Thread #3 removed.
Thread #2 removed.
Thread #1 removed.
Leaked thread (0000000014D2BA60)
Leaked thread (0000000014D2BEE0)
Leaked thread (0000000014D2BF60)

E:\Applications\Tests\SDL Multithreaded CPU Renderer>

How the heck do I destroy a thread that has returned if the SDL3 has no function for that? Currently, to destroy the worker, I set a condition (so that the worker wakes up and breaks the loop) and wait with SDL_WaitThread until it returns.

And here is the problem, because after the worker returns, the program crashes during destroying two conditions and mutexes devoted for this worker. And of course there is no function to destroy the worker thread context, so I have no idea what to do with it.

I can show the sources of the test program, but I would like to point out that it is about 200 lines of code (two files), written in Free Pascal.

Wait, there is a function SDL_DetachThread. if I use this function, then worker threads release correctly at runtime. Currently, terminating and deleting a dynamically created thread looks like this:

  1. Set a condition variable to wake the worker thread and let it return.
  2. Call SDL_WaitThread and wait until it returns.
  3. Call SDL_DetachThread to clean up the worker thread.
  4. Call SDL_DestroyCondition to destroy conditional variable used to wake this thread.
  5. Call SDL_DestroyMutex to destroy mutex used to modify above condition variable.

This way I can delete threads and no error will occur. Is this the correct and safe way to release worker threads?

1 Like

Don’t use both DetachThread and WaitThread…WaitThread waits until the thread completes and then cleans up any last remaining resources from that thread, DetachThread says “nothing is going to call WaitThread for this thread, so it should clean those resources up itself when it’s done (or, if it already finished, clean it up before returning from DetachThread)”.

You should call one of these two functions for every thread, but not both.

I think SDL is smart enough to handle this situation without crashing, and is probably unrelated to the crash you’re seeing, though.

If you can post the crashing code (even in Pascal), I’ll take a look at it.

1 Like

Thanks @icculus for the answer, but my test program works flawlessly only when I use both functions. If I use one of them, deleting threads causes the previously mentioned error, and the threads are reported as leaked.

I thought my code was wrong and thread synchronization was implemented incorrectly, so I rewrote this tester and instead of using SDL threads, I used Win32 API threads and event objects for their syncing. Unfortunately, in this case, the SDL_PollEvent function crashed the tester (segmentation fault, I don’t know why yet), so I went back to using SDL threads.


Anyway, two attachments:

MultiRender — source.zip (1018.8 KB)
Full source code, project for Lazarus IDE, with SDL3 header sources.

MultiRender — x86-64.zip (962.0 KB)
Compiled exe + dll, ready to run (if you are brave to run strangers’ files).


Only two files are important when it comes to source code:

  1. MultiRender.lpr — main project file, contains almost all the tester code.
  2. MultiSignal.pp — a small wrapper for the condition+mutex+flag triplet, called “signal” (wrapper to the solution described at the bottom of the SDL_CreateCondition documentation).

The program while running looks like this:

MultiRender

You can change the number of threads by pressing Up, Down, PageUp and PageDown, and exit the program by pressing the Esc key. Currently, the code responsible for destroying excess threads when the key to decrease their number is pressed looks like this (focus on the second part, i.e. reducing the number of threads):

procedure ThreadNumUpdate (ANum: Integer);
var
  Index: Integer;
begin
  if ANum > THREAD_NUM_MAX then ANum := THREAD_NUM_MAX;
  if ANum < 1              then ANum := 1;
  if ANum = ThreadNum      then exit;

  if ANum > ThreadNum then
    // add more rendering threads
    for Index := ThreadNum to ANum - 1 do
    begin
      SignalInitialize(@ThreadData[Index].Enter);
      SignalInitialize(@ThreadData[Index].Leave);

      ThreadData[Index].Index := Index;
      ThreadData[Index].Ended := False;

      ThreadFunc[Index] := SDL_CreateThreadRuntime(TSDL_ThreadFunction(@ThreadWorker), '', @ThreadData[Index], nil, nil);
    end
  else
    // reduce the number of rendering threads
    for Index := ThreadNum - 1 downto ANum do
    begin
      ThreadData[Index].Ended := True;      // set the thread termination flag
      SignalEmit(@ThreadData[Index].Enter); // emit the signal to wake up the thread and let it terminate

      SDL_WaitThread(ThreadFunc[Index], nil); // wait until the thread terminates
      SDL_DetachThread(ThreadFunc[Index]);    // clean up the terminated thread

      SignalFinalize(@ThreadData[Index].Enter); // destroys the condition and mutex
      SignalFinalize(@ThreadData[Index].Leave); // the same but for the second signal
    end;

  ThreadNum        := ANum;
  ThreadNumChanged := True;
end;

To simplify the tester and reduce the number of lines of code, I did not add error checking. But I checked it under the debugger and SDL does not report any errors.

This function is called in the main thread, in the main loop and before rendering starts, so always when all threads are frozen and waiting for a signal to wake them up and start rendering. “Rendering” in these threads involves populating a regular array of pixels, entirely on the CPU, so the SDL’s 2D renderer is used only in the main thread and only after all worker threads finished filling the pixel data array.

Focusing on thread removal in the above function, the tester only works if I first wait for the thread using SDL_WaitThread and then detach it using SDL_DetachThread. If one of these functions is missing, the program crashes when the signal is destroyed, i.e. when the condition and mutex are destroyed (segmentation fault).

You might want to check your workload. One of my larger programs maxed out at 1 thread per core (actually, a little before that,) logical cpu may report 2 per code because of hyperthreading.

Thanks @icculus for the answer, but my test program works flawlessly only when I use both functions. If I use one of them, deleting threads causes the previously mentioned error, and the threads are reported as leaked.

I thought my code was wrong and thread synchronization was implemented incorrectly, so I rewrote this tester and instead of using SDL threads, I used Win32 API threads and event objects for their syncing. Unfortunately, in this case, the SDL_PollEvent function crashed the tester (segmentation fault, I don’t know why yet), so I went back to using SDL threads.

I have just tested your program.
It works perfectly with wine. But when I compile a Linux version, I get the following error.

./MultiRender
Runtime error 202 at $000000000041AD4F
$000000000041AD4F
$00007C598C8E8E29

But I have just found an error in my binding. SDL_snprintf still needs to be adapted to varargs.

// old
function SDL_snprintf(Text: pansichar; maxlen: Tsize_t; fmt: pansichar; args: array of const): longint; cdecl; external libSDL3;
function SDL_snprintf(Text: pansichar; maxlen: Tsize_t; fmt: pansichar): longint; cdecl; external libSDL3;

// new
function SDL_snprintf(Text: pansichar; maxlen: Tsize_t; fmt: pansichar): longint; varargs; cdecl; external libSDL3;

I’ll fix this today.

@Levo: yes, ultimately I intend to use as many threads as logical CPU cores, but also give the player the option to change the number of rendering threads — hence this demo, to play with it. For fun, I gave the option in this demo to set a maximum of 240 rendering threads, which is as many as the frame buffer height (one thread per scanline).

Of course, using more than one worker thread per logical core will not increase performance, although there are no technical contraindications to not being able to do this.

This is what the SDL_GetNumLogicalCPUCores function takes into account, as its name suggests. :wink:


@Mathias: thanks for testing!

In summary, if the correct release of a thread is to call either SDL_WaitThread or SDL_DetachThread, then why are there no runtime errors (segmentation faults) only if I use both functions?

So I could run this with some debugging tools, like AddressSanitizer, I did a manual conversion of the program to C, and it has no problems with only using WaitThread:


#include <SDL3/SDL.h>
#include <SDL3/SDL_main.h>

typedef struct TSignal
{
    SDL_Condition *Condition;
    SDL_Mutex *Mutex;
    bool Flag;
} TSignal, *PSignal;

static void SignalInitialize (PSignal ASignal)
{
  ASignal->Condition = SDL_CreateCondition();
  ASignal->Mutex     = SDL_CreateMutex();
  ASignal->Flag      = false;
}


static void SignalFinalize (PSignal ASignal)
{
  SDL_DestroyCondition (ASignal->Condition);
  SDL_DestroyMutex     (ASignal->Mutex);
}


static void SignalEmit(PSignal ASignal)
{
  SDL_LockMutex(ASignal->Mutex);

  ASignal->Flag = true;

  SDL_SignalCondition(ASignal->Condition);
  SDL_UnlockMutex(ASignal->Mutex);
}


static void SignalWait (PSignal ASignal)
{
  SDL_LockMutex(ASignal->Mutex);

  while (!ASignal->Flag) {
    SDL_WaitCondition(ASignal->Condition, ASignal->Mutex);
  }

  ASignal->Flag = false;
  SDL_UnlockMutex(ASignal->Mutex);
}



#define FRAME_W 256
#define FRAME_H 240
#define THREAD_NUM_MAX FRAME_H

static const int
  SQUARE_CENTER_X = FRAME_W / 2,
  SQUARE_CENTER_Y = FRAME_H / 2,
  SQUARE_SIZE     = 32,
  SQUARE_RADIUS   = FRAME_H - SQUARE_CENTER_X - SQUARE_SIZE - 8;

typedef struct TThreadData
{
    TSignal Enter;
    TSignal Leave;
    int Index;
    bool Ended;
} TThreadData, *PThreadData;

static int ThreadNum;
static bool ThreadNumChanged;
static SDL_Thread *ThreadFunc[THREAD_NUM_MAX];
static TThreadData ThreadData[THREAD_NUM_MAX];

static float SquareAngle = 0.0;
static int SquareX;
static int SquareY;

static struct { Uint8 A, R, G, B; } FrameBuffer[FRAME_H][FRAME_W];

static int ThreadWorker (void *PVThreadData)
{
    PThreadData AThreadData = (PThreadData) PVThreadData;
    int IndexScanline;
    int IndexPixel;

    do {
      SignalWait(&AThreadData->Enter);

      if (!AThreadData->Ended) {

        IndexScanline = AThreadData->Index;

        while (IndexScanline < FRAME_H) {
          for (IndexPixel = 0; IndexPixel < FRAME_W; IndexPixel++) {
            FrameBuffer[IndexScanline][IndexPixel].R = IndexScanline ^ IndexPixel;
            FrameBuffer[IndexScanline][IndexPixel].G = IndexScanline ^ IndexPixel;
            FrameBuffer[IndexScanline][IndexPixel].B = IndexScanline ^ IndexPixel;
          }

          if ((IndexScanline > SquareY - SQUARE_SIZE) && (IndexScanline < SquareY + SQUARE_SIZE)) {
            for (IndexPixel = SquareX - SQUARE_SIZE + 1; IndexPixel < SquareX + SQUARE_SIZE - 1; IndexPixel++) {
              FrameBuffer[IndexScanline][IndexPixel].R = ~FrameBuffer[IndexScanline][IndexPixel].R;
              FrameBuffer[IndexScanline][IndexPixel].G = ~FrameBuffer[IndexScanline][IndexPixel].G;
              FrameBuffer[IndexScanline][IndexPixel].B = ~FrameBuffer[IndexScanline][IndexPixel].B;
            }
          }

          IndexScanline += ThreadNum;
        }

        SignalEmit(&AThreadData->Leave);
      }
    } while (!AThreadData->Ended);

    return 0;
}

static char Title[127];
static SDL_Window *Window;
static SDL_Renderer *Renderer;
static SDL_Texture *Texture;
static void *TextureData;
static int TexturePitch;

static void ThreadNumUpdate (int ANum)
{
    int Index;

    if (ANum > THREAD_NUM_MAX) { ANum = THREAD_NUM_MAX; }
    if (ANum < 1) { ANum = 1; }
    if (ANum == ThreadNum) { return; }

    if (ANum > ThreadNum) {
      for (Index = ThreadNum; Index < ANum; Index++) {
        SignalInitialize(&ThreadData[Index].Enter);
        SignalInitialize(&ThreadData[Index].Leave);

        ThreadData[Index].Index = Index;
        ThreadData[Index].Ended = false;

        ThreadFunc[Index] = SDL_CreateThreadRuntime(ThreadWorker, "", &ThreadData[Index], NULL, NULL);
      }
    } else {
      for (Index = ThreadNum - 1; Index >= ANum; Index--) {
        ThreadData[Index].Ended = true;
        SignalEmit(&ThreadData[Index].Enter);

        SDL_WaitThread(ThreadFunc[Index], NULL);
        SDL_DetachThread(ThreadFunc[Index]);

        SignalFinalize(&ThreadData[Index].Enter);
        SignalFinalize(&ThreadData[Index].Leave);
      }
    }

    ThreadNum        = ANum;
    ThreadNumChanged = true;
}

int main(int argc, char **argv)
{
  SDL_Event Event;
  int FrameRate = 0;
  int FrameRateNew = 0;
  float FrameTime = 0.0;
  Uint64 TimeSample;
  Uint64 Second;
  Uint64 SecondNew;
  int Index;

  SDL_Init(SDL_INIT_VIDEO);

  SDL_memset(FrameBuffer, 0xFF, sizeof (FrameBuffer));
  Second = SDL_GetTicks() / 1000;

  SDL_snprintf(Title, sizeof (Title), "Threads: %d/%d — Frame rate: %d — Frame time: %.2fms", ThreadNum, THREAD_NUM_MAX, FrameRate, FrameTime);

  Window    = SDL_CreateWindow   (Title, 640, 480, SDL_WINDOW_RESIZABLE);
  Renderer  = SDL_CreateRenderer (Window, "opengl");
  Texture   = SDL_CreateTexture  (Renderer, SDL_PIXELFORMAT_BGRA8888, SDL_TEXTUREACCESS_STREAMING, FRAME_W, FRAME_H);
  ThreadNum = SDL_GetNumLogicalCPUCores();

  for (Index = 0; Index < ThreadNum; Index++) {
    SignalInitialize(&ThreadData[Index].Enter);
    SignalInitialize(&ThreadData[Index].Leave);

    ThreadData[Index].Index = Index;
    ThreadData[Index].Ended = false;

    ThreadFunc[Index] = SDL_CreateThreadRuntime(ThreadWorker, "", &ThreadData[Index], NULL, NULL);
  }

  while (true) {
    ThreadNumChanged = false;

    while (SDL_PollEvent(&Event)) {
      switch (Event.type) {
        case SDL_EVENT_KEY_DOWN:
          switch (Event.key.scancode) {
            case SDL_SCANCODE_UP:       ThreadNumUpdate(ThreadNum + 1); break;
            case SDL_SCANCODE_DOWN:     ThreadNumUpdate(ThreadNum - 1); break;
            case SDL_SCANCODE_PAGEUP:   ThreadNumUpdate(ThreadNum + 4); break;
            case SDL_SCANCODE_PAGEDOWN: ThreadNumUpdate(ThreadNum - 4); break;
            case SDL_SCANCODE_ESCAPE:   goto l0001;
            default: break;
          }
          break;

        case SDL_EVENT_QUIT: goto l0001;
      }
    }

    SquareAngle += (2 * SDL_PI_F) / (60 * 2);
    SquareX     = SDL_roundf(SQUARE_CENTER_X + SDL_cos(SquareAngle) * SQUARE_RADIUS);
    SquareY     = SDL_roundf(SQUARE_CENTER_Y + SDL_sin(SquareAngle) * SQUARE_RADIUS);

    TimeSample  = SDL_GetTicksNS();
    //begin
      for (Index = 0; Index < ThreadNum; Index++) { SignalEmit(&ThreadData[Index].Enter); }
      for (Index = 0; Index < ThreadNum; Index++) { SignalWait(&ThreadData[Index].Leave); }
      SDL_SetRenderDrawColor(Renderer, 0, 0, 0, 255);
      SDL_RenderClear(Renderer);

      SDL_LockTexture(Texture, NULL, &TextureData, &TexturePitch);
      SDL_memcpy(TextureData, FrameBuffer, sizeof (FrameBuffer));
      SDL_UnlockTexture(Texture);

      SDL_RenderTexture(Renderer, Texture, NULL, NULL);
    //end;
    FrameTime = (SDL_GetTicksNS() - TimeSample) / 1000000;
    SDL_RenderPresent(Renderer);

    FrameRateNew += 1;
    SecondNew    = SDL_GetTicks() / 1000;

    if ((ThreadNumChanged) || (SecondNew != Second)) {
      if (SecondNew != Second) {
        Second       = SecondNew;
        FrameRate    = FrameRateNew;
        FrameRateNew = 0;
      }

      SDL_snprintf(Title, sizeof (Title), "Threads: %d/%d — Frame rate: %d — Frame time: %.2fms", ThreadNum, THREAD_NUM_MAX, FrameRate, FrameTime);
      SDL_SetWindowTitle(Window, Title);

      ThreadNumChanged = false;
    }

    SDL_Delay(15);
  }

l0001:
  for (Index = 0; Index < ThreadNum; Index++) {
    ThreadData[Index].Ended = true;
    SignalEmit(&ThreadData[Index].Enter);

    SDL_WaitThread(ThreadFunc[Index], NULL);
//    SDL_DetachThread(ThreadFunc[Index]);

    SignalFinalize(&ThreadData[Index].Enter);
    SignalFinalize(&ThreadData[Index].Leave);
  }

  SDL_DestroyRenderer(Renderer);
  SDL_DestroyWindow(Window);
  SDL_DestroyTexture(Texture);
  SDL_Quit();

  return 0;
}

…no incorrect memory accesses, let alone crashes.

So it could be a Pascal binding thing, or it could be a bug in SDL’s Windows code (I’m on Linux here), but I guess knowing where it crashed (or even better, why it crashed) would be extremely helpful, if we can determine that somehow.

1 Like

It seems that my earlier demo has a bug somewhere, because I rewrote it without wrapping condition-mutex-flag in standalone structures and everything works fine. In fact, SDL_WaitThread is enough, there are no problems with waiting for the thread to return, so sorry for the confusion.

ThreadTest.zip (1.0 MB)

Attached is a new demo written in Free Pascal (Lazarus IDE, with those headers), which uses only SDL_WaitThread and it works correctly. Full source code below, with some comments:

uses
  SDL3;

const
  THREAD_NUM_MAX = 10;

var
  ThreadNum:        Integer = 0;
  ThreadFunc:       array [0 .. THREAD_NUM_MAX - 1] of PSDL_Thread;
  ThreadEnterMutex: array [0 .. THREAD_NUM_MAX - 1] of PSDL_Mutex;
  ThreadEnterCond:  array [0 .. THREAD_NUM_MAX - 1] of PSDL_Condition;
  ThreadEnterFlag:  array [0 .. THREAD_NUM_MAX - 1] of Boolean;
  ThreadLeaveMutex: array [0 .. THREAD_NUM_MAX - 1] of PSDL_Mutex;
  ThreadLeaveCond:  array [0 .. THREAD_NUM_MAX - 1] of PSDL_Condition;
  ThreadLeaveFlag:  array [0 .. THREAD_NUM_MAX - 1] of Boolean;
  ThreadTerminated: array [0 .. THREAD_NUM_MAX - 1] of Boolean;

  // A worker function waiting for the "Enter" signal, executing the task and sending the "Leave" signal.
  function ThreadWorker (AData: Pointer): Integer; cdecl;
  var
    I: PtrInt absolute AData; // Thread index is encrypted in the parameter address. ;)
  begin
    repeat
      // Wait until the main thread emits the "Enter" signal.
      SDL_LockMutex(ThreadEnterMutex[I]);

      while not ThreadEnterFlag[I] do
        SDL_WaitCondition(ThreadEnterCond[I], ThreadEnterMutex[I]);

      ThreadEnterFlag[I] := False;
      SDL_UnlockMutex(ThreadEnterMutex[I]);

      // If the thread's termination flag is not set, do the task.
      if not ThreadTerminated[I] then
      begin
        // In this demo the threads do nothing and give the "Leave" signal immediately, because we are only
        // checking the correctness of the synchronization of the worker threads with the main thread.

        // Emit a "Leave" signal to inform the main thread that the task is completed.
        SDL_LockMutex(ThreadLeaveMutex[I]);
        ThreadLeaveFlag[I] := True;
        SDL_SignalCondition(ThreadLeaveCond[I]);
        SDL_UnlockMutex(ThreadLeaveMutex[I]);
      end;
    until ThreadTerminated[I];

    Result := 0;
  end;

  // Add a new worker thread to the end of the pool.
  procedure ThreadAdd();
  begin
    ThreadEnterMutex[ThreadNum] := SDL_CreateMutex();
    ThreadEnterCond[ThreadNum]  := SDL_CreateCondition();
    ThreadEnterFlag[ThreadNum]  := False;

    ThreadLeaveMutex[ThreadNum] := SDL_CreateMutex();
    ThreadLeaveCond[ThreadNum]  := SDL_CreateCondition();
    ThreadLeaveFlag[ThreadNum]  := False;

    ThreadTerminated[ThreadNum] := False;
    ThreadFunc[ThreadNum]       := SDL_CreateThreadRuntime(@ThreadWorker, '', Pointer(PtrInt(ThreadNum)), nil, nil);

    ThreadNum += 1;
  end;

  // Remove the last thread from the thread pool.
  procedure ThreadDelete();
  begin
    // Set the flag - when the thread will wake up, it will check this flag, break the loop and return.
    ThreadNum -= 1;
    ThreadTerminated[ThreadNum] := True;

    // Wake up the thread.
    SDL_LockMutex(ThreadEnterMutex[ThreadNum]);
    ThreadEnterFlag[ThreadNum] := True;
    SDL_SignalCondition(ThreadEnterCond[ThreadNum]);
    SDL_UnlockMutex(ThreadEnterMutex[ThreadNum]);

    // Wait for thread to wake up and return, then destroy it.
    SDL_WaitThread(ThreadFunc[ThreadNum], nil);

    // Destroy thread-realted resources.
    SDL_DestroyCondition(ThreadEnterCond[ThreadNum]);
    SDL_DestroyMutex(ThreadEnterMutex[ThreadNum]);

    SDL_DestroyCondition(ThreadLeaveCond[ThreadNum]);
    SDL_DestroyMutex(ThreadLeaveMutex[ThreadNum]);
  end;

  // Emit the "Enter" signal to wake up the thread to do its task or return.
  procedure ThreadWakeUp(AIndex: Integer);
  begin
    SDL_LockMutex(ThreadEnterMutex[AIndex]);
    ThreadEnterFlag[AIndex] := True;
    SDL_SignalCondition(ThreadEnterCond[AIndex]);
    SDL_UnlockMutex(ThreadEnterMutex[AIndex]);
  end;

  // Wait until the thread do its task and emit "Leave" signal to inform the main thread.
  procedure ThreadWaitFor(AIndex: Integer);
  begin
    SDL_LockMutex(ThreadLeaveMutex[AIndex]);

    while not ThreadLeaveFlag[AIndex] do
      SDL_WaitCondition(ThreadLeaveCond[AIndex], ThreadLeaveMutex[AIndex]);

    ThreadLeaveFlag[AIndex] := False;
    SDL_UnlockMutex(ThreadLeaveMutex[AIndex]);
  end;

  // Update the number of active worker threads in the window title.
  procedure WindowUpdateTitle(AWindow: PSDL_Window);
  var
    Title: array [0 .. 127] of Char;
  begin
    SDL_snprintf(@Title, Length(Title), 'Threads: %d/%d', ThreadNum, THREAD_NUM_MAX);
    SDL_SetWindowTitle(AWindow, @Title);
  end;

label
  CleanUp;
var
  Window: PSDL_Window;
  Event:  TSDL_Event;
  I:      Integer;
begin
  SDL_Init(SDL_INIT_VIDEO);

  Window := SDL_CreateWindow('', 640, 480, 0);
  Event  := Default(TSDL_Event);

  // Create all worker threads.
  for I := 0 to THREAD_NUM_MAX - 1 do
    ThreadAdd();

  WindowUpdateTitle(Window);

  // Main loop.
  while True do
  begin
    while SDL_PollEvent(@Event) do
    case Event._Type of

      SDL_EVENT_KEY_DOWN:
      case Event.Key.Scancode of
        SDL_SCANCODE_UP:
          if ThreadNum < THREAD_NUM_MAX then
          begin
            ThreadAdd();
            WindowUpdateTitle(Window);
          end;
        SDL_SCANCODE_DOWN:
          if ThreadNum > 1 then
          begin
            ThreadDelete();
            WindowUpdateTitle(Window);
          end;
        SDL_SCANCODE_ESCAPE: goto CleanUp;
      end;

      SDL_EVENT_QUIT: goto CleanUp;
    end;

    // Wake up all threads to do their tasks.
    for I := 0 to ThreadNum - 1 do
      ThreadWakeUp(I);

    // Wait until all finishes the task.
    for I := 0 to ThreadNum - 1 do
      ThreadWaitFor(I);

    // Here all threads are asleep and are waiting for the next "Enter" signal, so do other main-loop-releated things
    // (logic, rendering etc.). In this demo we do nothing more, so sleep for a while and go to the next frame.
    SDL_Delay(10);
  end;

CleanUp:
  // Terminate all threads and destroy thread-related resources.
  while ThreadNum > 0 do
    ThreadDelete();

  SDL_DestroyWindow(Window);
  SDL_Quit();
end.

So it looks like SDL is working fine. Thanks to everyone for the discussion. :wink:

I tested the above C example, it works for me on Linux.

I tried it again, only the Windows version in wine works for me.

This error occurs with Linux.

Runtime error 202202 at $000000000041954F
00000000041954F
Runtime error 202202 at $000000000041954F
00000000041954F
$000000000041954F

There must be an error in the Pascal binding. It’s just strange that it works under Windows.

PS: Is it true that Windows stays black while it is running?
ESC works to abort.

I removed all the debug stuff, now it works on Linux too. Even with the animation above. It looks like the debugger and SDL_Mutex don’t get along. I had something like that before when I was playing with multithreading, as far as I know it was with X11.

Yes, the window should be just black, as this demo only tests changing the number of worker threads (just a minimal example, to test threads synchronization). On Windows it works fine, without any errors (even under debugger).