I use SDL_Delay() to save power, actually, since my system is electrical current limited–I have multiple large screens and computers in a unit that has only a single 15 amp line. While with some content some of the computers will be using up all cycles in actually doing work, on average I will schedule things so that not all are doing so–it’s the only way I can get both the ability to do real heavy duty realtime rendering and simulation for some times on some of the machines, yet manage to stay within the power limit.
In my threaded version, the render thread waits on vsync so that’s fine. The control thread and animation threads share data (such as vertices from character skinning code) with the render thread using SDL mutexes. I’m going to change the shared data into a ring buffer and use SDL_SemTryWait() instead, in the readout (to load to VBO or whatever), to check if it’s locked, else point to the previous buffer. My best guess is that I should use individual buffers for various objects even if there may be lag of some objects by a frame more than the others if those happen to be locked, but that’s not much of an issue. Later I’ll switch the buffer index to atomic compare-and-swap based structure for lock-free synchronization. In my setup the control and animation threads do an SDL_Delay() if they’re going much faster than the framerate:
unsigned const int tNew(SDL_GetTicks());
unsigned int tDif(tNew - tLast);
tLast = tNew;
float const dt(0.001f * static_cast(tDif));
// animate(dt) and/or simulate(dt)
// Lock shared buffer while dumping updated data
tDif = SDL_GetTicks() - tNew;
if (tDif < 18) SDL_Delay(18 - tDif);
Note that on Windows XP I always get 18 or 19 for tDif, and if I put say 11 I always get 11 or 12, which seems to suggest that Windows has a timer resolution around 2 ms (I expected 10 from the SDL documentation).
(Actually, the simulation thread would not allow dt > max_stable_timestep / k and force others to slow down as well as it falls out of realtime–no point in wasting CPU time on other threads).> >>From: Bob Pendleton
Subject: Re: [SDL] SDL_GL_SwapBuffer() + vsync randomly causes
high CPU usage–or not…
If you have to add explicit SDL_Delay() calls to a piece of
multi-threaded code to get it to work correctly your code
work correctly and needs to be redesigned. Explicit delay
only to be used to slow down a loop and/or control the
frequency of a
process. If you have more than one delay in your whole program
doing something wrong. You never need to use delay and should be
suspicious of your design if you do find yourself using it.
Be careful making blanket statements like that. I was
building a game
engine last year with a scripting API, and every script ran on
thread. There was a “wait” command in the script, plus a
handful of special
effects that executed over a certain amount of time and had an
wait to move on to the next command until the current one was
finished> executing. I implemented the first one directly
with a call to Sleep(), and
the second by sending initialization information to the
(on the main thread) and then sleeping. I don’t see how
I could have
done effectively it without sleeping the script threads.
I am very careful about making blanket statements like that. It
weird thing, you can ask and ask and ask and never get an
claim absolute knowledge and people will work for days or even weeks
to prove you wrong. So, sometimes I makes statements like that because
I want to find out if I am right. Other times I do it because I think
I now something and I can help other people learn.
If I am right, then someone else has a chance to learn something from
me. If they prove me wrong, then I learn something from them. In
either case, someone gets something they didn’t have before and
Having a wait/sleep command in scripts is handy, I have to admit that.
Just like using a single sleep in the main loop of and SDL
handy for limiting the frame rate. But, I allowed for that in my
blanket statement The trouble is that sleeping does not
threads you want to run to actually run. It only forces one
sleep(). I can see that if you have a small number of scripts
works most of the time.
OTOH, I would use a very different method. I would use a thread pool
to store threads allocated to running scripts. You do not need a
thread for each script. No matter how many threads you have you will
never have more threads running than you have processors. So,
limit your self to one thread per processor and save some overhead.
(You might need more if you have other blocking operations in your
scripting system.) Put your threads in a thread pool and share them
between scripts. You can implement wait() by having it place a script
in a priority queue (a sorted queue aka a heap). Its priority is the
time when it is supposed to wake up. A wait(0) should (if your queue
is correctly implemented) put a script at the end of the group of
threads already scheduled for the same time so it will just give up
control to the next ready to run script.
If no script is ready to run, use a timer to wake up a thread at the
time a script will be ready to run. If more than one script is ready
to run let more threads out of the pool and have them run the other
ready scripts. When a thread finishes it should check to see if there
are other ready scripts and run them. If there are no ready
should go back to the pool. If it is the last active scripting thread
it should wait on a timer for the next waiting thread. Or, if the
queue is empty wait on the queue so it will awaken when a script is
placed in the queue. If you want to get fancy you can have the threads
do fine grain scheduling (time slicing) on the ready scripts.
something your won’t necessarily get out of the OS.
A good OS does much the same thing for processes and threads
just described, Not all OSes are good… And, most OSes have
the number of threads you can have. The only limit on what I just
described is the amount of memory available to your program. Oh yeah,
the engine code winds up with no sleep() calls.
Lets take on the problem of sending commands to the rendering thread.
This is a common problem in threaded code. You need to ask another
thread to do some work for you, and you can’t continue until it is
done. The easiest, and safest, way that I know of to solve the problem
is based on a queue and a semaphore. The requesting threads put their
requests on the “request” queue and then lock a semaphore. The
rendering thread waits on the “request” queue for rendering requests
and when if finishes the request it sends a reply by unlocking the
semaphore. This technique works if each requesting thread has
semaphore and includes it as part of the request.
You have to use a counting mutex like a semaphore to avoid race
conditions where the request gets finished before the requesting
thread waits for the results. By using queues you force the requesting
threads to wait until their work is done. If you sleep you don’t know
if the work is finished before you move on. If you use the sleep(0)
trick the rendering thread may never run. If you actually force your
threads to sleep for longer than 0 you may be wasting processor time
that you could have used.
Also, as more and more worker threads wait for results the odds
rendering thread actually getting to do something gets higher and
higher. When all the requesting threads are blocked the rendering
thread is the only left that can run and so it will run. Using sleep
there is no way to be sure the rendering thread will ever run.