Question about threading

jeroen_clarysse1 · April 24, 2013, 7:28am

Hi all,

I’m working out (or at least trying to) the schematics on how to split up my app (a development studio for psychology experiments) in multiple threads. Right now, my app does (mainly) the following things :

poll static events. These are events at a fixed time, so a simple clock-comparision is all that is needed
poll dynamic events. These are events that are caused by user input such as keypresses, mouseclicks, mouse movement
poll external equipment (via parallel port or Data Acquisition Cards)
poll streams (via Data Acquisition cards). These are streams of numbers at 1Khz (buffered by the hardware)
do some simple calculations on these streams (for instance detect if a stream exceeds a threshold, or calculate the sum of two streams into a new stream)
draw stuff on the screen (at low speed. Usually there are only 5 to 10 screen updates per ‘trial’, and one program execution has typically 50-100 trials. However, we want these updates to happen EXACTLY when we expect them)
draw a graph of the active streams on the 2nd monitor

that more or less covers the important parts. Right now, version 4 of the software does all of this without multithreading, which is kind of a shame since all PCs nowadays have multiple cores. So I would like to optimize this. My question is : can you guys give me some tips on what to optimize, and what not ?

for instance, I could split the drawing of the streams from the rest of the app. But is that feasible ? Can one thread draw on one monitor, and another thread on a second monitor ?

Another possibility is to poll equipment on a 2nd thread, an simply set booleans on/off whenever equipment-related events are to be executed… would that be a speed benefit ? Or is the overhead of multithreading greater than the speed gain here ?

any advice would be very welcome !

thanks

Gabriele_Greco · April 24, 2013, 11:15am

that more or less covers the important parts. Right now, version 4 of the
software does all of this without multithreading, which is kind of a shame
since all PCs nowadays have multiple cores. So I would like to optimize
this. My question is : can you guys give me some tips on what to optimize,
and what not ?

If you don’t have performance problems DO NOT optimize. You’ll need heavy
refactoring to parallelize your project and if you are not familiar with
threads/mutexes and what APIs are reentrant and what not you may incurr in
several debugging nightmares.

Also if you refresh the screen with such a low frequency probably you use a
few percent points of power of a single CPU, so there is no need at all to
parallelize. Parallelizing your app you can slow down it on single core
machines (that are probably the ones where your app could have problems…)

If you have performance problems on modern multicore machines move the
external event processing on parallel threads, do your event processing in
the threads, and feed the result to the main thread through
SDL_USER_EVENT(s).

for instance, I could split the drawing of the streams from the rest of
the app. But is that feasible ? Can one thread draw on one monitor, and
another thread on a second monitor ?

AFAIK render should be done in the thread that creates the window, so
usually the main one. I’ve not yet tried to do a multithread app with 2 or
more SDL2 windows so I can not tell you for sure if you can handle the
rendering in separate threads if you have created them on separate thread.

Another possibility is to poll equipment on a 2nd thread, an simply set
booleans on/off whenever equipment-related events are to be executed…
would that be a speed benefit ? Or is the overhead of multithreading
greater than the speed gain here ?

If your external input are simple boolean there is no gain at all since
you’ll probably need to add a mutex logic that flushes caches and slow down
your app more than what you gain.–
Bye,
Gabry

jeroen_clarysse1 · April 24, 2013, 11:32am

you’re probably right… the only thing I can possibly see as a useful parallelisation target is the streams : this is data that is fed into the PC via an analog-digital card at the rate of 1 sample per millisecond (so 1KHz). Sometimes we have 4 or 5 devices, so that is 5K samples per second coming in. I think that the device drivers are highly threaded already and will use idle CPU-cores if any are available. The only option for me is when I want to do real-time number crunching on that incoming stream of data.

my app does NOT have a wait_event() call : it only has a get_clock() call in which it will check if one millisecond has passed since the previous call. If so, all responses are checked and incoming stream-data is graphed offscreen (blitted onscreen after a refresh). The fact that my app calls get_clock every millisecond gives the impression that the CPU is heaviliy taxed (in windows, the task manager will show 100% for the core that the thread/app is running on). This way, it is difficult for me to estimate if the app will benefit from parallelisation.

thank you for the interesting input !

icculus · April 24, 2013, 4:29pm

I’m working out (or at least trying to) the schematics on how to split
up my app (a development studio for psychology experiments) in multiple
threads. Right now, my app does (mainly) the following things :

You’re probably about to get a flood of replies that say this, but I’ll
throw one out there, too: unless you absolutely need threads, don’t add
them.

The app you describe sounds like it can handle all this on one core with
CPU time to spare, so if you multithread this all you’ll do is spread it
between more cores that’ll also be mostly idle, but you’ll introduce
subtle bugs and make the code harder to maintain. It really sounds like
this workload would spend most its time waiting on mutexes between threads.

Threads are disproportionately hard to do well, and shouldn’t be used
unless you have a workload that not only maxes out a CPU, but would also
max out more cores if you split it up.

–ryan.

David_Olofson · April 24, 2013, 5:01pm

As others have pointed out, multithreading as a performance hack is a
nasty, complicated one, only to be used when you really need the raw
power of multiple cores.

You may want to use high priority threads for picking up and
timestamping input events, if you need better timing accuracy than
"one video frame," but if your event sources can provide timestamps
(from the drivers or even better, based on hardware timers), there’s
no need for that. Again, multithreading complicates things - and
unless you’re on a realtime OS, it doesn’t offer all that reliable
timing data anyway. (You should get down to milliseconds on average,
but there will most likely be occasional latency spikes in the order
of tens of milliseconds, at best.)

Also note that there’s no way, with standard video subsystems, to
update the display with exact timing. There’s a display refresh rate
(typically 60 Hz), and you have exactly one chance to update the
display with each refresh. That’s all the accuracy you will get
without custom hardware. That said, you may be able to raise the
refresh rate to somewhere between 100 and 200 Hz with a modern gaming
display - and these typically have much quicker response times from
input to actually displaying the changes as well.On Wed, Apr 24, 2013 at 9:28 AM, jeroen clarysse <jeroen.clarysse at ppw.kuleuven.be> wrote:

Hi all,

I’m working out (or at least trying to) the schematics on how to split up my
app (a development studio for psychology experiments) in multiple threads.
Right now, my app does (mainly) the following things :

poll static events. These are events at a fixed time, so a simple
clock-comparision is all that is needed

poll dynamic events. These are events that are caused by user input such
as keypresses, mouseclicks, mouse movement

poll external equipment (via parallel port or Data Acquisition Cards)

poll streams (via Data Acquisition cards). These are streams of numbers at
1Khz (buffered by the hardware)

do some simple calculations on these streams (for instance detect if a
stream exceeds a threshold, or calculate the sum of two streams into a new
stream)

draw stuff on the screen (at low speed. Usually there are only 5 to 10
screen updates per ‘trial’, and one program execution has typically 50-100
trials. However, we want these updates to happen EXACTLY when we expect
them)

draw a graph of the active streams on the 2nd monitor

that more or less covers the important parts. Right now, version 4 of the
software does all of this without multithreading, which is kind of a shame
since all PCs nowadays have multiple cores. So I would like to optimize
this. My question is : can you guys give me some tips on what to optimize,
and what not ?

for instance, I could split the drawing of the streams from the rest of the
app. But is that feasible ? Can one thread draw on one monitor, and another
thread on a second monitor ?

Another possibility is to poll equipment on a 2nd thread, an simply set
booleans on/off whenever equipment-related events are to be executed…
would that be a speed benefit ? Or is the overhead of multithreading greater
than the speed gain here ?

any advice would be very welcome !

thanks

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
//David Olofson - Consultant, Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://consulting.olofson.net http://olofsonarcade.com |
’---------------------------------------------------------------------’

Sik_the_hedgehog · April 24, 2013, 5:03pm

2013/4/24, Ryan C. Gordon :

Threads are disproportionately hard to do well, and shouldn’t be used
unless you have a workload that not only maxes out a CPU, but would also
max out more cores if you split it up.

Or you don’t want one heavy long operation to make the program
unresponsive (which is what I do in my game with loading screens by
having one thread for loading and one for dealing with the loading
screen itself - even on single core this guarantees the interface to
remain responsive). SDL itself does something similar too with the
audio subsystem if I recall correctly (with playback happening in its
own thread for timing reasons).

Of course this situation seems to be neither case, so yeah… Going
multithreaded could be bound to even make the program slower.

jeroen_clarysse1 · April 24, 2013, 7:47pm

thank you all for your feedback !

i have loads of other questions now, but as far as multi-threading goes, I will stick to one thread at the moment !