Just to be clear, what I meant when I said that accessing hardware is sequential, is that accessing the interface to hardware is sequential (order is VITAL when using OGL) - which Bob pointed out, but didn’t say explicitly. Again, this goes back to what I was saying about most people doing drawing from a single thread.
Okay. In order to implement fully parallelized access to a video surface:
- Add support for a list of mutexes to the surface.
- For every drawing operation:
a. Test to see that the area being updated is NOT locked by checking EVERY mutex in the list of mutexes.
i. If lock found, sleep, busy-wait or whatever and try again from 2.a.
ii. if no lock found, try to acquire the mutex on the mutex list.
1. if cannot get mutex on list of mutexes, sleep, busy-wait or whatever and return to 2.a (yes, you must recheck all mutexes).
2. Now that list-lock is aquired, add mutex for the region being updated.
b. Do blit
c. Try to acquire list-mutex.
i. If unable to aquire list mutex, sleep, busy-wait or whatever and return to 2.c.
ii. Now that you have list-mutex, remove the mutex from the list.
d. Return from drawing function.
Now, granted, this is a scheme I knocked up in 10 minutes and there are a number of techniques available to improve average performance and reduce worst-case scenario, you’re still going to get a bottleneck around the list-mutex (you just can’t avoid it!).
Compare that to the sequential case where you’ve told the user that they MUST only draw from one thread (i.e., you’ve told them it’s not thread safe):
- Do blit.
a. The amount of memory and processing overhead involved in implementing this (not to mention the code and debugging!).
b. You only get the most benefit out of it when your updates to the drawing surface are small and therefore quite quick anyway.
This reinforces the point I made about “experienced developers” just doing drawing from a single thread.
From: email@example.com [sdl-bounces at lists.libsdl.org] On Behalf Of Bob Pendleton [bob at pendleton.com]
Sent: 07 August 2008 14:16
To: A list for developers using the SDL library. (includes SDL-announce)
Subject: Re: [SDL] multithreading & video
On Wed, Aug 6, 2008 at 11:35 AM, Albert Zeyer <albert.zeyer at rwth-aachen.de<mailto:albert.zeyer at rwth-aachen.de>> wrote:
On Wed, 2008-08-06 at 15:57 +0100, cullen e.a. (eac203) wrote:
I said “drawing in a single thread”.
Fundamentally, graphics hardware is still single-threaded - with
OpenGL, you’re adding drawing operations to a list and that list is
executed as quickly as possible. Without OpenGL, all you have is flat
memory. If you’re updating flat memory, the most efficient way to do
that is with a single DMA transfer. (Yes, I know this is a gross
over-simplification, but it is close enough to the truth…) So,
you’re not actually going to improve drawing performance by
threading, because the most efficient way to draw is by using a single
OK, that explains at least the case when OGL is used. Though for pure
software surfaces, I still don’t understand why it should not work.
Indeed, it might work for software surfaces. You would need to study the code to find out if it is thread safe. If it is not then you could take on the project of making it thread safe. As you have implied, it could be possible to do soft blits in separate threads.
Btw., that still does not explain why the graphic stuff can only be done
from the main thread and not from another thread. Why is it important to
do it really from the main thread?
Access to hardware acceleration is provided by OpenGL. An OGL context is usually? Often? (it depends on the OS) bound to a single thread. And, that thread is the one that created the context. The SDL 1.x libraries create the context during initialization so the context is bound to the initial thread. In 2.0 you will be able to create multiple OGL contexts and may be able (I can not guarantee it) to do OGL calls in multiple threads. But, they will likely have to be attached to separate windows.
Hardware acceleration is highly parallelized. But, that does not mean you can process multiple calls using the same OGL context at once. It means that processing OGL primitives can be parallelized to an astonishing degree. For example, if you send a large triangle strip to OGL the hardware can transform dozens (or hundreds) of vertices at the same time. And, once at least points are transformed the hardware can rasterize dozens (or hundreds) of triangles at the same time. Primitive rendering can be highly parallelized.
OTOH, if two commands can be carried out at once, the hardware/OGL may try to carry them out in parallel. It all depends on the hardware and the driver. But, not all OGL commands can be run in parallel. Many graphics effects require that a sequence of primitives must be rendered in the order they were sent to OGL. Think about alphablending and depth queueing, rendering primitives out of order can cause weird visual effects.
It is wrong to lump all SDL graphics operations together. Hardware graphics has one set of problems, software graphics has another.
BTW, the easiest way to do two blits as fast as they can be done is to use hardware surfaces and call OGL to do the blits. Even though you send the blit commands sequentially, they may be done in parallel. They may be done in parallel on a large number of processors. Once the first blit is started the hardware will try to start the second blit. (You do have to flush the OGL command queue to make sure they get started.)
SDL mailing list
SDL at lists.libsdl.org<mailto:SDL at lists.libsdl.org>