SDL_LockAudio and MacOS

david_at_ultramaster · April 19, 2001, 4:04pm

Hi List.

This is my first post to the list Hi all.

My question is about the audio callback in MacOS. My understanding is
that this runs in a hardware interrupt context, or something just as
ridiculous.

I’ve looked at the SDL audio source code, and it seems as though the
SDL_LockAudio and SDL_UnlockAudio are no-ops on MacOS. Thus there’s no way
to safely manipulate any data structures that may be accessed both by the
’event’ context and the ‘audio callback’ context.

I’ve read a bit about the MacOS sound interface, and it seems that there
are two channel commands that may suffice, pauseCmd and resumeCmd. Does
anyone out there know whether the pauseCmd makes the following
guarantees:

a) no callback is running (should be by definition since there are no
threads)

b) no callback will be run until the resumeCmd is sent.

If so, I could add this functionality to the SDL library, and test it of
course…

I also noticed a comment about implementing some PPC interrupt asm, in
SDL_OpenAudio, which I assume means some sort of ‘cli / sti’ (intelism)
type of thing. Is this in the works?

David–
David Mansfield (718) 963-2020
@david_at_ultramaster
Ultramaster Group, LLC www.ultramaster.com

Darrell_Walisser · April 19, 2001, 4:46pm

Hi List.

This is my first post to the list Hi all.

I’ve looked at the SDL audio source code, and it seems as though the
SDL_LockAudio and SDL_UnlockAudio are no-ops on MacOS. Thus there’s no way
to safely manipulate any data structures that may be accessed both by the
’event’ context and the ‘audio callback’ context.

Well, since the audio is copied (audio callback is invoked here) to the
sound chip in the interrupt (which cannot be interrupted by the main
thread), there is no need to lock the audio. It is impossible for your
code and the interrupt code to execute concurrently. It is possible,
however, that you will be interrupted before you finish processing the
buffers, in which case you will get distortion.

The problem is that once we enter the audio loop, we must keep pumping
data to the chip on every interrupt.

It may be possible to implement a pseudo-interrupt mechanism where we copy
silence to the chip if the audio is locked, which would be very easy to
implement.On Thu, 19 Apr 2001 david at ultramaster.com wrote:

I’ve read a bit about the MacOS sound interface, and it seems that there
are two channel commands that may suffice, pauseCmd and resumeCmd. Does
anyone out there know whether the pauseCmd makes the following
guarantees:

a) no callback is running (should be by definition since there are no
threads)

b) no callback will be run until the resumeCmd is sent.

If so, I could add this functionality to the SDL library, and test it of
course…

I also noticed a comment about implementing some PPC interrupt asm, in
SDL_OpenAudio, which I assume means some sort of ‘cli / sti’ (intelism)
type of thing. Is this in the works?

David

–
David Mansfield (718) 963-2020
david at ultramaster.com
Ultramaster Group, LLC www.ultramaster.com

david_at_ultramaster · April 19, 2001, 7:08pm

I’ve looked at the SDL audio source code, and it seems as though the
SDL_LockAudio and SDL_UnlockAudio are no-ops on MacOS. Thus there’s no way
to safely manipulate any data structures that may be accessed both by the
’event’ context and the ‘audio callback’ context.

Well, since the audio is copied (audio callback is invoked here) to the
sound chip in the interrupt (which cannot be interrupted by the main
thread), there is no need to lock the audio. It is impossible for your
code and the interrupt code to execute concurrently. It is possible,
however, that you will be interrupted before you finish processing the
buffers, in which case you will get distortion.

No. What I mean is that the interrupt will ‘interrupt’ my code. Imagine
if there is some sort of linked list that needs to be walked to generate
the next buffer of sound.

In normal operation, the interrupt occurs, pre-empts the ‘main thread’,
and calls the callback. The callback walks the list of objects that need
to produce sound, and fills the buffer then returns.

Ok no problem. But now if the ‘main thread’ is in the middle of updating
that linked list (adding or removing a node, let’s say), the list can be
in an inconsistent state at the exact moment the interrupt occurs. Then
the callback could end up crashing on an invalid memory access, do to a
corrupted list.

It’s important to be able to guarantee an atomic state change from A to B.
That state change is made up of a bunch of non-atomic asm instructions.
In order to guarantee that the callback sees either state A or state B but
not some in-between mangled state, we need to be able to block the
callback once the state change begins and then re-enable once we reach B.

The problem is that once we enter the audio loop, we must keep pumping
data to the chip on every interrupt.

Understood. Of course one method is to produce audio buffers in the main
thread and have the audio callback just copy to the stream and return.
Again, it seems difficult to manage the audio pointers without some
locking primitives.

The other method is to have the ‘audio engine’ run in the interrupt
context, which is fine, but it’s hard to imagine anything other than a
trivial ‘audio engine’ that doesn’t have some sort of locking needs.

It may be possible to implement a pseudo-interrupt mechanism where we copy
silence to the chip if the audio is locked, which would be very easy to
implement.

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

David–

David Mansfield (718) 963-2020
@david_at_ultramaster
Ultramaster Group, LLC www.ultramaster.com

Darrell_Walisser · April 19, 2001, 9:36pm

The other method is to have the ‘audio engine’ run in the interrupt
context, which is fine, but it’s hard to imagine anything other than a
trivial ‘audio engine’ that doesn’t have some sort of locking needs.

Somehow the Mac port of OpenAL manages to work without delaying the
interrupt.

It may be possible to implement a pseudo-interrupt mechanism where we
copy
silence to the chip if the audio is locked, which would be very easy to
implement.

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

Unfortunately I don’t think there is any way to implement it. The
interrupt is generated by the audio hardware whenever one of its buffers
is finished playing, like clockwork (you could leverage this fact to
avoid race conditions perhaps). There is no way easy way around this in
OS 9 or earlier systems, you cannot reconfigure the hardware interrupt.

BUT on MacOS X (thank god) you can most likely put mutex locks around
your critical sections of the audio callback and audio engine.

In order do deal with the locking problem (which I am sure many MacOS
games have done), you will have to change the way things are done
compared to a preemptively threaded audio system. You might find some
ideas in the “asynchronous sound” article at iDevGames.com.

David_Olofson · April 21, 2001, 5:41am

Why not? There are lock-free solutions for most kind of sync constructs,
although most of them are rather messy, uggly and/or awkward to code. Single
reader-single writer constructs are rather simple, though, as long as you
have atomic reads and/or atomic writes for some usable word size, which is
the case with most architectures, even on SMP machines.

Try passing info (pointers to ready buffers) over a lock-free FIFO, for
example… If you want it rock solid and/or want to reduce the number of
buffers floating around, use another FIFO so the callback/ISR can return
buffers when they’re not needed any more.

I have a FIFO that I use for all sorts of things on various platforms, and is
should work on PPC as well, I think. I just uploaded a stripped version (no
Linux kernel driver versions of sfifo_read/write()) to the usual place:

http://www.angelfire.com/ar/agc/download/

(Do run the test program [make test] fist, just in case! It could
potentially break if the compiler does things differently, but the only
requirement is that buffer offsets are updated after data is read of
written, so I can’t see how…

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Thursday 19 April 2001 23:36, Darrell Walisser wrote:

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

Unfortunately I don’t think there is any way to implement it.

Darrell_Walisser · April 21, 2001, 5:14pm

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

Unfortunately I don’t think there is any way to implement it.

Why not? There are lock-free solutions for most kind of sync constructs,
although most of them are rather messy, uggly and/or awkward to code. Single
reader-single writer constructs are rather simple, though, as long as you
have atomic reads and/or atomic writes for some usable word size, which is
the case with most architectures, even on SMP machines.

I’ve thought about this problem a bit, and I think I have a solution:

Modify the Carbon sound driver as follows:

make SDL_LockAudio set a flag audio_locked to true.

in the callback, if the audio is locked, don’t send any new data to the
chip, and stop the callback cycle. In other words, do nothing in this
case.

make SDL_UnlockAudio send data to the chip and also send the interrupt
command to start the callback cycle over again.

As long as you can process all of your data between

SDL_LockAudio() and SDL_UnLockAudio() faster than the current sound
buffer takes to play, you shouldn’t get any clicks or have synchronization
problems.

What do you think?

Regards,
DarrellOn Sat, 21 Apr 2001, David Olofson wrote:

On Thursday 19 April 2001 23:36, Darrell Walisser wrote:

slouken · April 21, 2001, 7:25pm

I’ve thought about this problem a bit, and I think I have a solution:

Modify the Carbon sound driver as follows:

make SDL_LockAudio set a flag audio_locked to true.

in the callback, if the audio is locked, don’t send any new data to the
chip, and stop the callback cycle. In other words, do nothing in this
case.

make SDL_UnlockAudio send data to the chip and also send the interrupt
command to start the callback cycle over again.

Do you have a working patch?

See ya!
-Sam Lantinga, Lead Programmer, Loki Entertainment Software

david_at_ultramaster · April 23, 2001, 5:17am

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

Unfortunately I don’t think there is any way to implement it.

Why not? There are lock-free solutions for most kind of sync constructs,
although most of them are rather messy, uggly and/or awkward to code. Single
reader-single writer constructs are rather simple, though, as long as you
have atomic reads and/or atomic writes for some usable word size, which is
the case with most architectures, even on SMP machines.

You only have to worry about the speculative reads and writes a CPU may
do. A CPU is allowed to re-order execution, and to speculatively read a
value (prefetch) before it’s used. I think this has been discussed a lot
on the Linux Kernel mailing list. I’m not really a CPU expert, but I
understand that reads and writes don’t always take place as expected
without the explicit placement of a ‘memory barrier’, which can be either
a read or a write barrier.

That said (FYI really), I don’t see it being an issue with your code
really (haven’t looked really hard). The real problem is that I’d really
like some of the other locking semantics that usually come along with
locks, such as wake-ups. Your FIFO’s are always non-blocking. It would
be nice to not have to wait in a busy loop for the interrupt to occur.

After a full audio fragment has been processed (created) you want to wait
without spinning for the next audio interrupt, and pass the buffer to the
driver at that point. I suppose a 'while (!interrupted) usleep(1000);'
type of loop is OK. Since a fragment will generally be on the order of a
few millisenconds long.

Try passing info (pointers to ready buffers) over a lock-free FIFO, for
example… If you want it rock solid and/or want to reduce the number of
buffers floating around, use another FIFO so the callback/ISR can return
buffers when they’re not needed any more.

It would be nice to block here waiting for the free buffer to come out of
the FIFO.

I have a FIFO that I use for all sorts of things on various platforms, and is
should work on PPC as well, I think. I just uploaded a stripped version (no
Linux kernel driver versions of sfifo_read/write()) to the usual place:

http://www.angelfire.com/ar/agc/download/

(Do run the test program [make test] fist, just in case! It could
potentially break if the compiler does things differently, but the only
requirement is that buffer offsets are updated after data is read of
written, so I can’t see how…

I looked briefly at your code. I think there’s a slight bug. In
sfifo_init you set the f->size to the next greatest power of two to the
size passed in the following loop:

for(; f->size < size; f->size <<= 1);

And yet the actual malloc only allocates ‘size’ (not f->size) bytes. Then
you go on to use the entire f->size number of bytes throughout.

Looks nice though, generally. Too bad there’s no portable way to do
non-racy sleep/wakeup stuff (which of course would need locking

David (also)> On Thursday 19 April 2001 23:36, Darrell Walisser wrote:

–
David Mansfield (718) 963-2020
@david_at_ultramaster
Ultramaster Group, LLC www.ultramaster.com

David_Olofson · April 23, 2001, 1:31pm

Should work, but keep in mind that you need a totally reliable way of safely
skipping an interrupt without SDL_UnlockAudio() failing to compensate for it.
(Another flag might be enough to take care of that.)

BTW, both the audio callback and SDL_UnlockAudio() could perhaps use the same
code, unless of course, the audio API as to be used differently from outside
the callback. Don’t know enough about Mac OS or Mac OS X to tell, but it’s a
common way of doing it inside drivers for various other platforms.

(As to application code, the only system I’ve programmed that uses real
callbacks from the driver is Win16 - and there, you cannot queue new audio
buffers from within such callbacks! Now, is that useless, or is that
useless…?)

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Saturday 21 April 2001 19:14, Darrell Walisser wrote:

On Sat, 21 Apr 2001, David Olofson wrote:

On Thursday 19 April 2001 23:36, Darrell Walisser wrote:

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

Unfortunately I don’t think there is any way to implement it.

Why not? There are lock-free solutions for most kind of sync constructs,
although most of them are rather messy, uggly and/or awkward to code.
Single reader-single writer constructs are rather simple, though, as long
as you have atomic reads and/or atomic writes for some usable word size,
which is the case with most architectures, even on SMP machines.

I’ve thought about this problem a bit, and I think I have a solution:

Modify the Carbon sound driver as follows:

make SDL_LockAudio set a flag audio_locked to true.

in the callback, if the audio is locked, don’t send any new data to the
chip, and stop the callback cycle. In other words, do nothing in this
case.

make SDL_UnlockAudio send data to the chip and also send the interrupt
command to start the callback cycle over again.

As long as you can process all of your data between

SDL_LockAudio() and SDL_UnLockAudio() faster than the current sound
buffer takes to play, you shouldn’t get any clicks or have synchronization
problems.

What do you think?

Darrell_Walisser · April 23, 2001, 2:48pm

I’ve thought about this problem a bit, and I think I have a solution:

Modify the Carbon sound driver as follows:

make SDL_LockAudio set a flag audio_locked to true.

in the callback, if the audio is locked, don’t send any new data to the
chip, and stop the callback cycle. In other words, do nothing in this
case.

make SDL_UnlockAudio send data to the chip and also send the interrupt
command to start the callback cycle over again.

As long as you can process all of your data between

SDL_LockAudio() and SDL_UnLockAudio() faster than the current sound
buffer takes to play, you shouldn’t get any clicks or have synchronization
problems.

What do you think?

Should work, but keep in mind that you need a totally reliable way of safely
skipping an interrupt without SDL_UnlockAudio() failing to compensate for it.
(Another flag might be enough to take care of that.)

Shouldn’t be a problem.

BTW, both the audio callback and SDL_UnlockAudio() could perhaps use the same
code, unless of course, the audio API as to be used differently from outside
the callback. Don’t know enough about Mac OS or Mac OS X to tell, but it’s a
common way of doing it inside drivers for various other platforms.

Sure, I could use the same code in both places. But it’s only 10 lines, so
I don’t know how much difference that really makes (well, easier to
maintain at least).

(As to application code, the only system I’ve programmed that uses real
callbacks from the driver is Win16 - and there, you cannot queue new audio
buffers from within such callbacks! Now, is that useless, or is that
useless…?)

Not a problem. I’ve done it both ways. Actually the PLIB MacOS audio
driver doesn’t send any samples in the interrupt and seems to work OK.On Mon, 23 Apr 2001, David Olofson wrote:

David_Olofson · April 23, 2001, 2:55pm

And result in clicks/blank spaces. Unacceptable. I’m talking about
blocking the audio interrupt for microseconds at a time, while, for
example, a link list operation is completed.

Unfortunately I don’t think there is any way to implement it.

Why not? There are lock-free solutions for most kind of sync constructs,
although most of them are rather messy, uggly and/or awkward to code.
Single reader-single writer constructs are rather simple, though, as long
as you have atomic reads and/or atomic writes for some usable word size,
which is the case with most architectures, even on SMP machines.

You only have to worry about the speculative reads and writes a CPU may
do. A CPU is allowed to re-order execution, and to speculatively read a
value (prefetch) before it’s used. I think this has been discussed a lot
on the Linux Kernel mailing list. I’m not really a CPU expert, but I
understand that reads and writes don’t always take place as expected
without the explicit placement of a ‘memory barrier’, which can be either
a read or a write barrier.

That’s quite enough to worry about, actually.

That said (FYI really), I don’t see it being an issue with your code
really (haven’t looked really hard).

The only problem would be if the read/write offset write is somehow done
before the last memcpy() is completely finished. If the code is very
carefully optimized by the compiler, this could probably happen on a next
generation CPU with an extremely deep pipeline, but I don’t think it’s even
theoretically possible with P-IV, G4 and older CPUs.

The real problem is that I’d really
like some of the other locking semantics that usually come along with
locks, such as wake-ups.

Right, but that’s not portable, and generally not possible to implement in a
way that guarantees that you don’t lose the CPU when trying to wake up
another thread. (That’s why one shouldn’t use standard sync constructs from
within SCHED_FIFO real time threads in audio on Linux/lowlatency BTW; it
effectively cancels the advantages of SCHED_FIFO on current kernels.)

Your FIFO’s are always non-blocking. It would
be nice to not have to wait in a busy loop for the interrupt to occur.

Well, the FIFOs were originally designed for communication between threads
that manage their own timing, expecting the FIFOs never to interfere with
thread scheduling.

After a full audio fragment has been processed (created) you want to wait
without spinning for the next audio interrupt, and pass the buffer to the
driver at that point. I suppose a 'while (!interrupted) usleep(1000);'
type of loop is OK. Since a fragment will generally be on the order of a
few millisenconds long.

Unless you’re using a dedicated thread only to process audio and then pass it
on to the callback, you normally wouldn’t want to sleep on the FIFO anyway.
If you are using a dedicated audio thread, why can’t the code be in the
callback, in the main loop of the application (provided it’s looping
continously at a sufficient rate), or in a thread that schedules
"periodically" using SDL_Delay() or something?

An example of where the non-blocking FIFO fits in perfectly would be a game
that runs the audio engine inside the audio callback, passing control
commands (“soundsource::start”, “soundsource::volume” etc) from either the
main loop, or (in the case of a decoupled game loop), from within the control
system thread. The audio callback would just read and process all commands on
the input FIFO every time it’s invoked, writing any responses to the return
FIFO.

No need for waking anything up, as both the game/control system thread and
the audio engine have different sources of timing that essentially define
their respective need for timing resolution. No need waking the control
system thread up to sync with audio events as the CS must stick to it’s fixed
"hartbeat", and no need to “wake up” the audio callback, as it can’t bypass
the audio buffering(*) anyway.

(*) Not quite true, if we’re using a shared memory audio API, rather than the
current SDL interface - but that’s another story. You don’t even need to
use a separate audio thread or callback at all with such an interface.

Try passing info (pointers to ready buffers) over a lock-free FIFO, for
example… If you want it rock solid and/or want to reduce the number of
buffers floating around, use another FIFO so the callback/ISR can return
buffers when they’re not needed any more.

It would be nice to block here waiting for the free buffer to come out of
the FIFO.

If you’re using a dedicated audio thread; yes, but that’s not the kind of
setup this FIFO is designed for. (See above.)

The simplest way would be to have the main loop (doing video updates at
whatever speed the machine can cope with) do the same thing as the audio
callback; ie read and process all data in the “input” FIFO once per loop/call.

Of course, you’ll need enough buffering to deal with the timing of the main
loop, but if that results in too high latency, you’re probably not going to
want to use an audio engine that’s too indeterministic to run inside the
audio callback anyway. It’s not very likely that you’ll do much better with a
separate audio thread than inside the main loop, on most operating systems,
especially not if the main loop can use up lots of CPU time. (Note: It’s
entirely different with Linux/lowlatency and real RTOSes like QNX.)

I have a FIFO that I use for all sorts of things on various platforms,
and is should work on PPC as well, I think. I just uploaded a stripped
version (no Linux kernel driver versions of sfifo_read/write()) to the
usual place:
http://www.angelfire.com/ar/agc/download/
(Do run the test program [make test] fist, just in case! It could
potentially break if the compiler does things differently, but the only
requirement is that buffer offsets are updated after data is read of
written, so I can’t see how…
I looked briefly at your code. I think there’s a slight bug. In
sfifo_init you set the f->size to the next greatest power of two to the
size passed in the following loop:

for(; f->size < size; f->size <<= 1);

And yet the actual malloc only allocates ‘size’ (not f->size) bytes. Then
you go on to use the entire f->size number of bytes throughout.

Oops. Never noticed, as I never request non-power-of-two buffer sizes.

Looks nice though, generally. Too bad there’s no portable way to do
non-racy sleep/wakeup stuff (which of course would need locking

Tricky stuff, indeed. Even when hacking the kernel, you eventually arrive at
the point where you have to make the scheduler aware of the fact that a new
thread has just been made runnable, and that’s where you hit a spinlock or
similar construct, protecting the runnable task list…

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Monday 23 April 2001 07:17, david at ultramaster.com wrote:

On Thursday 19 April 2001 23:36, Darrell Walisser wrote:

david_at_ultramaster · April 23, 2001, 5:08pm

Should work, but keep in mind that you need a totally reliable way of safely
skipping an interrupt without SDL_UnlockAudio() failing to compensate for it.
(Another flag might be enough to take care of that.)

Shouldn’t be a problem.

How do you work around the following race:

SDL_Lockaudio()
-> set ignore callback flag to true
do some list manipulations
SDL_Unlockaudio()
-> check if callback happened while we were locked, nope
<- callback happens here
-> set ignore flag to false

We’ve now dropped a callback. You can’t work around this without a lock
IMHO.

However, there is the pauseCmd, resumeCmd with the SndDoImmediate, as I
posted before. I’d be willing to bet that these commands will function as
locking primitives around the callback. The closest I can get to
’semantics’ for the call are that they pause all processing of the audio
command queue. Now since the callback is issued in response to a queued
?‘callBackCmd’ it should be delayed by the pauseCmd until the resumeCmd is
issued. Since we can issue these two commands using SndDoImmediate, we
bypass the command queue. It is also specified that pauseCmd DOES NOT
pause any currently playing sound, it only causes no further queued
commands to be processed.

I’ve never tried this however.

Since the Mac is single thread w/interrupt model, there is no chance of
setting the pause flag WHILE a callback is in progress.

Do you guys think this is a reasonable model? Unfortunately,
SDL_LockAudio and SDL_UnlockAudio are not “virtual” methods in SDL 1.2.
Maybe they need to be. In the meantime, I could hack together a test of
this technique pretty quickly by exposing functions in the SDL_romaudio.c
to do this.

David–
David Mansfield (718) 963-2020
@david_at_ultramaster
Ultramaster Group, LLC www.ultramaster.com