Audio - understanding the absolute buffer size

Hey Ryan,

I’ve got a question regarding the audio API, and how the samples parameter of the SDL_AudioSpec structure is used.

When I worked on a patch I submitted a while back for the ALSA backend (https://bugzilla.libsdl.org/show_bug.cgi?id=4156) it seemed that most audio backends used the samples parameter to mean frames, and they often try to create an audio buffer that’s 2x the requested frames parameter.

So, if samples was 256 and the frame size was 4, it’d attempt to create a total buffer of 256 * 4 * 2 bytes and if the hardware was capable, the obtained spec returned by SDL_OpenAudioDevice would contain samples = 256 frames, while the buffer itself is actually 512 frames.

In effect, the returned samples parameter is equal to the period size, and the number of periods can be assumed to be approximately 2 (but it’s not guaranteed). The problem being, it’s not possible to understand the total buffer size / audio latency without knowing both of these values.

Would it make sense to modify the API to report both the period size and number of periods?

  • Anthony

In a similar vein, would it make sense to make the number of periods an input to the SDL_AudioSpec?

Also, with the new audio queuing API, is there any interest in supporting blocking behavior? Essentially, making SDL_QueueAudio block until there’s room enough in the audio buffer to write out the data. This is immensely helpful for my problem domain (emulation), where the execution speed often is tied to the audio clock.

In my ideal world, I’d be able to configure a buffer of say 1024 frames with 4 periods that are each 256 frames which I can enqueue to in a blocking manner. I’d like a larger overall buffer to help with long frames caused by excessive code compiling, etc. but smaller periods to provide more frequent interrupts to tell the emulator to resume execution in order to have more consistent frame pacing.

I currently simulate my ideal world of top of SDL’s callback API, but it’d be great if I could take advantage of having SDL handle the n buffering, so I can enqueue multiple periods ahead of time in case the callback is latent.

Sorry this turned into a bit of a long ramble - I’m just trying to get a feel for what your direction with the API is, and if there’s any overlap that I could help contribute to.

If this is turning into a ‘feature request’ thread, can I once again ask that SDL_ClearQueuedAudio() returns how much was cleared (rather than void). In my application I call SDL_GetQueuedAudioSize() immediately followed by SDL_ClearQueuedAudio() but since they are not atomic there’s a possibility that a block was queued in between and then my code misbehaves.

This isn’t a feature request thread.

I’m trying to learn more about SDL’s goals with its Audio API. I think there’s some room to modify the API to enable users to have more granular control over the underlying audio buffer - which is desirable for my use case - but I want to see if this is something that fits in with SDL’s own goals. If there is some overlap we can figure out what that is and get some work and patches rolling.

I saw your old post from November without a reply. For what it’s worth, while I’m not familiar with SDL’s development, when I have a small request like that for OSS I find it’s generally best to let the code talk even though there is the risk some of your time may be wasted - make the change, post the patch with justification for the change and let the discussion go from there.

It’s a good idea, but unfortunately it would be a steep learning curve for me (not having ever used Mercurial, nor being familiar with SDL’s code, nor knowing how to create a patch etc.). I don’t even know whether it’s legitimate to change a function from void to Uint32 or whether a new function would be required. I’m primarily a BASIC programmer!

Did an experiment tonight where I introduced a new function, SDL_WriteAudio.

This function writes audio directly to the device (after resampling if necessary) and blocks until the entire write is complete. To do this, a new WriteDevice routine was added to each audio backend which is essentially a combination of GetDeviceBuf / PlayDevice / WaitDevice. I also modified the audio thread routine to use the new WriteDevice to show that I think it’s possible to add the new blocking functionality without having to add a ton of new code - it’s more a matter of merging together existing code in the backends.

I had great results with this new path on the WASAPI backend and its dreadful 10 ms buffer. The write object for the shared streams is signaled in intervals much less than 10 ms apart, enabling me to block for less time when making smaller writes to the buffer. This lets me write audio / block / run emulation / repeat at a more consistent pace as I’m not having to wait a full 10+ ms for the next callback once the buffer is “full.”

Since the real value here is not having to wait for the full 10 ms to drain before my app gets a notification, I wonder if there’s value in just modifying the main audio thread loop / the WASAPI backend to call the callback with < the AudioSpec samples in cases like this to help improve responsiveness.

Here’s a patch from tonight’s proof of concept: https://gist.github.com/inolen/35fa98f685e49a814ff3ce8b9a7b6b3e

Right now the patch just #if 0’s the main audio thread. I didn’t spend much time thinking on how the blocking push (SDL_WriteAudio), non-blocking push (SDL_QueueAudio) and pull APIs could be selected between at runtime.

If anyone has feedback, please let me know.

When I am trying to change the buffer size then suddenly chrome keeps crashing and so please help me with the fix.