Audio without callback

Apparently there is a plan to remove the callback audio API in SDL3.

I have a game that I have been working on for many years, on and off. It still uses SDL 1.2 but when it’s ready I will probably update it to use SDL 2 or 3.

I have created a little “sound language” which allows me to create simple sound effects and music. For the music many parts are repeated so instead of just generating all samples as a long array I instead have something that is more like an array of pointers to array of samples (these can be as short as a single musical note but could also be longer). On a higher level, background music often have an introduction followed by a part that is repeated infinitely. At any point I might want the music to fade away. I even have the ability to cross fade the background music to a slightly different version that might use different “instruments”, different octave, different speed, etc. so that it continues from the exact same position in the music flawlessly.

The above works fine using the current callback API, but would it work without it (efficiently, without additional delay)? Would I perhaps be better off using another library for the audio once I upgrade to SDL 3?

As far as I understand, SDL 3 will use SDL_AudioStream. I have no experience with that but I see that it has a function named SDL_AudioStreamPut to add samples. So maybe I could use this to essentially implement my own callback, but wouldn’t that be less efficient? Note that I’m not interested in the conversion aspect of SDL_AudioStream. If the format changed at runtime I would rather prefer to regenerate the samples.

To be clear, this is still just an idea, and we haven’t actually removed the callback from SDL3 yet. But we might, and if we do, we’ll provide a single-header library that one can drop into their project that will spin a thread and call an app’s callback, managing the AudioStream for you.

I don’t expect latency to be worse in a measurable way in the new system; more or less, it works the same way behind the scenes. The difference is you just feed audio to the stream as you have it and we consume it as we need it, instead of the callback demanding you give us an exact amount more right now.

Assuming you don’t use the single-header library that continues to provide a callback, the change in your app would be to just generate a little more sound on each frame (or send a new block of audio every X milliseconds, or query the stream to see what’s queued and decide if you want to top it off with more data).

Thanks for the reply. It sounds reassuring when you explain it like that.

Interesting. As long as it works well, this is probably the easiest way that will require the least amount of change.

I guess there is a lot more going on behind the scenes than I realized.

From my perspective it’s more like I make it available as it is needed. I don’t want to make it available too early because that would delay the start of new sounds, etc. I also don’t want to make it available too late and risk missing the deadline. With the current API we also make this trade-off when deciding a suitable size for the sample buffer.

So I could do this from the main thread? I think I can see some advantages with that approach and it’s nice not having to deal with different threads unnecessarily.

However, If I am responsible for doing this myself I would worry about the pace that I provide the sound samples in and about staying in sync.

For example, If I add samples at a fixed rate but gets delayed and end up adding many samples at once after missing the deadline, then I think the latency would become bigger (since the amount of data in the stream will now be bigger), unless SDL somehow accounts for this and drops samples that come in too late. The same might also happen if SDL and/or the backend doesn’t manage to keep up the pace for whatever reason.

On the other hand, the fact that the latency is increased when the deadline is missed might actually be a good thing since it makes it less likely to happen again, but only up to a certain point.

This sounds like it might be the best approach (if not using the callback lib).

If the sample rate is 44100 and I decide that an acceptable latency is 100 ms* then I guess I could just call SDL_AudioStreamAvailable to figure out how many samples there are and fill it up so that there are (at least) 4410 samples after each frame update.

I understand there will probably be some additional latency due to how SDL handles it behind the scenes.

On second thought this might not be a good idea because some rare actions (saving, loading, changing level, etc.) might take more than the usual frame time so I think I would still need to call this from a separate thread.