Sync issue: real-time audio with SDL_QueueAudio


I am building an application that generates its own sound stream at 8 kHz. Once I built the sound sample, I pass it to SDL_QueueAudio(). The problem is that my application and SDL appear to be slightly out-of-sync. I mean - I open the SDL audio device with a 8 kHz rate, and I emit samples at 8 kHz, but there must be some small rate difference because the sound is building up a delay (after one minute, audio is lagging 2s behind video).

I understand the reason of the problem, but I don’t know how to solve it. Currently, I use this:

if (SDL_GetQueuedAudioSize() > 4096) SDL_ClearQueuedAudio(devid);

This kinda works, but it’s ugly and generates sound glitches every time it executes.

What would be the best way to solve this problem?

I was thinking along the lines of skipping a frame from time to time on my side, but for this I would need to know how much audio is lagging behind video (and here the problem is that SDL_GetQueuedAudioSize() is not very precise, to say the least). Any idea? I suppose this is a common problem as soon as anyone uses the SDL_QueueAudio() API, but sadly I fail to see any obvious solution.

After many tries, I found no better way than relying on SDL_GetQueuedAudioSize() for timing, but instead of using a fixed trigger value, I had to define an “acceptable size range” for the queued SDL buffer, and slightly adapt my audio output speed (faster/slower) whenever the SDL audio buffer approach any of the acceptable boundaries. Basically, this looks something along these lines:

#define LOWER_LIMIT 4096
#define UPPER_LIMIT 16384

qlen = SDL_GetQueueAudioSize();
if (qlen < LOWER_LIMIT) audio_speed++;
if (qlen > UPPER_LIMIT) audio_speed--;

This is a very clumsy approach, but it works. A much better way would be to use the SDL audio framerate as a timer for the rest of my application (“sync framerate to-audio”). Unfortunately, as of today I was unable to find any SDL method that would enable me to do so (the SDL_GetQueueAudioSize function processes audio data in too big blocks to be useable as a high-resolution timer).