Changing buffer size causes inconsistent audio output

I am using SDL 2.0.20 on Pop!_OS 22.04 (I really need to migrate to SDL3 when I get a chance)

I am trying to output a constant square wave to the audio device via SDL_QueueAudio, but I run into a very strange issue where instead of a constant stream, I get a series of short “blips” about a second apart.

#include <SDL2/SDL.h>
#include <SDL2/SDL_audio.h>

int32 main() {

	/* Initialization for window and input management */

	if (SDL_InitSubSystem(SDL_INIT_AUDIO | SDL_INIT_VIDEO) < 0) {
		std::cerr << "failed to initialize SDL audio, video, or events: " << SDL_GetError() << "\n";
		return -1;
	}

	/* SDL window creation */

	// Audio?
	int32 devNum = SDL_GetNumAudioDevices(0) - 1;
	const char* devName = SDL_GetAudioDeviceName(devNum, 0);

	SDL_AudioSpec specTarget;
	specTarget.format   = AUDIO_S32MSB; // signed, 32-bit, big-endian (?), integer format
	specTarget.freq     = 48000; // samples in Hz
	specTarget.channels =     2;

This is where I think the issue is originating from (described further below). I set the buffer size to 2 seconds, which to my understanding is a very large buffer. I have noticed that if I make this buffer smaller, the delay between the disconnected audio blips seems to also get smaller.

	specTarget.samples  = 48000; // buffer size in sample frames, 2 seconds?
	specTarget.callback =     0; // will not be using a callback function
	specTarget.userdata =     0; // ignored because no callback function

	SDL_AudioSpec spec;
	SDL_AudioDeviceID devID = SDL_OpenAudioDevice(devName, 0, &specTarget, &spec, 0);
	SDL_PauseAudioDevice(devID, 0);

	uint32 wavePhase = 0;
	uint32 waveFreqHz = 260;

	while (loopCont == true) {
		
		/* Input management and exit condition */

		/* Graphics output */

My approach to ensure that the audio is continuous was to check if there is any audio currently queued. If there is none, queue more square wave (arbitrarily set to queue 1/60th of a second’s worth of audio) then wait for it to drain again. However, when I run the program it seems that the queued audio isn’t actually draining to the device. SDL_GetQueuedAudioSize repeatedly reports 6400 bytes queued until about every second or so, where it suddenly reports 0 and then states that it is queueing audio. This happens at the same frequency of the audio blips, but not in sync with them.

		uint32 audioQueuedBytes = SDL_GetQueuedAudioSize(devID);
		std::cout << audioQueuedBytes << " bytes queued."; // DEBUG
		if (audioQueuedBytes == 0) {

			uint32 lenSamples = spec.freq * spec.channels / 60; // queue 1/60th seconds of audio?
			uint32 lenBytes = lenSamples * SDL_AUDIO_BITSIZE(spec.format) / 8;

			int32 audioData[lenSamples];
			for (uint32 i = 0; i < lenSamples; i++) {
				++wavePhase;
				if (wavePhase > waveFreqHz) wavePhase = 0;
				if (wavePhase <= waveFreqHz / 2) audioData[i] =  6000;
				if (wavePhase  > waveFreqHz / 2) audioData[i] = -6000;
			}
			std::cout << "Queueing audio!"; // DEBUG
			SDL_QueueAudio(devID, &audioData, lenBytes);
		}
		std::cout << "\n"; // DEBUG

	}

	/* Window destruction */
}

As mentioned above, when I reduce the buffer size, it also causes the audio blips to come more frequently, which I think means I have a fundamental misunderstanding of what the buffer means. I thought that the buffer is simply the block of memory where you can store audio while it is continuously drained to the device at a constant rate regardless of the buffer size. But it seems like the size of the buffer is actually delaying the output of audio, as if SDL is waiting for the duration of the buffer to pass so it can output the audio I queued.

I also checked the SDL3 SDL_AudioSpec wiki page and noticed that it doesn’t have a buffer member at all! I’m not sure what that implies about the significance of the buffer in SDL2.

Here is an example for SDL3 that produces a ~800 hz sine tone, hope that helps.
You should use at least a small audio buffer.
if (audioQueuedBytes == 0) { … is also not optimal. If this is the case SDL has no more audio to play..
Warning: example with no event handling.

int main(int argc, char *argv[]) {
    SDL_AudioSpec AudioSpec;
    SDL_AudioStream *pAudioStream;
    short value;
    short audio_buf[2000];  // 1000 samples / 2 channels
    float i = 0;
    int z;

    if (SDL_Init(SDL_INIT_AUDIO)) {
        SDL_Log("Hello SDL3");

        AudioSpec.freq = 44100;
        AudioSpec.format = SDL_AUDIO_S16LE;
        AudioSpec.channels = 2;
        pAudioStream = SDL_OpenAudioDeviceStream(SDL_AUDIO_DEVICE_DEFAULT_PLAYBACK,&AudioSpec,NULL,NULL);
        if (pAudioStream != NULL) {
            if (SDL_ResumeAudioStreamDevice(pAudioStream)) {
                while (1) {
                    SDL_Delay(10);
                    if (SDL_GetAudioStreamQueued(Audioplayer.pAudioStream) < 1000) {
                        // Fill audio buffer
                        for (z = 0; z < 1000; z++) {
                            i = i + 0.1;
                            value = sin(i) * 32767;
                            audio_buf[z * 2] = value;  // left channel
                            audio_buf[z * 2 + 1] = value; // right channel
                        }
                        if (!SDL_PutAudioStreamData(pAudioStream,audio_buf,sizeof(audio_buf))) {
                            SDL_Log("%s: SDL_QueueAudio() failed: %s",__FUNCTION__,SDL_GetError());
                        }
                    } else {
                        SDL_Delay(10);
                    }
                }
            } else {
                SDL_Log("%s: SDL_ResumeAudioStreamDevice() failed: %s",__FUNCTION__,SDL_GetError());
            }
        } else {
            SDL_Log("%s: SDL_OpenAudioDeviceStream() failed: %s",__FUNCTION__,SDL_GetError());
        }
    }
    return 0;
}

You can use “audacity” (or a similar program) in record mode that can show your waveform while your program is running.
Edit: It seems that my example produces a memory leak but don’t know why. Memory leak is getting bigger and bigger while running.

Okay, wait, there’s a lot of issues to talk about here.

First, let’s just open whatever the default device on the system is: SDL_OpenAudioDevice() with a NULL devName. Using SDL_GetNumAudioDevices() - 1 is probably not a good approach.

Next: you almost certainly don’t want specTarget.format = AUDIO_S32MSB; …you’re probably on an Intel-based computer, which is littleendian, but AUDIO_S32MSB is bigendian. Since you’re generating your own audio (a square wave), it’s always going to be in the system’s byteorder, so AUDIO_S32 will suffice here and SDL will pick the right one for your system.

Samples:

specTarget.samples  = 48000; // buffer size in sample frames, 2 seconds?

Since we’re running with a freq of 48000Hz, this is 1 second. And yes, this is a massive buffer size to choose. Usually this is more like 1024. This is (more or less) the hardware buffer, and how much data it will try to consume at once, not how much audio is buffered to be played. You want this to be much much lower. Not just for latency, but I wouldn’t be surprised if the system misbehaves or outright fails with numbers that high. You can still queue more data than this value, so definitely lower it to 1024.

check if there is any audio currently queued. If there is none, queue more square wave

You don’t want to wait for the amount queued to get to zero. If it hits zero, it means the system has run out of audio to play and is now playing silence until you feed it more data. Even if you’re doing this quickly, you’ll still have gaps in your audio output.

I would say queue about 1/60th of a second (about 16 milliseconds) of audio everytime there’s less than 1/16th queued, so at most you’ve got about 32 milliseconds buffered, and you have a good chunk of time to fill in more when it gets low before it totally runs out. The specific amounts can be tweaked, but that’s the idea.

SDL_GetQueuedAudioSize repeatedly reports 6400 bytes queued until about every second or so, where it suddenly reports 0 and then states that it is queueing audio.

That’s because the specTarget.samples is making it come along every second to pull in another one full second worth of audio, finding 1/16th of a second queued, taking all of that and filling in silence for the other 15. Then your app finds the queue empty and adds another 1/16th of a second worth of data, which SDL will come pick up when it finishes playing the 15/16th of a second worth of silence.

SDL3 works similarly: it just moves the SDL_QueueAudio function to an SDL_AudioStream object, but the exact same theory applies. But SDL3 also chooses a smaller samples for you when opening the device).

A nice thing about SDL_QueueAudio and SDL_AudioStream, though, is that you don’t have to be exact in things. Just buffer a bunch of audio and let SDL nibble on it as it needs to. If you want to give it 3/16th of a second at a time, you can, and it’ll still do the right thing, at the cost of a little more buffered memory, or you can give 2/16ths here and 1/16th here, whatever, as long as you stay ahead of playback.

Anyhow, you’re mostly there, just some small tweaks to fix small misunderstandings and you’ll be playing sound just fine!