Memory leak in audio thread w/ CoreAudio

chvolow24 · August 4, 2024, 12:21am

I’m seeing 1MB of memory allocated every 2-3 seconds in the SDL2 audio subsystem when no work should be happening. It is definitely correlated to opening a device and leaving it open, but unfortunately, I am having a very hard time figuring out the conditions in which the leak arises; my attempts to reproduce it in a tiny program have failed. The device is paused, nothing is happening on the main thread (just busy waiting), and I don’t have any of my own threads doing anything.

Here’ a full backtrace of (what I believe is) the problematic allocation:

(lldb) bt
* thread #7, name = 'AudioQueue thread', queue = 'AQServer', stop reason = breakpoint 6.148
  * frame #0: 0x0000000191114930 AudioToolbox`XAtomicAllocator::alloc()
    frame #1: 0x00000001911148f8 AudioToolbox`TCustomAllocated<XAtomicAllocator, AQCommand>::operator new(unsigned long, XAtomicAllocator&) + 44
    frame #2: 0x00000001911c9bdc AudioToolbox`AudioQueueObject::EnqueueBuffer(AudioQueueBuffer*, unsigned int, AudioStreamPacketDescription const*, int, int, unsigned int, AudioQueueParameterEvent*, XAudioTimeStamp const&, XAudioTimeStamp*) + 1336
    frame #3: 0x00000001911df18c AudioToolbox`AudioQueueXPC_Server::EnqueueBuffer(unsigned int, std::__1::span<AQBufferCreateDestroyEvent const, 18446744073709551615ul>, unsigned int, unsigned int, unsigned int, std::__1::span<AudioStreamPacketDescription const, 18446744073709551615ul>, unsigned int, unsigned int, std::__1::span<AudioQueueParameterEvent const, 18446744073709551615ul>, XAudioTimeStampBase, bool) + 464
    frame #4: 0x000000019123096c AudioToolbox`invocation function for block in AudioQueueXPC_Bridge::EnqueueBuffer(unsigned int, std::__1::span<AQBufferCreateDestroyEvent const, 18446744073709551615ul>, unsigned int, unsigned int, unsigned int, std::__1::span<AudioStreamPacketDescription const, 18446744073709551615ul>, unsigned int, unsigned int, std::__1::span<AudioQueueParameterEvent const, 18446744073709551615ul>, XAudioTimeStampBase, bool) + 132
    frame #5: 0x0000000180c363e8 libdispatch.dylib`_dispatch_client_callout + 20
    frame #6: 0x0000000180c45d0c libdispatch.dylib`_dispatch_sync_invoke_and_complete_recurse + 64
    frame #7: 0x00000001912308b4 AudioToolbox`AudioQueueXPC_Bridge::EnqueueBuffer(unsigned int, std::__1::span<AQBufferCreateDestroyEvent const, 18446744073709551615ul>, unsigned int, unsigned int, unsigned int, std::__1::span<AudioStreamPacketDescription const, 18446744073709551615ul>, unsigned int, unsigned int, std::__1::span<AudioQueueParameterEvent const, 18446744073709551615ul>, XAudioTimeStampBase, bool) + 240
    frame #8: 0x000000019118ea20 AudioToolbox`AQ::API::V2Impl::AudioQueueEnqueueBufferWithParameters(OpaqueAudioQueue*, AudioQueueBuffer*, unsigned int, AudioStreamPacketDescription const*, unsigned int, unsigned int, unsigned int, AudioQueueParameterEvent const*, AudioTimeStamp const*, AudioTimeStamp*) + 804
    frame #9: 0x00000001911ae4e0 AudioToolbox`AudioQueueEnqueueBuffer + 128
    frame #10: 0x0000000100a113b0 libSDL2-2.0d.0.dylib`outputCallback(inUserData=0x0000000104222be0, inAQ=0x0000000011a81000, inBuffer=0x00000001045580f0) at SDL_coreaudio.m:589:5
    frame #11: 0x000000019119cfe0 AudioToolbox`AQ::API::Queue::CallOutputCallback(AudioQueueBuffer*) + 344
    frame #12: 0x0000000191120bd4 AudioToolbox`AQClientCallbackMessageReader::DispatchCallbacks(void const*, unsigned long) + 288
    frame #13: 0x000000019119c708 AudioToolbox`AQ::API::Queue::FetchAndDeliverPendingCallbacks() + 436
    frame #14: 0x000000019119c514 AudioToolbox`(anonymous namespace)::RunLoopSourcePerform(void*) + 52
    frame #15: 0x0000000180ec5eb0 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
    frame #16: 0x0000000180ec5e44 CoreFoundation`__CFRunLoopDoSource0 + 176
    frame #17: 0x0000000180ec5bb4 CoreFoundation`__CFRunLoopDoSources0 + 244
    frame #18: 0x0000000180ec47a0 CoreFoundation`__CFRunLoopRun + 828
    frame #19: 0x0000000180ec3e0c CoreFoundation`CFRunLoopRunSpecific + 608
    frame #20: 0x0000000100a101d4 libSDL2-2.0d.0.dylib`audioqueue_thread(arg=0x0000000104222be0) at SDL_coreaudio.m:970:9
    frame #21: 0x000000010096c120 libSDL2-2.0d.0.dylib`SDL_RunThread(thread=0x0000000104557960) at SDL_thread.c:292:18
    frame #22: 0x0000000100a6aab0 libSDL2-2.0d.0.dylib`RunThread(data=0x0000000104557960) at SDL_systhread.c:76:5
    frame #23: 0x0000000180de6f94 libsystem_pthread.dylib`_pthread_start + 136

Does anyone know anything about this? Is there something I could be doing wrong that would cause this?

anon914446 · August 4, 2024, 1:46am

I have some questions, and a guess or two:

Can we get your call to SDL_OpenAudioDevice() and also the settings you have for the SDL_AudioSpec that you opened it with?
In your system monitor, is the CPU being used more than you might expect it to be while your program is running?

Did you SDL_zero that audiospec before setting its values?
Do you have the audioSpec.callback set to NULL, or is it set to call a function? (The callback gets called on another thread).

If you do have a callback function, please post it’s contents.

chvolow24 · August 4, 2024, 2:11am

Yes, the spec is SDL_zero’d. CPU usage is all reasonable. And I am using a callback function. Here’s the spec printed right before the call to SDL_OpenAudio is made:

(SDL_AudioSpec) $0 = {
  freq = 48000
  format = 32784
  channels = '\x02'
  silence = '\0'
  samples = 64
  padding = 0
  size = 0
  callback = 0x000000010005533c (jackdaw`transport_playback_callback at transport.c:226)
  userdata = 0x0000000104426a40
}

And here’s the call to SDL_OpenAudioDevice with some context:

    AudioDevice *device = &conn->c.device;
	SDL_AudioSpec obtained;
	SDL_zero(obtained);
	SDL_zero(device->spec);

	/* Project determines high-level audio settings */
	device->spec.format = AUDIO_S16LSB;
	device->spec.samples = proj->chunk_size_sframes;
 	device->spec.freq = proj->sample_rate;

	device->spec.channels = proj->channels;
	device->spec.callback = conn->iscapture ? transport_record_callback : transport_playback_callback;
	device->spec.userdata = conn;
	if ((device->id = SDL_OpenAudioDevice(conn->name, conn->iscapture, &(device->spec), &(obtained), 0)) > 0) {
	    fprintf(stdout, "ID: %d\n", device->id);
	    device->spec = obtained;
	    conn->open = true;
	    fprintf(stderr, "Successfully opened device %s, with id: %d, chunk size %d\n", conn->name, device->id, obtained.samples);
	} else {
	    conn->open = false;
	    fprintf(stderr, "Error opening audio device %s : %s\n", conn->name, SDL_GetError());
	    return 1;
	}

The callback function is not relevant (or shouldn’t be) because the leak occurs even if the device is never unpaused. I verified that the callback is never called.

Here are some things I can say with (near) certainty:

The leak is not occurring on the main thread
Removing the call to SDL_OpenAudioDevice solves the leak
Closing the device immediately after it is opened stops the leak
The device is never unpaused
The leak is not the result of any allocators I use in my code (malloc, calloc, and realloc are not called while the leak is occurring)

chvolow24 · August 4, 2024, 2:16am

Seems very similar to this bug, but I’m on macos v14.4.1.

(and the leak I’m seeing is much faster)

anon914446 · August 4, 2024, 3:04am

I’m seeing several different sites with similar issues, and I’m pretty sure that I saw that build version of SDL (2.0.5) in at least two of them. Some of the others didn’t mention a specific build version but the posting dates would put them near that version.
What build version of SDL2 are you working with, and can you update to a newer one?

chvolow24 · August 4, 2024, 3:23am

SDL v2.31.0. Right now I’m building SDL from source, fresh from the github repo. I verified the version with the boilerplate code at SDL2/SDL_version - SDL Wiki

2024-08-03 23:19:38.211 jackdaw[74716:32502306] INFO: We compiled against SDL version 2.31.0 ...
2024-08-03 23:19:38.211 jackdaw[74716:32502306] INFO: But we are linking against SDL version 2.31.0.

I first encountered the leak in homebrew-installed version 2.30.4

anon914446 · August 4, 2024, 3:33am

Did you run that test from inside the problem code? (Just curious because it is possible to link to different versions of the library depending on your compiler arguments)

At this point I think it would be a good idea to create an issue report on the SDL github, since the issue in the link was fixed by updating MacOS at one time, it is quite possible that the same issue has popped up as Mac made other changes since then.

chvolow24 · August 4, 2024, 6:43pm

Thanks @anon914446. Yeah, that was inside the problem code. I’m gonna keep trying to reproduce the error in a smaller program, and will put up an issue report if/when I am able to do that. Will post here once I figure out what’s going on.

anon914446 · August 5, 2024, 12:12am

I am curious, do you get the same issue if you change the device driver using SDL_AudioInit(driverName)?

You can use this code to fetch the names of available drivers, I’m hoping your system has more than one:

#include <SDL2/SDL.h>

int main()
{
	SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO);
	int numDrivers = SDL_GetNumAudioDrivers();
	SDL_Log("Number of drivers found: %d", numDrivers);
	for(int i = 0 ; i < numDrivers; i ++)
	{
		SDL_Log(" Index value: %d = %s", i, SDL_GetAudioDriver(i));
	}
	SDL_Quit();
}

One final thought:
I’m seeing suggestions that for some systems 64 is a very small number for the SDL_AudioSpec samples buffer size, you might try a value between 256 and 4096 and see if that changes the outcome as well.

chvolow24 · August 5, 2024, 4:34am

On macos, core audio is the law of the land. There are also available drivers called “disk” and “dummy” but they are not usable for anything, as far as I can tell.

Changing the chunk size was a good suggestion; it turns out that it’s correlated to the size of the leak; a larger chunk size results in a smaller leak. This is probably why I hadn’t noticed the problem before (because I only recently reduced my default chunk size to 64 to reduce latency). For experimentation’s sake, I tried changing the chunk size to 8 and the leak increased to >1MB/s.

What this tells me is that SDL is probably setting up a callback function that is called repeatedly in the background (reconfirmed that my callbacks are never called in the test case) and allocating net new memory on every call. Not good!

anon914446 · August 5, 2024, 4:56pm

Since you have said you are having trouble replicating the issue in smaller programs it does draw some suspicion back to the code itself, have you tested your code on another system?
(preferably on a non-Mac system, Linux or Windows or even a raspberry pi would work.)

chvolow24 · August 5, 2024, 5:56pm

Irony of ironies: the large allocations are solved by removing address sanitation (-fsanitize=address) in the build. When I add that flag to the small test program, it does indeed “leak” (not a true leak) in the same way.

The problem doesn’t happen on linux, with or without address sanitation.

This is great news in the sense that there’s not some glaring bug in SDL or CoreAudio that’s allocating 1MB every second. It’s still a little weird though. [Conjecture; idk much about how address sanitation works:] The address sanitizer is retaining information about allocated blocks of memory even after they are freed, and its own memory usage explodes because CoreAudio is running a callback in the background that allocates memory for a buffer every time it is called. Doesn’t it seem weird that CoreAudio would dynamically allocate memory for its behind-the-scenes callback? I’d expect it to allocate a buffer once, and reuse that buffer for subsequent calls.

chvolow24 · August 5, 2024, 7:30pm

@icculus I’m sorry if this is obtuse, but why does SDL’s outputCallback use AudioQueueEnqueueBuffer here? That function allocates memory every time it’s called. I would expect a buffer to be allocated once and then reused in each call to the callback. Does SDL3 handle this bit differently?

icculus · August 6, 2024, 5:35am

The buffer is allocated once during device setup, here:

github.com

libsdl-org/SDL/blob/8f5d3ca57d610707cbc76582ffb6eb2b6a8904fc/src/audio/coreaudio/SDL_coreaudio.m#L923


      
              if (this->hidden->audioBuffer == NULL) {
                  SDL_OutOfMemory();
                  return 0;
              }
          
          #if DEBUG_COREAUDIO
              printf("COREAUDIO: numAudioBuffers == %d\n", numAudioBuffers);
          #endif
          
              for (i = 0; i < numAudioBuffers; i++) {
                  result = AudioQueueAllocateBuffer(this->hidden->audioQueue, this->spec.size, &this->hidden->audioBuffer[i]);
                  CHECK_RESULT("AudioQueueAllocateBuffer");
                  SDL_memset(this->hidden->audioBuffer[i]->mAudioData, this->spec.silence, this->hidden->audioBuffer[i]->mAudioDataBytesCapacity);
                  this->hidden->audioBuffer[i]->mAudioDataByteSize = this->hidden->audioBuffer[i]->mAudioDataBytesCapacity;
                  /* !!! FIXME: should we use AudioQueueEnqueueBufferWithParameters and specify all frames be "trimmed" so these are immediately ready to refill with SDL callback data? */
                  result = AudioQueueEnqueueBuffer(this->hidden->audioQueue, this->hidden->audioBuffer[i], 0, NULL);
                  CHECK_RESULT("AudioQueueEnqueueBuffer");
              }
          
              result = AudioQueueStart(this->hidden->audioQueue, NULL);
              CHECK_RESULT("AudioQueueStart");

…AudioQueueEnqueueBuffer merely puts the already-allocated buffer back in the CoreAudio queue to be played again. Once it plays, outputCallback fires again with the finished buffer, and we overwrite its data with the next chunk to play, and requeue it again, repeating until the device is closed.

I don’t know why CoreAudio is allocating memory here, or if it’s allocating memory and then freeing it correctly. I don’t know if it is actually allocating memory at all, or AddressSanitizer is in the wrong here.

chvolow24 · August 6, 2024, 1:18pm

@icculus Thanks for the response. I guess I’m surprised that SDL uses AudioQueue for its audio callbacks at all, since “Audio Queue Services is high level”. Audio-focused libraries that offer cross-platform low-level access in a similar way (e.g. PortAudio, Juce, libsoundio) seem to all use the AudioUnit API.

icculus · August 6, 2024, 8:06pm

That document refers to it as “high-level” because all it does is provide a way to push raw PCM to hardware efficiently, which is exactly what SDL needs.

CoreAudio as a whole can do some extremely powerful and flexible things–plugins, mixers, DSPs–but none of them are needed for the SDL API.

(Also, AudioQueue was introduced in a later (but now-ancient) Mac OS X release, in response to developer feedback that AudioUnits were too complex when all you wanted to do was feed audio to the hardware directly; SDL used to use them, but replaced them with AudioQueue in 2016.)

chvolow24 · August 7, 2024, 5:41pm

Thanks, that’s helpful. I suspect (though can’t yet confirm experimentally) that some added latency is the price for Audio Queue’s relative simplicity and ease of use. IMHO it would kick the most ass if it was possible to achieve as close as possible to minimum system audio I/O latency with SDL, but I also acknowledge that this is irrelevant for most SDL use cases, unless you’re trying to do something crazy like write a DAW that depends only on SDL.

If I can get some more substantive information on comparitive low-latency performance of Audio Queue vs. Audio units I’ll post that here just as an FYI.

slouken · August 7, 2024, 5:57pm

Even better, if that is actually better, go ahead and create a PR for SDL3!

icculus · August 10, 2024, 7:18pm

I don’t see any reason why AudioQueues would have to be higher-latency, but I don’t have any data either.