SDL2 Audio Callback Buffer Change

Brian_Barnes · July 4, 2012, 4:53pm

Sam wrote:

Yes, SDL 2 doesn’t clear the audio buffer as an optimization, since it
assumes you’ll fill it entirely each time you get a callback.

CC’ing Ryan to confirm.

(I changed the subject to be better)

Hmmm, OK, that’s definitely a change from 1.2, and it seems to still be the case for OS X and iOS version. If there’s a standard, let’s get it into the docs and let me know which way you guys want to do it.

The core audio one (if I’m reading the code correctly) actually has this in it:

    while (remaining > 0) {
        if (this->hidden->bufferOffset >= this->hidden->bufferSize) {
            /* Generate the data */
            SDL_memset(this->hidden->buffer, this->spec.silence,
                       this->hidden->bufferSize);
            SDL_mutexP(this->mixer_lock);
            (*this->spec.callback)(this->spec.userdata,
                        this->hidden->buffer, this->hidden->bufferSize);
            SDL_mutexV(this->mixer_lock);
            this->hidden->bufferOffset = 0;
        }

Basically, whatever way it should be, you should probably get it in the docs and make all the versions do the same thing.

[>] Brian

slouken · July 5, 2012, 4:03pm

Yep, thanks for the heads up. I’ll pull the memset from those drivers.

Cheers!On Wed, Jul 4, 2012 at 12:53 PM, Brian Barnes wrote:

Sam wrote:

Yes, SDL 2 doesn’t clear the audio buffer as an optimization, since it
assumes you’ll fill it entirely each time you get a callback.

CC’ing Ryan to confirm.

(I changed the subject to be better)

Hmmm, OK, that’s definitely a change from 1.2, and it seems to still be
the case for OS X and iOS version. If there’s a standard, let’s get it
into the docs and let me know which way you guys want to do it.

The core audio one (if I’m reading the code correctly) actually has this
in it:
    while (remaining > 0) {
        if (this->hidden->bufferOffset >= this->hidden->bufferSize) {
            /* Generate the data */
            SDL_memset(this->hidden->buffer, this->spec.silence,
                       this->hidden->bufferSize);
            SDL_mutexP(this->mixer_lock);
            (*this->spec.callback)(this->spec.userdata,
                        this->hidden->buffer, this->hidden->bufferSize
);
SDL_mutexV(this->mixer_lock);
this->hidden->bufferOffset = 0;
}

Basically, whatever way it should be, you should probably get it in the
docs and make all the versions do the same thing.

[>] Brian

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

slouken · July 5, 2012, 4:19pm

Done!On Thu, Jul 5, 2012 at 12:03 PM, Sam Lantinga <@slouken> wrote:

Yep, thanks for the heads up. I’ll pull the memset from those drivers.

Cheers!

On Wed, Jul 4, 2012 at 12:53 PM, Brian Barnes wrote:
Sam wrote:

Yes, SDL 2 doesn’t clear the audio buffer as an optimization, since it
assumes you’ll fill it entirely each time you get a callback.

CC’ing Ryan to confirm.

(I changed the subject to be better)

Hmmm, OK, that’s definitely a change from 1.2, and it seems to still be
the case for OS X and iOS version. If there’s a standard, let’s get it
into the docs and let me know which way you guys want to do it.

The core audio one (if I’m reading the code correctly) actually has this
in it:
    while (remaining > 0) {
        if (this->hidden->bufferOffset >= this->hidden->bufferSize) {
            /* Generate the data */
            SDL_memset(this->hidden->buffer, this->spec.silence,
                       this->hidden->bufferSize);
            SDL_mutexP(this->mixer_lock);
            (*this->spec.callback)(this->spec.userdata,
                        this->hidden->buffer, this->hidden->
bufferSize);
SDL_mutexV(this->mixer_lock);
this->hidden->bufferOffset = 0;
}

Basically, whatever way it should be, you should probably get it in the
docs and make all the versions do the same thing.

[>] Brian

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Forest_Hale · July 5, 2012, 4:26pm

I will note that in my game engines I regularly use memset before writing streaming data to intentionally fill the cache lines, with a small but measurable performance benefit on x86 despite the
redundancy of the act itself.

Never really figured out why it helps but it does in my experience, I can only assume that my stream writing code is failing to satisfy the write-combine logic in some way.

I have confirmed a performance gain from memset before writing on everything I’ve ever tried it on - sound data copies out of mixer buffers, skeletal animation of model vertices, and so on…

But I do not have sufficient experience with other architectures to say whether memsetting first is a win or loss on other CPU architectures, just that on x86 it seems to help.On 07/05/2012 09:03 AM, Sam Lantinga wrote:

Yep, thanks for the heads up. I’ll pull the memset from those drivers.

Cheers!

On Wed, Jul 4, 2012 at 12:53 PM, Brian Barnes <ggadwa at charter.net <mailto:ggadwa at charter.net>> wrote:

Sam wrote:

Yes, SDL 2 doesn't clear the audio buffer as an optimization, since it
assumes you'll fill it entirely each time you get a callback.

CC'ing Ryan to confirm.

(I changed the subject to be better)

Hmmm, OK, that's definitely a change from 1.2, and it seems to still be the case for OS X and iOS version.  If there's a standard, let's get it into the docs and let me know which way you guys
want to do it. 

The core audio one (if I'm reading the code correctly) actually has this in it:

        while (remaining > 0) {
            if (this->hidden->bufferOffset >= this->hidden->bufferSize) {
                /* Generate the data */
                SDL_memset(this->hidden->buffer, this->spec.silence,
                           this->hidden->bufferSize);
                SDL_mutexP(this->mixer_lock);
                (*this->spec.callback)(this->spec.userdata,
                            this->hidden->buffer, this->hidden->bufferSize);
                SDL_mutexV(this->mixer_lock);
                this->hidden->bufferOffset = 0;
            }

Basically, whatever way it should be, you should probably get it in the docs and make all the versions do the same thing.

[>] Brian


_______________________________________________
SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
LordHavoc
Author of DarkPlaces Quake1 engine - http://icculus.org/twilight/darkplaces
Co-designer of Nexuiz - http://alientrap.org/nexuiz
"War does not prove who is right, it proves who is left." - Unknown
"Any sufficiently advanced technology is indistinguishable from a rigged demo." - James Klass
"A game is a series of interesting choices." - Sid Meier

Vance_Michael · July 5, 2012, 4:38pm

On many architectures memset benefits from hardware support from instructions like dcbz on the PPC (which is essentially a combined prefetch or cache block zero). If you find optimizations like this interesting you may also want to explore usage of the restrict keyword.

m.> -----Original Message-----

From: sdl-bounces at lists.libsdl.org [mailto:sdl-bounces at lists.libsdl.org] On
Behalf Of Forest Hale
Sent: Thursday, July 05, 2012 12:27 PM
To: sdl at lists.libsdl.org
Subject: Re: [SDL] SDL2 Audio Callback Buffer Change

I will note that in my game engines I regularly use memset before writing
streaming data to intentionally fill the cache lines, with a small but
measurable performance benefit on x86 despite the redundancy of the act
itself.

Never really figured out why it helps but it does in my experience, I can only
assume that my stream writing code is failing to satisfy the write-combine
logic in some way.

I have confirmed a performance gain from memset before writing on
everything I’ve ever tried it on - sound data copies out of mixer buffers,
skeletal animation of model vertices, and so on…

But I do not have sufficient experience with other architectures to say
whether memsetting first is a win or loss on other CPU architectures, just
that on x86 it seems to help.

On 07/05/2012 09:03 AM, Sam Lantinga wrote:
Yep, thanks for the heads up. I’ll pull the memset from those drivers.

Cheers!

On Wed, Jul 4, 2012 at 12:53 PM, Brian Barnes <ggadwa at charter.net <mailto:ggadwa at charter.net>> wrote:
Sam wrote:
Yes, SDL 2 doesn't clear the audio buffer as an optimization, since it
assumes you'll fill it entirely each time you get a callback.

CC'ing Ryan to confirm.
(I changed the subject to be better)

Hmmm, OK, that's definitely a change from 1.2, and it seems to still be
the case for OS X and iOS version. If there’s a standard, let’s get it into the
docs and let me know which way you guys
want to do it.

The core audio one (if I'm reading the code correctly) actually has this in
it:
        while (remaining > 0) {
            if (this->hidden->bufferOffset >= this->hidden->bufferSize) {
                /* Generate the data */
                SDL_memset(this->hidden->buffer, this->spec.silence,
                           this->hidden->bufferSize);
                SDL_mutexP(this->mixer_lock);
                (*this->spec.callback)(this->spec.userdata,
                            this->hidden->buffer, this->hidden->bufferSize);
                SDL_mutexV(this->mixer_lock);
                this->hidden->bufferOffset = 0;
            }

Basically, whatever way it should be, you should probably get it in the
docs and make all the versions do the same thing.
[>] Brian


_______________________________________________
SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
–
LordHavoc
Author of DarkPlaces Quake1 engine - http://icculus.org/twilight/darkplaces
Co-designer of Nexuiz - http://alientrap.org/nexuiz “War does not prove
who is right, it proves who is left.” - Unknown “Any sufficiently advanced
technology is indistinguishable from a rigged demo.” - James Klass “A game is
a series of interesting choices.” - Sid Meier

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

slouken · July 5, 2012, 4:41pm

The memset implementation is probably using SIMD with a prefetch
instruction. You might be able to get the same benefit by inlining that
asm yourself. There are some examples of this in the SDL blitter code. :)On Thu, Jul 5, 2012 at 12:26 PM, Forest Hale wrote:

I will note that in my game engines I regularly use memset before writing
streaming data to intentionally fill the cache lines, with a small but
measurable performance benefit on x86 despite the
redundancy of the act itself.

Never really figured out why it helps but it does in my experience, I can
only assume that my stream writing code is failing to satisfy the
write-combine logic in some way.

I have confirmed a performance gain from memset before writing on
everything I’ve ever tried it on - sound data copies out of mixer buffers,
skeletal animation of model vertices, and so on…

But I do not have sufficient experience with other architectures to say
whether memsetting first is a win or loss on other CPU architectures, just
that on x86 it seems to help.

On 07/05/2012 09:03 AM, Sam Lantinga wrote:
Yep, thanks for the heads up. I’ll pull the memset from those drivers.

Cheers!

On Wed, Jul 4, 2012 at 12:53 PM, Brian Barnes <ggadwa at charter.net<mailto: ggadwa at charter.net>> wrote:
Sam wrote:
Yes, SDL 2 doesn't clear the audio buffer as an optimization, since
it
assumes you'll fill it entirely each time you get a callback.

CC'ing Ryan to confirm.
(I changed the subject to be better)

Hmmm, OK, that's definitely a change from 1.2, and it seems to still
be the case for OS X and iOS version. If there’s a standard, let’s get it
into the docs and let me know which way you guys
want to do it.

The core audio one (if I'm reading the code correctly) actually has
this in it:
        while (remaining > 0) {
            if (this->hidden->bufferOffset >=
this->hidden->bufferSize) {
                /* Generate the data */
                SDL_memset(this->hidden->buffer, this->spec.silence,
                           this->hidden->bufferSize);
                SDL_mutexP(this->mixer_lock);
                (*this->spec.callback)(this->spec.userdata,
                            this->hidden->buffer,
this->hidden->bufferSize);
                SDL_mutexV(this->mixer_lock);
                this->hidden->bufferOffset = 0;
            }

Basically, whatever way it should be, you should probably get it in
the docs and make all the versions do the same thing.
[>] Brian


_______________________________________________
SDL mailing list
SDL at lists.libsdl.org <mailto:SDL at lists.libsdl.org>
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
–
LordHavoc
Author of DarkPlaces Quake1 engine -
http://icculus.org/twilight/darkplaces
Co-designer of Nexuiz - http://alientrap.org/nexuiz
"War does not prove who is right, it proves who is left." - Unknown
"Any sufficiently advanced technology is indistinguishable from a rigged
demo." - James Klass
"A game is a series of interesting choices." - Sid Meier

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org