SDL_Audio Improvment

David_Olofson · July 21, 2001, 1:15pm

Actually, my first engine will be a lot more retro. I’m thinking “GM
on an array of SID chips.”

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Monday 16 July 2001 01:04, Patrick McFarland wrote:

Ahh, you mean like directx’s little software midi thing?
I was thinking about that anyhow.

David_Olofson · July 21, 2001, 1:27pm

What you’re talking about is shared memory style audio (as opposed to
streaming). That is, instead of just sending a sufficiently large
buffer off to the driver every now and then, you share a looping
buffer with the driver, and mix directly into it, starting just ahead
of the current play position whenever a new sound starts.

It makes global effect processing and other things more complicated,
but it does cut latency down significantly without the risk of
frequent, global drop-outs. It’s a nice feature for targets without
usable real time performance.

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Monday 16 July 2001 17:39, Anoq of the Sun wrote:

Patrick McFarland wrote:

Im gonna improve it, but I want everyone’s thoughts first.

No-latency sound for Windows (and any other OS’es which
might have this problem).

To put it plainly - what SDL needs is to be able to play
mulitple sound effects at the same time (the number of
channels on the soundcard?) and so that they start
immediately after a “playsound” command is called.

Streaming audio is just not good enough for soundeffects
in a realtime game on an OS with a long delay because of
the stream-buffer…

ANOQ_of_the_Sun · July 22, 2001, 11:47pm

David Olofson wrote:

What you’re talking about is shared memory style audio (as opposed to
streaming). That is, instead of just sending a sufficiently large
buffer off to the driver every now and then, you share a looping
buffer with the driver, and mix directly into it, starting just ahead
of the current play position whenever a new sound starts.

Sounds like a good way to implement it - I didn’t think of that…
Thanks

Cheers–
http://www.HardcoreProcessing.com

slouken · July 23, 2001, 4:09am

David Olofson wrote:

What you’re talking about is shared memory style audio (as opposed to
streaming). That is, instead of just sending a sufficiently large
buffer off to the driver every now and then, you share a looping
buffer with the driver, and mix directly into it, starting just ahead
of the current play position whenever a new sound starts.

Sounds like a good way to implement it - I didn’t think of that…
Thanks

That’s essentially what SDL does with the DirectX audio driver,
except your callback is called when the driver is ready for you
to mix in new audio, instead of you having to query and mix ahead
yourself.

See ya,
-Sam Lantinga, Lead Programmer, Loki Software, Inc.

Christopher_Purnell · July 24, 2001, 3:36am

In article <Pine.LNX.4.33.0107142030350.11006-100000 at gemini.verizon.net>,
Ryan C. Gordon wrote:

Actually, I lied. I want someone to write a better sample rate converter.
But that wouldn’t require an API change.

One that works with multi channel audio data would be nice.
Not hard to do unless you want it to do sample interpolation.
I could probably knock one up tonight.–
Christopher John Purnell | A friend in need’s a friend in deed
http://www.lost.org.uk/ | A friend with weed is better
--------------------------| A friend with breasts and all the rest
What gods do you pray to? | A friend who’s dressed in leather

icculus · July 24, 2001, 3:47am

Actually, I lied. I want someone to write a better sample rate converter.
But that wouldn’t require an API change.

One that works with multi channel audio data would be nice.
Not hard to do unless you want it to do sample interpolation.
I could probably knock one up tonight.

Sample interpolation would also be nice.

Mostly, I’m looking for something that converts betweek 8KHz and 11KHz
samples correctly.

–ryan.

David_Olofson · July 24, 2001, 5:02am

In article
<Pine.LNX.4.33.0107142030350.11006-100000 at gemini.verizon.net>,

Ryan C. Gordon wrote:

Actually, I lied. I want someone to write a better sample rate
converter. But that wouldn’t require an API change.

One that works with multi channel audio data would be nice.

I’m not sure what you mean here; stereo->mono and vice versa, or 4
channel output and stuff?

Not hard to do unless you want it to do sample interpolation.

Interpolation is easy. Doing it fast is hard. Doing something that sounds
really good as well (ie not 2 or 3 point interpolation) is real hard…

For a single, fixed ratio, one of the best solutions seems to be a
windowed sinc. I haven’t seen any optimized source code, though.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Tuesday 24 July 2001 12:42, Christopher Purnell wrote:

David_Olofson · July 24, 2001, 5:50am

It’s fairly easy to handle other integer input/output relations than 1:x
and x:1. For example, 2:3 would be something like:

out[0] = in[0];
out[1] = (in[0]*43691 + in[1]*21845) >> 16;
out[2] = (in[1]*21845 + in[2]*43691) >> 16;
in += 2;
out += 3;

Note that
* the input signal must be limited to a 16 bit
integer range [-32768, 32767], or the filter
will overflow

* every iteration will access the first sample
  of the next input frame

* incrementing pointers is slow on modern CPUs

The filter can be generalized, so that the weight factors are
precalculated and stored in an array of size (input ratio + deepth),
where deepth in this case is 1 (“1 point”/linear interpolation).

A simpler, better sounding, more generic, but much slower method is to
derive a sensible integer ratio from the sample rate ratio (5:8, 8:9
etc), and first downsample (more samples out), then upsample (fewer
samples out). When downsampling, build linear slopes between the input
sample level, and when upsampling, use linear interpolation.

A more sophisticated variant would involve a low pass filter before the
upsampling. A good LP filter can eliminate the need for interpolated
downsampling; just fill in with zeroes between the input samples, and
note that you’ll have to adjust the power level. (Scale all filter
coefficient to increase the sum or something; avoids extra
multiplications just to scale the data, and avoids losing bits in an
integer implementation.)

Oops… If go on like this, I might as well code the converter myself!
Maybe I will.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Tuesday 24 July 2001 15:54, Ryan C. Gordon wrote:

Actually, I lied. I want someone to write a better sample rate
converter. But that wouldn’t require an API change.

One that works with multi channel audio data would be nice.
Not hard to do unless you want it to do sample interpolation.
I could probably knock one up tonight.

Sample interpolation would also be nice.

Mostly, I’m looking for something that converts betweek 8KHz and 11KHz
samples correctly.

Christopher_Purnell · July 24, 2001, 6:01am

In article <01072414090204.00786 at cutangle.admeo.se>, David Olofson wrote:>On Tuesday 24 July 2001 12:42, Christopher Purnell wrote:

One that works with multi channel audio data would be nice.

I’m not sure what you mean here; stereo->mono and vice versa, or 4
channel output and stuff?

The existing audio convertion routines are correct for mono only.
So if you are converting to stereo from stereo with frequency
change, for example, I’d expect cross contamination between
the left and right channels. Although I’ve not yet tested it.

–
Christopher John Purnell | A friend in need’s a friend in deed
http://www.lost.org.uk/ | A friend with weed is better
--------------------------| A friend with breasts and all the rest
What gods do you pray to? | A friend who’s dressed in leather

Christopher_Purnell · July 24, 2001, 6:16am

In article <mailman.995979006.9927.sdl at libsdl.org>, David Olofson wrote:>On Tuesday 24 July 2001 15:54, Ryan C. Gordon wrote:

Mostly, I’m looking for something that converts betweek 8KHz and 11KHz
samples correctly.

It’s fairly easy to handle other integer input/output relations than 1:x
and x:1.

The existing SDL_RateSLOW() should be doing 8KHz to 11KHz.

–
Christopher John Purnell | A friend in need’s a friend in deed
http://www.lost.org.uk/ | A friend with weed is better
--------------------------| A friend with breasts and all the rest
What gods do you pray to? | A friend who’s dressed in leather

Jordan_Wilberding · July 24, 2001, 6:30am

How about support for VQF?

-Jordan Wilberding_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

David_Olofson · July 24, 2001, 6:40am

In article <01072414090204.00786 at cutangle.admeo.se>, David Olofson
wrote:On Tuesday 24 July 2001 15:06, Christopher Purnell wrote:

On Tuesday 24 July 2001 12:42, Christopher Purnell wrote:

One that works with multi channel audio data would be nice.

I’m not sure what you mean here; stereo->mono and vice versa, or 4
channel output and stuff?

The existing audio convertion routines are correct for mono only.
So if you are converting to stereo from stereo with frequency
change, for example, I’d expect cross contamination between
the left and right channels. Although I’ve not yet tested it.

Ok. (I guess it would drop the right channel, alternating the resulting
mono samples over the two output channels when resampling 2:1, but it
should work for 1:2 and the like if there’s no interpolation…)

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -’

David_Olofson · July 24, 2001, 6:43am

In article <mailman.995979006.9927.sdl at libsdl.org>, David Olofson wrote:

Mostly, I’m looking for something that converts betweek 8KHz and
11KHz samples correctly.

It’s fairly easy to handle other integer input/output relations than
1:x and x:1.

The existing SDL_RateSLOW() should be doing 8KHz to 11KHz.

That’s all? No arbitrary sample rates? Anything better than pretending
that 8 kHz data is either 5 kHz or 11 kHz would be a significant
improvement, right…?

I’ll think about it… Must fix some boring bugs now, though.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Tuesday 24 July 2001 15:18, Christopher Purnell wrote:

On Tuesday 24 July 2001 15:54, Ryan C. Gordon wrote:

Christopher_Purnell · July 24, 2001, 7:21am

In article <01072415501407.00786 at cutangle.admeo.se>, David Olofson wrote:>On Tuesday 24 July 2001 15:18, Christopher Purnell wrote:

The existing SDL_RateSLOW() should be doing 8KHz to 11KHz.

That’s all? No arbitrary sample rates? Anything better than pretending
that 8 kHz data is either 5 kHz or 11 kHz would be a significant
improvement, right…?

SDL_RateSLOW() is an arbitrary sample rates filter.

It uses the multiply/divide by 2 filters to get as close as possible
and then uses the arbitrary sample rates filter if is not within 100Hhz
of the desired sample rate.

Apart from the case where the samples rates are a single factor of two
out it would be probably be faster to just use the arbitrary sample
rates filter.

–
Christopher John Purnell | A friend in need’s a friend in deed
http://www.lost.org.uk/ | A friend with weed is better
--------------------------| A friend with breasts and all the rest
What gods do you pray to? | A friend who’s dressed in leather

Teunis_Peters · July 24, 2001, 2:24pm

Not hard to do unless you want it to do sample interpolation.

Interpolation is easy. Doing it fast is hard. Doing something that sounds
really good as well (ie not 2 or 3 point interpolation) is real hard…

For a single, fixed ratio, one of the best solutions seems to be a
windowed sinc. I haven’t seen any optimized source code, though.

I’m -good- at interpolation… Pray tell, how does windowed sync work?
hrm… have to think about this… (I’ve mostly worked with graphics
interpolation though)

G’day, eh?
- TeunisOn Tue, 24 Jul 2001, David Olofson wrote:

On Tuesday 24 July 2001 12:42, Christopher Purnell wrote:

ANOQ_of_the_Sun · July 25, 2001, 8:17am

Sam Lantinga wrote:

That’s essentially what SDL does with the DirectX audio driver,
except your callback is called when the driver is ready for you
to mix in new audio, instead of you having to query and mix ahead
yourself.

But won’t it still give latency then? Or are you saying that I
can just use the DirectX audiodriver to get rid of the latency?
And how about multiple sounds at the same time?

Cheers–
http://www.HardcoreProcessing.com

David_Olofson · July 25, 2001, 8:33am

I’m not sure about the details, but it’s basically a variant of
bandlimited resampling, AFAIK. That is, rather than trying to perform a
high order interpolation, the signal is fed through a steep FIR filter
that removes any frequencies that won’t fit in the target sample rate.
(Otherwise those frequencies would be mirrored over the new Nyqvist
frequency. This phenomenon is what’s responsible for most of the
distortion in simple interpolating resampling algorithms.) IIRC, a sort
of FIR filter style construct is also used for the actual resampling.

As for the terminology; “windowed” refers to the bandlimiting filter, and
"sinc" refers to the maths behind the filter coefficients.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Tuesday 24 July 2001 23:16, winterlion wrote:

On Tue, 24 Jul 2001, David Olofson wrote:

On Tuesday 24 July 2001 12:42, Christopher Purnell wrote:

Not hard to do unless you want it to do sample interpolation.

Interpolation is easy. Doing it fast is hard. Doing something that
sounds really good as well (ie not 2 or 3 point interpolation) is
real hard…

For a single, fixed ratio, one of the best solutions seems to be a
windowed sinc. I haven’t seen any optimized source code, though.

I’m -good- at interpolation… Pray tell, how does windowed sync work?
hrm… have to think about this… (I’ve mostly worked with graphics
interpolation though)

David_Olofson · July 25, 2001, 8:35am

BTW, BeOS uses a pretty fast windowed sinc resampling filter. Not (yet)
Open Source, AFAIK…

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Tuesday 24 July 2001 23:16, winterlion wrote:

windowed sinc

David_Olofson · July 25, 2001, 9:17am

Sam Lantinga wrote:

That’s essentially what SDL does with the DirectX audio driver,
except your callback is called when the driver is ready for you
to mix in new audio, instead of you having to query and mix ahead
yourself.

But won’t it still give latency then?

Well, yes, because SDL writes some time ahead of the output position to
avoid drop-outs. Essentially, it’s just a different way of communicating
with the driver. (BTW, ALSA uses that method only nowadays; the library
is actually acting as a wrapper when you’re doing read/write style I/O.)

However, if you deal with every sound event directly, mixing them into
the output buffer individually as soon as they occur, the situation
changes. The problem is that it’s not possible to do so without an
entirely different API.

As to latency, the method I’ve mentioned earlier basically works by
starting the mixing of a new sound right away, right at the output
position (virtually zero latency), and then mixing on, running away from
the pointer, until there’s a sufficient distance in time between the
output position and the mix position for that sound. Then mixing resumes
to normal operation, mixing ahead by some time (10-100 ms depending on OS
and driver performance).

Or are you saying that I
can just use the DirectX audiodriver to get rid of the latency?

You wish! Life would have been very easy for pro audio software coders if
that was even remotely true. (Guess why I’ve excuded Windows as a
viable target for anything like that…)

And how about multiple sounds at the same time?

No problem. Just run the algorithm described above in multiple instances;
one per sound. (You could optimize it some by gathering all voices that
are out of the “zero latency start” phase in a single, traditional
mixer.) The “normal” mixer must clear or overwrite the buffer as it goes,
while the low latency event mixers have to mix into the buffer,
preferably doing saturation clipping as they go.

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Wednesday 25 July 2001 17:32, Anoq of the Sun wrote:

David_Olofson · July 25, 2001, 5:27pm

[…]

As for the terminology; “windowed” refers to the bandlimiting
filter, and “sinc” refers to the maths behind the filter
coefficients.

heh Must have been tired when I wrote this.

A “window” is a way of applying wheights to an array of samples,
usually in order to avoid the nasty transient phenomena you get with
a rectangular window. (A rectangular window is just an array - all
samples have the same wheight. Nice windows for this kind of stuff
are Hanning, Hamming and similar shapes, which basically look like
cos(x) + 1.0; x in [-?,?], or like a normal distribution curve.)

//David

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------> http://www.linuxaudiodev.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Wednesday 25 July 2001 17:40, David Olofson wrote: