OT: How do you transpose the pitch of a WAV/sample?

Hi, I am using SDL for my audio output. What I’m struggling to find out is how to transpose the pitch of sample data. Being new to audio processing, I may be searching using the wrong words.

Suppose I have a buffer of samples that fits an instrument recorded playing note C (e.g 523.25Hz) … How would I transpose that sound up to D? (587.33Hz)

Sorry I posted about this here, but I do know that many of the people who frequent this forum know how to do such things.

Can anybody recommend a good forum for discussing audio processing algorithms such as this too, please?

Thanks
Sparky

You’re looking for pitch shifting. Try google.

This looks promising:
http://www.dspdimension.com/admin/time-pitch-overview/On 5 May 2011 14:25, SparkyNZ wrote:

Suppose I have a buffer of samples that fits an instrument recorded playing
note C (e.g 523.25Hz) … How would I transpose that sound up to D?
(587.33Hz)

Kenneth Bull wrote:> On 5 May 2011 14:25, SparkyNZ wrote:

You’re looking for pitch shifting. Try google.

Thanks Ken. I spent a good couple of hours Googling lastnight and this morning.

I checked out the site you referred to earlier on before posting here and it states that pitch shifting (ie. stretching) is not the same as pitch transposition. If I have 2 seconds worth of sample data for the instrument playing at ‘C’… what I want to do is play it so it sounds like D but it doesn’t matter that the duration of the sound decreases to less than 2 seconds.

I’m not after a ‘true’ transposition as such - what I’m after is something that the old Amiga MOD trackers used to do. I think they used to play back samples at different rates by making the audio chip itself play the sample data back at an increased frequency. This isn’t an option for me as I have a set sample playback rate of 44.1kHz and I want to mix the resulting sample data back into my main sample output buffer.

The only way I can think of doing this myself (ie. reinventing the wheel) is to take my sample data, divide it by the frequency that it claims to be (e.g. 523.25Hz for C) and then build a new buffer by sampling at intervals of that result multiplied by the new frequency (e.g. 587.33Hz for D).

Does this sound feasible to anyone (no pun intended) ? [Question]

I’m not after a ‘true’ transposition as such - what I’m after is
something that the old Amiga MOD trackers used to do. I think they used
to play back samples at different rates by making the audio chip itself
play the sample data back at an increased frequency. This isn’t an
option for me as I have a set sample playback rate of 44.1kHz and I
want to mix the resulting sample data back into my main sample output
buffer.

The only way I can think of doing this myself (ie. reinventing the
wheel) is to take my sample data, divide it by the frequency that it
claims to be (e.g. 523.25Hz for C) and then build a new buffer by
sampling at intervals of that result multiplied by the new frequency
(e.g. 587.33Hz for D).

Does this sound feasible to anyone (no pun intended) ? [Question]

That’s audio resampling. The simplest way is to take samples at integer
positions
So you fill your new table with sample[0], sample[1], sample[3], …

The problem with this is you get aliasing. A simple interpolation will
help to get better results (on Amia this was done by an hardware
filter).
It goes as follow :

  • Compute the resampling ratio = (587.33 / 523.25)

for (int i = 0; i <= sampleLength; i+= ratio)
{
newsample[k] = interpolate(sample[int(i)],sample[int(i)+1],i%1);
k++;
}

interpolate(a,b,r) does something like return (ai+b(1-i))/2.
There are better way to perform interpolation on more samples to get
better quality, but this should be ok in most cases.–
Adrien / PulkoMandy

You could resample it as you mentioned, interpolate it as Adrien
mentions, or use pitch shifting and time stretching together to
increase both frequency and speed.

I’d try resampling first, then move on if it sounds like crap.On 5 May 2011 16:08, SparkyNZ wrote:

The only way I can think of doing this myself (ie. reinventing the wheel) is to take my sample data, divide it by the frequency that it claims to be (e.g. 523.25Hz for C) and then build a new buffer by sampling at intervals of that result multiplied by the new frequency (e.g. 587.33Hz for D).

Does this sound feasible to anyone (no pun intended) ?

Thanks Adrien! Where’s your source of knowledge? Do you know all of this from experience or do you have a good book or website you could recommend. Its great to know I’m on the start of the right track! :slight_smile:

Thanks Adrien! Where’s your source of knowledge? Do you know all of
this from experience or do you have a good book or website you could
recommend. Its great to know I’m on the start of the right track! :slight_smile:

I had to write an interpolating resampler like this for the Haiku
operating system. I spent some time in school studying these kind of
things, but my overall impression is looking at C code gets you to
understand it much faster than looking for the mathematical explanation.–
Adrien.

newsample[k] = interpolate(sample[int(i)],sample[int(i)+1],i%1);

Hmm? C’s not really my language, but that last parameter doesn’t quite
look right. Wouldn’t that always evaluate to 0?>----- Original Message ----

From: Adrien Destugues
Subject: Re: [SDL] OT: How do you transpose the pitch of a WAV/sample?

WAV/sample?

newsample[k] = interpolate(sample[int(i)],sample[int(i)+1],i%1);

Hmm? C’s not really my language, but that last parameter doesn’t
quite
look right. Wouldn’t that always evaluate to 0?

That may not be actually C. i is a floating point number and this is
meant to get the fractionnal part of it. Kind of like a floating
modulus :)> >----- Original Message ----

From: Adrien Destugues <@Adrien_Destugues>
Subject: Re: [SDL] OT: How do you transpose the pitch of a

WAV/sample?

newsample[k] = interpolate(sample[int(i)],sample[int(i)+1],i%1);

Hmm? C’s not really my language, but that last parameter doesn’t
quite
look right. Wouldn’t that always evaluate to 0?

That may not be actually C. i is a floating point number and this is
meant to get the fractionnal part of it. Kind of like a floating
modulus :slight_smile:

OK, that makes a lot more sense if i is a floating point number, but then
why did you declare it as an int?>----- Original Message ----

From: Adrien Destugues
Subject: Re: [SDL] OT: How do you transpose the pitch of a WAV/sample?

----- Original Message ----
From: Adrien Destugues
Subject: Re: [SDL] OT: How do you transpose the pitch of a

OK, that makes a lot more sense if i is a floating point number, but
then
why did you declare it as an int?

oops. It’s a float obviously, or this won’t give the expected result.
Too fast writing.

Sorry :slight_smile:

Mason Wheeler wrote:

newsample[k] = interpolate(sample[int(i)],sample[int(i)+1],i%1);

Hmm? C’s not really my language, but that last parameter doesn’t quite
look right. Wouldn’t that always evaluate to 0?

Yes it wouldbe zero every time :slight_smile: I’m not able to experiment with this at the moment (at work) so if you’re able to validate that Adrien, that would be great. Thanks.> > ----- Original Message ----

It would be like this wouldn’t it…?

Code:

for ( float i = 0; i <= sampleLength; i+= ratio)
{
newsample[k] = interpolate(sample[int(i)],sample[ int(i)+1], i - (int) i );
k++;
}

or

unsigned k = 0;
for (float i = 0; i < sampleLength; i += ratio) {
newsample[k++] = interpolate(
sample[(int) i], sample[(int) i + 1],
i - floor(i));
}On 5 May 2011 17:31, SparkyNZ wrote:

It would be like this wouldn’t it…?
for ( float i = 0; i <= sampleLength; i+= ratio)
{
? newsample[k] = interpolate(sample[int(i)],sample[ int(i)+1], i - (int) i );
? k++;
}

Wouldn’t it make more sense to iterate over the indexes to newsample?

for (int i = 0; i < newsampleLength; ++i) {
float p = i * ratio;
newsample[i] = interpolate(
sample[(int) p], sample[(int) p + 1],
p - floor§);
}

Of course that’s only valid if 0 < ratio <= 2, and you may need to watch
out of a buffer overrun.On 5/5/2011 15:43, Kenneth Bull wrote:

unsigned k = 0;
for (float i = 0; i < sampleLength; i += ratio) {
newsample[k++] = interpolate(
sample[(int) i], sample[(int) i + 1],
i - floor(i));
}


Rainer Deyke - rainerd at eldwood.com

Rainer Deyke wrote:

Wouldn’t it make more sense to iterate over the indexes to newsample?

I wouldn’t have though so - the resulting sample length would be smaller than the original sample length in some cases, but longer in others - depending whether the sample is to be transposed up or down. Makes sense to use the original length to me. Yeah?

libsamplerate does sample rate conversion if you don’t mind GPL license.

PatrickOn Thu, May 5, 2011 at 5:09 PM, SparkyNZ wrote:

Rainer Deyke wrote:

Wouldn’t it make more sense to iterate over the indexes to newsample?

I wouldn’t have though so - the resulting sample length would be smaller
than the original sample length in some cases, but longer in others -
depending whether the sample is to be transposed up or down. Makes sense to
use the original length to me. Yeah?


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

No, your reasoning makes no sense. You want to generate a new chunk
consisting of N (where N = size of original chunk * frequency ratio)
samples, so you make a loop that iterates N times, generating one sample
for each iteration. That’s what the code to which I was responding
does; I just made it more explicit.

The opposite - iterating over the original samples - is also possible,
but it would look completely different from the code to which I was
responding. Something like this:

/* new_samples must be initialized to zeros. */
float ratio = (float) org_samples_length / new_samples_length;
for (int i = 0; i < org_samples_length; ++i) {
float new_pos = i / ratio;
float fractional = new_pos - floor(new_pos);
new_samples[(int) new_pos] += org_samples[i] * (1.0 - fractional) * ratio;
new_samples[(int) new_pos + 1] += org_samples[i] * fractional * ratio;
}On 5/5/2011 16:09, SparkyNZ wrote:

Rainer Deyke wrote:

Wouldn’t it make more sense to iterate over the indexes to
newsample?

I wouldn’t have though so - the resulting sample length would be
smaller than the original sample length in some cases, but longer in
others - depending whether the sample is to be transposed up or down.
Makes sense to use the original length to me. Yeah?


Rainer Deyke - rainerd at eldwood.com

How’s this?

struct SampleSet {
sample* data; ///< The samples.
unsigned count; ///< The number of samples in \a data.
unsigned capacity; ///< The maximum number of samples \a data can
hold.
double rate; ///< The sample rate in samples / second.
};

template T min(T x, T y) { return (x>y)? y: x; }
template T max(T x, T y) { return (y>x)? y: x; }

int resample(
SampleSet* out,
const SampleSet* in
) {
if (out.rate == in.rate) {
out.count = max(out.capacity, in.count);
memcpy(out.data, in.data, out.count*sizeof(sample));
return;
}

unsigned oc = 0;    // output counter
unsigned ic = 0;    // input counter
double oe = 0.0;    // output error
double ie = 0.0;    // input error
double od = out.rate / in.rate;
double id = in.rate / out.rate;

out.data[0] = 0;
while (1)    {
    if ((1.0 - oe) * in.rate > (1.0 - ie) * out.rate)    {
        out.data[oc] = in.data[ic] * (1.0 - oe) * ie * od;
        ie += (1 - oe) * id;
        oe = 1.0;
    }
    else    {
        out.data[oc] = in.data[ic] * (1.0 - ie) * oe * id;
        oe += (1 - ie) * od;
        ie = 0.0;
    }

    if (oe == 1.0)    {
        ++oc;
        if (oc == out.capacity)    {
            out.count = oc;
            return 1;
        }
        out.data[oc] = 0;
    }

    if (ie == 1.0)    {
        ++ic;
        if (ic == in.count) {
            out.count = oc + 1;
            return 0;
        }
    }
}

}On 5 May 2011 17:51, Rainer Deyke wrote:

On 5/5/2011 15:43, Kenneth Bull wrote:

unsigned k = 0;
for (float i = 0; i < sampleLength; i += ratio) {
newsample[k++] = interpolate(
sample[(int) i], sample[(int) i + 1],
i - floor(i));
}

Wouldn’t it make more sense to iterate over the indexes to newsample?

for (int i = 0; i < newsampleLength; ++i) {
float p = i * ratio;
newsample[i] = interpolate(
sample[(int) p], sample[(int) p + 1],
p - floor§);
}

Of course that’s only valid if 0 < ratio <= 2, and you may need to watch
out of a buffer overrun.

forgot a couple things:

struct SampleSet {
sample* data; ///< The samples.
unsigned count; ///< The number of samples in \a data.
unsigned capacity; ///< The maximum number of samples \a data can hold.
double rate; ///< The sample rate in samples / second.
};

template T min(T x, T y) { return (x>y)? y: x; }
template T max(T x, T y) { return (y>x)? y: x; }

int resample(
SampleSet* out,
const SampleSet* in
) {
if (out.rate == in.rate) {
out.count = max(out.capacity, in.count);
memcpy(out.data, in.data, out.count*sizeof(sample));
return;
}

unsigned oc = 0;	// output counter
unsigned ic = 0;	// input counter
double oe = 0.0;	// output error
double ie = 0.0;	// input error
double od = out.rate / in.rate;
double id = in.rate / out.rate;

out.data[0] = 0;
while (1)	{
	if ((1.0 - oe) * in.rate > (1.0 - ie) * out.rate)	{
		out.data[oc] = in.data[ic] * (1.0 - oe) * ie * od;
		ie += (1 - oe) * id;
		oe = 1.0;
	}
	else	{
		out.data[oc] = in.data[ic] * (1.0 - ie) * oe * id;
		oe += (1 - ie) * od;
		ie = 1.0;
	}
	
	if (oe == 1.0)	{
		++oc;
		if (oc == out.capacity)	{
			out.count = oc;
			return 1;
		}
		oe = 0.0;
		out.data[oc] = 0;
	}
	
	if (ie == 1.0)	{
		++ic;
		if (ic == in.count) {
			out.count = oc + 1;
			return 0;
		}
		ic = 0.0;
	}
}

}On 5 May 2011 22:52, Kenneth Bull <@Kenneth_Bull> wrote:

How’s this?