Just can't get it done: resample sound

Johannes_Bauer · April 3, 2004, 11:48am

Hi folks,

I posted here a while ago a message in which I asked for some help in
resampling a sound sample in order to increase/decrease the pitch. I
tried it back then and just couldn’t get it done. Then I tried again
right now and - once again - the result is all messed up. This is
totally driving me nuts. Here’s what I have:

// Function expects a 16-bit Stereo Sample (4 Byte/Sample)
Mix_Chunk* CSound::Change_Pitch(Mix_Chunk *Input_Sample, float Factor) {
Mix_Chunk *Sample_Modified;

if (Input_Sample->allocated!=1) {
    throw

CGenericException(std::string(“CSound::Change_Pitch”),std::string(“Input
sample not allocated.”));
}
Sample_Modified=(Mix_Chunk*)malloc(sizeof(Mix_Chunk));
Sample_Modified->allocated=1; // Not yet, but will soon be
Sample_Modified->alen=(Uint32)((Input_Sample->alen/4*Factor)*4);
// alen must be divisible by 4

Sample_Modified->abuf=(Uint8*)malloc(Sample_Modified->alen*sizeof(Uint8));
Sample_Modified->volume=Input_Sample->volume;

for (Uint32 i=0;i<Sample_Modified->alen;i++) {    // Zero out

destination sample just to be sure
Sample_Modified->abuf[i]=0;
};

Uint32 Resample;
Uint16 Links, Rechts;
for (Uint32 i=0;i<Input_Sample->alen/4;i+=4) {
    Links=(Input_Sample->abuf[i+0]*256)+Input_Sample->abuf[i+1];
    Rechts=(Input_Sample->abuf[i+2]*256)+Input_Sample->abuf[i+3];

// printf(“L/R: %d %d [%d %d %d
%d]\n”,Links,Rechts,Input_Sample->abuf[i+0],Input_Sample->abuf[i+1],Input_Sample->abuf[i+2],Input_Sample->abuf[i+3]);
Resample=(Uint32)((float)iFactor);
Resample/=4; // To get the right offset
Resample=4;
Sample_Modified->abuf[Resample+0]=Input_Sample->abuf[i+0];
Sample_Modified->abuf[Resample+1]=Input_Sample->abuf[i+1];
Sample_Modified->abuf[Resample+2]=Input_Sample->abuf[i+2];
Sample_Modified->abuf[Resample+3]=Input_Sample->abuf[i+3];
}

// Debug output the modified sample
for (Uint32 i=0;i<Sample_Modified->alen/4;i+=4) {
Links=(Sample_Modified->abuf[i+0]*256)+Sample_Modified->abuf[i+1];
Rechts=(Sample_Modified->abuf[i+2]*256)+Sample_Modified->abuf[i+3];
printf(“L/R: %d %d [%d %d %d
%d]\n”,Links,Rechts,Sample_Modified->abuf[i+0],Sample_Modified->abuf[i+1],Sample_Modified->abuf[i+2],Sample_Modified->abuf[i+3]);
}

return Sample_Modified;

}

It’s working a little bit (currently only supports Factor<1 and
16Bit/Stereo samples). This is what it does: I created a input sample
which has the left channel completely silenced (all 0s). When I activate
any of the printfs, it’s correctly displaying

L/R: 0 something (0 0 someting something)

It seems as if everything was okay. But when I play the sample (say it
was called with 0.9):
Play faster (upsampled) sound on right channel
Play white noise of same length on right channel
Play faster (upsampled) sound on left channel
Play white noise of same length on left channel

How can that be? Left/Right channel data is interleaved, isn’t it? As I
mentioned, it’s totally driving me crazy, things like that make game
programming so little fun :-(((

I hope somebody can point out what I did wrong.

Greetings
Joe

William_Petiot · April 3, 2004, 3:28pm

You should try to think about your problem in another way…
try to fill destination array from the src array, not the opposite

this would be something like this :

// WARNING : this code is untested, just written on the fly
// not optimized at all, certainly full of bugs
// btw: resampling without interpolation or filtering will sound weird

typedef unsigned long sample; // assuming 16 bits stereo

void resample(sample **ptr_dest,int *ptr_ndest, sample *src,int nsrc, float
factor)
{
int ndest,i ;
sample *dest;
ndest = (int) (nsrc / factor); // new size
dest = (sample )malloc(ndest);
for (i = 0; i < ndest; ++i) {
// filling all destination samples from source
dest[i] = src[(int)(ifactor)];
// note that we work in “sample” unit (1 sample is 4 bytes)
}
// filling caller parameters
*ptr_dest = dest;
*ptr_ndest = ndest;
}

then, call this with :

sample *my_sample;
int my_sample_size;

sample *my_resampled;
int my_resampled_size;

// code to fill my_sample, load from disk etc
... 
resample(&my_resampled,&my_resampled_size,my_sample,my_sample_size);
...
// use the sample
... 
free(my_resampled);

Hope it helps. This code is only for explanations, it’s suboptimal.

William.On Saturday 03 April 2004 19:47, Johannes Bauer wrote:

Hi folks,

I posted here a while ago a message in which I asked for some help in
resampling a sound sample in order to increase/decrease the pitch. I
tried it back then and just couldn’t get it done. Then I tried again
right now and - once again - the result is all messed up. This is
totally driving me nuts. Here’s what I have:
[…]

Johannes_Bauer · April 4, 2004, 8:02am

William Petiot wrote:

You should try to think about your problem in another way…
try to fill destination array from the src array, not the opposite

Hi William,

okay I tried what you said (the other way around) and there’s a definite improvement in my code. I also tried to implement what you said: linear approximation. Yet I do not seem to have a good enough understanding of how exactly to do this.

I’m currently plainly averaging the sample values and then writing them back.

The improvement: the channels are okay, the sample speed is just as expected. Yet, there’s a huge amount of noise coming with the sample (not so much that you couldn’t hear the original sample any more, but yet too much to use for gameplay).

Because of the progress that the code has made I think I’m close to a good solution. Would you be so kind and take another peek at the snippet below so I can make it sound good?

If this thing works I’m definitely going to send in an example to the libsdl-Website - this problem is really harder than I thought (and than it looks).

// Broken code follows Works a little bit.
Mix_Chunk* CSound::Change_Pitch(Mix_Chunk *Input_Sample, float Factor) {

Mix_Chunk *Sample_Modified;

if (Input_Sample->allocated!=1) {
    throw

CGenericException(std::string(“CSound::Change_Pitch”),std::string(“Input
sample not allocated.”));
}

Uint32 Number_Input_Samples=Input_Sample->alen/4;
Uint32 Number_Output_Samples=Uint32((float)Number_Input_Samples*Factor);
float Real_Factor =

(float)Number_Output_Samples/(float)Number_Input_Samples;

Sample_Modified=(Mix_Chunk*)malloc(sizeof(Mix_Chunk));
Sample_Modified->allocated=1;            // Not yet, but will soon be
Sample_Modified->alen=Number_Output_Samples*4;

Sample_Modified->abuf=(Uint8*)malloc(Sample_Modified->alen*sizeof(Uint8));
Sample_Modified->volume=Input_Sample->volume;

Uint32 Begin, End;
Uint16 Left, Right;
Uint64 AverageL, AverageR;
Uint16 OutLeft, OutRight;
for (Uint32 i=0;i<Number_Output_Samples;i++) {
    Begin=(Uint32)((float)i/Real_Factor);
    End=(Uint32)((float)(i+1)/Real_Factor);
    AverageL=0;
    AverageR=0;
    // Destinstaion i is the average of Begin-End of Input sample
    for (Uint32 j=Begin;j<=End;j++) {

Left=(256Input_Sample->abuf[(4j)+1])+(Input_Sample->abuf[(4*j)+0]);

Right=(256Input_Sample->abuf[(4j)+3])+(Input_Sample->abuf[(4j)+2]);
AverageL+=Left;
AverageR+=Right;
}
OutLeft=Uint16((float)AverageL/(float)(End-Begin+1));
OutRight=Uint16((float)AverageR/(float)(End-Begin+1));
Sample_Modified->abuf[(4i)+1]=OutLeft/256;
Sample_Modified->abuf[(4i)+0]=OutLeft%256;
Sample_Modified->abuf[(4i)+3]=OutRight/256;
Sample_Modified->abuf[(4*i)+2]=OutRight%256;
}

return Sample_Modified;

}

Greetings
Joe

Jay_Cornwall · April 4, 2004, 9:25am

Johannes Bauer wrote:

The improvement: the channels are okay, the sample speed is just as
expected. Yet, there’s a huge amount of noise coming with the sample
(not so much that you couldn’t hear the original sample any more, but
yet too much to use for gameplay).

You might be interested to read this discussion I had on comp.dsp recently:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&group=comp.dsp&selm=c3iuso%2430q%241%248302bc10%40news.demon.co.uk&rnum=1

And the associated resampling code I wrote to interface with the ibrary:
http://cvs.evilprojects.net/cgi-bin/viewcvs.cgi/behold/src/rateconverter.cpp?rev=1.1&content-type=text/vnd.viewcvs-markup

The quality is far better than I could achieve with linear interpolation
or zero order hold.–
Cheers,
Jay

http://www.evilrealms.net/ - Systems Administrator & Developer
http://www.imperial.ac.uk/ - 3rd year CS student

William_Petiot · April 4, 2004, 9:39am

Hi Joe,

About noise that is added to your sample, I’m not sure, but it sounds like a
"signed/unsigned" problem. You could try using signed arithmetic.

I guess your source sample format is signed 16 bits, so try using some
signed long int as the averageL or R accumalating buffer, as for the Left
and Right temporary vars (use signed short also for them), then, doing
something like :

// signed 32 bits (should be enough to calculate this average )
int AverageL,AverageR;
// calculate sums in AverageL and R from 16 bits signed values
//
OutLeft = (Uint16)(short)((float)AverageL/(End-Begin+1));
etc
Or preferably use signed value everywhere (in OutLeft also)
OutLeft = (short) (AverageL/(End-Begin+1));

if your architecture is x86 (or little endian) you could benefit a lot by
using buffers of short instead of calculating each 16 bits signed sample with
a formula like (sample = src[i] + src[i+1]*256).
if src is short , i-nth sample for left is src[2i], and the right sample is
src[i*2+1] and that’s all

if your architecture is not little endian, then convert all sample on loading
by swaping their bytes, and then forget it

I believe Audio programming is simplier if you always work with samples, not
bytes.

William.On Sunday 04 April 2004 15:01, Johannes Bauer wrote:

William Petiot wrote:

You should try to think about your problem in another way…
try to fill destination array from the src array, not the opposite

Hi William,

okay I tried what you said (the other way around) and there’s a definite
improvement in my code. I also tried to implement what you said: linear
approximation. Yet I do not seem to have a good enough understanding of how
exactly to do this.

Michel_Bardiaux · April 5, 2004, 2:37am

Jay Cornwall wrote:

Johannes Bauer wrote:

The improvement: the channels are okay, the sample speed is just as
expected. Yet, there’s a huge amount of noise coming with the sample
(not so much that you couldn’t hear the original sample any more, but
yet too much to use for gameplay).

You might be interested to read this discussion I had on comp.dsp recently:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&group=comp.dsp&selm=c3iuso%2430q%241%248302bc10%40news.demon.co.uk&rnum=1

And the associated resampling code I wrote to interface with the ibrary:
http://cvs.evilprojects.net/cgi-bin/viewcvs.cgi/behold/src/rateconverter.cpp?rev=1.1&content-type=text/vnd.viewcvs-markup

The quality is far better than I could achieve with linear interpolation
or zero order hold.

I dont think it is possible to achieve resampling at acceptable quality
without some form of filtering. See how it is done in SOX and RATECONV.–
Michel Bardiaux
Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles
Tel : +32 2 790.29.41

Christophe_Pallier · April 5, 2004, 3:18am

I dont think it is possible to achieve resampling at acceptable
quality without some form of filtering. See how it is done in SOX and
RATECONV.

or see:

http://ccrma-www.stanford.edu/~jos/resample/

for the theory + a program

If your sounds are static (i.e. not generated while the game is playing),
it is a good idea to resample all the sound files once for ever.
Life is also easier if a given audio application uses a unique sampling
rate for all the sounds.

Christophe Pallier

Christophe_Pallier · April 5, 2004, 3:41am

Oops,

Forget my previous email.

I missed that you wanted to modify the pitch.

What type of sound is it? Simple sinewaves, complex music, speech?
There are algorithms to modify the pitch of speech (e.g. PSOLA) but they
are complex…

Christophe Pallier

Johannes Bauer wrote:> Hi folks,

I posted here a while ago a message in which I asked for some help in
resampling a sound sample in order to increase/decrease the pitch. I
tried it back then and just couldn’t get it done. Then I tried again
right now and - once again - the result is all messed up. This is
totally driving me nuts. Here’s what I have:

// Function expects a 16-bit Stereo Sample (4 Byte/Sample)
Mix_Chunk* CSound::Change_Pitch(Mix_Chunk Input_Sample, float Factor) {
Mix_Chunk Sample_Modified;
if (Input_Sample->allocated!=1) {
throw
CGenericException(std::string(“CSound::Change_Pitch”),std::string(“Input
sample not allocated.”));
}
Sample_Modified=(Mix_Chunk)malloc(sizeof(Mix_Chunk));
Sample_Modified->allocated=1; // Not yet, but will soon be
Sample_Modified->alen=(Uint32)((Input_Sample->alen/4Factor)*4);
// alen must be divisible by 4

Sample_Modified->abuf=(Uint8*)malloc(Sample_Modified->alen*sizeof(Uint8));

Sample_Modified->volume=Input_Sample->volume;

for (Uint32 i=0;i<Sample_Modified->alen;i++) { // Zero out
destination sample just to be sure
Sample_Modified->abuf[i]=0;
};

Uint32 Resample;
Uint16 Links, Rechts;
for (Uint32 i=0;i<Input_Sample->alen/4;i+=4) {
Links=(Input_Sample->abuf[i+0]*256)+Input_Sample->abuf[i+1];
Rechts=(Input_Sample->abuf[i+2]*256)+Input_Sample->abuf[i+3];
// printf(“L/R: %d %d [%d %d %d
%d]\n”,Links,Rechts,Input_Sample->abuf[i+0],Input_Sample->abuf[i+1],Input_Sample->abuf[i+2],Input_Sample->abuf[i+3]);
   Resample=(Uint32)((float)i*Factor);
   Resample/=4;    // To get the right offset
   Resample*=4;
   Sample_Modified->abuf[Resample+0]=Input_Sample->abuf[i+0];
   Sample_Modified->abuf[Resample+1]=Input_Sample->abuf[i+1];
   Sample_Modified->abuf[Resample+2]=Input_Sample->abuf[i+2];
   Sample_Modified->abuf[Resample+3]=Input_Sample->abuf[i+3];
}

// Debug output the modified sample
for (Uint32 i=0;i<Sample_Modified->alen/4;i+=4) {
Links=(Sample_Modified->abuf[i+0]*256)+Sample_Modified->abuf[i+1];

Rechts=(Sample_Modified->abuf[i+2]*256)+Sample_Modified->abuf[i+3];
printf(“L/R: %d %d [%d %d %d
%d]\n”,Links,Rechts,Sample_Modified->abuf[i+0],Sample_Modified->abuf[i+1],Sample_Modified->abuf[i+2],Sample_Modified->abuf[i+3]);

}
return Sample_Modified; }

It’s working a little bit (currently only supports Factor<1 and
16Bit/Stereo samples). This is what it does: I created a input sample
which has the left channel completely silenced (all 0s). When I
activate any of the printfs, it’s correctly displaying

L/R: 0 something (0 0 someting something)

It seems as if everything was okay. But when I play the sample (say it
was called with 0.9):
Play faster (upsampled) sound on right channel
Play white noise of same length on right channel
Play faster (upsampled) sound on left channel
Play white noise of same length on left channel

How can that be? Left/Right channel data is interleaved, isn’t it? As
I mentioned, it’s totally driving me crazy, things like that make game
programming so little fun :-(((

I hope somebody can point out what I did wrong.

Greetings
Joe

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Johannes_Bauer · April 7, 2004, 6:19am

Hi Christopher, Jay, Michel and William,

thank you for all your help! I finally got it done.

As I mentioned, the first tip by William improved my code a lot, his
second tip then (signed/unsigned integers) completely solved it.
Therefore my poor man’s resampling is working now.

But I also now use the Secret Rabbit Code library for resampling in SDL,
which works prefectly and with surprisingly little CPU usage.

As I promised, I will release a demo which incorporates all things
neccessary soon; I’ll then also leave a note at the libsdl Website so
maybe other SDL newbies (like me) have their problems solved as quickly
and compentently as I had.

Thank you again, very much!
Greetings
Joe

slouken · May 16, 2004, 5:48pm

But I also now use the Secret Rabbit Code library for resampling in SDL,
which works prefectly and with surprisingly little CPU usage.

As I promised, I will release a demo which incorporates all things
neccessary soon; I’ll then also leave a note at the libsdl Website so
maybe other SDL newbies (like me) have their problems solved as quickly
and compentently as I had.

That would be great, please add this to the website, if you haven’t already.

Thanks!
-Sam Lantinga, Software Engineer, Blizzard Entertainment