SDL_voice API proposal: please comment

icculus · September 13, 2001, 11:58pm

…on a different note…

Here is a proposed API for that library I mentioned before. I’m
tentatively calling it “SDL_voice”. It’s an abstraction over decoding
various audio file formats. Think of it as the audio equivalent of
SDL_Image.

Once discussion on the API (er…what little of it there is) settles down,
I’ll write it, and get the initial decoders worked out.

The goal for a 1.0 release is to support decoding from at least all of the
file formats that SDL_mixer currently does.

Once SDL_voice is stable, I’ll be removing all the decoders from
SDL_mixer, and replacing them with a single dependency to SDL_voice. Part
of the benefit is the blurring of SDL_mixer’s current distinction between
"music" and “sound”, so that you can mix multiple MP3s as regular
channels, or use a VOC file for the music channel. Actually, the "music"
channel will mostly become a matter of backward compatibility, since all
the channels will be treated equally at that point.
(Actually…hhm…native midi music will probably have to be kept separate
on that channel, unless you want to use Timidity through SDL_voice to mix
it into a wave buffer…anyhow…that’s future stuff.)

Also, as a side tangent, SDL_voice gets its data through SDL_RWops. It
would be interesting to see a support library for more than just
file and memory RWops. I’m envisioning things like an SDL_RWops
implementation that reads from an http connection or (really wacky),
interfaces to XMMS plugins to expand the number of supported decoders.
It’s one to think about, but it’s only vaguely related to this API at the
moment.

Finally, this header file hasn’t been run through ANY compiler, so it
might have minor syntax errors at the moment. Bare with me.

Here it is. Please comment.

–ryan.

/*

SDL_voice – An abstract sound format decoding API.
Copyright © 2001 Ryan C. Gordon.*
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/

/**

The basic gist of SDL_Voice is that you use an SDL_RWops to get sound data
into this library, and SDL_Voice will take that data, in one of several
popular formats, and decode it into raw waveform data in the format of
your choice. This gives you a nice abstraction for getting sound into your
game or application; just feed it to SDL_Voice, and it will handle
decoding and converting, so you can just pass it to your SDL audio
callback (or whatever). Since it gets data from a SDL_RWops, you can get
the initial sound data from any number of sources: file, memory buffer,
network connection, etc.
As the name implies, this library depends on SDL: Simple Directmedia Layer,
which is a powerful, free, and cross-platform multimedia library. It can
be found at http://www.libsdl.org/
Support is in place or planned for the following sound formats:
- .WAV (Microsoft WAVfile RIFF data, internal.)
- .VOC (Creative Labs’ Voice format, internal.)
- .MP3 (MPEG-1 layer 3 support, via the SMPEG library.)
- .MID (MIDI music converted to Waveform data, via Timidity.)
- .MOD (MOD files, via MikMod.)
- .OGG (Ogg files, via Ogg Vorbis libraries.)
- .RAWDATA (Raw sound data in any format, internal.)
- .CDA (CD audio read into a sound buffer, internal.)
(…and more to come…)
Please see the file LICENSE in the source’s root directory.
This file written by Ryan C. Gordon. (icculus at clutteredmind.org)
*/

#ifndef INCLUDE_SDL_VOICE_H
#define INCLUDE_SDL_VOICE_H

#include “SDL.h”

#ifdef __cplusplus
extern “C” {
#endif

/* Stupid DLL stuff… */
#if (defined _MSC_VER)
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif

/*

These are flags that are used in a Voice_Sample (below) to show various
states.
To use: “if (sample->flags & VOICE_SAMPLEFLAGS_ERROR) { dosomething(); }”
/
typedef enum VOICE_SAMPLEFLAGS
{
VOICE_SAMPLEFLAG_NONE = 0, / Null flag. */

/* these are set at sample creation time… /
VOICE_SAMPLEFLAG_NEEDSEEK = 1, / SDL_RWops must be able to seek. /
VOICE_SAMPLEFLAG_STREAMING = 1 << 1, / source is streaming (no EOF). */

/* these are set during decoding… /
VOICE_SAMPLEFLAG_EOF = 1 << 29, / end of input stream. /
VOICE_SAMPLEFLAG_ERROR = 1 << 30, / unrecoverable error. /
VOICE_SAMPLEFLAG_AGAIN = 1 << 31 / couldn’t read without blocking. */
} Voice_SampleFlags;

/*

The Voice_Sample structure is the heart of SDL_Voice. This holds
information about a source of sound data as it is being decoded.
EVERY FIELD IN THIS IS READ-ONLY. Please use the API functions to
change them.
*/
typedef struct VOICE_SAMPLE
{
void opaque; / Internal use only. */
Voice_DecoderInfo decoder; / Decoder used for this sample. /
SDL_AudioSpec desired; / Desired audio format for conversion. /
SDL_AudioSpec actual; / Actual audio format of sample. */
void buffer; / Decoded sound data lands in here. /
Uint32 buffer_size; / Current size of (buffer), in bytes. /
Voice_SampleFlags flags; / Flags relating to this sample. */
} Voice_Sample;

/*

Each decoder sets up one of these structs, which can be retrieved via
the Voice_AvailableDecoders() function.
*/
typedef struct PHYSFS_ARCHIVEINFO
{
const char extension; / Case sensitive standard file extension. */
const char description; / Human readable description of decoder. */
const char author; / “Name Of Author (email at emailhost.dom)” */
const char url; / URL specific to this decoder. */
} PHYSFS_ArchiveInfo;

/*

Just what it says: a x.y.z style version number…
*/
typedef struct VOICE_VERSION
{
int major;
int minor;
int patch;
} Voice_Version;

/* functions and macros… */

#define VOICE_VER_MAJOR 0
#define VOICE_VER_MINOR 0
#define VOICE_VER_PATCH 1

#define VOICE_VERSION(x) {
(x)->major = VOICE_VER_MAJOR;
(x)->minor = VOICE_VER_MINOR;
(x)->patch = VOICE_VER_PATCH;
}

/**

Get the version of SDL_Voice that is linked against your program. If you
are using a shared library (DLL) version of SDL_Voice, then it is possible
that it will be different than the version you compiled against.
This is a real function; the macro VOICE_VERSION tells you what version
of SDL_Voice you compiled against:
Voice_Version compiled;
Voice_Version linked;
VOICE_VERSION(&compiled);
Voice_GetLinkedVersion(&linked);
printf(“We compiled against SDL_Voice version %d.%d.%d …\n”,

      compiled.major, compiled.minor, compiled.patch);

printf(“But we linked against SDL_Voice version %d.%d.%d.\n”,

      linked.major, linked.minor, linked.patch);

This function may be called safely at any time, even before Voice_Init().
*/
EXPORT void Voice_GetLinkedVersion(Voice_Version *ver);

/**

Initialize SDL_Voice. This must be called before any other SDL_Voice
function (except perhaps Voice_GetLinkedVersion()). You should call
SDL_Init() before calling this. Voice_Init() will attempt to call
SDL_Init(SDL_INIT_AUDIO), just in case. This is a safe behaviour, but it
may not configure SDL to your liking by itself.
@return nonzero on success, zero on error. Specifics of the error can be

     gleaned from Voice_GetLastError().

*/
EXPORT int Voice_Init(void);

/**

Shutdown SDL_Voice. This closes any SDL_RWops that were being used as
sound sources, and frees any resources in use by SDL_Voice.
All Voice_Sample pointers you had prior to this call are INVALIDATED.
Once successfully deinitialized, Voice_Init() can be called again to
restart the subsystem. All defaults API states are restored at this
point.
You should call this BEFORE SDL_Quit(). This will NOT call SDL_Quit()
for you!
@return nonzero on success, zero on error. Specifics of the error can be

     gleaned from Voice_GetLastError(). If failure, state of SDL_Voice

     is undefined, and probably badly screwed up.

*/
EXPORT int Voice_Quit(void);

/**

Get a list of sound formats supported by this implementation of SDL_Voice.
This is for informational purposes only. Note that the extension listed is
merely convention: if we list “MP3”, you can open an MPEG Audio layer 3
file with an extension of “XYZ”, if you like. The file extensions are
informational, and only required as a hint to choosing the correct
decoder, since the sound data may not be coming from a file at all, thanks
to the abstraction that an SDL_RWops provides.
The returned value is an array of pointers to Voice_DecoderInfo structures,
with a NULL entry to signify the end of the list:
Voice_DecoderInfo **i;
for (i = Voice_AvailableDecoders(); *i != NULL; i++)
{

printf("Supported sound format: [%s], which is [%s].\n",

         i->extension, i->description);

```
// ...and other fields...
```
}
The return values are pointers to static internal memory, and should
be considered READ ONLY, and never freed.
@return READ ONLY Null-terminated array of READ ONLY structures.
*/
EXPORT const Voice_DecoderInfo **Voice_AvailableDecoders(void);

/**

Get the last SDL_Voice error message as a null-terminated string.
This will be NULL if there’s been no error since the last call to this
function. The pointer returned by this call points to an internal buffer.
Each thread has a unique error state associated with it, but each time
a new error message is set, it will overwrite the previous one associated
with that thread. It is safe to call this function at anytime, even
before Voice_Init().
@return READ ONLY string of last error message.
*/
EXPORT const char *Voice_GetLastError(void);

/**

Start decoding a new sound sample. The data is read via an SDL_RWops
structure (see SDL_rwops.h in the SDL include directory), so it may be
coming from memory, disk, network stream, etc. The (ext) parameter is
merely a hint to determining the correct decoder; if you specify, for
example, “mp3” for an extension, and one of the decoders lists that
(case sensitive) as a handled extension, then that decoder is given
first shot at trying to claim the data for decoding. If none of the
extensions match (or the extension is NULL), then every decoder examines
the data to determine if it can handle it, until one accepts it.
If no decoders can handle the data, a NULL value is returned, and a human
readable error message can be fetched from Voice_GetLastError().
Optionally, a desired audio format can be specified. If the incoming data
is in a different format, SDL_Voice will convert it to the desired format
on the fly. Note that this can be an expensive operation, so it may be
wise to convert data before you need to play it back, if possible, or
make sure your data is initially in the format that you need it in.
If you don’t want to convert the data, you can specify NULL for a desired
format. The incoming format of the data, preconversion, can be found
in the Voice_Sample structure.
Note that the raw sound data “decoder” needs you to specify both the
extension “RAWDATA” and a “desired” format, or it will refuse to handle
the data.
Finally, specify an initial buffer size; this is the number of bytes that
will be allocated to store each read from the sound buffer. The more you
can safely allocate, the more decoding can be done in one block, but the
more resources you have to use up, and the longer each decoding call will
take. Note that different data formats require more or less space to
store. This buffer can be resized via Voice_SetBufferSize() …
When you are done with this Voice_Sample pointer, you can dispose of it
via Voice_FreeSample().
@param rw SDL_RWops with sound data.
@param ext File extension normally associated with a data format.
```
          Can usually be NULL.
```
@param desired Format to convert sound data into. Can usually be NULL,

              if you don't need conversion.

@return Voice_Sample pointer, which is used as a handle to several other

       SDL_Voice APIs. NULL on error. If error, use

       Voice_GetLastError() to see what went wrong.

*/
EXPORT Voice_Sample *Voice_NewSample(SDL_RWops *rw, const char *ext,
SDL_AudioInfo *desired,
Uint32 bufferSize);

/**

Dispose of a Voice_Sample pointer that was returned from Voice_NewSample().
This will also close/dispose of the SDL_RWops that was used at creation
time, so there’s no need to keep a reference to that around.
The Voice_Sample pointer is invalid after this call, and will almost
certainly result in a crash if you attempt to keep using it.
@param sample The Voice_Sample to delete.
*/
EXPORT void Voice_FreeSample(Voice_Sample *sample);

/**

Decode more of the sound data in a Voice_Sample. It will decode at most
sample->buffer_size bytes into sample->buffer in the desired format, and
return the number of decoded bytes.
If sample->buffer_size bytes could not be decoded, then please refer to
sample->flags to determine if this was an End-of-stream or error condition.
@param sample Do more decoding to this Voice_Sample.
@return number of bytes decoded into sample->buffer. If it is less than

      sample->buffer_size, then you should check sample->flags to see

      what the current state of the sample is (EOF, error, read again).

*/
EXPORT Uint32 Voice_Decode(Voice_Sample *sample);

#ifdef __cplusplus
}
#endif

#endif /* !defined INCLUDE_SDL_VOICE_H */

/* end of SDL_voice.h … */

Gerry_Jo_Jellestad · September 14, 2001, 6:10am

/* Stupid DLL stuff… */
#if (defined _MSC_VER)
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif

It would be nice if this kinda stuff were in a SDL header, so that
SDL progs could just use the predefined EXPORT for libs… Just a
thought =)

const char author; / “Name Of Author (email at emailhost.dom)” */

Shouldn’t this be…
const char author; / "Name Of Author " */–
Trick

Linux User #229006 * http://counter.li.org

Torbjorn_Andersson · September 14, 2001, 7:21am

This may be a premature - or even stupid - question, but would it then
be SDL_voice that converts all sound data into a common format (same
sample size, frequency, etc.) or would that job still fall on SDL’s
"audiocvt" filters?

The reason I’m asking is that while trying to add/fix AIFF support in
SDL_mixer (the parts of the AIFF specification I could easily
understand anyway), I found that SDL seems to convert sample frequency
by multiplying or dividing current frequency by two until it gets
close enough to the desired frequency. Clearly, there are cases where
this is not adequate.

There is a filter in SDL for doing a more exact conversion, but it has
been commented out with a big scary comment about having to keep the
length of the buffer a power of 2. If that is true, perhaps it would
be easier to get it right at a higher level of abstraction?

(Some of the other filters look like they could possibly be improved,
at the cost of speed and simplicity, as well but I don’t really know
enough about the subject to be able to tell if the changes would be
noticeably for the better, or if they would in fact be for the worse.)

Torbj?rn Andersson

Dominique_Louis · September 14, 2001, 7:32am

Is SDL_Voice in competition with HawkVoice (
http://www.hawksoft.com/hawkvoice/ ) or are these totally different beasts?

thanks,

Dominique
http://www.DelphiGamer/com := for all your Delphi/Kylix game development
needs;

Ryan C. Gordon wrote:> …on a different note…

Here is a proposed API for that library I mentioned before. I’m
tentatively calling it “SDL_voice”. It’s an abstraction over decoding
various audio file formats. Think of it as the audio equivalent of
SDL_Image.

Once discussion on the API (er…what little of it there is) settles down,
I’ll write it, and get the initial decoders worked out.

The goal for a 1.0 release is to support decoding from at least all of the
file formats that SDL_mixer currently does.

Once SDL_voice is stable, I’ll be removing all the decoders from
SDL_mixer, and replacing them with a single dependency to SDL_voice. Part
of the benefit is the blurring of SDL_mixer’s current distinction between
"music" and “sound”, so that you can mix multiple MP3s as regular
channels, or use a VOC file for the music channel. Actually, the "music"
channel will mostly become a matter of backward compatibility, since all
the channels will be treated equally at that point.
(Actually…hhm…native midi music will probably have to be kept separate
on that channel, unless you want to use Timidity through SDL_voice to mix
it into a wave buffer…anyhow…that’s future stuff.)

Also, as a side tangent, SDL_voice gets its data through SDL_RWops. It
would be interesting to see a support library for more than just
file and memory RWops. I’m envisioning things like an SDL_RWops
implementation that reads from an http connection or (really wacky),
interfaces to XMMS plugins to expand the number of supported decoders.
It’s one to think about, but it’s only vaguely related to this API at the
moment.

Finally, this header file hasn’t been run through ANY compiler, so it
might have minor syntax errors at the moment. Bare with me.

Here it is. Please comment.

–ryan.

/*

SDL_voice – An abstract sound format decoding API.

Copyright © 2001 Ryan C. Gordon.

This library is free software; you can redistribute it and/or

modify it under the terms of the GNU Lesser General Public

License as published by the Free Software Foundation; either

version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU

Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public

License along with this library; if not, write to the Free Software

Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/

/**

The basic gist of SDL_Voice is that you use an SDL_RWops to get sound data

into this library, and SDL_Voice will take that data, in one of several

popular formats, and decode it into raw waveform data in the format of

your choice. This gives you a nice abstraction for getting sound into your

game or application; just feed it to SDL_Voice, and it will handle

decoding and converting, so you can just pass it to your SDL audio

callback (or whatever). Since it gets data from a SDL_RWops, you can get

the initial sound data from any number of sources: file, memory buffer,

network connection, etc.

As the name implies, this library depends on SDL: Simple Directmedia Layer,

which is a powerful, free, and cross-platform multimedia library. It can

be found at http://www.libsdl.org/

Support is in place or planned for the following sound formats:

.WAV (Microsoft WAVfile RIFF data, internal.)

.VOC (Creative Labs’ Voice format, internal.)

.MP3 (MPEG-1 layer 3 support, via the SMPEG library.)

.MID (MIDI music converted to Waveform data, via Timidity.)

.MOD (MOD files, via MikMod.)

.OGG (Ogg files, via Ogg Vorbis libraries.)

.RAWDATA (Raw sound data in any format, internal.)

.CDA (CD audio read into a sound buffer, internal.)

(…and more to come…)

Please see the file LICENSE in the source’s root directory.

This file written by Ryan C. Gordon. (icculus at clutteredmind.org)
*/

#ifndef INCLUDE_SDL_VOICE_H
#define INCLUDE_SDL_VOICE_H

#include “SDL.h”

#ifdef __cplusplus
extern “C” {
#endif

/* Stupid DLL stuff… */
#if (defined _MSC_VER)
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif

/*

These are flags that are used in a Voice_Sample (below) to show various

states.

To use: “if (sample->flags & VOICE_SAMPLEFLAGS_ERROR) { dosomething(); }”
/
typedef enum VOICE_SAMPLEFLAGS
{
VOICE_SAMPLEFLAG_NONE = 0, / Null flag. */

/* these are set at sample creation time… /
VOICE_SAMPLEFLAG_NEEDSEEK = 1, / SDL_RWops must be able to seek. /
VOICE_SAMPLEFLAG_STREAMING = 1 << 1, / source is streaming (no EOF). */

/* these are set during decoding… /
VOICE_SAMPLEFLAG_EOF = 1 << 29, / end of input stream. /
VOICE_SAMPLEFLAG_ERROR = 1 << 30, / unrecoverable error. /
VOICE_SAMPLEFLAG_AGAIN = 1 << 31 / couldn’t read without blocking. */
} Voice_SampleFlags;

/*

The Voice_Sample structure is the heart of SDL_Voice. This holds

information about a source of sound data as it is being decoded.

EVERY FIELD IN THIS IS READ-ONLY. Please use the API functions to

change them.
*/
typedef struct VOICE_SAMPLE
{
void opaque; / Internal use only. */
Voice_DecoderInfo decoder; / Decoder used for this sample. /
SDL_AudioSpec desired; / Desired audio format for conversion. /
SDL_AudioSpec actual; / Actual audio format of sample. */
void buffer; / Decoded sound data lands in here. /
Uint32 buffer_size; / Current size of (buffer), in bytes. /
Voice_SampleFlags flags; / Flags relating to this sample. */
} Voice_Sample;

/*

Each decoder sets up one of these structs, which can be retrieved via

the Voice_AvailableDecoders() function.
*/
typedef struct PHYSFS_ARCHIVEINFO
{
const char extension; / Case sensitive standard file extension. */
const char description; / Human readable description of decoder. */
const char author; / “Name Of Author (email at emailhost.dom)” */
const char url; / URL specific to this decoder. */
} PHYSFS_ArchiveInfo;

/*

Just what it says: a x.y.z style version number…
*/
typedef struct VOICE_VERSION
{
int major;
int minor;
int patch;
} Voice_Version;

/* functions and macros… */

#define VOICE_VER_MAJOR 0
#define VOICE_VER_MINOR 0
#define VOICE_VER_PATCH 1

#define VOICE_VERSION(x) {
(x)->major = VOICE_VER_MAJOR;
(x)->minor = VOICE_VER_MINOR;
(x)->patch = VOICE_VER_PATCH;
}

/**
Get the version of SDL_Voice that is linked against your program. If you

are using a shared library (DLL) version of SDL_Voice, then it is possible

that it will be different than the version you compiled against.

This is a real function; the macro VOICE_VERSION tells you what version

of SDL_Voice you compiled against:

Voice_Version compiled;

Voice_Version linked;

VOICE_VERSION(&compiled);

Voice_GetLinkedVersion(&linked);

printf(“We compiled against SDL_Voice version %d.%d.%d …\n”,
      compiled.major, compiled.minor, compiled.patch);
printf(“But we linked against SDL_Voice version %d.%d.%d.\n”,
      linked.major, linked.minor, linked.patch);
This function may be called safely at any time, even before Voice_Init().
*/
EXPORT void Voice_GetLinkedVersion(Voice_Version *ver);
/**
Initialize SDL_Voice. This must be called before any other SDL_Voice

function (except perhaps Voice_GetLinkedVersion()). You should call

SDL_Init() before calling this. Voice_Init() will attempt to call

SDL_Init(SDL_INIT_AUDIO), just in case. This is a safe behaviour, but it

may not configure SDL to your liking by itself.

@return nonzero on success, zero on error. Specifics of the error can be
     gleaned from Voice_GetLastError().
*/
EXPORT int Voice_Init(void);

/**
Shutdown SDL_Voice. This closes any SDL_RWops that were being used as

sound sources, and frees any resources in use by SDL_Voice.

All Voice_Sample pointers you had prior to this call are INVALIDATED.

Once successfully deinitialized, Voice_Init() can be called again to

restart the subsystem. All defaults API states are restored at this

point.

You should call this BEFORE SDL_Quit(). This will NOT call SDL_Quit()

for you!

@return nonzero on success, zero on error. Specifics of the error can be
     gleaned from Voice_GetLastError(). If failure, state of SDL_Voice
     is undefined, and probably badly screwed up.
*/
EXPORT int Voice_Quit(void);

/**
Get a list of sound formats supported by this implementation of SDL_Voice.

This is for informational purposes only. Note that the extension listed is

merely convention: if we list “MP3”, you can open an MPEG Audio layer 3

file with an extension of “XYZ”, if you like. The file extensions are

informational, and only required as a hint to choosing the correct

decoder, since the sound data may not be coming from a file at all, thanks

to the abstraction that an SDL_RWops provides.

The returned value is an array of pointers to Voice_DecoderInfo structures,

with a NULL entry to signify the end of the list:

Voice_DecoderInfo **i;

for (i = Voice_AvailableDecoders(); *i != NULL; i++)

{
printf("Supported sound format: [%s], which is [%s].\n",
         i->extension, i->description);
// ...and other fields...
}

The return values are pointers to static internal memory, and should

be considered READ ONLY, and never freed.

@return READ ONLY Null-terminated array of READ ONLY structures.
*/
EXPORT const Voice_DecoderInfo **Voice_AvailableDecoders(void);
/**

Get the last SDL_Voice error message as a null-terminated string.

This will be NULL if there’s been no error since the last call to this

function. The pointer returned by this call points to an internal buffer.

Each thread has a unique error state associated with it, but each time

a new error message is set, it will overwrite the previous one associated

with that thread. It is safe to call this function at anytime, even

before Voice_Init().

@return READ ONLY string of last error message.
*/
EXPORT const char *Voice_GetLastError(void);

/**
Start decoding a new sound sample. The data is read via an SDL_RWops

structure (see SDL_rwops.h in the SDL include directory), so it may be

coming from memory, disk, network stream, etc. The (ext) parameter is

merely a hint to determining the correct decoder; if you specify, for

example, “mp3” for an extension, and one of the decoders lists that

(case sensitive) as a handled extension, then that decoder is given

first shot at trying to claim the data for decoding. If none of the

extensions match (or the extension is NULL), then every decoder examines

the data to determine if it can handle it, until one accepts it.

If no decoders can handle the data, a NULL value is returned, and a human

readable error message can be fetched from Voice_GetLastError().

Optionally, a desired audio format can be specified. If the incoming data

is in a different format, SDL_Voice will convert it to the desired format

on the fly. Note that this can be an expensive operation, so it may be

wise to convert data before you need to play it back, if possible, or

make sure your data is initially in the format that you need it in.

If you don’t want to convert the data, you can specify NULL for a desired

format. The incoming format of the data, preconversion, can be found

in the Voice_Sample structure.

Note that the raw sound data “decoder” needs you to specify both the

extension “RAWDATA” and a “desired” format, or it will refuse to handle

the data.

Finally, specify an initial buffer size; this is the number of bytes that

will be allocated to store each read from the sound buffer. The more you

can safely allocate, the more decoding can be done in one block, but the

more resources you have to use up, and the longer each decoding call will

take. Note that different data formats require more or less space to

store. This buffer can be resized via Voice_SetBufferSize() …

When you are done with this Voice_Sample pointer, you can dispose of it

via Voice_FreeSample().

@param rw SDL_RWops with sound data.

@param ext File extension normally associated with a data format.
          Can usually be NULL.
@param desired Format to convert sound data into. Can usually be NULL,
              if you don't need conversion.
@return Voice_Sample pointer, which is used as a handle to several other
       SDL_Voice APIs. NULL on error. If error, use
       Voice_GetLastError() to see what went wrong.
*/
EXPORT Voice_Sample *Voice_NewSample(SDL_RWops *rw, const char *ext,
SDL_AudioInfo *desired,
Uint32 bufferSize);

/**

Dispose of a Voice_Sample pointer that was returned from Voice_NewSample().

This will also close/dispose of the SDL_RWops that was used at creation

time, so there’s no need to keep a reference to that around.

The Voice_Sample pointer is invalid after this call, and will almost

certainly result in a crash if you attempt to keep using it.

@param sample The Voice_Sample to delete.
*/
EXPORT void Voice_FreeSample(Voice_Sample *sample);

/**
Decode more of the sound data in a Voice_Sample. It will decode at most

sample->buffer_size bytes into sample->buffer in the desired format, and

return the number of decoded bytes.

If sample->buffer_size bytes could not be decoded, then please refer to

sample->flags to determine if this was an End-of-stream or error condition.

@param sample Do more decoding to this Voice_Sample.

@return number of bytes decoded into sample->buffer. If it is less than
      sample->buffer_size, then you should check sample->flags to see
      what the current state of the sample is (EOF, error, read again).
*/
EXPORT Uint32 Voice_Decode(Voice_Sample *sample);

#ifdef __cplusplus
}
#endif

#endif /* !defined INCLUDE_SDL_VOICE_H */

/* end of SDL_voice.h … */

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Sam_Hart · September 14, 2001, 8:07am

Not speaking for Ryan Gordon, but just from looking at the hawkvoice site, I
would say that SDL_Voice would be different because of its ties with SDL.

Tying it to SDL means that the intent is to have it run on as many different
platforms as SDL does (HawkVoice only looks like Linux & Win32 right now) and
that it will integrate well with other SDL apps (speaking as someone who has
recently been trying to mix SDL with another massive library [for speech
synth], SDL integration is very important! ;-)On Friday 14 September 2001 7:27am, Dominique Louis wrote:

Is SDL_Voice in competition with HawkVoice (
http://www.hawksoft.com/hawkvoice/ ) or are these totally different beasts?

–
Sam “Criswell” Hart <@Sam_Hart> AIM, Yahoo!:
Homepage: < http://www.geekcomix.com/snh/ >
PGP Info: < http://www.geekcomix.com/snh/contact/ >
Tux4Kids: < http://www.geekcomix.com/tux4kids/ >

slouken · September 14, 2001, 8:50am

/* Stupid DLL stuff… */
#if (defined _MSC_VER)
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif

It would be nice if this kinda stuff were in a SDL header, so that
SDL progs could just use the predefined EXPORT for libs… Just a
thought =)

It is, it’s in begin_code.h

See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment

slouken · September 14, 2001, 8:54am

I suggest using SDL_sound instead of SDL_voice, since the library doesn’t
primarily handle voice communication.

Otherwise, looks good!

See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment

Gerry_Jo_Jellestad · September 14, 2001, 12:15pm

It would be nice if this kinda stuff were in a SDL header, so
that SDL progs could just use the predefined EXPORT for
libs… Just a thought =)

It is, it’s in begin_code.h

Oh. I’ll go slap my head now.

Sorry, next time i’ll check things before i speak =) Now for coding
that game for the game contest… Hm.–
Trick

Linux User #229006 * http://counter.li.org

David_Olofson · September 14, 2001, 1:13pm

[…]

(Actually…hhm…native midi music will probably have to be kept
separate on that channel, unless you want to use Timidity through
SDL_voice to mix it into a wave buffer…anyhow…that’s future stuff.)

Actually, you can mix MIDI tracks on the MIDI level, although it’s a
bit messy and unpredictable, at least if more than 15 different
instruments are needed.

However, even if it’s not very useful for “normal” MIDI files, it could
be interesting for games with MIDI tracks explicitly written to be mixed
this way.

A more interesting scenario would be if the MIDI sequencer and MIDI synth
functionality were separated, so that a game could use a custom synth
with the “SDL_music” (or whatever) MIDI sequencer - complex sound effects
could be programmed as MIDI sequences, and mixed by the sequencer. (Or
rathe, played by individual instances of the sequencer, and mixed by a
MIDI mixer in the core.)

BTW, I “stole” some MIDI file code (.MID, SMI, .XMI) from the Exult
project (http://exult.sourceforge.net/). I intend to use it as a part of
something along the lines of what I described above, to be used in
Project Spitfire, and probably SKobo/“Kobo Deluxe”.

[…]

Looking good!

However - I could be missing something - bet there’s one issue; real time
streaming. For example, if I’m going to play something from disk, I need
to know how and when I can safely call Voice_Decode() to get more data.
(If it reads directly from the file without a background "disk butler"
thread, using Voice_Decode() from within the adio callback would be
fatal, even if the current platform theoretically supports it.)

One detail I’m not sure about is whether it is to be expected that
Vocie_Decode() returns non-full buffers from time to time, or if that’s
only in the case of errors or EOF. (It would make sense if it didn’t try
to decode half frames for some file formats, for example, or preferably,
if that was selectable.)

An implementation that allows transparent use of Voice_Decode() from
within the audio callback would be more complex, but OTOH, that
complexity cannot be avoided - it’ll move into applications if the
library doesn’t handle it… (And it would be done wrong most of the
time, guaranteed - it’s not trivial stuff.)

If streams are assumed to be played at “normal” speed, it would be
possible to implement this transparently. The only issue would be that
preparing a file for playback takes some time. (Buffering must build up
before you start playing.) That could be handled by simply blocking in
the Voice_NewSample() call until we’re really ready for playback.

Implementation:
The audio thread is timing critical, whereas disk I/O is
very non-deterministic.

Disk I/O could be driven by a thread that's blocked most
of the time, either waiting for I/O completion, or sleeping
for some 50-100 ms.

To avoid the risk of timing interference between the disk
thread and audio thread/callback, lock-free FIFOs can be
used for passing pointers to buffers. The audio thread/
callback *mustn't* block anyway, and the disk thread isn't
expected to respond very quickly, so polling overhead
(ie when the disk thread wakes up only to find there's
nothing to do) should be insignificant, compared to the
overhead involved with using OS sync constructs for every
single buffer transaction.

There’s just one major issue: How to implement this on a platform without
threads?

//David Olofson — Programmer, Reologica Instruments AB

.- M A I A -------------------------------------------------.
| Multimedia Application Integration Architecture |
| A Free/Open Source Plugin API for Professional Multimedia |
----------------------------> http://www.linuxdj.com/maia -' .- David Olofson -------------------------------------------. | Audio Hacker - Open Source Advocate - Singer - Songwriter |--------------------------------------> david at linuxdj.com -'On Friday 14 September 2001 08:54, Ryan C. Gordon wrote:

icculus · September 14, 2001, 3:10pm

It would be nice if this kinda stuff were in a SDL header, so that
SDL progs could just use the predefined EXPORT for libs… Just a
thought =)

…plus, I’m sure that my lame EXPORT macro doesn’t cover all the
platforms that need something like that.

Alternately, as far as Win32 goes, you can have a .DEF file that lists
symbols to export. Is there some sort of utility that can parse a source
file and create this DEF file by figuring out which symbols are static and
which ones are global? Obviously this limits the usefulness of the .DEF
(the point being that you can have global, file scope, and project scope
symbols), but it makes Linux compatibility a little cleaner.

I do have a perl script that does something similar by examining a Linux
binary for global symbols, but it needs to run (obviously) on Linux, not
Win32. Perhaps run it on a Linux build, and stick the output into CVS for
Win32 users to use.

Okay, I’m rambling.

Shouldn’t this be…
const char author; / "Name Of Author " */

Good point. Yes.

–ryan.

slouken · September 14, 2001, 3:14pm

It would be nice if this kinda stuff were in a SDL header, so that
SDL progs could just use the predefined EXPORT for libs… Just a
thought =)

…plus, I’m sure that my lame EXPORT macro doesn’t cover all the
platforms that need something like that.

Alternately, as far as Win32 goes, you can have a .DEF file that lists
symbols to export. Is there some sort of utility that can parse a source
file and create this DEF file by figuring out which symbols are static and
which ones are global? Obviously this limits the usefulness of the .DEF
(the point being that you can have global, file scope, and project scope
symbols), but it makes Linux compatibility a little cleaner.

Take a look at the SDL headers, begin_code.h, and the perl scripts which
generate the exports lists (src/main/*/exports)

See ya,
-Sam Lantinga, Software Engineer, Blizzard Entertainment

icculus · September 14, 2001, 3:15pm

This may be a premature - or even stupid - question, but would it then
be SDL_voice that converts all sound data into a common format (same
sample size, frequency, etc.) or would that job still fall on SDL’s
"audiocvt" filters?

This comes up a lot, actually.

In SDL_voice’s case, I was going to just use SDL’s audiocvt. Partially
because I’m lazy, and partially because it would be better to improve
SDL’s converter than to put it in an external library.

The quality of the SDL converter is usable, since most samples are stored
at sample rates that can be cleanly divided and multiplied by two. For
example, most samples are at 11KHz, or 22KHz, or 44KHz, which causes few
problems.

Try converting from 8KHz to 11KHz with SDL, though.

Other things, like interpolation, are missing from the sample rate
conversion, too.

I have been told that there are several good ways to fix this, but all of
those ways are patented.

Alternately, no one’s tried to fix it. But if someone’s got the know-how,
they really should.

–ryan.

icculus · September 14, 2001, 3:20pm

The basic gist of SDL_Voice is that you use an SDL_RWops to get sound data

into this library, and SDL_Voice will take that data, in one of several

popular formats, and decode it into raw waveform data in the format of

your choice. This gives you a nice abstraction for getting sound into your

game or application; just feed it to SDL_Voice, and it will handle

Is SDL_Voice in competition with HawkVoice (
http://www.hawksoft.com/hawkvoice/ ) or are these totally different beasts?

HHmmmm…maybe I should change the name to SDL_audiofile.

–ryan.

Torbjorn_Andersson · September 15, 2001, 4:36am

“Ryan C. Gordon” wrote:

Try converting from 8KHz to 11KHz with SDL, though.

Other things, like interpolation, are missing from the sample rate
conversion, too.

I have been told that there are several good ways to fix this, but all of
those ways are patented.

Alternately, no one’s tried to fix it. But if someone’s got the know-how,
they really should.

Would it be feasible to use SoX in any way, just like we use MikMod
and Timidity? I haven’t had the time to look at the code, but
according to their SourceForge page [1] it’s released under the LGPL
so at least there shouldn’t be any license incompatibilities.

I still worry about the "buffer length has to be a power of 2"
restriction though. Obviously that one would have to be worked around
somewhere…

Torbj?rn Andersson

[1] http://sourceforge.net/projects/sox

icculus · September 15, 2001, 8:24am

HHmmmm…maybe I should change the name to SDL_audiofile.

SDL_sound is a good complement to SDL_image, IMHO…

I agree:

http://icculus.org/SDL_sound/

–ryan.