[slightly OT] Portable float packing/unpacking

a. compiler option like -fno-strict-aliasing
b. use a union
c. compiler-specific hack: attribute((mayalias))
d. use (char*) to access the memory

Also remember that none of these will help when the variable is
optimized away into the FPU stack (though volatile may help there,
I’m not sure). :slight_smile:

-Sam Lantinga, Senior Software Engineer, Blizzard Entertainment

a. compiler option like -fno-strict-aliasing
b. use a union
c. compiler-specific hack: attribute((mayalias))
d. use (char*) to access the memory

Also remember that none of these will help when the variable is
optimized away into the FPU stack (though volatile may help there,
I’m not sure). :slight_smile:

My understanding is that any of the four choices above will squash
aliasing problems and force the variable off the FPU on GCC. Casting to
char was a concession to all the piles of legacy code…of the four, the
GCC developers seem to encourage unions.

In the case of a serialize function, this might just make a copy of the
float onto the stack when shuffling it into the union and leave it on
the FPU too for further processing.

The volatile keyword should, I would think, disable any significant FPU
work on that variable, or force it to store out after each
operation…but I’m really not sure either, now that I think about it. :slight_smile:

For what it’s worth, I’ve used serialize functions with considerably
less thought put into them across VS.NET 2002/2003/2005, Metrowerks C++,
and GCC on countless CPU architectures and OSes, and not had problems,
so it’s probably not worth losing too much sleep over it.

(Until your program is inexplicably crashing, of course!)

–ryan.

We can only hope there, and by the looks of it, there are plenty
of people to consult here.

:wink:

a1On 2/14/06, Ryan C. Gordon wrote:

(Until your program is inexplicably crashing, of course!)

and it may fail if the size and alignment of float32 and
uint32 are not the same.

By definition uint32 and float32 are the safe size. That is the reasons
for the 32 following the uint and float parts :-).

I guessed that too :slight_smile:

Is the alignment of the two types even a question?

It is possible a processor such as x86 can support byte
aligned integers but only word aligned floats. In that case
aliasing via a cast from an int32* to a float32* might fail
when the result is dereferenced, if the C compiler packs
structs etc in a way optimised for space.

I generally try to write strictly conforming ISO C/C++ if
possible, and document when breaking that rule. This tends to
support better portability.

There is no guarantee this will work, for example:

struct X {
char pad;
int32 i;
};

void f() {
X x;
float32 p = (float32)(void*)&x.i;
*p = 1.1; // BUS ERROR
}

It all depends, in this case, where the C compiler aligns i.
If you do it like this:

struct X {
char pad;
unsigned char i[4];
};

it will almost certainly fail on all current processors
because almost all C compilers will not add any padding after
’pad’, and almost all current processors require floats
to be properly aligned, with alignment > 1.

If you want to have a variable that is either a float32
or an int32 there is only one portable way to do it:

union num32 {
	float32 f;
	int32 i;
};

Using the union like this is still aliasing, however
use of the union ensures f and i are aligned to

maxalign(float32,int32)

Note you CAN assume int/unsigned int, long/unsigned long, etc
have the same alignment. Also all data pointers have the
same alignment.

Unfortunately this technique only works in C.
In C++, you cannot make unions of constructable types.
This is an inexcusable fault in ISO C++, because use of
a union as above is the ONLY portable way to alias store
in general, precisely because of alignment issues.On Mon, 2006-02-13 at 12:20 -0600, Bob Pendleton wrote:


John Skaller
Felix, successor to C++: http://felix.sf.net

In that case, you can try to check the alignment of both is the same!

Alignment can be tested by the trick, where T is the type
you’re finding the alignment of:

struct X { unsigned char dummy; T x; };
printf("%d",(int)offsetof(X,x));

with appropriate header files and mainline. This trick assumes
that x is properly aligned without the compiler introducing
any gratuitous padding.

Just one comment – the choice of the name ‘uint32’ is not so good,
because it is so obvious, and may clash with someone elses
definition of that name (almost certainly for the same purpose).
Most people use a prefix to try to reduce this possibility, eg:

SDL_uint32   // SDL
MYAPP_uint32 // MYAPP

Pretty messy. C99 provides

uint32_t

optionally in inttypes.hOn Mon, 2006-02-13 at 20:58 +0000, mal content wrote:

On 2/13/06, Bob Pendleton wrote:

By definition uint32 and float32 are the safe size. That is the reasons
for the 32 following the uint and float parts :-). These types are not
built in to C/C++ they are typedef’ed somewhere in the code an hopefully
a test was included to make sure that sizeof(uint32) == 4 and

Yes, generated with this:

#include <stdio.h>

int go(const char* s)
{


John Skaller
Felix, successor to C++: http://felix.sf.net

It’s more than a matter of keeping values in a floating-point
stack. The C standard allows the compiler to assume that
such pointers don’t point to the same memory.

Good point! It will only hold for individual variables,
not components of a union.

This means Sam’s original claim could be correct.

“Be really careful with this. Optimizing compilers will sometimes keep
floating point values in the floating point stack, so (uint32)&f will
be garbage in this case.”

However I am ALSO correct :)) That is because the rules are
entirely characterized in terms of alignment requirements.
If the alignments of float32 and uint32 are the same, the above
must work.

Alignments are not numbers, they’re arbitrary things with
a partial order. However the order has both a known assured
minimum (that of unsigned char) and maximum (required
or malloc couldn’t work).On Mon, 2006-02-13 at 01:20 -0500, Albert Cahalan wrote:


John Skaller
Felix, successor to C++: http://felix.sf.net

The volatile keyword should, I would think, disable any significant FPU
work on that variable, or force it to store out after each
operation…but I’m really not sure either, now that I think about it. :slight_smile:

You are right. Volatile has no portable semantics. It is entirely
implementation defined what it means.

It is useful for non-portable programs, eg kernel development,
or, if you’re writing a game for a particular device and
need low level hacks.

The intent was to force invalidation of a cached copy of
a hardware port at each sequence point, to force
a subsequent reference to reload a new value from the
hardware port. Particularly useful on Motorola chips
which use memory mapped I/O latched on addressing.On Mon, 2006-02-13 at 16:46 -0800, Ryan C. Gordon wrote:


John Skaller
Felix, successor to C++: http://felix.sf.net

If you want to have a variable that is either a float32
or an int32 there is only one portable way to do it:

union num32 {
float32 f;
int32 i;
};

Although technically this isn’t portable either, as the ISO C standard
specifically indicates that writing to a union through one member and
then reading it through a different member invokes
implementation-defined behavior.

In short, there is simply no strictly portable way to do “type
punning”, except in a few limited cases (e.g. to array of unsigned
char). You either work around it, or else you have to make some
assumptions about your platform.

b

Yes, but it is portable IF you obey the rules, and only
take out of the union what you put in. The same applies
to pointer based aliasing – the difference is that
the union ensures proper alignment for both types,
whereas the pointer based thing does not.

As Sam noted, parameter passing is type based.

You don’t want to read the rules for
AMD64 for the GNU/Linux API … they’re really complicated,
and require passing floats in SSE registers.On Mon, 2006-02-13 at 22:27 -0800, Brian Raiter wrote:

If you want to have a variable that is either a float32
or an int32 there is only one portable way to do it:

union num32 {
  float32 f;
  int32 i;
};

Although technically this isn’t portable either, as the ISO C standard
specifically indicates that writing to a union through one member and
then reading it through a different member invokes
implementation-defined behavior.


John Skaller
Felix, successor to C++: http://felix.sf.net

In short, there is simply no strictly portable way to do “type
punning”, except in a few limited cases (e.g. to array of unsigned
char). You either work around it, or else you have to make some
assumptions about your platform.

Btw, this is a really good example of what I was saying about a cut off
point for portable software. The solutions we’ve discussed here work on
all the popular compilers (Metrowerks, Intel C++, IBM’s xlc, GCC, MSVC),
and all the popular OSes (Mac OS 9/X, Windows, Linux, etc).

There comes a point were you really have to give the C standard the
finger.

An “int” on the original gameboy is 8-bits, and while this would surely
break all sorts of code inside SDL, but we don’t tapdance to support it,
and frankly, if you want to get any work done, you won’t either. Same
with this float/int aliasing nonsense. Cast the thing or use a union and
if it breaks on your black-and-white Palm Pilot, write some inline asm
to move it from a register to the right place in memory…but don’t get
too hung up on it until you have to.

Take reasonable efforts for portability, but emphasis on the word
"reasonable," everyone.

–ryan.

a. compiler option like -fno-strict-aliasing
b. use a union
c. compiler-specific hack: attribute((mayalias))
d. use (char*) to access the memory

Also remember that none of these will help when the variable is
optimized away into the FPU stack (though volatile may help there,
I’m not sure). :slight_smile:

Those four work, but volatile does not!

If you need to support non-gcc compilers with heavy-duty
optimization, -fno-strict-aliasing and attribute((mayalias))
will be unavailable. There may be alternatives, such as #pragma.

The (char*) cast and union should work on any system
that uses IEEE floats and 8-bit bytes.

I forgot one option… an inline assembly statement without
any assembly can use asm constraints to force stuff out to
memory. So that’s a fifth method that works:

#define FORCE_TO_MEM(x) asm volatile(""::“r”(&(x)))

You could mark it as clobbering memory too if you are
really paranoid, like this:

#define FORCE_TO_MEM(x) asm volatile(""::“r”(&(x)):“memory”)

The volatile keyword should, I would think, disable any significant FPU
work on that variable, or force it to store out after each
operation…but I’m really not sure either, now that I think about it. :slight_smile:

(Until your program is inexplicably crashing, of course!)

I’ve seen the crashes. Adding “volatile” doesn’t fix things.
I had code like the following:

int getexp(double d){
short sp = (short)&d;
return (sp[EXPLOCATION] >> EXPSHIFT) & EXPMASK;
}

No matter where I add volatile, the above can fail. I fixed
the code using assembly constraints.On 2/13/06, Ryan C. Gordon wrote:

… and so will the compiler be. However the above code
is broken. You must do this:

    *(unit32*)(void*)&f

Sorry, (void*) does not help you with the aliasing rules.
Though (char*) does, I think you still lose because you
then cast to a type that is not permitted to alias with the
first type.

and it may fail if the size and alignment of float32 and
uint32 are not the same.

Sure. In practice, both float and unsigned are 32-bits on
all 32-bit and 64-bit platforms. Because of the need to
avoid adding holes in the selection of types available,
they won’t be switching to a larger size decades from now.

Thanks for all the help so far. The situation is far more fragile than
I had previously thought. Is there ANY safe assignment in C? :slight_smile:

Ok, so, can these functions be made any ‘safer’ or is this as good
as it’s going to get?

void float32_packl(char n[4], float32 f)
{
uint32 ui = *(uint32 *)(void *) &f;

That’s wrong.

The correct way with pointers:

void float32_pack(char n, float32 f)
{
// flip the digits on one side as needed for endianness
n[0] = ((char
)&f)[0];
n[1] = ((char*)&f)[1];
n[2] = ((char*)&f)[2];
n[3] = ((char*)&f)[3];
}

The correct way with a union:

unsigned float32_pack(float32 f)
{
union {
unsigned u;
float f;
}uf;
uf.f = f;
return htonl(uf.u); // htonl does byte swapping if needed
}

Using attribute((mayalias)) is something like:

unsigned float32_pack(float32 f)
{
attribute((mayalias)) unsigned up = (unsigned)&f;
return htonl(*up); // htonl does byte swapping if needed
}

Using assembly constraints is:

unsigned float32_pack(float32 f)
{
asm volatile(""::“r”(&(f)):“memory”);
unsigned up = (unsigned)&f;
return htonl(*up); // htonl does byte swapping if needed
}

Compile with -Wstrict-aliasing=2 to get more warnings about
this sort of problem. You will get both false positives and
flase negatives though.On 2/13/06, mal content <artifact.one at googlemail.com> wrote:

On 2/13/06, skaller wrote:

It’s more than a matter of keeping values in a floating-point
stack. The C standard allows the compiler to assume that
such pointers don’t point to the same memory.

Good point! It will only hold for individual variables,
not components of a union.

This means Sam’s original claim could be correct.

“Be really careful with this. Optimizing compilers will sometimes keep
floating point values in the floating point stack, so (uint32)&f will
be garbage in this case.”

Well, sort of, but even worse. Just as you can’t mix int and
float as you’d like, you can’t mix float and double. You also
can’t mix short and long, or long and int, etc.

However I am ALSO correct :)) That is because the rules are
entirely characterized in terms of alignment requirements.
If the alignments of float32 and uint32 are the same, the above
must work.

No. C is a cruel language that few people really understand.
We can discuss the “restrict” keyword if you’d like your brain
to melt and pour out of your nose. :slight_smile: Other good ones are
"const" (which isn’t) and “volatile”. We need a C-- language.

Anyway…

I’m not kidding on this. Let me translate a function from C
to assembly-like pseudo-code, much as gcc sometimes does.
I’ll assume that the float is passed in a register and that the
integer return value is returned in a register. First, the C code:

unsigned getsign(float f){
unsigned up = (unsigned)&f;
return *up >> 31;
}

Now, the “assembly”:

getsign:
subtract 4 from stack pointer
move32bit stack[0] into r1 # load value into an int register
logicalshift r1 right by 31
move32bit f1 into stack[0] # store float passed in an fp register
add 4 to stack pointer
return

OK, so it looks a tad like COBOL, but most people don’t
know PowerPC assembly all that well.

I actually reported this as a bug, and was told it is not.
I then personally discussed the problem with a compiler
developer. The C standard really is that evil.

In case you’re wondering, the C standard is like this so that
compilers can optimize better. The standards committee
likes to support stuff like the AS/400, where you’d actually
get a fault/exception/crash for touching memory of the
wrong type. The committee also cares about trying to keep
up with FORTRAN performance. The screwball DSP vendors
are pretty well represented too. No care is given to the fact
that all normal systems (server, desktop, PDA, cellphone)
use normal 32-bit IEEE float, etc.On 2/14/06, skaller wrote:

On Mon, 2006-02-13 at 01:20 -0500, Albert Cahalan wrote:

If you want to have a variable that is either a float32
or an int32 there is only one portable way to do it:

union num32 {
        float32 f;
        int32 i;
};

Although technically this isn’t portable either, as the ISO C standard
specifically indicates that writing to a union through one member and
then reading it through a different member invokes
implementation-defined behavior.

Yes, but it is portable IF you obey the rules, and only
take out of the union what you put in. The same applies
to pointer based aliasing – the difference is that
the union ensures proper alignment for both types,
whereas the pointer based thing does not.

This is NOT about alignment. I hit problems casting to
something with less alignment. I tried to read a double
as an array of 4 shorts. This did not work. Read the
pseudo-assembly I put in a different email.

For “int” and “float”, nearly all systems will try to
do a 4-byte alignment. Nearly all systems do not
actually require this; AltiVec is a notable exception
because it doesn’t generate an exception that would
allow the OS to emulate a misaligned access.
Both x86 and regular PowerPC include hardware
support to make unaligned access somewhat fast.

As Sam noted, parameter passing is type based.

You don’t want to read the rules for
AMD64 for the GNU/Linux API … they’re really complicated,
and require passing floats in SSE registers.

I already read it, on several occasions. I have a feeling
that I will be reading it again soon.

Isn’t it cute how they use AL (or AH maybe?) to count
up the floats passed to varargs functions?On 2/14/06, skaller wrote:

On Mon, 2006-02-13 at 22:27 -0800, Brian Raiter wrote:

I had code like the following:

int getexp(double d){
short sp = (short)&d;
return (sp[EXPLOCATION] >> EXPSHIFT) & EXPMASK;
}

No matter where I add volatile, the above can fail. I fixed
the code using assembly constraints.

Confirm, it fails on AMD64/gcc 4.0 too (even with no optimisation).
This works though:

int getexp(double d){
unsigned char sp = (unsigned char)(void*)&d;
return ((sp[EXPLOCATION]+256*sp[EXPLOCATION+1])>>EXPSHIFT) & EXPMASK;
}

[at least, with the short I get indeterminate result, with
the unsigned chars the result is the same every time]On Mon, 2006-02-13 at 20:49 -0500, Albert Cahalan wrote:

On 2/13/06, Ryan C. Gordon wrote:


John Skaller
Felix, successor to C++: http://felix.sf.net

void float32_pack(char n, float32 f)
{
// flip the digits on one side as needed for endianness
n[0] = ((char
)&f)[0];
n[1] = ((char*)&f)[1];
n[2] = ((char*)&f)[2];
n[3] = ((char*)&f)[3];
}

Correct me if I’m wrong, but wouldn’t the above code, given
the same value for f but run on two machines with different
endianness, write a different value into n?

Wouldn’t this sort of defeat the point of a portable packing
function?

However I am ALSO correct :)) That is because the rules are
entirely characterized in terms of alignment requirements.
If the alignments of float32 and uint32 are the same, the above
must work.

No. C is a cruel language that few people really understand.

True. Including the committee members ;(

We can discuss the “restrict” keyword if you’d like your brain
to melt and pour out of your nose. :slight_smile:

Lol! No thanks. WG14 tried noalias first … but that didn’t
pan out.

Other good ones are
"const" (which isn’t) and “volatile”. We need a C-- language.

We have a C-- language.

Anyway…

I’m not kidding on this.

Neither am I :slight_smile:

You are of course correct that you cannot use a value via
an alias which is of a type incompatible with the value
actually stored.

however that is not an aliasing rule: the same rule forbids
using an uninitialised value.

Aliases of ANY types are in fact allowed: see ISO C99 6.3.2.3/1,
which says pointers to any incomplete or object type can be converted
to or from void*. Combining the two cases one deduces any such pointer
can be converted to any other.

This ability to alias any object types is obvious anyhow,
since you can put any object types in a union to create an alias.

In the example code you ARE correct that there must be a misuse
of an alias, but your argument that types cannot be aliased is
incorrect. IMHO. Since interpreting the C Standard is difficult :slight_smile:

In a slightly more difficult example, we examine lvalues
which are assignment targets. In that case there is no abuse
of a wrongly typed value, however we may STILL want to ban
such an operation.

The thing is, objects do not have a type, only value
representations. But storage does have alignment and size.
So the only way to ban storing a float in an int, is to make
sure that they have either different sizes or different alignments.

In this case, we will choose the following alignments:

float32 --> SSE
int32   --> EiX
least upper bound (SSE, EiX) --> 4

no one said alignments had to be integers, nor that the ordering
had to be linear – actually, it has to form a lattice I think,
with alignof unsigned char* being the minimum, and some alignment which
malloc provides being the maximum – pity that one isn’t named,
I need it often.

The C standard really is that evil.

Yup. But it is standardising an evil language, what do
you expect?

In case you’re wondering, the C standard is like this so that
compilers can optimize better.

I’m afraid that isn’t a good account of the process ;(

The standards committee
likes to support stuff like the AS/400, where you’d actually
get a fault/exception/crash for touching memory of the
wrong type. The committee also cares about trying to keep
up with FORTRAN performance. The screwball DSP vendors
are pretty well represented too. No care is given to the fact
that all normal systems (server, desktop, PDA, cellphone)
use normal 32-bit IEEE float, etc.

That isn’t quite fair. Support for weird platforms is important,
because in the future someone may invent nice ones!

I’d complain the other way – C99 actually adds gratuitous
overconstraints on integer types, and fails to add some
others that are vital. The C Standard (and that of C++) is a result
of a political process. I’d have to say … looking at the political
leaders of various countries over time … the standards committees
are doing quite well :slight_smile:

“Doctor, it hurts when I C things”
"Then don’t C things"On Tue, 2006-02-14 at 01:53 -0500, Albert Cahalan wrote:


John Skaller
Felix, successor to C++: http://felix.sf.net

mal content wrote:

I basically need to write ‘float32_packl’ (packs a 32 bit
floating point into a 4 byte char array in little endian
byte order).

My suggestion is this:—
#include <math.h>
#include <float.h>
#include <stdint.h>

typedef struct packed_float32 {
int32_t mantissa; /* for IEEE, we need only 24 bits /
int16_t exp; /
for IEEE we need 8 bits, representing
from -125 to 128 */
} packed_float32;

#if FLT_RADIX==2

packed_float32
pack_float32(float f)
{
packed_float32 pf = { 0, 0 };

    int minus = signbit(f);
    if (minus)
            f = -f;
   
    while (f < 1) {
            --pf.exp;
            f *= 2;
    }
   
    while (f > 1) {
            ++pf.exp;
            f /= 2;
    }
   
    /* now we have 1 > f >= 0.1b */
    const uint32_t big_positive_int = 1 << FLT_MANT_DIG;
    pf.mantissa = (uint32_t)(f * big_positive_int); /* the unused 

most-significant bits are zero */

    if (minus)
            pf.mantissa = -pf.mantissa;
           
    pf.exp -= FLT_MANT_DIG;

    return pf;

}

float
unpack_float32(packed_float32 pf)
{
float f;

    f = pf.mantissa;
    while (pf.exp > 0) {
            f *= 2;
            --pf.exp;
    }
    while (pf.exp < 0) {
            f /= 2;
            ++pf.exp;
    }

    return f;

}

#else
/* write IBM 360 code here… */
#endif

You can easily make a version for double and long double too. No
treacherous pointer hacks, just well-behaved binary arithmetic.
Actually, GCC doesn’t complain even with -Wextra (w00t!!). The only
problem is with non-finite values, you may want to add extra code for that.

The downside is that it currently requires more bits than necessary.
Obviously, for interchanging data you may want to pack it a bit (e.g.
store it in a bit-field instead of separated members in a struct). It’s
also easy to derive a pack_double64() (and pack_ldouble128()) from it,
just change the FLT_* macros and the size of integers.

If anyone sees a problem with this code please let me know. I currently
find it more reliable than the other commonly suggested way (i.e. the
"pointers orgy" :P).


Daniel K. O.

void float32_pack(char n, float32 f)
{
// flip the digits on one side as needed for endianness
n[0] = ((char
)&f)[0];
n[1] = ((char*)&f)[1];
n[2] = ((char*)&f)[2];
n[3] = ((char*)&f)[3];
}

Correct me if I’m wrong, but wouldn’t the above code, given
the same value for f but run on two machines with different
endianness, write a different value into n?

Yes. Thus the comment about flipping the digits on
one side. You’re supposed to do that for either big-endian
or little-endian, but not both. Wrap an #ifdef around it,
or #define the digits as needed.

Here, start the file off like this:

#if defined(_WIN32) || defined(_WIN64)
#define __BYTE_ORDER 1
#define __BIG_ENDIAN 0
#define __LITTLE_ENDIAN 1
#else
#include <endian.h>
#endif

Then, in that function, flip the digits as instructed:

#if __BYTE_ORDER == __BIG_ENDIAN
n[0] = ((char*)&f)[0];
n[1] = ((char*)&f)[1];
n[2] = ((char*)&f)[2];
n[3] = ((char*)&f)[3];
#else
n[0] = ((char*)&f)[3];
n[1] = ((char*)&f)[2];
n[2] = ((char*)&f)[1];
n[3] = ((char*)&f)[0];
#endif

Wouldn’t this sort of defeat the point of a portable packing
function?

It would, if you didn’t flip the digits on one side.On 2/14/06, mal content <artifact.one at googlemail.com> wrote:

In short, there is simply no strictly portable way to do “type
punning”, except in a few limited cases (e.g. to array of unsigned
char). You either work around it, or else you have to make some
assumptions about your platform.

Btw, this is a really good example of what I was saying about a cut off
point for portable software. The solutions we’ve discussed here work on
all the popular compilers (Metrowerks, Intel C++, IBM’s xlc, GCC, MSVC),
and all the popular OSes (Mac OS 9/X, Windows, Linux, etc).

There comes a point were you really have to give the C standard the
finger.

I strongly agree.

An “int” on the original gameboy is 8-bits, and while this would surely
break all sorts of code inside SDL, but we don’t tapdance to support it,

That isn’t standard because it can’t hold -32767 to 32767.

and frankly, if you want to get any work done, you won’t either. Same
with this float/int aliasing nonsense. Cast the thing or use a union and
if it breaks on your black-and-white Palm Pilot, write some inline asm

I hope you mean “cast the thing to (char*)”, because otherwise:

The cast breaks on really normal Linux systems. It breaks
with AMD64 and gcc 4.0, and it breaks with PowerPC and
some gcc 3.x.

Note that I posted inline asm that is portable to any system
that uses gcc, no matter what the CPU. The code was “”.
Assembly constraints force the variable out to memory.

Take reasonable efforts for portability, but emphasis on the word
"reasonable," everyone.

Definitely, but I think “reasonable” ought to include the above
mentioned platforms. (AMD64/gcc4, PowerPC/gcc3)On 2/14/06, Ryan C. Gordon wrote: