CPU specific build flags for OS X Universal Binaries

So I think we’re close to getting the Universal Binaries finished. The
tricky part has been figuring how to maintain 10.2 compatibility in
this whole process. I think we’re finally past this (pending on some
final 10.2 test results).

One thing I still need help with is what build flags should I be
defining to enable CPU specific optimizations like Altivec or SSE/MMX?
I know how to pass architecture specific flags in Xcode, but I don’t
know what actual flags (if any) I need to use for SDL.

Currently, for PowerPC I think I need -DGCC_ALTIVEC
-DUSE_ALTIVEC_BLITTERS. Are there any others? And I’m not sure what
flags (if any) I need for Intel. Also, does additional code need to be
included in the Xcode project for x86? (The Xcode project seems to
omit non-Mac code.) Currently, the x86 side compiles without any
flags, but I don’t know if it gets any SSE capabilities by default.

Thanks,
Eric

E. Wing wrote:

So I think we’re close to getting the Universal Binaries finished. The
tricky part has been figuring how to maintain 10.2 compatibility in
this whole process. I think we’re finally past this (pending on some
final 10.2 test results).

One thing I still need help with is what build flags should I be
defining to enable CPU specific optimizations like Altivec or SSE/MMX?
I know how to pass architecture specific flags in Xcode, but I don’t
know what actual flags (if any) I need to use for SDL.

Currently, for PowerPC I think I need -DGCC_ALTIVEC
-DUSE_ALTIVEC_BLITTERS. Are there any others? And I’m not sure what
flags (if any) I need for Intel. Also, does additional code need to be
included in the Xcode project for x86? (The Xcode project seems to
omit non-Mac code.) Currently, the x86 side compiles without any
flags, but I don’t know if it gets any SSE capabilities by default.

You should define USE_ASMBLIT. Then the cpu mmx/sse/whatever available
will be used (you also need to use a gcc compiler with i386 defined).
You will also need nasm to compile the hermes blitters. Man, I feel I’ve
been meaning to rewrite these blitters for gcc for ages…

Stephane

Sorry to hijack the thread… but the following stood out to me :^)On Wed, Jan 18, 2006 at 05:07:38PM -0800, E. Wing wrote:

So I think we’re close to getting the Universal Binaries finished. The
tricky part has been figuring how to maintain 10.2 compatibility in
this whole process. I think we’re finally past this (pending on some
final 10.2 test results).

I just got set up to build Tux Paint for OS X using Xcode.
Unfortunately, when I hit the “10.2.8 compatibility” switch, suddenly
things no longer link.

I have a feeling it might be Fink-related. (We pick up libiconv and
libintl and I think the SDL-and-friends libs from Fink, and the
SDL-and-friends frameworks for Xcode directly from libSDL.org.)

Anyone out here still able to do 10.2.8 builds and would like to give
us a hand? :^) (I can probably walk through all this on an weekday
evening (Pacific time) or weekend, via IRC.)


-bill! Tux Paint 2006 wall calendar,
bill at newbreedsoftware.com CDROM, bumper sticker & apparel
http://www.newbreedsoftware.com/ http://www.cafepress.com/newbreedsw

PS - Thanks to Ryan/icculus for the Mac mini and Martin Fuhrer for the
"how to build Tux Paint on OS X" walk-thru last weekend. :^)

You should define USE_ASMBLIT. Then the cpu mmx/sse/whatever
available
will be used (you also need to use a gcc compiler with i386 defined).
You will also need nasm to compile the hermes blitters. Man, I feel I’ve
been meaning to rewrite these blitters for gcc for ages…

Stephane

Okay, I’m totally out of my league here so I’m going to need some help.

I defined the USE_ASMBLIT and added all the Hermes files. I found a
switch in Xcode called “Use nasm to process .asm files” which I
enabled.

(For the curious, the description of the switch is:
Activating this setting indicates that .asm files should be processed
with nasm instead of as. By default, this build setting is turned off.
This setting only applies to the Intel architecture.
[GCC_USE_NASM_FOR_ASM_FILETYPE, -nasm])

Compiling just the Intel side, I encountered undefined symbols which
prompted me to add the files:
SDL_mixer_MMX.c/h
SDL_yuv_mmx.c

But after doing this, the compile seems to fail on the
SDL_mixer_MMX.c. I get the following 2 error lines:
{standard input}:22:Alignment too large: 15. assumed.
{standard input}:80:Alignment too large: 15. assumed.

This is the build command/log for that one file:
cd /Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL
/usr/bin/gcc-4.0 -x c -arch i386 -pipe -Wno-trigraphs
-fpascal-strings -fasm-blocks -O3 -DENABLE_QUARTZ
-DPTHREAD_NO_RECURSIVE_MUTEX -DSDL_USE_PTHREADS
-DTARGET_API_MAC_CARBON -DTARGET_API_MAC_OSX -DMACOSX -DHAVE_OPENGL
-fmessage-length=0 -ftree-vectorize -msse3 -mmacosx-version-min=10.4
-I/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/SDL.hmap
-Wall -Wno-four-char-constants
-F/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/Deployment
-I/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/Deployment/include
-I/Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Carbon.framework/Headers
-I/Developer/SDKs/MacOSX10.4u.sdk/usr/X11R6/include -I…/…/src
-I…/…/include -I…/…/src/audio -I…/…/src/cdrom -I…/…/src/endian
-I…/…/src/events -I…/…/src/file -I…/…/src/hermes
-I…/…/src/joystick -I…/…/src/main -I…/…/src/thread
-I…/…/src/timer -I…/…/src/video
-I/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/DerivedSources
-DUSE_ASMBLIT -isysroot /Developer/SDKs/MacOSX10.4u.sdk -c
/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/…/…/src/audio/SDL_mixer_MMX.c
-o /Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/Objects-normal/i386/SDL_mixer_MMX.o
{standard input}:22:Alignment too large: 15. assumed.
{standard input}:80:Alignment too large: 15. assumed.

There is also a secondary problem. When compiling the PowerPC side,
Xcode seems to be trying to compile the Hermes .asm files with
/usr/libexec/gcc/darwin/ppc/as which results in a compile failure. The
problem is that I don’t think there is a way in Xcode to conditionally
specify if an actual file is to be compiled or not based on
architecture. This is expected to be done through C-preprocessor
macros. I tried using -fno-asm to disable all assembly processing for
just the PowerPC side, but it also disables the use of the word
’inline’ which apparently is used in the code base.

I don’t know the rules for .asm files, but is there a way we can
encapsulate them somehow so the PPC side can compile? (Or does anybody
know any Xcode tricks for handling this?)

Thanks,
Eric

E. Wing wrote:

You should define USE_ASMBLIT. Then the cpu mmx/sse/whatever
available
will be used (you also need to use a gcc compiler with i386 defined).
You will also need nasm to compile the hermes blitters. Man, I feel I’ve
been meaning to rewrite these blitters for gcc for ages…

Stephane

Okay, I’m totally out of my league here so I’m going to need some help.

I defined the USE_ASMBLIT and added all the Hermes files. I found a
switch in Xcode called “Use nasm to process .asm files” which I
enabled.

(For the curious, the description of the switch is:
Activating this setting indicates that .asm files should be processed
with nasm instead of as. By default, this build setting is turned off.
This setting only applies to the Intel architecture.
[GCC_USE_NASM_FOR_ASM_FILETYPE, -nasm])

Compiling just the Intel side, I encountered undefined symbols which
prompted me to add the files:
SDL_mixer_MMX.c/h
SDL_yuv_mmx.c

But after doing this, the compile seems to fail on the
SDL_mixer_MMX.c. I get the following 2 error lines:
{standard input}:22:Alignment too large: 15. assumed.
{standard input}:80:Alignment too large: 15. assumed.

This is the build command/log for that one file:
cd /Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL
/usr/bin/gcc-4.0 -x c -arch i386 -pipe -Wno-trigraphs
-fpascal-strings -fasm-blocks -O3 -DENABLE_QUARTZ
-DPTHREAD_NO_RECURSIVE_MUTEX -DSDL_USE_PTHREADS
-DTARGET_API_MAC_CARBON -DTARGET_API_MAC_OSX -DMACOSX -DHAVE_OPENGL
-fmessage-length=0 -ftree-vectorize -msse3 -mmacosx-version-min=10.4
-I/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/SDL.hmap
-Wall -Wno-four-char-constants
-F/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/Deployment
-I/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/Deployment/include
-I/Developer/SDKs/MacOSX10.4u.sdk/System/Library/Frameworks/Carbon.framework/Headers
-I/Developer/SDKs/MacOSX10.4u.sdk/usr/X11R6/include -I…/…/src
-I…/…/include -I…/…/src/audio -I…/…/src/cdrom -I…/…/src/endian
-I…/…/src/events -I…/…/src/file -I…/…/src/hermes
-I…/…/src/joystick -I…/…/src/main -I…/…/src/thread
-I…/…/src/timer -I…/…/src/video
-I/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/DerivedSources
-DUSE_ASMBLIT -isysroot /Developer/SDKs/MacOSX10.4u.sdk -c
/Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/…/…/src/audio/SDL_mixer_MMX.c
-o /Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/Objects-normal/i386/SDL_mixer_MMX.o
{standard input}:22:Alignment too large: 15. assumed.
{standard input}:80:Alignment too large: 15. assumed.

Replace each “.align 16” with a “.align 8”.
Doing so is not too big of a performance issue, so it could be used on
all platforms if it works for you.

Stephane

Replace each “.align 16” with a “.align 8”.
Doing so is not too big of a performance issue, so it could be used on
all platforms if it works for you.

Okay, that seemed to work. But I’m getting a linker error now in the last step.

ld: /Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/Objects-normal/i386/SDL_yuv_mmx.o
has local relocation entries in non-writable section (__TEXT,__text)
/usr/bin/libtool: internal link edit command failed

I searched around for this but I don’t see any solutions for it. I’ve
verified that -mdynamic-no-pic is not being invoked, and that’s about
all I could find. Any ideas?

Also, for the PowerPC problem, is there a way to use something like C
preprocessor macros in the .asm files. Basically, I would like to put
a #ifndef ppc block around each file. Strangely when I tried it,
this actually let the PowerPC side compile as normal, but the Intel
side failed because of “error: label or instruction expected at start
of line”.

Thanks,
Eric

E. Wing wrote:

Replace each “.align 16” with a “.align 8”.
Doing so is not too big of a performance issue, so it could be used on
all platforms if it works for you.

Okay, that seemed to work. But I’m getting a linker error now in the last step.

Can you send the exact patch you used to the list so that it can be
tested and applied ?

ld: /Users/ewing/DEVELOPMENT/CODETEST/UniversalBinarySDL/SDL12/Xcode/SDL/build/SDL.build/Deployment/Framework.build/Objects-normal/i386/SDL_yuv_mmx.o
has local relocation entries in non-writable section (__TEXT,__text)
/usr/bin/libtool: internal link edit command failed

I searched around for this but I don’t see any solutions for it. I’ve
verified that -mdynamic-no-pic is not being invoked, and that’s about
all I could find. Any ideas?

Are you sure you’re using the -fPIC flag ? If you’re using an IDE, that
might be an option with the words “position independent code” in it.

Also, for the PowerPC problem, is there a way to use something like C
preprocessor macros in the .asm files. Basically, I would like to put
a #ifndef ppc block around each file. Strangely when I tried it,
this actually let the PowerPC side compile as normal, but the Intel
side failed because of “error: label or instruction expected at start
of line”.

Well, you definitely do not want to do that. I think the right way is to
find how to do conditional file compilation in Xcode. This has to exist :slight_smile:

Stephane

Replace each “.align 16” with a “.align 8”.

Can you send the exact patch you used to the list so that it can be
tested and applied ?

Before I do that, I’ve been posting to the Xcode list for help too and
somebody said this:

“Better replace it with 4. “.align x” means “align to 2^x” here, so
currently you’re aligning to 256 byte boundaries.”

Should I be doing this instead?

verified that -mdynamic-no-pic is not being invoked, and that’s about
Are you sure you’re using the -fPIC flag ? If you’re using an IDE, that
might be an option with the words “position independent code” in it.

According to the gcc man page, -fPIC is the default on Darwin and Mac
OS X. Just in case, I manually added the switch. The only related
switch to be found in the Xcode IDE is “Generate Position-Dependent
Code” which has the description:

Faster function calls for applications. Not appropriate for shared
libraries (which need to be position-independent).
[GCC_DYNAMIC_NO_PIC, -mdynamic-no-pic]

I have made sure to keep this switch off.

The same person on the Xcode list from above said this:
“Since you are using assembler, I guess it’s your assembler that
contains non-pic code.”

I’m not sure if that’s true or not.

Also, for the PowerPC problem, is there a way to use something like C
preprocessor macros in the .asm files.

Well, you definitely do not want to do that. I think the right way is to
find how to do conditional file compilation in Xcode. This has to exist :slight_smile:

Unfortunately, I’m fairly certain this doesn’t exist. That’s why I’m
looking for something like a preprocessor trick. Any other ideas?

Thanks,
Eric

E. Wing wrote:

Replace each “.align 16” with a “.align 8”.

Can you send the exact patch you used to the list so that it can be
tested and applied ?

Before I do that, I’ve been posting to the Xcode list for help too and
somebody said this:

“Better replace it with 4. “.align x” means “align to 2^x” here, so
currently you’re aligning to 256 byte boundaries.”

No, .align X aligns over the next X boundary. He’s probably confusing it
with .p2align X which aligns over the next 2^X.
Try it if you don’t believe me (and here I hope the gcc guys didn’t
change the .align semantics in a recent version :wink:

Should I be doing this instead?

No.

verified that -mdynamic-no-pic is not being invoked, and that’s about

Are you sure you’re using the -fPIC flag ? If you’re using an IDE, that
might be an option with the words “position independent code” in it.

According to the gcc man page, -fPIC is the default on Darwin and Mac
OS X. Just in case, I manually added the switch. The only related
switch to be found in the Xcode IDE is “Generate Position-Dependent
Code” which has the description:

Faster function calls for applications. Not appropriate for shared
libraries (which need to be position-independent).
[GCC_DYNAMIC_NO_PIC, -mdynamic-no-pic]

I have made sure to keep this switch off.

The same person on the Xcode list from above said this:
“Since you are using assembler, I guess it’s your assembler that
contains non-pic code.”

I’m not sure if that’s true or not.

Well, some of this code is not PIC-clean, since no one made it PIC-clean
so far. However, even though this could probably cause crashes at
runtime with some compilers on some systems, this shouldn’t prevent it
from compiling.

To be perfectly honest, I don’t like this code because it adds a
dependency on nasm, and thus can’t be used on platforms with no nasm
support (qnx for example), because it’s not PIC-clean, and because some
ld versions are buggy when linking nasm files. A gcc rewrite has been
for a long time in my todo list, I even started it IIRC…

Also, for the PowerPC problem, is there a way to use something like C
preprocessor macros in the .asm files.

Well, you definitely do not want to do that. I think the right way is to
find how to do conditional file compilation in Xcode. This has to exist :slight_smile:

Unfortunately, I’m fairly certain this doesn’t exist. That’s why I’m
looking for something like a preprocessor trick. Any other ideas?

Yet another reason to rewrite that stuff in gcc style, where you could
simply use gcc #ifdefs :slight_smile:

Also, to use some nasm-specific preprocessor stuff, you’d need to have
nasm (an x86-specific assembler) to work under powerpc, if only in order
to parse the .asm files. It sounds like there is very little chance this
can be accomplished.

Stephane

No, .align X aligns over the next X boundary. He’s probably confusing it
with .p2align X which aligns over the next 2^X.
Try it if you don’t believe me (and here I hope the gcc guys didn’t
change the .align semantics in a recent version :wink:

As I said, I know nothing about this and am perfectly happy to take
your word on it :slight_smile: I actually don’t have the ability to test anything
though. I only have PPC based Macs. I’m just setting up the build
system (and trying to get some binaries ready).

Well, some of this code is not PIC-clean, since no one made it PIC-clean
so far. However, even though this could probably cause crashes at
runtime with some compilers on some systems, this shouldn’t prevent it
from compiling.

To be perfectly honest, I don’t like this code because it adds a
dependency on nasm, and thus can’t be used on platforms with no nasm
support (qnx for example), because it’s not PIC-clean, and because some
ld versions are buggy when linking nasm files. A gcc rewrite has been
for a long time in my todo list, I even started it IIRC…

Well, you definitely do not want to do that. I think the right way is to
find how to do conditional file compilation in Xcode. This has to exist
:slight_smile:

Unfortunately, I’m fairly certain this doesn’t exist. That’s why I’m
looking for something like a preprocessor trick. Any other ideas?

Yet another reason to rewrite that stuff in gcc style, where you could
simply use gcc #ifdefs :slight_smile:

Also, to use some nasm-specific preprocessor stuff, you’d need to have
nasm (an x86-specific assembler) to work under powerpc, if only in order
to parse the .asm files. It sounds like there is very little chance this
can be accomplished.

Okay, so I think we’ve gone as far as we can to support the Intel Macs
until the assembly code is rewritten. It you could bump up the
priority on your to-do list, I’m sure x86 Mac users would be grateful
:slight_smile:

In the meantime, I guess I will setup the build system to not do any
architecture specific optimizations for x86.

Thanks,
Eric> From: Stephane Marchesin <stephane.marchesin at wanadoo.fr>