Hi there,
I recently tried to use SDL 1.3 (latest SVN development version) with
ScummVM on Mac OS XThe main motivation is that we are looking into
vsync enabled fullscreen graphics code for various reasons.
And the current SDL 1.2 quartz backend is so totally utterly messed
up (I feel entitled to say that, as I did contribute my share of code
to it and share the blame that it’s virtually impossible to add
that there. In fact, while it has code which tries to implement VBL
syncing, the way it’s done and the way CoreGraphics works with LCD
screens, means that the VBL syncing code actually increases the
tearing effect, instead of hiding it.
Anyway, I briefly considered ripping out the existing fullscreen code
and replacing it with something new, when I discovered that SDL 1.3
already does that. Great. Very elegant new code, I must say, kudos to
Sam and Ryan and everybody else responsible.
However, when I tried ScummVM with the new API (using the SDL_compat
compatibility layer), I was disappointed to see that it suddenly took
up 90+% of my CPU power; the whole thing was unbearably sluggish.
Upon investigating with Shark, it turned out that it spends virtually
all of that time doing internal texture conversions, in a function
called glgProcessPixels. Ouch!
Some googling quickly revealed the following insightful pointers:
<http://developer.apple.com/documentation/GraphicsImaging/Conceptual/
OpenGL-MacProgGuide/opengl_performance/chapter_13_section_4.html>
<http://developer.apple.com/documentation/GraphicsImaging/Conceptual/
OpenGL-MacProgGuide/opengl_performance/chapter_13_section_2.html#//
apple_ref/doc/uid/TP40001987-CH213-SW23>
So, in short, if you use 8888 mode for your texture data, you are
fine, if you use 1555 mode, it’s still OK, anything else will cause
you PAIN PAIN PAIN. Sure enough, quickly hacking the ScummVM code to
allocate a (1)555 (15 bpp) surface instead of a 565 (16 bpp) surface
gave a big speed boost; the app was usable again (but still took
about 50% CPU time).
My question now: Would it be possible to take this into account into
the compat layer, resp. the SDL Cocoa OpenGL code? Given that Apple
specifically documents this bottleneck… Because it seems that SDL
does a much better job doing these bitmap conversions. At least on my
PowerBook G4 1.5 Ghz, with Radeon 9700 XT onboard graphics. As a
quick test for that claim, I changed SDL_compat.c, replacing in lin 487
SDL_VideoTexture = SDL_CreateTexture(desired_format,
SDL_TEXTUREACCESS_LOCAL, width, height);
by
SDL_VideoTexture = SDL_CreateTexture(SDL_PIXELFORMAT_RGB888,
SDL_TEXTUREACCESS_LOCAL, width, height);
and the speed also went up to “fast enough” again (although the
colors were incorrect this way).
Bye,
Max