Speed weirdness

I am having a strange speed problem. I am running on a 3Ghz XP
machine with the desktop set to 32 bit color mode.

I have a 640x480 background buffer (returned by the SDL_Image lib
loader) that I blit to a 640x480 screen buffer created with:

SDL_SetVideoMode(640, 480, 32, SDL_SWSURFACE);

I then blit 100 70x70 alpha-channel images to the screen and do a full
screen SDL_UpdateRect().

I am getting around 150fps. This seems slow to me, so I started
playing around. Nothing seemed to speed it up.

Then I noticed something strange. When my mouse cursor went over the
"close" box of the SDL window, a tool-tip popped up and for about 1
second after that, the fps jumped way up, then back to 150fps. I did
a little experimenting and found that if I dropped a menu down in
another app (Windows Task Manager for example) and left it down, my
fps went to 300+ and stayed there until I clicked to get rid of the
menu.

I checked the CPU usage and my SDL app is only using 30% of the CPU.
Why is this? And why does dropping a menu on another app cause the
frame rate to double? Can anyone else reproduce this? When my app
has focus, I’d like it to be use a good chuck of the CPU.

Is this a SDL thing? Or a Windows thing?

Tankko

I am having a strange speed problem. I am running on a 3Ghz XP
machine with the desktop set to 32 bit color mode.

I have a 640x480 background buffer (returned by the SDL_Image lib
loader) that I blit to a 640x480 screen buffer created with:

SDL_SetVideoMode(640, 480, 32, SDL_SWSURFACE);

I then blit 100 70x70 alpha-channel images to the screen and do a full
screen SDL_UpdateRect().

I am getting around 150fps. This seems slow to me, so I started
playing around. Nothing seemed to speed it up.

  1. SDL_SWSURFACE. What for?
  2. Why force a specific pixel depth? Why not use SDL_ANYFORMAT and convert
    your surfaces to the appropriate format? Even in the same pixel depth SDL
    might have to do something like convert between orderings.
  3. are your surfaces in screen format? Or are you having it do on-the-fly
    conversion in blits? If your screen surface is not 100% the same as the OS’s
    screen this could be a double-whammy, convert-blitting to the screen then
    convert-blitting to the OS.
  4. Alpha channel blending is a full-out read-modify-write operation. Unless
    you arrange some form of hardware acceleration, it’s going to be slow. Note
    that alpha=128 is a special case for which SDL is optimized.
  5. Full-screen updating. If you’re going to be updating the entire screen
    every frame, why not use SDL_DOUBLEBUF and SDL_Flip? That’s what it’s there
    for.
  6. When you run the numbers, 150FPS in 64048032 software mode is 175 MB/s
    channeled through your CPU without hardware accel for the screen update
    alone. Considering all the alpha blits it works out to around 525MB/s. I’d
    say that’s pretty impressive.

Then I noticed something strange. When my mouse cursor went over the
"close" box of the SDL window, a tool-tip popped up and for about 1
second after that, the fps jumped way up, then back to 150fps. I did
a little experimenting and found that if I dropped a menu down in
another app (Windows Task Manager for example) and left it down, my
fps went to 300+ and stayed there until I clicked to get rid of the
menu.
Can’t say, except I expect that FPS reading is illusory somehow.

When my app has focus, I’d like it to be use a good chuck of the CPU.
20 years from now the surviving mutant roaches on this planet will wonder why
your program consumes 100% CPU to do essentially nothing, the same way I
wonder why freaking Dungeons of Daggorath feels the need to pin my dual
opteron.

Consuming 100% CPU is also bad for delicate timings in your program, not
good. This is because the operating system will temporarily suspend your
program, often, and at the worst possible times, because your program’s

  1. so impolite that it’s put in a low-priority queue out of self-defense
  2. giving no hints whatsoever about what time it would prefer to be suspendedOn Sunday 17 April 2005 09:24, Tankko Omaskio wrote:
  1. SDL_SWSURFACE. What for?

Please understand, I am doing speed tests. I want to see what the SW speed is.

  1. Why force a specific pixel depth? Why not use SDL_ANYFORMAT and convert
    your surfaces to the appropriate format? Even in the same pixel depth SDL
    might have to do something like convert between orderings.

Interesting. I’ll explore this.

  1. are your surfaces in screen format? Or are you having it do on-the-fly
    conversion in blits? If your screen surface is not 100% the same as the OS’s
    screen this could be a double-whammy, convert-blitting to the screen then
    convert-blitting to the OS.

All surface are in screen format

  1. Alpha channel blending is a full-out read-modify-write operation. Unless
    you arrange some form of hardware acceleration, it’s going to be slow. Note
    that alpha=128 is a special case for which SDL is optimized.

True, but I need full alpha channel blending.

  1. Full-screen updating. If you’re going to be updating the entire screen
    every frame, why not use SDL_DOUBLEBUF and SDL_Flip? That’s what it’s there
    for.

Again, I am doing speed tests.

Can’t say, except I expect that FPS reading is illusory somehow.

It’s not just the FPS printed, the demo visibly runs much much faster.

Consuming 100% CPU is also bad for delicate timings in your program, not
good. This is because the operating system will temporarily suspend your
program, often, and at the worst possible times, because your program’s

  1. so impolite that it’s put in a low-priority queue out of self-defense
  2. giving no hints whatsoever about what time it would prefer to be suspended

True, but once again, I am running benchmarks. But, even with a final
app, I would expect more the 30% when the app had focus.

The real question of this post had to do with the doubling of speed
with a menu pulled down on another app, all the rest was background,
but you bring up some great points for optimization once my speed
tests are done.

Tankko

  1. SDL_SWSURFACE. What for?

Please understand, I am doing speed tests. I want to see what the SW speed
is.
Understood. :slight_smile: Sorry if I sounded fruistrated, I was unaware it was a
benchmark. We get ‘why do I only get 10FPS in 1600128032 software
doublebuffering’ questions every other week, and all too often I have to hack
quickly-written SDL apps to correct their excessive CPU consumption; running
a a triple-duty server/compiler/gaming box means I’m unusually conscious of
waste.

  1. Full-screen updating. If you’re going to be updating the entire
    screen
    every frame, why not use SDL_DOUBLEBUF and SDL_Flip? That’s what
    it’s there for.

Again, I am doing speed tests.
Still, since you’re doing complete fullscreen redraws, you might as well use
SDL_DOUBLEBUF and SDL_Flip(). It won’t confer any advantage in software mode
but hardware doublebuffering really flies if you can get it – 50% less
data movement for every frame.

Can’t say, except I expect that FPS reading is illusory somehow.

It’s not just the FPS printed, the demo visibly runs much much faster.
SDL and/or the OS might not bother updating covered parts leading to increased
framerates via less work, but a tooltip doesn’t cover THAT much. Hard to say
without seeing the code, I don’t suppose you could post a minimal example
that reproduces this?

Consuming 100% CPU is also bad for delicate timings in your program,
not good. This is because the operating system will temporarily suspend
your program, often, and at the worst possible times, because your
program’s 1) so impolite that it’s put in a low-priority queue out of
self-defense 2) giving no hints whatsoever about what time it would
prefer to be suspended

True, but once again, I am running benchmarks. But, even with a final
app, I would expect more the 30% when the app had focus.
Why? If your app can do the same thing with only 20%, or 10%, or 1%, why use
30? The more you can do with less the faster your final app will be, and the
less your app does with more the slower it will be. And CPU usage will cease
to be a useful benchmark once it’s hardware accelerated anyway. I still
think 520MB/s is impressive speed as is, you might be better off considering
ways to do more with that 520.

The real question of this post had to do with the doubling of speed
with a menu pulled down on another app, all the rest was background,
but you bring up some great points for optimization once my speed
tests are done.

Tankko
Good, glad I’m helping.On Monday 18 April 2005 11:22, Tankko Omaskio wrote: