Faster way?

I need a bit of help speeding something up.
I’m porting a dos based isometric engine to SDL, and I got it running
but I’m having a speed problem. On my PII-450 I get 16 fps at 640x480
and on my P1-150MMX laptop I get 4.8 fps at 640x480. Needless to say
that
is lousy performance. So I really need some ideas on how to speed this
thing up.

Here is the basic design:
The display is set to 640x480x8bpp (SDL_HWSURFACE)
the drawing functions draw to a back buffer (not an SDL surface, but
a big old chunk o’memory pointed to by char *virtual_screen
when the drawing is done it calls my wrapper function
display_screen(char *buffer)
I’m not worried about the map-drawing function right now because
I know it is not optimized at all.
I did some checking and the main loop runs at about 150,000 times/sec
if neither the draw function of screen update function are called. I
also had these functions return right away (so I could find out how
much time was lost with the function call overhead) and the main loop
was still running at over 140,000 times/sec.
If I just put in the display_screen function, it drops to around
39 fps. (at 640x480x8bpp) and if I comment out the SDL_UpdateRects
call it jumps up to around 440 fps, but that is useless because I
don’t see anything that way…

I was hoping that the this function would run much faster than it does
because I can’t see any way to speed it up. Does anyone have any ideas?

_screenWidth _pitch and _bufferSize are globals set when SDL is
initialized

void display_screen(char *buffer) {
int screenOffset = 0;
int bufferOffset;

if (SDL_MUSTLOCK(screen))
{
if (SDL_LockSurface(screen) < 0)
return;
for (bufferOffset = 0; bufferOffset < _bufferSize;)
{
memcpy(&bits[screenOffset],&buffer[bufferOffset],_screenWidth);
screenOffset += _pitch;
bufferOffset += _screenWidth;
}
SDL_UnlockSurface(screen);
}
else
{
for (bufferOffset = 0; bufferOffset < _bufferSize;)
{
memcpy(&bits[screenOffset],&buffer[bufferOffset],_screenWidth);
screenOffset += _pitch;
bufferOffset += _screenWidth;
}
}
SDL_UpdateRect(screen,0,0,_screenWidth,_screenHeight);
}

Thanks, I just put this in it helps a bit, I got the fps up to 18.51.
I think I should be able to get it up to 30fps by optimising the
isometic engine itself. I’l like to get it up to 50 but I don’t think
that’s going to be possible for a (windowed)640x480 display.

			-fjr

John Garrison wrote:> You might want to try using SDL_Flip() instead of your own double buffer. I

have found that SDL_UpdateRect is pretty slow comparted to whatever same
does for SDL_Flip(). In fact the ex2 demo in my PowerDraw lib went up from
10-27 fps just getting rid of all the update rects and adding SDL_Flip() as
the double buffer. The SDL_Flip optimization alone took it from 21fps to
27fps, so that might be what you are looking for.

Why are windowed displays on linux so slow? I have never done any
graphics programming on linux but from my mac background these framerates
seem very low. I am just curious.

SeanOn Sat, 14 Aug 1999, Frank Ramsay wrote:

Thanks, I just put this in it helps a bit, I got the fps up to 18.51.
I think I should be able to get it up to 30fps by optimising the
isometic engine itself. I’l like to get it up to 50 but I don’t think
that’s going to be possible for a (windowed)640x480 display.

  		-fjr

John Garrison wrote:

You might want to try using SDL_Flip() instead of your own double buffer. I
have found that SDL_UpdateRect is pretty slow comparted to whatever same
does for SDL_Flip(). In fact the ex2 demo in my PowerDraw lib went up from
10-27 fps just getting rid of all the update rects and adding SDL_Flip() as
the double buffer. The SDL_Flip optimization alone took it from 21fps to
27fps, so that might be what you are looking for.

Many Mac programs run 60fps on 60mhz 601 cpus. Is this an SDL thing? or
a linux thing? or an X thing? or am I imagining it?

SeanOn Sun, 15 Aug 1999, John Garrison wrote:

@seanh21_at_mail.exec wrote:

Why are windowed displays on linux so slow? I have never done any
graphics programming on linux but from my mac background these framerates
seem very low. I am just curious.

The Dgen Sega Genesis emulator has been known to run upwards of 60fps on my 400mhz
computer, so I wouldn’t call linux windowed displays slow.

I think it’s just because the code I’m working with is sooo very
unoptimized. I’m still trying to figure it out (as I said earlier
I’m porting this from DOS, so initially all I did was change the
graphics $ keyboard IO) but it looks like the engine examines every
map point, which is really the wrong way to do it. You should figure
out the range of map points that can be seen, then only look at those.

				-fjr

seanh21 at mail.execpc.com wrote:>

Many Mac programs run 60fps on 60mhz 601 cpus. Is this an SDL thing? or
a linux thing? or an X thing? or am I imagining it?

Sean

On Sun, 15 Aug 1999, John Garrison wrote:

seanh21 at mail.execpc.com wrote:

Why are windowed displays on linux so slow? I have never done any
graphics programming on linux but from my mac background these framerates
seem very low. I am just curious.

The Dgen Sega Genesis emulator has been known to run upwards of 60fps on my 400mhz
computer, so I wouldn’t call linux windowed displays slow.

You might want to try using SDL_Flip() instead of your own double buffer. I
have found that SDL_UpdateRect is pretty slow comparted to whatever same
does for SDL_Flip(). In fact the ex2 demo in my PowerDraw lib went up from
10-27 fps just getting rid of all the update rects and adding SDL_Flip() as
the double buffer. The SDL_Flip optimization alone took it from 21fps to
27fps, so that might be what you are looking for.

Frank Ramsay wrote:> I need a bit of help speeding something up.

I’m porting a dos based isometric engine to SDL, and I got it running
but I’m having a speed problem. On my PII-450 I get 16 fps at 640x480
and on my P1-150MMX laptop I get 4.8 fps at 640x480. Needless to say
that
is lousy performance. So I really need some ideas on how to speed this
thing up.

Here is the basic design:
The display is set to 640x480x8bpp (SDL_HWSURFACE)
the drawing functions draw to a back buffer (not an SDL surface, but
a big old chunk o’memory pointed to by char *virtual_screen
when the drawing is done it calls my wrapper function
display_screen(char *buffer)
I’m not worried about the map-drawing function right now because
I know it is not optimized at all.
I did some checking and the main loop runs at about 150,000 times/sec
if neither the draw function of screen update function are called. I
also had these functions return right away (so I could find out how
much time was lost with the function call overhead) and the main loop
was still running at over 140,000 times/sec.
If I just put in the display_screen function, it drops to around
39 fps. (at 640x480x8bpp) and if I comment out the SDL_UpdateRects
call it jumps up to around 440 fps, but that is useless because I
don’t see anything that way…

I was hoping that the this function would run much faster than it does
because I can’t see any way to speed it up. Does anyone have any ideas?

_screenWidth _pitch and _bufferSize are globals set when SDL is
initialized

void display_screen(char *buffer) {
int screenOffset = 0;
int bufferOffset;

if (SDL_MUSTLOCK(screen))
{
if (SDL_LockSurface(screen) < 0)
return;
for (bufferOffset = 0; bufferOffset < _bufferSize;)
{
memcpy(&bits[screenOffset],&buffer[bufferOffset],_screenWidth);
screenOffset += _pitch;
bufferOffset += _screenWidth;
}
SDL_UnlockSurface(screen);
}
else
{
for (bufferOffset = 0; bufferOffset < _bufferSize;)
{
memcpy(&bits[screenOffset],&buffer[bufferOffset],_screenWidth);
screenOffset += _pitch;
bufferOffset += _screenWidth;
}
}
SDL_UpdateRect(screen,0,0,_screenWidth,_screenHeight);
}

seanh21 at mail.execpc.com wrote:

Why are windowed displays on linux so slow? I have never done any
graphics programming on linux but from my mac background these framerates
seem very low. I am just curious.

The Dgen Sega Genesis emulator has been known to run upwards of 60fps on my 400mhz
computer, so I wouldn’t call linux windowed displays slow.

AUGH!
I give up, if anyone want to try their hand as speeding this
isometric engine up I could use the help, here is the tarball.

http://fjramsay.simplenet.com/pub/isotest.tar.gz

I’ve noticed one piece of really strange behavior, if you run it as
root you get a highter frame rate.

I really want a user account to hit 50fps with this.

		-fjr

Hello,

Frank Ramsay wrote:

I need a bit of help speeding something up.
I’m porting a dos based isometric engine to SDL, and I got it running
but I’m having a speed problem. On my PII-450 I get 16 fps at 640x480
and on my P1-150MMX laptop I get 4.8 fps at 640x480. Needless to say
that
is lousy performance. So I really need some ideas on how to speed this
thing up.

Here is the basic design:
The display is set to 640x480x8bpp (SDL_HWSURFACE)
the drawing functions draw to a back buffer (not an SDL surface, but
a big old chunk o’memory pointed to by char *virtual_screen
when the drawing is done it calls my wrapper function
display_screen(char *buffer)
I’m not worried about the map-drawing function right now because
I know it is not optimized at all.
I did some checking and the main loop runs at about 150,000 times/sec
if neither the draw function of screen update function are called. I
also had these functions return right away (so I could find out how
much time was lost with the function call overhead) and the main loop
was still running at over 140,000 times/sec.
If I just put in the display_screen function, it drops to around
39 fps. (at 640x480x8bpp) and if I comment out the SDL_UpdateRects
call it jumps up to around 440 fps, but that is useless because I
don’t see anything that way…

advise you to avoid
updates of the whole screen. The 39 fps you get for your app only
updating the screen is not that bad.
The graphical isometric frontend for worldforge
(http://www.worldforge.org), which I am developing, tries to figure out
which parts of the screen actually need an screen update and so it
updates only the needed parts. (if you only need to update half of the
screen you get 80fps …) Well if I need to scroll your screen, there is
(currently … ) no way but to update the whole screen, but scrolling
even with 20fps does not look that bad.

Using your own backbuffer is quite a good idea, especially if you plan
any alpha blits, because readaccess to the “real” screenbuffer (either
videoram or shared mem) is always slower compared to main memory access.

just my 0.02 euro.>From my experience with XWindows and running SDL in a window, I would


Karsten-O. Laux
klaux at student.uni-kl.de
http://www.rhrk.uni-kl.de/~klaux
UIN 21614933 (Bert)

Ya I started a thread on that before. When you run it as root SDL will use
DGA so it will run faster. Im writing up my own isometric engine btw. Still
working on tiling it.

-Garrett, WPI student majoring in Computer Science.

“He who joyfully marches in rank and file has already earned
my contempt. He has been given a large brain by mistake, since
for him the spinal cord would suffice.” -Albert EinsteinOn Sat, 14 Aug 1999, you wrote:

I’ve noticed one piece of really strange behavior, if you run it as
root you get a highter frame rate.

It doesn’t seem to be working that way on my system. I just got a new
box (P3 450) and I get between 18 and 18.5 fps no matter who I run it
as or what I do.

To origional poster: One thing I did notice is that the mouse flickers
like crazy when it’s over the isotest window. That’s leading me to
believe that it’s trying to redraw the whole screen with each frame.
This is generally a Bad Idea ™ unless the whole screen changes
between each frame. It would probably take some major restructuring of
the code, but if you could implement an “updates as required” method to
only update the parts of the screen that need it, I’m willing to bet
your frame rates would jump quite a bit right there.

-KWOn 14 Aug, Garrett wrote:

Ya I started a thread on that before. When you run it as root SDL will use
DGA so it will run faster. Im writing up my own isometric engine btw. Still
working on tiling it.

This makes since, since the kind of engine I’m thinking of would be
for real-time strategy (WarCraft, Civ:CTP, Age of Empires, etc) I don’t
need to worry about having layers above the units. I could have three
buffers:
One for the ground stuff (ground/trees/rocks/etc)
One for the Units & Buildings
A combined buffer, (the traditional back buffer)

This way I just have to track the movement on the Unit buffer,
merge that into the combined buffer and update those areas
of the screen. The only time I’d have to update the entire screen
is when the view window moves… Even then I could speed it up by
having the ground buffer contain a larger area then the view shows
and just grabbing the rectangle I need for the view from it, this
would prevent the engine from having to re-render the ground layer
for every screen move…
This also means I don’t have to optimize the rendering engine any
further, since it would not get used nearly as much. I just did
a quick test by setting the screen size to 320x200 (figure 1/4 of
a 640x480 screen changes for each frame) this gives root almost
80fps and the regular user over 40fps.
I think I’m on to something, thanks!

		-fjr

btw, when will there be a graphical front end for worldforge? I read
about it on linuxgames the other day, but it seemed to indicate a text
mode front end was being developed first.

Karsten Laux wrote:>

From my experience with XWindows and running SDL in a window, I would
advise you to avoid
updates of the whole screen. The 39 fps you get for your app only
updating the screen is not that bad.
The graphical isometric frontend for worldforge
(http://www.worldforge.org), which I am developing, tries to figure out
which parts of the screen actually need an screen update and so it
updates only the needed parts. (if you only need to update half of the
screen you get 80fps …) Well if I need to scroll your screen, there is
(currently … ) no way but to update the whole screen, but scrolling
even with 20fps does not look that bad.

Using your own backbuffer is quite a good idea, especially if you plan
any alpha blits, because readaccess to the “real” screenbuffer (either
videoram or shared mem) is always slower compared to main memory access.

Frank Ramsay wrote:

This makes since, since the kind of engine I’m thinking of would be
for real-time strategy (WarCraft, Civ:CTP, Age of Empires, etc) I don’t
need to worry about having layers above the units. I could have three
buffers:
One for the ground stuff (ground/trees/rocks/etc)
One for the Units & Buildings
A combined buffer, (the traditional back buffer)

This way I just have to track the movement on the Unit buffer,
merge that into the combined buffer and update those areas
of the screen. The only time I’d have to update the entire screen
is when the view window moves… Even then I could speed it up by
having the ground buffer contain a larger area then the view shows
and just grabbing the rectangle I need for the view from it, this
would prevent the engine from having to re-render the ground layer
for every screen move…
This also means I don’t have to optimize the rendering engine any
further, since it would not get used nearly as much. I just did
a quick test by setting the screen size to 320x200 (figure 1/4 of
a 640x480 screen changes for each frame) this gives root almost
80fps and the regular user over 40fps.
I think I’m on to something, thanks!

:slight_smile: this is good to hear.

                  -fjr

btw, when will there be a graphical front end for worldforge? I read
about it on linuxgames the other day, but it seemed to indicate a text
mode front end was being developed first.

hm, in fact the graphical frontend, named uclient, is currently the most
advanced frontend.
–>> http://www.worldforge.org/website/clients/uclient--
Karsten-O. Laux
klaux at student.uni-kl.de
http://www.rhrk.uni-kl.de/~klaux
UIN 21614933 (Bert)

Frank Ramsay wrote:

This makes since, since the kind of engine I’m thinking of would be
for real-time strategy (WarCraft, Civ:CTP, Age of Empires, etc) I don’t

Uhm… Playing Civ:CTP for two weeks now I didn’t recognize yet it is a real
time strategy ;-))

Vasek

DOH! Your right, I was thinking of the iso-engine itself, very similar
to
a real time stragegy one.

		-fjr

Vaclav Slavik wrote:>

Frank Ramsay wrote:

This makes since, since the kind of engine I’m thinking of would be
for real-time strategy (WarCraft, Civ:CTP, Age of Empires, etc) I don’t

Uhm… Playing Civ:CTP for two weeks now I didn’t recognize yet it is a real
time strategy ;-))

Vasek

seanh21 at mail.execpc.com wrote:

Many Mac programs run 60fps on 60mhz 601 cpus. Is this an SDL thing? or
a linux thing? or an X thing? or am I imagining it?

Well, in all fairness you can’t compare a native mac app to a emulation. I think you
are imagining it. I haven’t really had any problem with Linux speed. In fact Basilisk
II in a window is much faster on Linux than Windows.>

Sean

On Sun, 15 Aug 1999, John Garrison wrote:

seanh21 at mail.execpc.com wrote:

Why are windowed displays on linux so slow? I have never done any
graphics programming on linux but from my mac background these framerates
seem very low. I am just curious.

The Dgen Sega Genesis emulator has been known to run upwards of 60fps on my 400mhz
computer, so I wouldn’t call linux windowed displays slow.

Well, all Linux-promotion aside, there was a fundamental design
decision in X Windows to provide a network-aware GUI. X Windows’
design is network-based, and although there are extensions to provide
direct access to an X server for increased speed, a non-root user
generally doesn’t have access to these, for security reasons. The
networking ability of X allows you to remotely run applications, but
performance may suffer for local access to the screen, especially if a
non-root user is running the program.

 MacOS and Windows don't care so much about security, so they give 
 regular users direct access to the video buffer (which could let one 
 program grab data from another program without it's knowledge or 
 consent, a no-no for a secure OS).
 
 If you need the speed that DGA gives, for a non-root user, you can 
 make your program suid root.  Just be aware that the onus of security 
 now falls on your program, so if someone figures out how to break into 
 your program, they can still cause havoc.
 
 If the program is going to be running on a single-user home machine 
 (as most games do), security isn't such a great concern.  In this 
 case, you can provide an install program that makes the game 
 executable suid root (giving appropriate warning to the user).  Of 
 course, the install itself must be run as root, but that's normal for 
 installing shared binaries.  You can have your installer detect if 
 it's not run as root, and in that case, warn the user that they won't 
 get the increased performance of DGA unless they manually make the 
 game binary suid root.
 
 Warren E. Downs

______________________________ Reply Separator _________________________________Subject: Re: [SDL] faster way?
Author: at internet-mail
Date: 8/15/99 10:55 PM

seanh21 at mail.execpc.com wrote:

Many Mac programs run 60fps on 60mhz 601 cpus. Is this an SDL thing? or
a linux thing? or an X thing? or am I imagining it?

Well, in all fairness you can’t compare a native mac app to a emulation. I think
you
are imagining it. I haven’t really had any problem with Linux speed. In fact
Basilisk
II in a window is much faster on Linux than Windows.

Sean

On Sun, 15 Aug 1999, John Garrison wrote:

seanh21 at mail.execpc.com wrote:

Why are windowed displays on linux so slow? I have never done any
graphics programming on linux but from my mac background these framerates
seem very low. I am just curious.

The Dgen Sega Genesis emulator has been known to run upwards of 60fps on my
400mhz

computer, so I wouldn’t call linux windowed displays slow.

Warren Downs wrote:

 If the program is going to be running on a single-user home machine
 (as most games do), security isn't such a great concern.  In this
 case, you can provide an install program that makes the game
 executable suid root (giving appropriate warning to the user).  Of
 course, the install itself must be run as root, but that's normal for
 installing shared binaries.  You can have your installer detect if
 it's not run as root, and in that case, warn the user that they won't
 get the increased performance of DGA unless they manually make the
 game binary suid root.

Except security issues, running as suid root has one more disadvantage : meaning of
$HOME changes so if your game saves something e.g. in ~/.mygame it will be saved
in root’s home directory :frowning:

To solve this, I would write a wrapper (script or binary) that
retrieves the correct home directory before running the suid game
binary. If you don’t want people running the binary directly, require
a parameter (say, the real home directory), and if not provided, give
a message about needing to run instead. The wrapper would
of course provide that parameter to the suid binary.

 One additional design method would be to make a general game engine 
 that runs suid root and can access the screen via DGA.  Then, have the 
 main game program, which provides the game's "character", as a 
 separate binary, which communicates with the suid root binary via an 
 IPC method, feeding it the graphics, sound, etc.  Basically the same 
 idea as X, except moving the rendering engine across the IPC barrier.
 
 Warren

______________________________ Reply Separator _________________________________Subject: Re: [SDL] faster way?
Author: at internet-mail
Date: 8/17/99 10:17 AM

Warren Downs wrote:

 If the program is going to be running on a single-user home machine 
 (as most games do), security isn't such a great concern.  In this
 case, you can provide an install program that makes the game
 executable suid root (giving appropriate warning to the user).  Of
 course, the install itself must be run as root, but that's normal for 
 installing shared binaries.  You can have your installer detect if
 it's not run as root, and in that case, warn the user that they won't 
 get the increased performance of DGA unless they manually make the
 game binary suid root.

Except security issues, running as suid root has one more disadvantage : meaning
of
$HOME changes so if your game saves something e.g. in ~/.mygame it will be
saved
in root’s home directory :frowning:

And in this case, you need to make the suid root executable only
execute scripts from a certain directory (not an option that the user
passes in), then make that hard-coded directory owned by root. The
directory itself should only be root-writable (so no new files can be
added by others), and the files in it should also be only root
writable. This is the only way to securely execute scripts from
within a suid root binary. And of course, the binary itself must be
only root writable.

 Warren

______________________________ Reply Separator _________________________________Subject: Re: [SDL] faster way?
Author: at internet-mail
Date: 8/19/99 6:13 AM

Vaclav Slavik wrote:

Warren Downs wrote:

 If the program is going to be running on a single-user home machine 
 (as most games do), security isn't such a great concern.  In this
 case, you can provide an install program that makes the game
 executable suid root (giving appropriate warning to the user).  Of
 course, the install itself must be run as root, but that's normal for 
 installing shared binaries.  You can have your installer detect if
 it's not run as root, and in that case, warn the user that they won't 
 get the increased performance of DGA unless they manually make the
 game binary suid root.

Except security issues, running as suid root has one more disadvantage :
meaning of
$HOME changes so if your game saves something e.g. in ~/.mygame it will be
saved
in root’s home directory :frowning:

    Err, I think there's another problem, I'm using for my OpenVentura

Isometric Game Developement System embeded python, so ? the python code
is executed as root ! ? anyone can do anything ! :frowning:

    So by the time, everyone want to use a extension language cant chroot

his game…

    well, im not sure but...
 
 
    Letter writen for Malaga / Spain, we're in our mayor fest "la feria"