Performance SDL 1.2.7 vs Allegro

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve the
FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra note,
without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the video
    with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1 second
    in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but they
    refused to work citing ‘failed to initialize 0x0000018’ at runtime which
    i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.

have you ever heard of vsync?

if vsync is on, your fps is limited to the verticle retrace of your monitor.
60hz and 70hz are common settings, which means your monitor could be set to
70 and thats why you are maxing out at 70 fps in sdl

what vsync does is wait to write to video memory until the verticle retrace
interupt fires, it does this to avoid tearing. Have you ever drawn an image
to the screen really fast, or like a colored rectangle and had it be
flickery? thats tearing and is the result of drawing an image to memory
while the monitor is being updated (ie the first frame might only get 1/3 of
the box, next frame only gets 2/3 of the box and third frame has the full
box then it starts over).

vsync is a good thing, and should only really be turned off when you are
benchmarking, but here’s some code to turn on and off vsync (:

void VSyncOn(char On)
{
typedef void (APIENTRY * WGLSWAPINTERVALEXT) (int);

WGLSWAPINTERVALEXT wglSwapIntervalEXT = (WGLSWAPINTERVALEXT)
wglGetProcAddress(“wglSwapIntervalEXT”);
if (wglSwapIntervalEXT)
{
wglSwapIntervalEXT(On); // set or unset vertical synchronisation
}
}

let us know if that changes the results of your benchmarking at all.> ----- Original Message -----

From: grembo@swiftdsl.com.au (grembo)
To: “SDL ML”
Sent: Thursday, August 12, 2004 9:03 PM
Subject: [SDL] Performance SDL 1.2.7 vs Allegro

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve the
FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra note,
without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the video
    with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1 second
    in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but they
    refused to work citing ‘failed to initialize 0x0000018’ at runtime which
    i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Thanks for your reply :slight_smile:

I am not using OpenGL but rather DirectX. The DirectX refresh rate of
the said mode is forced to 100Hz via my video driver. OpenGL vsync has
already been forced off via my video driver. I do fully understand vsync
etc. While the theory seems logical, it falls apart when doubling this
statement halves the FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL)
in double buffered mode.

Alan Wolfe wrote:>have you ever heard of vsync?

if vsync is on, your fps is limited to the verticle retrace of your monitor.
60hz and 70hz are common settings, which means your monitor could be set to
70 and thats why you are maxing out at 70 fps in sdl

what vsync does is wait to write to video memory until the verticle retrace
interupt fires, it does this to avoid tearing. Have you ever drawn an image
to the screen really fast, or like a colored rectangle and had it be
flickery? thats tearing and is the result of drawing an image to memory
while the monitor is being updated (ie the first frame might only get 1/3 of
the box, next frame only gets 2/3 of the box and third frame has the full
box then it starts over).

vsync is a good thing, and should only really be turned off when you are
benchmarking, but here’s some code to turn on and off vsync (:

void VSyncOn(char On)
{
typedef void (APIENTRY * WGLSWAPINTERVALEXT) (int);

WGLSWAPINTERVALEXT wglSwapIntervalEXT = (WGLSWAPINTERVALEXT)
wglGetProcAddress(“wglSwapIntervalEXT”);
if (wglSwapIntervalEXT)
{
wglSwapIntervalEXT(On); // set or unset vertical synchronisation
}
}

let us know if that changes the results of your benchmarking at all.

----- Original Message -----
From: “grembo” <@grembo>
To: “SDL ML”
Sent: Thursday, August 12, 2004 9:03 PM
Subject: [SDL] Performance SDL 1.2.7 vs Allegro

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve the
FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra note,
without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the video
    with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1 second
    in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but they
    refused to work citing ‘failed to initialize 0x0000018’ at runtime which
    i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

P.S. I just tried your code just so you know it has made no difference,
but I really do appreciate your efforts and I thank you.

Alan Wolfe wrote:>have you ever heard of vsync?

if vsync is on, your fps is limited to the verticle retrace of your monitor.
60hz and 70hz are common settings, which means your monitor could be set to
70 and thats why you are maxing out at 70 fps in sdl

what vsync does is wait to write to video memory until the verticle retrace
interupt fires, it does this to avoid tearing. Have you ever drawn an image
to the screen really fast, or like a colored rectangle and had it be
flickery? thats tearing and is the result of drawing an image to memory
while the monitor is being updated (ie the first frame might only get 1/3 of
the box, next frame only gets 2/3 of the box and third frame has the full
box then it starts over).

vsync is a good thing, and should only really be turned off when you are
benchmarking, but here’s some code to turn on and off vsync (:

void VSyncOn(char On)
{
typedef void (APIENTRY * WGLSWAPINTERVALEXT) (int);

WGLSWAPINTERVALEXT wglSwapIntervalEXT = (WGLSWAPINTERVALEXT)
wglGetProcAddress(“wglSwapIntervalEXT”);
if (wglSwapIntervalEXT)
{
wglSwapIntervalEXT(On); // set or unset vertical synchronisation
}
}

let us know if that changes the results of your benchmarking at all.

----- Original Message -----
From: “grembo” <@grembo>
To: “SDL ML”
Sent: Thursday, August 12, 2004 9:03 PM
Subject: [SDL] Performance SDL 1.2.7 vs Allegro

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve the
FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra note,
without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the video
    with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1 second
    in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but they
    refused to work citing ‘failed to initialize 0x0000018’ at runtime which
    i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

wow im surprised did you call it like this?

VSyncOn(0);

im not sure if you have to call it before or after setting the video mode or
if that even matters> ----- Original Message -----

From: grembo@swiftdsl.com.au (grembo)
To: "A list for developers using the SDL library. (includes SDL-announce)"

Sent: Thursday, August 12, 2004 9:41 PM
Subject: Re: [SDL] Performance SDL 1.2.7 vs Allegro

P.S. I just tried your code just so you know it has made no difference,
but I really do appreciate your efforts and I thank you.

Alan Wolfe wrote:

have you ever heard of vsync?

if vsync is on, your fps is limited to the verticle retrace of your
monitor.

60hz and 70hz are common settings, which means your monitor could be set
to

70 and thats why you are maxing out at 70 fps in sdl

what vsync does is wait to write to video memory until the verticle
retrace

interupt fires, it does this to avoid tearing. Have you ever drawn an
image

to the screen really fast, or like a colored rectangle and had it be
flickery? thats tearing and is the result of drawing an image to memory
while the monitor is being updated (ie the first frame might only get 1/3
of

the box, next frame only gets 2/3 of the box and third frame has the full
box then it starts over).

vsync is a good thing, and should only really be turned off when you are
benchmarking, but here’s some code to turn on and off vsync (:

void VSyncOn(char On)
{
typedef void (APIENTRY * WGLSWAPINTERVALEXT) (int);

WGLSWAPINTERVALEXT wglSwapIntervalEXT = (WGLSWAPINTERVALEXT)
wglGetProcAddress(“wglSwapIntervalEXT”);
if (wglSwapIntervalEXT)
{
wglSwapIntervalEXT(On); // set or unset vertical synchronisation
}
}

let us know if that changes the results of your benchmarking at all.

----- Original Message -----
From: “grembo”
To: “SDL ML”
Sent: Thursday, August 12, 2004 9:03 PM
Subject: [SDL] Performance SDL 1.2.7 vs Allegro

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve the
FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra note,
without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the video
    with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1 second
    in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but they
    refused to work citing ‘failed to initialize 0x0000018’ at runtime which
    i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

oh ok, nm then maybe someone else knows that is an odd one> ----- Original Message -----

From: grembo@swiftdsl.com.au (grembo)
To: "A list for developers using the SDL library. (includes SDL-announce)"

Sent: Thursday, August 12, 2004 9:28 PM
Subject: Re: [SDL] Performance SDL 1.2.7 vs Allegro

Thanks for your reply :slight_smile:

I am not using OpenGL but rather DirectX. The DirectX refresh rate of
the said mode is forced to 100Hz via my video driver. OpenGL vsync has
already been forced off via my video driver. I do fully understand vsync
etc. While the theory seems logical, it falls apart when doubling this
statement halves the FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL)
in double buffered mode.

Alan Wolfe wrote:

have you ever heard of vsync?

if vsync is on, your fps is limited to the verticle retrace of your
monitor.

60hz and 70hz are common settings, which means your monitor could be set
to

70 and thats why you are maxing out at 70 fps in sdl

what vsync does is wait to write to video memory until the verticle
retrace

interupt fires, it does this to avoid tearing. Have you ever drawn an
image

to the screen really fast, or like a colored rectangle and had it be
flickery? thats tearing and is the result of drawing an image to memory
while the monitor is being updated (ie the first frame might only get 1/3
of

the box, next frame only gets 2/3 of the box and third frame has the full
box then it starts over).

vsync is a good thing, and should only really be turned off when you are
benchmarking, but here’s some code to turn on and off vsync (:

void VSyncOn(char On)
{
typedef void (APIENTRY * WGLSWAPINTERVALEXT) (int);

WGLSWAPINTERVALEXT wglSwapIntervalEXT = (WGLSWAPINTERVALEXT)
wglGetProcAddress(“wglSwapIntervalEXT”);
if (wglSwapIntervalEXT)
{
wglSwapIntervalEXT(On); // set or unset vertical synchronisation
}
}

let us know if that changes the results of your benchmarking at all.

----- Original Message -----
From: “grembo”
To: “SDL ML”
Sent: Thursday, August 12, 2004 9:03 PM
Subject: [SDL] Performance SDL 1.2.7 vs Allegro

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve the
FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra note,
without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the video
    with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1 second
    in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but they
    refused to work citing ‘failed to initialize 0x0000018’ at runtime which
    i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Problem solved!

SDL_DisplayFormat() is now my best friend. The frame rate is now equal
to Allegro at ~220 FPS.

I wrote:> I’m presently evaluating libraries for performance. I found a large

performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve
the FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra
note, without double buffering Allegro ran at ~255 FPS while SDL ~75 FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the
    video with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1
    second in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but
    they refused to work citing ‘failed to initialize 0x0000018’ at
    runtime which i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

haha bad ass!!

good to hear that (:

something else pretty cool is a few people are working on something called
glSDL.

sdl itself doesnt make use of any possible hardware accelerations your
computers hardware might have (well it might some but the vast majority it
doesnt), but glSDL is going to be an OpenGL backend to the SDL functions so
that hardware acceleration will be used behind the scenes without having to
change any code (:

pretty awesome stuff…cant wait til it comes out.> ----- Original Message -----

From: grembo@swiftdsl.com.au (grembo)
To: "A list for developers using the SDL library. (includes SDL-announce)"

Sent: Thursday, August 12, 2004 10:30 PM
Subject: Re: [SDL] Performance SDL 1.2.7 vs Allegro

Problem solved!

SDL_DisplayFormat() is now my best friend. The frame rate is now equal
to Allegro at ~220 FPS.

I wrote:

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve
the FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra
note, without double buffering Allegro ran at ~255 FPS while SDL ~75
FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the
    video with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1
    second in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but
    they refused to work citing ‘failed to initialize 0x0000018’ at
    runtime which i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

I gather the main improvement in this implementation would be taking
advantage of texture memory on the video card and potentially adding
rotated blits to SDL. This sounds very promising. It would probably
require some extensions to SDL to be able to manipulate what goes in and
what doesn’t go into texture ram. Using texture ram on a often
modified/read surface would decrease performance so it sounds like they
will have to extend SDL a little further to discriminate between static
and volatile surfaces. Anyway, I’m thinking about it way too much :slight_smile:
Thanks for the info.

Alan Wolfe wrote:>haha bad ass!!

good to hear that (:

something else pretty cool is a few people are working on something called
glSDL.

sdl itself doesnt make use of any possible hardware accelerations your
computers hardware might have (well it might some but the vast majority it
doesnt), but glSDL is going to be an OpenGL backend to the SDL functions so
that hardware acceleration will be used behind the scenes without having to
change any code (:

pretty awesome stuff…cant wait til it comes out.

----- Original Message -----
From: “grembo” <@grembo>
To: "A list for developers using the SDL library. (includes SDL-announce)"

Sent: Thursday, August 12, 2004 10:30 PM
Subject: Re: [SDL] Performance SDL 1.2.7 vs Allegro

Problem solved!

SDL_DisplayFormat() is now my best friend. The frame rate is now equal
to Allegro at ~220 FPS.

I wrote:

I’m presently evaluating libraries for performance. I found a large
performance difference between the two libraries. Allegro performs at
~220 FPS while SDL only ~70 FPS. Doubling this statement would halve
the FPS: SDL_BlitSurface(image, NULL, pkScreen, NULL). As an extra
note, without double buffering Allegro ran at ~255 FPS while SDL ~75

FPS.

Is there a problem with the directx driver in SDL, am I missing
something or is Allegro a lot faster on at least Windows?

TEST SPECIFICS:

  • Both libraries were using a directx driver.
  • Screen mode was 640x480x16 fullscreen using double buffering under
    windows for both libs.
  • 24 and 32 bit screen modes halved the FPS under SDL.
  • SDL was initialised with SDL_INIT_VIDEO | SDL_INIT_TIMER and the
    video with SDL_SetVideoMode(640, 480, 16, SDL_FULLSCREEN).
  • The image loaded after setting the video mode was a 640x480x24 image
    either PNG or BMP format.
  • I am using “if(SDL_GetTicks() > uLastTime + 1000)” for timing 1
    second in SDL.
  • I tried using the precompiled binaries and libs for SDL-MinGW but
    they refused to work citing ‘failed to initialize 0x0000018’ at
    runtime which i believe has something to do with the DLL.
  • I built a standard SDL and an optimised version by using: export
    CFLAGS="-funroll-loops -fexpensive-optimizations -march=i686 -mmmx".
    Both builds produced the same result.
  • I used the same compiler to make a standard build of Allegro.
  • The system runs WinXP on an AMD XP1700+ and a Geforce2 Titanium with
    DirectX 9.0c

Regards,
Grembo.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

I suspect that this code will have no effect if vsync is forced on
or off in the driver config, as you can do with some drivers…
Totally platform and driver dependent, of course.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Friday 13 August 2004 06.59, Alan Wolfe wrote:

wow im surprised did you call it like this?

VSyncOn(0);

im not sure if you have to call it before or after setting the
video mode or if that even matters

I gather the main improvement in this implementation would be
taking advantage of texture memory on the video card and

Yep - or actually, we have to keep surfaces in VRAM to have OpenGL
accelerate anything at all in the normal case. (Some drivers will
probably make use of DMA to accelerate glWritePixels() if you use a
h/w supported pixel format, but so far, I’ve never seen that, at
least not on a Linux driver.)

potentially adding rotated blits to SDL.

Maybe eventually, but that’ll be if/when rotated blits are a part of
the official SDL API, implemented (one way or another) on all
backends. The problem is that rotated blits take a lot more power
than plain blits when implemented in software, so using them
effectively makes your code depend on h/w acceleration for proper
operation (as in usable framerates) - and then you’re probably better
off using OpenGL directly.

You could of course think of glSDL as a handy OpenGL wrapper, but I
think it’s way too low level to make sense. Could happen that a pure
accelerated 2D lib is forked off of glSDL eventually, though…

This sounds very
promising. It would probably require some extensions to SDL to be
able to manipulate what goes in and what doesn’t go into texture
ram.

That “extension” is already in place; the SDL_HWSURFACE flag. Only
blits from hardware surfaces to the display surface are OpenGL
accelerated.

This is actually more accidental than intentional; we can’t reliably
accelerate blits from software surfaces, as there is no requirement
that they are locked when modified. More specifically, we can’t tell
when you modify a s/w surface, so we can’t cache it in VRAM.

Note that glSDL/wrapper (found here: http://olofson.net/mixed.html)
does accelerate blits from any type of surface - but indeed, it
breaks if you modify surfaces without locking/unlocking them, which
makes it slightly incompatible with the official SDL API.

Using texture ram on a often modified/read surface would
decrease performance

Depends… With good drivers, uploading a texture is at least as fast
as an optimal SDL s/w blit - and if the surface has an alpha channel,
the total performance will be very much higher than doing it all in
software.

so it sounds like they will have to extend SDL
a little further to discriminate between static and volatile
surfaces.

Actually, I suggested an extension SDL_InvalidateRect() that would
tell the backend that a specific part of a surface has been modified.
This would avoid the requirement to update the whole surface/texture
if only part of it was changed. This was actually in one SDL release,
but was backed out again. Probably just as well, as it needs explicit
application support, is irrelevant to most backends, and probably
isn’t all that useful. (How often do you actually do procedural fx in
a small area of a surface? I could use that for the new radar display
in Kobo Deluxe, but I think it’s a rather unusual case.)

Anyway, I’m thinking about it way too much :slight_smile:

Unlikely. :wink:

This is quite tricky stuff, since the SDL API wasn’t really designed
with this type of backends in mind. I think we got it mostly right
now, but the problem is that there is a direct conflict between being
totally true to the SDL API, and allowing your average SDL
application to be fully accelerated. :-/

//David Olofson - Programmer, Composer, Open Source Advocate

ps. I’m the author of the original glSDL/wrapper, which has been
converted into an SDL backend by Stephane Marchesin, with some
occasional help/interference from me.

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Friday 13 August 2004 08.39, grembo wrote:

Well, thanks for the education. I’ve only been fiddling with SDL for a
day and have learnt quite a bit, but it looks like I have a lot more to
discover.
I’ve downloaded your wrapper and am giving it a bash. Cheers :o)

David Olofson wrote:>On Friday 13 August 2004 08.39, grembo wrote:

I gather the main improvement in this implementation would be
taking advantage of texture memory on the video card and

Yep - or actually, we have to keep surfaces in VRAM to have OpenGL
accelerate anything at all in the normal case. (Some drivers will
probably make use of DMA to accelerate glWritePixels() if you use a
h/w supported pixel format, but so far, I’ve never seen that, at
least not on a Linux driver.)

potentially adding rotated blits to SDL.

Maybe eventually, but that’ll be if/when rotated blits are a part of
the official SDL API, implemented (one way or another) on all
backends. The problem is that rotated blits take a lot more power
than plain blits when implemented in software, so using them
effectively makes your code depend on h/w acceleration for proper
operation (as in usable framerates) - and then you’re probably better
off using OpenGL directly.

You could of course think of glSDL as a handy OpenGL wrapper, but I
think it’s way too low level to make sense. Could happen that a pure
accelerated 2D lib is forked off of glSDL eventually, though…

This sounds very
promising. It would probably require some extensions to SDL to be
able to manipulate what goes in and what doesn’t go into texture
ram.

That “extension” is already in place; the SDL_HWSURFACE flag. Only
blits from hardware surfaces to the display surface are OpenGL
accelerated.

This is actually more accidental than intentional; we can’t reliably
accelerate blits from software surfaces, as there is no requirement
that they are locked when modified. More specifically, we can’t tell
when you modify a s/w surface, so we can’t cache it in VRAM.

Note that glSDL/wrapper (found here: http://olofson.net/mixed.html)
does accelerate blits from any type of surface - but indeed, it
breaks if you modify surfaces without locking/unlocking them, which
makes it slightly incompatible with the official SDL API.

Using texture ram on a often modified/read surface would
decrease performance

Depends… With good drivers, uploading a texture is at least as fast
as an optimal SDL s/w blit - and if the surface has an alpha channel,
the total performance will be very much higher than doing it all in
software.

so it sounds like they will have to extend SDL
a little further to discriminate between static and volatile
surfaces.

Actually, I suggested an extension SDL_InvalidateRect() that would
tell the backend that a specific part of a surface has been modified.
This would avoid the requirement to update the whole surface/texture
if only part of it was changed. This was actually in one SDL release,
but was backed out again. Probably just as well, as it needs explicit
application support, is irrelevant to most backends, and probably
isn’t all that useful. (How often do you actually do procedural fx in
a small area of a surface? I could use that for the new radar display
in Kobo Deluxe, but I think it’s a rather unusual case.)

Anyway, I’m thinking about it way too much :slight_smile:

Unlikely. :wink:

This is quite tricky stuff, since the SDL API wasn’t really designed
with this type of backends in mind. I think we got it mostly right
now, but the problem is that there is a direct conflict between being
totally true to the SDL API, and allowing your average SDL
application to be fully accelerated. :-/

//David Olofson - Programmer, Composer, Open Source Advocate

ps. I’m the author of the original glSDL/wrapper, which has been
converted into an SDL backend by Stephane Marchesin, with some
occasional help/interference from me.

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

David Olofson wrote:

so it sounds like they will have to extend SDL
a little further to discriminate between static and volatile
surfaces.

Actually, I suggested an extension SDL_InvalidateRect() that would
tell the backend that a specific part of a surface has been modified.
This would avoid the requirement to update the whole surface/texture
if only part of it was changed. This was actually in one SDL release,
but was backed out again. Probably just as well, as it needs explicit
application support, is irrelevant to most backends, and probably
isn’t all that useful. (How often do you actually do procedural fx in
a small area of a surface? I could use that for the new radar display
in Kobo Deluxe, but I think it’s a rather unusual case.)

Well, IMHO all emulators effectively are a procedural graphics generator
that mostly only update part of a surface at a time, and there are many
of them that use SDL as the backend (including the one I work on). I’d
like to have access to a GL backend that could take small updates from a
software surface for this purpose (especially on OS X where I think that
GL surfaces are much cheaper for compositing and 2D is all implicitly
double buffered and then composited making for very expensive drawing).

Isn’t SDL_UpdateRects already the equivalent of the SDL_InvalidateRect()
you mention above?

Fred

Just one question.

The code you wrote is windows specific. Maybe on Linux it is turned off?

I ask it because i have one animation that updates really fast, and its
broken, some steps of the animation shows only half of the objects, it
seems to be the case of vsync OFF.

Thanks

PieroOn Fri, 2004-08-13 at 09:16, Fred wrote:

David Olofson wrote:

so it sounds like they will have to extend SDL
a little further to discriminate between static and volatile
surfaces.

Actually, I suggested an extension SDL_InvalidateRect() that would
tell the backend that a specific part of a surface has been modified.
This would avoid the requirement to update the whole surface/texture
if only part of it was changed. This was actually in one SDL release,
but was backed out again. Probably just as well, as it needs explicit
application support, is irrelevant to most backends, and probably
isn’t all that useful. (How often do you actually do procedural fx in
a small area of a surface? I could use that for the new radar display
in Kobo Deluxe, but I think it’s a rather unusual case.)

Well, IMHO all emulators effectively are a procedural graphics generator
that mostly only update part of a surface at a time, and there are many
of them that use SDL as the backend (including the one I work on). I’d
like to have access to a GL backend that could take small updates from a
software surface for this purpose (especially on OS X where I think that
GL surfaces are much cheaper for compositing and 2D is all implicitly
double buffered and then composited making for very expensive drawing).

Isn’t SDL_UpdateRects already the equivalent of the SDL_InvalidateRect()
you mention above?

Fred


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

[…]

Isn’t SDL_UpdateRects already the equivalent of the
SDL_InvalidateRect() you mention above?

Sort of, but AFAIK it works only for the display surface.

Though the first argument is a surface… It would probably require
internal changes, but I guess we could make SDL_UpdateRects() work
for any surface without breaking the API.

Now, you mentioned emulators. Those are a special case in that they’ll
generally want to render directly to the screen, and thus, they lock
it, mess with it and then unlock it and flip. Disaster! Directly
accessing the VRAM is not possible with OpenGL, so glSDL creates a
shadow surface that you get when you lock the screen. That’s the good
news; at least it works. The bad news is that in order for this to
work as expected, glSDL has to copy the whole frame buffer into the
shadow surface every time you lock the screen. This is an insanely
expensive operation on pretty much any video card. (*)

(*) Well, except for the latest 3DLabs Wildcat Realizm cards.
Those specifically accelerate that operation, which kinda’
makes sense considering that the cards are meant to be
usable for final rendering, where you’ll want to grab each
frame to store it on disk. Previous generations of video
cards are only sufficient for editing and preview rendering,
and are designed for real time rendering only. The occasional
screenshot and a few weird applications is not reason enough
for such optimizations on that class of hardware.

One way to get around this issue is to use a set of buffer surfaces of
various sizes. To update an area of the screen, pick a buffer surface
of a suitable size, lock it, render into it, unlock it, and then blit
it to the screen. The unlock operation will invalidate the buffer
surface, and the next blit will force the new graphics to be
uploaded.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Friday 13 August 2004 14.16, Fred wrote:

I hear SDL doesn’t support vsync because it isn’t possible to determine
vsync for some drivers.

Piero B. Contezini wrote:>Just one question.

The code you wrote is windows specific. Maybe on Linux it is turned off?

I ask it because i have one animation that updates really fast, and its
broken, some steps of the animation shows only half of the objects, it
seems to be the case of vsync OFF.

Thanks

Piero

On Fri, 2004-08-13 at 09:16, Fred wrote:

David Olofson wrote:

so it sounds like they will have to extend SDL
a little further to discriminate between static and volatile
surfaces.

Actually, I suggested an extension SDL_InvalidateRect() that would
tell the backend that a specific part of a surface has been modified.
This would avoid the requirement to update the whole surface/texture
if only part of it was changed. This was actually in one SDL release,
but was backed out again. Probably just as well, as it needs explicit
application support, is irrelevant to most backends, and probably
isn’t all that useful. (How often do you actually do procedural fx in
a small area of a surface? I could use that for the new radar display
in Kobo Deluxe, but I think it’s a rather unusual case.)

Well, IMHO all emulators effectively are a procedural graphics generator
that mostly only update part of a surface at a time, and there are many
of them that use SDL as the backend (including the one I work on). I’d
like to have access to a GL backend that could take small updates from a
software surface for this purpose (especially on OS X where I think that
GL surfaces are much cheaper for compositing and 2D is all implicitly
double buffered and then composited making for very expensive drawing).

Isn’t SDL_UpdateRects already the equivalent of the SDL_InvalidateRect()
you mention above?

Fred


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

It does support vsync where possible (use double buffered h/w
surfaces), though since there is no call to enable/disable it (where
supported by drivers and/or OS), it relies on the driver
configuration.

Further, there is no explicit vysnc call (to sync the CPU directly
with the retrace), since it’s simply not possible to do on many
targets, even if they support retrace sync. Probably just as well,
since sync’ing the CPU is pretty pointless in a non-RT multitasking
environment, and it’s actually the page flip operation that needs
retrace sync anyway. Applications should just block waiting for the
old page to be released, so they can render the next frame into it.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Saturday 14 August 2004 00.58, grembo wrote:

I hear SDL doesn’t support vsync because it isn’t possible to
determine vsync for some drivers.

David Olofson wrote:

(*) Well, except for the latest 3DLabs Wildcat Realizm cards.
Those specifically accelerate that operation, which kinda’
makes sense considering that the cards are meant to be
usable for final rendering, where you’ll want to grab each
frame to store it on disk. Previous generations of video
cards are only sufficient for editing and preview rendering,
and are designed for real time rendering only. The occasional
screenshot and a few weird applications is not reason enough
for such optimizations on that class of hardware.

Well, as far as I know, most video drivers accelerate this operation
with a blit (both ati & nvidia drivers for linux do, probably their
windows counterparts too). Since the data has to travel across the agp
bus anyway, the speed won’t be as good as what you get from old high-end
SGI workstations. Or are these drivers doing something else ? Without
additional hardware that doesnt sound possible to me…

[Some SGI stations feature a physical memory design called UMA where
system memory and video memory are the same, thus there is no need to do
transfer between system and video memory over a bus, just memory copies.]

Stephane

David Olofson wrote:

(*) Well, except for the latest 3DLabs Wildcat Realizm cards.
Those specifically accelerate that operation, which kinda’
makes sense considering that the cards are meant to be
usable for final rendering, where you’ll want to grab each
frame to store it on disk. Previous generations of video
cards are only sufficient for editing and preview rendering,
and are designed for real time rendering only. The occasional
screenshot and a few weird applications is not reason enough
for such optimizations on that class of hardware.

Well, as far as I know, most video drivers accelerate this
operation with a blit (both ati & nvidia drivers for linux do,
probably their windows counterparts too). Since the data has to
travel across the agp bus anyway, the speed won’t be as good as
what you get from old high-end SGI workstations.

I don’t know how they do it (or not), but the net result on the setups
I’ve seen so far is that transfers to and from VRAM have about the
same bandwidth as software blits - that is, just a tiny fraction of
the theoretical AGP (or even PCI) bandwidth. Dunno what I’m doing
wrong, but the only time I’ve actually seen DMA transfers in action
was with DirectDraw. (On a pretty early AGP card, and it wasn’t much
faster than CPU transfers, but at least it was asynchronous…)

Or are these
drivers doing something else ? Without additional hardware that
doesnt sound possible to me…

Well, the hottest model is a PCI Express card, but other than that, I
think it’s just a matter of making use of DMA when transferring both
to and from system RAM. One would think that’s the obvious way of
doing it, but it seems like it just won’t work in real life. :-/

//David Olofson - Programmer, Composer, Open Source Advocate

.- Audiality -----------------------------------------------.
| Free/Open Source audio engine for games and multimedia. |
| MIDI, modular synthesis, real time effects, scripting,… |
`-----------------------------------> http://audiality.org -’
http://olofson.nethttp://www.reologica.se —On Saturday 14 August 2004 14.44, Stephane Marchesin wrote:

David Olofson wrote:

Now, you mentioned emulators. Those are a special case in that they’ll
generally want to render directly to the screen, and thus, they lock
it, mess with it and then unlock it and flip. Disaster! Directly
accessing the VRAM is not possible with OpenGL, so glSDL creates a
shadow surface that you get when you lock the screen. That’s the good
news; at least it works. The bad news is that in order for this to
work as expected, glSDL has to copy the whole frame buffer into the
shadow surface every time you lock the screen. This is an insanely
expensive operation on pretty much any video card.

Well, in my case the emulator has an array that represents the emulated
machine’s display file, a software surface in the display’s format in a
1x ratio to the emulated display, and a single buffered software display
surface in the final ratio to be displayed after filtering (hq2x,
scale3x etc.). The display surface is indeed locked (if SDL_MUSTLOCK is
set) as it is being updated.

One way to get around this issue is to use a set of buffer surfaces of
various sizes. To update an area of the screen, pick a buffer surface
of a suitable size, lock it, render into it, unlock it, and then blit
it to the screen. The unlock operation will invalidate the buffer
surface, and the next blit will force the new graphics to be
uploaded.

Does this mean that you could support a GL accelerated display with a
software display surface being updated with SDL_UpdateRects with no
extra copying of textures out of video RAM? Or is it the case that only
complete textures can be updated at a time via OpenGL?

Fred