Sluggish SDL_Flip

This is my configuration:
Athlon XP 2000+
GeForce4 MX440 64MB DDR AGP4x
Asus A7S333 (SiS333) (BIOS v1004, the latest one)
Windows 98
DirectX 8.1
NVidia GeForce drivers 30.82 (the latest ones)

Dev-C++ 4.9.6.0
SDL 1.2.5a (I’m using libSDL.a and libSDL_main.a from the mingw32 pack. OT:
why don’t you provide SDL for dev-c++ on libsdl.org?)
glSDL 0.3
linker options: -lmingw32 -lSDLmain -lSDL -mwindows -lglSDL -lopengl32

In my test program, I obtain the following results:
1)in software mode, it’s very fast
2)both in hardware and opengl mode, without double buffer, it goes warp
speeds - I can barely see it (however, it flickers)
3)both in hardware and opengl mode, with double buffer, it’s incredibly
slow (about 10ms each frame! There’s no difference if
SDL_Delay(10) is commented out or not!)
If I comment out SDL_Flip, I can’t see anything: that means that I’m using
a real double buffer surface and SDL_Flip doesn’t just do an SDL_UpdateRect
on the whole screen. Also, this way the program terminates real fast, that
means that SDL_Flip is the actual bottleneck in the whole stuff.

CRV?ADER/KY

KnowledgE is PoweR
-------------- next part --------------
#include “glSDL.h”

/*options:
USE_SW directx software
USE_HW directx hardware
USE_HWDB directx hardware, double buffer
USE_GL opengl hardware
USE_GLDB opengl hardware, double buffer
*/

#define USE_SW

SDL_Surface * screen;

int main(int argc, char * argv[])
{
SDL_Rect flrect, clrect;

if( SDL_Init(SDL_INIT_VIDEO) < 0 ) {
    fprintf(stderr, "Couldn't initialize SDL: %s\n", SDL_GetError());
	exit(1);
}
atexit(SDL_Quit);

#ifdef USE_SW
screen = SDL_SetVideoMode(1024,768,32, SDL_SWSURFACE | SDL_FULLSCREEN);
#elif defined USE_HW
screen = SDL_SetVideoMode(1024,768,32, SDL_HWSURFACE | SDL_FULLSCREEN);
#elif defined USE_HWDB
screen = SDL_SetVideoMode(1024,768,32, SDL_HWSURFACE | SDL_FULLSCREEN | SDL_DOUBLEBUF);
#elif defined USE_GL
screen = SDL_SetVideoMode(1024,768,32, SDL_GLSDL | SDL_HWSURFACE | SDL_FULLSCREEN);
#elif defined USE_GLDB
screen = SDL_SetVideoMode(1024,768,32, SDL_GLSDL | SDL_HWSURFACE | SDL_FULLSCREEN | SDL_DOUBLEBUF);
#endif

if( screen == NULL ) 
{
	fprintf(stderr, "Can't set video mode: %s\n", SDL_GetError());
	exit(1);
}

flrect.w = flrect.h = 100;
flrect.x = flrect.y = 0;
clrect = flrect;

for( ; flrect.y<500; flrect.x++,flrect.y++) {

    SDL_FillRect(screen,&clrect,0x0);
    SDL_FillRect(screen,&flrect,0xFF);

#ifdef USE_SW
SDL_UpdateRect(screen,clrect.x,clrect.y,101,101);
#elif defined USE_HWDB
SDL_Flip(screen);
SDL_FillRect(screen,&clrect,0x0);
SDL_FillRect(screen,&flrect,0xFF);
#elif defined USE_GLDB
SDL_Flip(screen);
SDL_FillRect(screen,&clrect,0x0);
SDL_FillRect(screen,&flrect,0xFF);
#endif

    clrect = flrect;
    //SDL_Delay(10);
}

return 0;

}

SDL 1.2.5a (I’m using libSDL.a and libSDL_main.a from the mingw32 pack.
OT: why don’t you provide SDL for dev-c++ on libsdl.org?)

Isn’t dev-c++ just an IDE for mingw? So what would you expect SDL to
provide for dev-c++? The mingw32 pack should be all you need.

In my test program, I obtain the following results:
1)in software mode, it’s very fast
2)both in hardware and opengl mode, without double buffer, it goes warp
speeds - I can barely see it (however, it flickers)

     On these two tests your program is not waiting for the vertical 

retrace, so it will be like that. You have to use a delay on it. But being
that fast is not good, because you’re actually loosing visual information.

3)both in hardware and opengl mode, with double buffer, it’s incredibly
slow (about 10ms each frame! There’s no difference if
SDL_Delay(10) is commented out or not!)

     Because you're waiting for the vertical retrace. Actually your 

program is just as fast as ins hardware in single buffer, but as it has to
wait the vertical retrace, it WILL spend some time waiting for it so it
won’t have tearing. If your video card is set to, let’s say, 75Hz, it won’t
update the screen more than 75 times per second, no matter if it takes 1ms
to draw the whole scene or less, but there’s no point in updating it more
than that as you can’t show two or more different in just one.

If I comment out SDL_Flip, I can’t see anything: that means that I’m using
a real double buffer surface and SDL_Flip doesn’t just do an
SDL_UpdateRect on the whole screen. Also, this way the program terminates
real fast, that means that SDL_Flip is the actual bottleneck in the whole
stuff.

     Not exactly a bottleneck, it's just doing it the way it should be 

done. You can’t sync your animation based on how many times you can update
the screen, as it depends on vertical retrace of the display adapter (which
the user can change), but on time itself. Start experiencing with
SDL_GetTicks, determine how much time your animation should take and sync
it according to how much time has passed since the animation started (the
first tick you got) and the current ticks count (instead of the number of
frames you could update, this is completely wrong).

     Just a small example, suppose you want to move an object 200 

points on the x axis in 3 seconds, from 0 to 200, you’ll have to do
something like that:

     Uint32 start, now;
     // We'll use double, but you can use int anyway
     double x;

     start = SDL_GetTicks();

     while( (now=SDL_GetTicks()) - start < 3000)
     {
             x = 200.0 * (double)now / (double)start;
             // Draw your object and update the display
             ...

     }

     Of course, this code can be improved, as there's no way to stop 

the animation (as in a game, if the user want to skip it, etc), but it can
give you a rough idea on how things. Of course, this won’t help you to stop
the wasting on processor power (after all, it will be still drawing more
frames than it should), but use SDL_Delay for it (there’s absolutly no
point in updating the screen more than it needs).

     Paulo

well since its an IDE you dont have direct interface to mingw and plus if
you dont know how to use mingw but do know how to use dev-C++ then youd want
something that was already setup for dev-C++ so you could code and compile
like you normaly would. I used SDL with Dev-C++ for a while but had
problems when i was trying to put in SDL_Net, which i had done before with
MSVC no problem. I now use mingw in windows and g++ in BSD (:

btw i think sdl-config dont work in dev-c++ :P> ----- Original Message -----

From: adam@preston.net (Adam Gates)
Newsgroups: loki.open-source.sdl
To:
Sent: Wednesday, October 16, 2002 6:06 PM
Subject: Re: [SDL] sluggish SDL_Flip

SDL 1.2.5a (I’m using libSDL.a and libSDL_main.a from the mingw32 pack.
OT: why don’t you provide SDL for dev-c++ on libsdl.org?)

Isn’t dev-c++ just an IDE for mingw? So what would you expect SDL to
provide for dev-c++? The mingw32 pack should be all you need.


SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

3)both in hardware and opengl mode, with double buffer, it’s
incredibly slow (about 10ms each frame! There’s no difference if
SDL_Delay(10) is commented out or not!)

    Because you're waiting for the vertical retrace. Actually your 

program is just as fast as is hardware in single buffer, but as it has to
wait the vertical retrace, it WILL spend some time waiting for it so it
won’t have tearing. If your video card is set to, let’s say, 75Hz, it
won’t update the screen more than 75 times per second, no matter if it
takes 1ms to draw the whole scene or less, but there’s no point in
updating it more than that as you can’t show two or more different in just one.

If I comment out SDL_Flip, I can’t see anything: that means that I’m
using a real double buffer surface and SDL_Flip doesn’t just do an
SDL_UpdateRect on the whole screen. Also, this way the program terminates
real fast, that means that SDL_Flip is the actual bottleneck in the whole
stuff.

    Not exactly a bottleneck, it's just doing it the way it should be 

done. You can’t sync your animation based on how many times you can
update the screen, as it depends on vertical retrace of the display
adapter (which the user can change), but on time itself. Start
experiencing with SDL_GetTicks, determine how much time your animation
should take and sync it according to how much time has passed since the
animation started (the first tick you got) and the current ticks count
(instead of the number of frames you could update, this is completely wrong).

I tested it out, the fps is 85 (like my vertical refresh rate). Thanks!
However, how can I disable video sync - while keeping double buffering -
for benchmark purposes?

I tested it out, the fps is 85 (like my vertical refresh rate).
Thanks! However, how can I disable video sync - while
keeping double buffering - for benchmark purposes?

If you use the SDL_DOUBLEBUF flag, you have to use SDL_Flip() to
update the display, AFAIK. (The name of the flag is a little
misleading, SDL_PAGEFLIP would have been better, IMHO.)

If you just want to find out how much time each frame takes, do
something like this in your game loop:

while (!finished) // game loop
{
Uint32 start = SDL_GetTicks();

handle_events();
draw_stuff();

// this is how long the frame took in ms
Uint32 elapsed = SDL_GetTicks() - start;

SDL_Flip();

}

In other words, don’t include SDL_Flip in your measurements. The
flip itself doesn’t take much time, but waiting for the vertical
blank does.–
Matthijs Hollemans
All Your Software
www.allyoursoftware.com

I tested it out, the fps is 85 (like my vertical refresh rate). Thanks!
However, how can I disable video sync - while keeping double buffering -
for benchmark purposes?

     I don't think you can do it, and anyway, there's no purpose on 

doing so, as if you want to benchmark how much time it takes render an
image on the hardware surface, you can consider the same values as a
hardware surface with a single buffer. This is safe to assume because the
double buffer technique uses two buffers for the screen (duh! :P), but
shows only one and draws on the other. When there’s a vertical retrace you
flip them (yeah, it will just change the “pointer” on where the image
starts on the video card). Actually, you tell SDL to do it with SDL_Flip,
but this is the way it works since dinosaur age :).

     Actually what I miss on SDL is the ability to wait fot the 

vertical retrace on a full screen hardware surface with a single buffer, so
I can draw to the video memory when the image is not being sent to the
monitor, as I found some video cards (onboard) that can’t deliver a full
screen hardware surface with a double buffer on 640x480x32bpp. I assume
here that it’s actually IMPOSSIBLE to get an actual software full screen
surface, i.e., SDL will convert it to a single buffer hardware surface on
video memory at some point. If anyone can tell me wrong, I 'd appreciate.

     Paulo

I tested it out, the fps is 85 (like my vertical refresh rate).
Thanks! However, how can I disable video sync - while keeping double
buffering - for benchmark purposes?

     I don't think you can do it, and anyway,

You can with many drivers, but this is entirely driver dependent. On
Win32, you’ll usually find a checkbox on one of the pages in the advance
video settings dialog. On XFree86 and other Un*x driver archs, it may be
possible to control it through the config file and/or through an
environment variable.

there’s no purpose on
doing so, as if you want to benchmark how much time it takes render an
image on the hardware surface, you can consider the same values as a
hardware surface with a single buffer. This is safe to assume because
the double buffer technique uses two buffers for the screen (duh! :P),
but shows only one and draws on the other.

In fact, there’s another reason it might be a bad idea to benchmark
without retrace sync, at least if you’re benchmarking the application,
as opposed to the OpenGL subsystem: Rendering full speed continously
drastically changes the way scheduling is done. Instead of blocking on
each retrace, the rendering thread will run furiously for several frames
at a time, and will then be preempted by some other thread. This might
change the total behavior of the system enough that cache optimizations
can’t be benchmarked reliably.

That said, when dealing with video at <100 Hz, this is probably not a
major issue, unless you have some rather memory intensive thread
working in the background. (Procedural texture rendering or something.)

With low latency audio (often >1000 Hz) and the like, this is a real
issue to much greater extent, even when the background system load is
"normal". In a game, there is a lot of background activity (a whole game,
more specifically :-), so there, this is always relevant.

When there’s a vertical
retrace you flip them (yeah, it will just change the “pointer” on where
the image starts on the video card). Actually, you tell SDL to do it
with SDL_Flip, but this is the way it works since dinosaur age :).

Actually, it seems that with most modern cards, it’s rather “tell the
video card to change the pointer/offset during the next retrace”. There
is an important difference: In this case, the application never
synchronizes directly with the retrace, but rather just blocks waiting
for a “new” rendering buffer from the driver. (Not much of a difference
with double buffer really, but a big difference with triple buffering.)

     Actually what I miss on SDL is the ability to wait fot the

vertical retrace on a full screen hardware surface with a single
buffer, so I can draw to the video memory when the image is not being
sent to the monitor, as I found some video cards (onboard) that can’t
deliver a full screen hardware surface with a double buffer on
640x480x32bpp. I assume here that it’s actually IMPOSSIBLE to get an
actual software full screen surface, i.e., SDL will convert it to a
single buffer hardware surface on video memory at some point. If anyone
can tell me wrong, I 'd appreciate.

Well, a software surface by definition resides in system RAM, and thus,
there is no way your average video card can display it.

The only exception would be the integrated chips that steal some system
RAM for VRAM - but that’s hidden by the drivers anyway, so you still have
to ask for a h/w surface. Helluva’ lot faster for s/w rendering than
anything a PCI or AGP card can offer, though.

Why do you want a software surface for this anyway? If you want sigle
buffering and “direct access”, you should simply use a single buffered
h/w surface.

Explicit retrace sync is another issue, though. Given that it’s usually
the driver or the video card that does the actual sync + flip operation,
you can’t really sync without flipping. If the driver allows faking a
double buffered display, where both “pages” actually use the same buffer,
that might work (flipping becomes a NOP - except that it’ll still sync
with the retrace), but other than that, you’re out of luck on many
platforms…

//David Olofson - Programmer, Composer, Open Source Advocate

.- Coming soon from VaporWare Inc…------------------------.
| The Return of Audiality! Real, working software. Really! |
| Real time and off-line synthesis, scripting, MIDI, LGPL…|
-----------------------------------> (Public Release RSN) -' .- M A I A -------------------------------------------------. | The Multimedia Application Integration Architecture |----------------------------> http://www.linuxdj.com/maia -’
http://olofson.nethttp://www.reologica.se —On Thursday 17 October 2002 09:09, Paulo V W Radtke wrote:

I tested it out, the fps is 85 (like my vertical refresh rate).
Thanks! However, how can I disable video sync - while keeping double
buffering - for benchmark purposes?

     I don't think you can do it, and anyway,

You can with many drivers, but this is entirely driver dependent. On
Win32, you’ll usually find a checkbox on one of the pages in the advance
video settings dialog. On XFree86 and other Un*x driver archs, it may be
possible to control it through the config file and/or through an
environment variable.

Do you have any pointers where exactly one should look on XFree86? I’ve
been able to include various driver settings in “Device” section of
XF86Config in the past but I can’t find any docs on options accepted by
the lastest radeon driver in 4.2.0.

I’m asking because I’m having an opposite problem to what’s being
discussed here - vsync is disabled (even though DGA claims it can wait
for retrace) and I would really like to enable it since I’m not very
interested in the horrible flicker I’m getting now.

latimeriusOn Thu, Oct 17, 2002 at 11:27:11AM +0200, David Olofson wrote:

On Thursday 17 October 2002 09:09, Paulo V W Radtke wrote: