Speeding up glClear?

Steve_Smith · January 6, 2005, 5:40pm

I get the sense that this is kind of an eternal question, but it’s
been driving me a bit batty. I’ve been running some timing on a very
simple 3D maze program to try to figure out why my framerates are so
low on older machines - for instance, my P2 550 runs it at about 4
frames per second. The timing indicates that the call each frame of:

glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)

takes over 250 ms to complete at 800x600 fullscreen resolution.

I’ve come up with a few hypotheses for how to speed this up, but I’m
hoping the expertise out here will be able to give me ideas I haven’t
thought of or information I don’t know.

My first thought is to lower the resolution to 640x480 for instance,
but this should conceptually only be about a 25% speed increase - 5
fps instead of 4 fps isn’t quite the tradeoff I’m looking for.

My second thought is that in my projection matrix, the near z plane is
too close to the camera - it’s at a distance of 0.2. I’ve encountered
other problems with the near plane being too close to the camera, but
those were all with rendering errors, not with slow performance, but
I’m wondering if perhaps the depth clearing will be faster if the near
z plane is at least a distance of 1.0 from the camera. Call it
intuition.

Third thought is to throw in a skybox of some kind, which will in
theory remove the need to clear the color buffer with each frame.
Currently, a lot of the cleared background is visible over the tops of
the walls of the maze, and there’s no floor to the maze, so that area
has to be cleared as well.

Can anyone tell me if I’m on the right track with these ideas, or give
guidance if there’s something I’ve missed?

Thanks,
Steve

Gabriel_Gambetta · January 6, 2005, 6:52pm

My second thought is that in my projection matrix, the near z plane is
too close to the camera - it’s at a distance of 0.2. I’ve encountered
other problems with the near plane being too close to the camera, but
those were all with rendering errors, not with slow performance, but
I’m wondering if perhaps the depth clearing will be faster if the near
z plane is at least a distance of 1.0 from the camera. Call it
intuition.

Probably not. Clearing the depth buffer involves writing 0s to every
pixel, so what’s on the pixel before doesn’t matter.

What you can do though, if you don’t need a lot of depth buffer
precission, is to use the range 0 - 0.5 in one frame, and then 0.5 - 1
in the next, then 0-0.5 and so on, thus removing the neccessity to
clear. I don’t remember the specifics but there was a way to do that.

Third thought is to throw in a skybox of some kind, which will in
theory remove the need to clear the color buffer with each frame.

This one is correct, and in fact, widely used. If you played Doom or
Duke Nukem 3D, and used one of the clip hacks, stepping out of the map
caused the background to become dirty. That’s because the games assumed
the map was closed, and therefore didn’t clear the color buffer at all.

    --Gabriel

Steve_Smith · January 6, 2005, 6:25pm

What you can do though, if you don’t need a lot of depth buffer
precission, is to use the range 0 - 0.5 in one frame, and then 0.5 - 1
in the next, then 0-0.5 and so on, thus removing the neccessity to
clear. I don’t remember the specifics but there was a way to do that.

I saw something like this just today, which involved switching the
clearing mode itself from GL_LESS to GL_GREATER with each frame as
well. As I fail to grok why this would work for the moment, I’m going
to try option 3 first.

Third thought is to throw in a skybox of some kind, which will in
theory remove the need to clear the color buffer with each frame.

This one is correct, and in fact, widely used. If you played Doom or
Duke Nukem 3D, and used one of the clip hacks, stepping out of the map
caused the background to become dirty. That’s because the games assumed
the map was closed, and therefore didn’t clear the color buffer at all.

I distinctly recall liberally using IDCLIP, heh. Excellent, this will
be where I start then - thanks much! I’m very curious to know,
though, why mapping a texture across the entire screen would be so
much faster than setting a single color value across it all. Any
insights to offer?

Thanks,
SteveOn Thu, 06 Jan 2005 15:52:39 -0300, Gabriel wrote:

Stephane_Marchesin · January 6, 2005, 8:25pm

Steve Smith wrote:>On Thu, 06 Jan 2005 15:52:39 -0300, Gabriel wrote:

What you can do though, if you don’t need a lot of depth buffer
precission, is to use the range 0 - 0.5 in one frame, and then 0.5 - 1
in the next, then 0-0.5 and so on, thus removing the neccessity to
clear. I don’t remember the specifics but there was a way to do that.

I saw something like this just today, which involved switching the
clearing mode itself from GL_LESS to GL_GREATER with each frame as
well. As I fail to grok why this would work for the moment, I’m going
to try option 3 first.

Third thought is to throw in a skybox of some kind, which will in
theory remove the need to clear the color buffer with each frame.

This one is correct, and in fact, widely used. If you played Doom or
Duke Nukem 3D, and used one of the clip hacks, stepping out of the map
caused the background to become dirty. That’s because the games assumed
the map was closed, and therefore didn’t clear the color buffer at all.

I distinctly recall liberally using IDCLIP, heh. Excellent, this will
be where I start then - thanks much! I’m very curious to know,
though, why mapping a texture across the entire screen would be so
much faster than setting a single color value across it all. Any
insights to offer?

250ms for a glClear is really slow. It’s even slower than software
rendering. What’s your video hardware/opengl drivers ?

Stephane

Steve_Smith · January 6, 2005, 8:47pm

250ms for a glClear is really slow. It’s even slower than software
rendering. What’s your video hardware/opengl drivers ?

On the slow system, it’s an old 16MB card, I don’t even remember the
manufacturer; the OpenGL drivers are the default ones that come with
Windows 98. I’ve noticed slow performance on most systems though,
honestly - it runs just fine on mine, a pentium 2.8 with a gig of
memory and a 256MB Geforce FX 5200. I haven’t run any timing on the
slow systems, though.

Steve

Koshmaar · January 6, 2005, 8:58pm

I distinctly recall liberally using IDCLIP, heh. Excellent, this will
be where I start then - thanks much! I’m very curious to know,
though, why mapping a texture across the entire screen would be so
much faster than setting a single color value across it all. Any
insights to offer?

Not sure, but…

When you call glClear and render any other thing, you end up touching every pixel multiple times, because every frame you must fill whole screen with zeros; while using Z buffer you only render what would be visible (not “culled” by other object pixels) - so, you save a whole lot of fillrate.
Odds are that texture you map on the screen, will include touching much smaller pixels than screenWidth*screenHeight, note that it also can change per frame ie. 30% of texture surface can be clipped by screen, 20% by other polygons in Z buffer etc., so in this case GL need to render only 50% of what original texture was - another big gain; glClear touches all pixels no matter what is happening on the screen.
Other reasons, forgot or don’t have time to write them

Koshmaar

P.S. Sorry for clumsy english…

Bob_Pendleton · January 6, 2005, 9:19pm

250ms for a glClear is really slow. It’s even slower than software
rendering. What’s your video hardware/opengl drivers ?

On the slow system, it’s an old 16MB card, I don’t even remember the
manufacturer; the OpenGL drivers are the default ones that come with
Windows 98.

IIRC, the default Windows OpenGL drivers are software only drivers.
Assuming they are pre MMX software and you have a slow video bus then
you could see this kind of performance. I have an old p2 laptop that
gets about 4 seconds per frame (not a typo) on a lot of OpenGL apps.

	Bob PendletonOn Thu, 2005-01-06 at 14:47 -0600, Steve Smith wrote:

I’ve noticed slow performance on most systems though,
honestly - it runs just fine on mine, a pentium 2.8 with a gig of
memory and a 256MB Geforce FX 5200. I haven’t run any timing on the
slow systems, though.

Steve

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Johannes_Schmidt · January 7, 2005, 12:59pm

Huh?
Sorry, I have to ask:
Are you really sure that it is glClear that takes this long?
glClear just fills the buffers with 0’s or the given clear color.

Every software renderer does a better job on this.

I am running my apps using Mesa software rendering and an Athlon 700 Mhz with
quite a decent frame rate.

Regards,
Johannes

< http://libufo.sourceforge.net > The OpenGL GUI ToolkitOn Thursday 06 January 2005 18:40, Steve Smith wrote:

I get the sense that this is kind of an eternal question, but it’s
been driving me a bit batty. I’ve been running some timing on a very
simple 3D maze program to try to figure out why my framerates are so
low on older machines - for instance, my P2 550 runs it at about 4
frames per second. The timing indicates that the call each frame of:

glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)

takes over 250 ms to complete at 800x600 fullscreen resolution.

Steve_Smith · January 7, 2005, 3:33pm

I suspected it should as well, but I don’t know how else to explain
it. I don’t have the code right in front of me, but it looks
something like this:

CStopWatch clearWatch;
clearWatch.reset();
clearWatch.start();

COutput::Clear(); // this function just calls glClear() with color
and depth bits

clearWatch.stop();
printf(“display cleared in %d ms\n”, clearWatch.getTime());

And when I look in stdout.txt after running this from my P2 550
running Win98, I see a bunch of lines that look like this:

display cleared in 278 ms
display cleared in 266 ms

etc. etc.

It might be completely far-fetched, but is it possible that because
I’m double buffering, all of my rendering calls are just being cached
instead of actually executed, and the rendering itself isn’t happening
until the next frame is being cleared?

I hope that makes sense to somebody other than me, and I hope I’m way
wrong, heh.

Thanks all for your help,
SteveOn Fri, 7 Jan 2005 13:59:33 +0100, Johannes Schmidt wrote:

Huh?
Sorry, I have to ask:
Are you really sure that it is glClear that takes this long?
glClear just fills the buffers with 0’s or the given clear color.

Every software renderer does a better job on this.

Bob_Pendleton · January 7, 2005, 6:57pm

Huh?
Sorry, I have to ask:
Are you really sure that it is glClear that takes this long?
glClear just fills the buffers with 0’s or the given clear color.

Every software renderer does a better job on this.

I suspected it should as well, but I don’t know how else to explain
it. I don’t have the code right in front of me, but it looks
something like this:

CStopWatch clearWatch;
clearWatch.reset();
clearWatch.start();

COutput::Clear(); // this function just calls glClear() with color
and depth bits

clearWatch.stop();
printf(“display cleared in %d ms\n”, clearWatch.getTime());

And when I look in stdout.txt after running this from my P2 550
running Win98, I see a bunch of lines that look like this:

display cleared in 278 ms
display cleared in 266 ms

etc. etc.

It might be completely far-fetched, but is it possible that because
I’m double buffering, all of my rendering calls are just being cached
instead of actually executed, and the rendering itself isn’t happening
until the next frame is being cleared?

That could happen. Through in a call to glFinish before, and after,the
call to glClear. That should isolate that one call for timing. Also, it
is much more accurate to get the cumulative time over several hundred
calls and then do the division to get the rate than to try to measure
each individual action. I don’t know the precision of the timer you are
using, I doubt it is the source of the problem right now, it may be a
problem after you put in fglFinish.

	Bob PendletonOn Fri, 2005-01-07 at 09:33 -0600, Steve Smith wrote:

On Fri, 7 Jan 2005 13:59:33 +0100, Johannes Schmidt wrote:

I hope that makes sense to somebody other than me, and I hope I’m way
wrong, heh.

Thanks all for your help,
Steve

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Tomas_Jakobsson · January 7, 2005, 7:14pm

Something must be wrong here… glClear just cant take that long time. I
personally used to develope stuff for glQuake in Win98 on a AMD 450 with a
Voodoo2 and later a TNT 16Mb card. And it never ran under 30-40 fps. And
thats with a glClear every frame.

// Tomaz

----Original Message Follows----From: onecrane@gmail.com (Steve Smith)
Reply-To: Steve Smith ,“A list for developers using the
SDL library. (includesSDL-announce)”
To: Johannes Schmidt
CC: “A list for developers using the SDL library. (includes
SDL-announce)”
Subject: Re: [SDL] Speeding up glClear?
Date: Fri, 7 Jan 2005 09:33:15 -0600

On Fri, 7 Jan 2005 13:59:33 +0100, Johannes Schmidt wrote:

Huh?
Sorry, I have to ask:
Are you really sure that it is glClear that takes this long?
glClear just fills the buffers with 0’s or the given clear color.

Every software renderer does a better job on this.

I suspected it should as well, but I don’t know how else to explain
it. I don’t have the code right in front of me, but it looks
something like this:

CStopWatch clearWatch;
clearWatch.reset();
clearWatch.start();

COutput::Clear(); // this function just calls glClear() with color
and depth bits

clearWatch.stop();
printf(“display cleared in %d ms\n”, clearWatch.getTime());

And when I look in stdout.txt after running this from my P2 550
running Win98, I see a bunch of lines that look like this:

display cleared in 278 ms
display cleared in 266 ms

etc. etc.

It might be completely far-fetched, but is it possible that because
I’m double buffering, all of my rendering calls are just being cached
instead of actually executed, and the rendering itself isn’t happening
until the next frame is being cleared?

I hope that makes sense to somebody other than me, and I hope I’m way
wrong, heh.

Thanks all for your help,
Steve

SDL mailing list
SDL at libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Auktioner: Tj?na en hacka p? gamla prylar http://tradera.msn.se

Tomas_Jakobsson · January 7, 2005, 7:23pm

Im not sure if this could be called a memory leak but…
I have noticed that at least ubder Win32. SDL_Appname is malloced using
strlen (name) + 1; but it is never free’d.

Id say the best place to put a free to this would be in DIB_ /
DX5_DestroyWindow.

Or maybe its just me that is very strict with free’ing everything that gets
malloced?

// Tomaz_________________________________________________________________
Hitta r?tt p? n?tet med MSN S?k http://search.msn.se/

Nicolai_Haehnle · January 8, 2005, 4:47pm

It has been said more than once by developers from both ATI and NVidia that
this is a bad idea. It might have been useful in older generation graphics
cards, but modern graphics cards are highly optimized for this kind of
thing (fast clear etc.), so you’re wasting a lot of depth buffer precision
for a laughable gain.

Besides, as others have pointed out, most real games won’t need color clear
at all because they always render a background, and the depth clear is well
covered by advanced Z buffering tricks done by the hardware. Don’t try to
be too clever, you’ll actually make the driver’s life harder.

cu,
Nicolai
-------------- next part --------------
A non-text attachment was scrubbed…
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20050108/20aaf027/attachment.pgpOn Thursday 06 January 2005 19:52, Gabriel wrote:

My second thought is that in my projection matrix, the near z plane is
too close to the camera - it’s at a distance of 0.2. I’ve encountered
other problems with the near plane being too close to the camera, but
those were all with rendering errors, not with slow performance, but
I’m wondering if perhaps the depth clearing will be faster if the near
z plane is at least a distance of 1.0 from the camera. Call it
intuition.

Probably not. Clearing the depth buffer involves writing 0s to every
pixel, so what’s on the pixel before doesn’t matter.

What you can do though, if you don’t need a lot of depth buffer
precission, is to use the range 0 - 0.5 in one frame, and then 0.5 - 1
in the next, then 0-0.5 and so on, thus removing the neccessity to
clear. I don’t remember the specifics but there was a way to do that.