Looking for ideas to speed things up a little

So, quietly ignoring my issues with 1.2 crashing and 1.3 having problems
with keyboard input…

I’m looking for hints and tips to try to speed things up a little…
However the issues I’m facing could quite possibly be due to my own
naiveity… Long time C programmer here, but very new to SDL…

My little application (runs under Linux), opens a window, plots points,
draw lines and prints text from a bitmap font to the screen. Works great,
but it seems I have a choice of “fast” mode whereby I draw to the surface
then update it to the screen in “batch” mode, or “slow” mode whereby I
draw to the surface and update to the screen after every point/line…

e.g. this trivial bit of code to draw moire patterns:

Fast:
for (x = 0 ; x < 640 ; x += 3)
{
drawLine (320,240, x, 0) ;
drawLine (320,240, x, 479) ;
}
SDL_UpdateRect (myScreen, 0,0,0,0) ;

Slow:
for (y = 0 ; y < 480 ; y += 3)
{
drawLine (320,240, 639,y) ;
drawLine (320,240, 0,y) ;
SDL_UpdateRect (myScreen, 0,0,0,0) ;
}

Same for text and scrolling the screen - if I do an UpdateRect after every
character, it’s slow, but after every line it’s faster (although I
possibly could be much cleverer with characters and only update the box of
the actual character, however…)

Doing graphics like this on little embedded sytems is something I’ve been
doing for a very long time - and in these systems, poking pixels directly
into the hardware is usually the “done thing”, but this is my first time
with SDL and I’m wondering if I’m missing a trick or 2… Or maybe I’ve
just not looked hard enough yet…

My initialisation is:

if ((myScreen = SDL_SetVideoMode (screenWidth, screenHeight, 32, SDL_HWSURFACE)) == NULL)
… etc.

it seems my hardware doesn’t support HWSURFACE, so it defaults to
SWSURFACE anyway.

I suspect my application is somewhat different from most applications -
I’m drawing/updating a tiny bit of the screen at a time and want it
visible ASAP without waiting for a vertical refresh/UpdateRect which is
slowing things down somewhat, rather than re-rendering large patches of it
and I appreciate issues with vertical refresh and tearing, however for now
I’m not letting them concern me…

So any hints or tips would be appreciated, even if it’s nothing more than
"that’s the way it is".

Thanks,

Gordon

[snip snip]
Same for text and scrolling the screen - if I do an UpdateRect
after every character, it’s slow, but after every line it’s faster
(although I possibly could be much cleverer with characters and
only update the box of the actual character, however…)

You can’t use it like a 6845 chip or VESA VGA memory… you should
switch to the mindset of a modern GPU for PCs.

[snip snip]
if ((myScreen = SDL_SetVideoMode (screenWidth, screenHeight, 32,
SDL_HWSURFACE)) == NULL)
… etc.

it seems my hardware doesn’t support HWSURFACE, so it defaults to
SWSURFACE anyway.

The SWSURFACE can be updated faster than the eye can see. By all
means write to it as fast as you can. But telling the graphics
driver to update to video (SDL_UpdateRect) is only useful up to
the LCD monitor refresh rate (sort of). So it’s best to drop the
VGA-bare-metal methods/thinking and put in a updater loop that has
a reasonable frame rate…

I suspect my application is somewhat different from most
applications - I’m drawing/updating a tiny bit of the screen at a
time and want it visible ASAP without waiting for a vertical
refresh/UpdateRect which is slowing things down somewhat, rather
than re-rendering large patches of it and I appreciate issues with
vertical refresh and tearing, however for now I’m not letting them
concern me…

IMHO, it’s best to drop old-style thinking and use a
game-rendering-loop style. Anyhow, there is a complicated GPU,
layers and layers of GPU driver code, probably OS compositing
thrown in, an LCD monitor at 60Hz… your pixel is going through
so many layers – it is not a 1-to-1 match to graphics memory
anymore, unless you are using SDL on an embedded system.On 3/5/2012 7:34 PM, Gordon Henderson wrote:

So any hints or tips would be appreciated, even if it’s nothing
more than “that’s the way it is”.


Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia

[snip snip]
Same for text and scrolling the screen - if I do an UpdateRect
after every character, it’s slow, but after every line it’s faster
(although I possibly could be much cleverer with characters and
only update the box of the actual character, however…)

You can’t use it like a 6845 chip or VESA VGA memory… you should switch to
the mindset of a modern GPU for PCs.

[snip snip]
if ((myScreen = SDL_SetVideoMode (screenWidth, screenHeight, 32,
SDL_HWSURFACE)) == NULL)
… etc.

it seems my hardware doesn’t support HWSURFACE, so it defaults to
SWSURFACE anyway.

The SWSURFACE can be updated faster than the eye can see. By all means write
to it as fast as you can. But telling the graphics driver to update to video
(SDL_UpdateRect) is only useful up to the LCD monitor refresh rate (sort of).
So it’s best to drop the VGA-bare-metal methods/thinking and put in a updater
loop that has a reasonable frame rate…

I suspect my application is somewhat different from most
applications - I’m drawing/updating a tiny bit of the screen at a
time and want it visible ASAP without waiting for a vertical
refresh/UpdateRect which is slowing things down somewhat, rather
than re-rendering large patches of it and I appreciate issues with
vertical refresh and tearing, however for now I’m not letting them
concern me…

IMHO, it’s best to drop old-style thinking and use a game-rendering-loop
style. Anyhow, there is a complicated GPU, layers and layers of GPU driver
code, probably OS compositing thrown in, an LCD monitor at 60Hz… your pixel
is going through so many layers – it is not a 1-to-1 match to graphics
memory anymore, unless you are using SDL on an embedded system.

OK, So I think my answer is “that’s the way it is”… :slight_smile:

What I have for one application is lots of inputs needing to be graphed
with digital needles flicking back and forth, gauges climbing, falling and
the odd bit of text coming in and so on and it’s all pretty much
asynchronous with each process pretty much isolated from the others, so I
was looking for some way for one thread to not continually call UpdateRect
which might bog-down the rest of the threads.

I was just wondering if I was missing a trick or something.

Thanks,

GordonOn Mon, 5 Mar 2012, KHMan wrote:

On 3/5/2012 7:34 PM, Gordon Henderson wrote:

I agree that you should only update the whole screen once per frame. The
way you’re calling SDL_UpdateRect() is equivalent to using SDL_Flip().

Alternatively, you can still update after each line you draw, but you need
to specify only the regions that have actually changed or else you’re
making the cpu do too much work. You can figure out your own dirty rects
to update, or you could try Sprig, which has it built in. Each primitive
in Sprig returns its dirty rect, so you could update after each, and
there’s a system in the library that can collect them and batch update the
screen.

By the way, if you do have any overlapping primitives, you may get
flickering if you update after each one.

Jonny DOn Mon, Mar 5, 2012 at 9:44 AM, Gordon Henderson <gordon+sdl at drogon.net>wrote:

On Mon, 5 Mar 2012, KHMan wrote:

On 3/5/2012 7:34 PM, Gordon Henderson wrote:

[snip snip]
Same for text and scrolling the screen - if I do an UpdateRect
after every character, it’s slow, but after every line it’s faster
(although I possibly could be much cleverer with characters and
only update the box of the actual character, however…)

You can’t use it like a 6845 chip or VESA VGA memory… you should switch
to the mindset of a modern GPU for PCs.

[snip snip]

if ((myScreen = SDL_SetVideoMode (screenWidth, screenHeight, 32,
SDL_HWSURFACE)) == NULL)
… etc.

it seems my hardware doesn’t support HWSURFACE, so it defaults to
SWSURFACE anyway.

The SWSURFACE can be updated faster than the eye can see. By all means
write to it as fast as you can. But telling the graphics driver to update
to video (SDL_UpdateRect) is only useful up to the LCD monitor refresh rate
(sort of). So it’s best to drop the VGA-bare-metal methods/thinking and put
in a updater loop that has a reasonable frame rate…

I suspect my application is somewhat different from most

applications - I’m drawing/updating a tiny bit of the screen at a
time and want it visible ASAP without waiting for a vertical
refresh/UpdateRect which is slowing things down somewhat, rather
than re-rendering large patches of it and I appreciate issues with
vertical refresh and tearing, however for now I’m not letting them
concern me…

IMHO, it’s best to drop old-style thinking and use a game-rendering-loop
style. Anyhow, there is a complicated GPU, layers and layers of GPU driver
code, probably OS compositing thrown in, an LCD monitor at 60Hz… your
pixel is going through so many layers – it is not a 1-to-1 match to
graphics memory anymore, unless you are using SDL on an embedded system.

OK, So I think my answer is “that’s the way it is”… :slight_smile:

What I have for one application is lots of inputs needing to be graphed
with digital needles flicking back and forth, gauges climbing, falling and
the odd bit of text coming in and so on and it’s all pretty much
asynchronous with each process pretty much isolated from the others, so I
was looking for some way for one thread to not continually call UpdateRect
which might bog-down the rest of the threads.

I was just wondering if I was missing a trick or something.

Thanks,

Gordon

_____________**
SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/**listinfo.cgi/sdl-libsdl.orghttp://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Message-ID: <alpine.DEB.2.00.1203051425170.19061 at unicorn.drogon.net>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

What I have for one application is lots of inputs needing to be graphed
with digital needles flicking back and forth, gauges climbing, falling and
the odd bit of text coming in and so on and it’s all pretty much
asynchronous with each process pretty much isolated from the others, so I
was looking for some way for one thread to not continually call UpdateRect
which might bog-down the rest of the threads.

Are you doing gui operations from multiple threads? This CAN work on
Macs, Windows & Linux, but in all cases it requires synchronization,
because behind the scenes there’s actually only ONE thread actually
doing gui operations. So, it’s better for your application to also
only have one thread doing gui operations, so there won’t be lock
contention.

If you’re using the various threads to directly modify gui elements,
then I recommend that you re-dedicate them to processing data, and
then have them communicate with another thread that handles the gui.

I was just wondering if I was missing a trick or something.

If this was being done in OpenGL I’d say to have a static background,
and only a handful of moving indicators. You pre-load the images
(commonly called textures) to the GPU, then render them in the correct
positions independently. E.g., you could have a pressure-gauge: it’d
have a background, and it would have a rotating needle. You’d pre-load
the images for both, obtaining a texture id for each of them. When you
wanted to display that gauge you’d move to the correct position, then
rotate to the correct orientation (which may sound weird, but it’s
just the way OpenGL does that), then display the appropriate texture.
For the next component of that gauge you’d undo the rotation (assuming
that any rotation was involved), move a little bit more, then rotate
again, repeating this process for all of the gauge components. If you
use a particular texture in multiple locations then you’d keep a list
of positions and their corresponding rotations in a structure assigned
to that texture.

Why is this faster? Textures commonly involve a lot of data, so they
take a long time to move to the GPU. However, the GPU memory is itself
fast, so once they’re there they can be displayed very quickly. By
pre-loading the textures, and then specifying the location and angle
to display them at, you can greatly reduce the amount of data that you
send to the GPU. Imagine the difference between a 3-color, 8 bit per
color, 8 * 8 image, and 4 32-bit floats. By pre-loading the textures,
you can reduce the data sent to the GPU on each draw to 1/12 of what
it otherwise would be.> Date: Mon, 5 Mar 2012 14:44:00 +0000 (GMT)

From: Gordon Henderson <gordon+sdl at drogon.net>
To: SDL Development List
Subject: Re: [SDL] Looking for ideas to speed things up a little…

If you want a tip, I can recommend a specific hardware/software setup that should guarantee minimum latency:

  1. analog CRT monitor (not LCD), to be certain the video card’s ramdac scanout is actually reaching pixels immediately.
  2. directly software update a framebuffer provided by fbdev on Linux, with no X11 involved.

Poking pixels into a raw framebuffer is going to hit the CRT display immediately.

There’s a reason that CRTs are used in latency comparisons with LCD monitors, such as the final image in this page of this review:
http://www.behardware.com/articles/712-4/lcd-david-vs-goliath-iolair-vs-dell.html

Where SDL fits into this equation I’m not entirely sure, as I have no experience using SDL with fbdev and such, but I figured a direct answer is best.On 03/05/2012 06:44 AM, Gordon Henderson wrote:

On Mon, 5 Mar 2012, KHMan wrote:

On 3/5/2012 7:34 PM, Gordon Henderson wrote:

[snip snip]
Same for text and scrolling the screen - if I do an UpdateRect
after every character, it’s slow, but after every line it’s faster
(although I possibly could be much cleverer with characters and
only update the box of the actual character, however…)

You can’t use it like a 6845 chip or VESA VGA memory… you should switch to the mindset of a modern GPU for PCs.

[snip snip]
if ((myScreen = SDL_SetVideoMode (screenWidth, screenHeight, 32,
SDL_HWSURFACE)) == NULL)
… etc.

it seems my hardware doesn’t support HWSURFACE, so it defaults to
SWSURFACE anyway.

The SWSURFACE can be updated faster than the eye can see. By all means write to it as fast as you can. But telling the graphics driver to update to video (SDL_UpdateRect) is only useful up to the
LCD monitor refresh rate (sort of). So it’s best to drop the VGA-bare-metal methods/thinking and put in a updater loop that has a reasonable frame rate…

I suspect my application is somewhat different from most
applications - I’m drawing/updating a tiny bit of the screen at a
time and want it visible ASAP without waiting for a vertical
refresh/UpdateRect which is slowing things down somewhat, rather
than re-rendering large patches of it and I appreciate issues with
vertical refresh and tearing, however for now I’m not letting them
concern me…

IMHO, it’s best to drop old-style thinking and use a game-rendering-loop style. Anyhow, there is a complicated GPU, layers and layers of GPU driver code, probably OS compositing thrown in, an LCD
monitor at 60Hz… your pixel is going through so many layers – it is not a 1-to-1 match to graphics memory anymore, unless you are using SDL on an embedded system.

OK, So I think my answer is “that’s the way it is”… :slight_smile:

What I have for one application is lots of inputs needing to be graphed with digital needles flicking back and forth, gauges climbing, falling and the odd bit of text coming in and so on and it’s all
pretty much asynchronous with each process pretty much isolated from the others, so I was looking for some way for one thread to not continually call UpdateRect which might bog-down the rest of the
threads.

I was just wondering if I was missing a trick or something.

Thanks,

Gordon


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org


LordHavoc
Author of DarkPlaces Quake1 engine - http://icculus.org/twilight/darkplaces
Co-designer of Nexuiz - http://alientrap.org/nexuiz
"War does not prove who is right, it proves who is left." - Unknown
"Any sufficiently advanced technology is indistinguishable from a rigged demo." - James Klass
"A game is a series of interesting choices." - Sid Meier

Just a quick note of thanks to all who replied. I’ve more or less gotten
the jist of it all now. Turns out I wasn’t doing anything too “wrong”, but
I had some calls to SDL_UpdateRect which were not needed, so once I
optimised them out and re-worked things I get a better effect.

It’s an intersting issue though - I’m not writing a game, so there is no
well defined loop, but rather processing lots of asynchronous events
(sensor reading, inputs from external devices) in a (pseudo) threaded
manner.

But that’s just one little test project, I have another project on the go
which I’ve just convered from curses to SDL - curses has a similar screen
refresh scenario, so it’s not been too hard to emulate, although getting
it right is tricky - e.g. my program is interpreting a little script and
the script prints data to the screen - do I do an update after every bit
of data (which could be a single character and thus slow-down the entire
system), or after every line (could look really slow if it prints 2 nubers
on the same line, then does calculations and prints a 3rd and a newline 5
minutes later), and so on… Plenty of room to experiment and get the
right balance though.

Thanks,

Gordon

I’ve ran stuff that spewed a tremendous amount of stuff on a shell
window… but then it ran much faster without doing any such
updates. So the former did indicate a lot of progress, but cut
performance. Decoupling processing and visual updates is probably
a good thing if performance is desirable.

If you try 7-zip, say on a big folder, the progress dialog box
updates only several times a second (but long enough for filenames
to register in your brain), even if it is actually processing 100
files per second (even with a per-file method like zip/deflate.)
It is doubtful whether a faster update (e.g. 50 filenames per
second) would be useful to the user. Of course, a full log file
has its uses too, but it may not be needed for normal usage. So
the 7-zip progress dialog box is what I would consider a
well-designed UI.On 3/8/2012 4:15 PM, Gordon Henderson wrote:

[snip snip]
But that’s just one little test project, I have another project on
the go which I’ve just convered from curses to SDL - curses has a
similar screen refresh scenario, so it’s not been too hard to
emulate, although getting it right is tricky - e.g. my program is
interpreting a little script and the script prints data to the
screen - do I do an update after every bit of data (which could be
a single character and thus slow-down the entire system), or after
every line (could look really slow if it prints 2 nubers on the
same line, then does calculations and prints a 3rd and a newline 5
minutes later), and so on… Plenty of room to experiment and get
the right balance though.


Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia