Help to increase performance in SDL_ttf marquee program

Friends,
My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Tanks.
-------------- next part --------------
A non-text attachment was scrubbed…
Name: marqueeslow.c
Type: application/octet-stream
Size: 3108 bytes
Desc: not available
URL: http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20100331/91aa9689/attachment.obj

Buy better hardware.Am 31.03.2010 19:37, schrieb Ricardo Leite:

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.


Christoph Nelles

E-Mail : @Christoph_Nelles
Jabber : eazrael at evilazrael.net ICQ : 78819723

PGP-Key : ID 0x424FB55B on subkeys.pgp.net
or http://evilazrael.net/pgp.txt

oO

Ok,
Now, help-me with tips and trick of programming (in C language and SDL lib,
please).

Tanks for fast answer.

2010/3/31 Christoph Nelles > Am 31.03.2010 19:37, schrieb Ricardo Leite:

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Buy better hardware.


Christoph Nelles

E-Mail : evilazrael at evilazrael.de
Jabber : eazrael at evilazrael.net ICQ : 78819723

PGP-Key : ID 0x424FB55B on subkeys.pgp.net
or http://evilazrael.net/pgp.txt


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Buy better hardware.

DO NOT buy better hardware. That’s just contributing to the problem and making it worse instead of fixing it.

First thing to do is run it through a profiler to see what’s taking so much time. What language are you programming in, and what OS are you on? This will help to find a good profiler.>----- Original Message ----

From: Christoph Nelles
Subject: Re: [SDL] Help to increase performance in SDL_ttf marquee program.
Am 31.03.2010 19:37, schrieb Ricardo Leite:

Hmm, Great Idea :slight_smile:

Do you know a link or example for me ?

2010/3/31 Jesse Palser > use OpenGL to draw text…

Christoph Nelles wrote:

Am 31.03.2010 19:37, schrieb Ricardo Leite:

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Buy better hardware.


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Linux Slackware 13 - kernel 2.6.29.6
Language: ANSI C
Compiller: gcc v 4.3.3
Lib: SDL version 1.2.13

CPU Intel atom 1.6Ghz
1Gb ram

Tanks a lot.

2010/3/31 Mason Wheeler > >----- Original Message ----

From: Christoph Nelles
Subject: Re: [SDL] Help to increase performance in SDL_ttf marquee
program.

Am 31.03.2010 19:37, schrieb Ricardo Leite:

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Buy better hardware.

DO NOT buy better hardware. That’s just contributing to the problem and
making it worse instead of fixing it.

First thing to do is run it through a profiler to see what’s taking so much
time. What language are you programming in, and what OS are you on? This
will help to find a good profiler.


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Then give us more information :stuck_out_tongue: Do you cache the output or do you
render the text every frame?

Hm, Intel Atom… Buy better hardware :stuck_out_tongue: :wink: And an irony detector GAm 31.03.2010 19:53, schrieb Ricardo Leite:

oO

Ok,
Now, help-me with tips and trick of programming (in C language and SDL
lib, please).


Christoph Nelles

E-Mail : @Christoph_Nelles
Jabber : eazrael at evilazrael.net ICQ : 78819723

PGP-Key : ID 0x424FB55B on subkeys.pgp.net
or http://evilazrael.net/pgp.txt

First of all, what hardware, operating system and video subsystem are you
using? The “correct” setup and methods for maximum performance depends a lot
on that.

For example, certain X servers (like XFree86 and X.org) have a fixed pixel
format for the desktop, and that applies to all windows - often even in
fullscreen mode. (XFree86 and X.org don’t really have fullscreen modes, but
rather zoom in on a “locked” window on the standard desktop.) So, if you have
a 16 or 32 bpp desktop, asking SDL for an 8 bpp display will result in you
getting a software surface that is converted from 8 bpp to the desktop format
on the fly. Not very efficient…

What you should probably do here is just ask for “0 bpp”, that is, whatever
the default is. Then you should use SDL_DisplayFormat() and
SDL_DisplayFormatAlpha() to convert your source surfaces (‘text’ in your case)
to formats suitable for fast blitting to whatever display surface you have at
hand. That is, you don’t really have to care about the pixel format as long as
you’re not doing pixel level software rendering.

Also, very few video subsystems support windows with hardware surfaces (that
requires specific hardware support to be practically useful in a desktop
environment), and some don’t even support it in fullscreen mode. Nothing you
have to worry about most of the time, but keep in mind that SDL_Flip() may
actually be a full-screen shadow-to-display copy, rather than the classic
(very low cost) flip of DMA pointers. This may be of great importance if your
application usually only changes a small area of the screen at a time.

One thing that may be an absolute performance killer is alpha blending. It’s
really rather expensive in software - and even more so if your surface is
"straight" as opposed to RLE accelerated! (See SDL_SetAlpha() and
SDL_SetColorKey().)

Try to use opaque surfaces first, then colorkeyed RLE accelerated, and if you
really need to, RLE accelerated RGBA surfaces. (You may increase the
"contrast" on the alpha channel to eliminate pixels that are nearly opaque or
nearly transparent. It’s the pixels that are actually blended that are
expensive when you’re using RLE! The opaque ones are just copied whereas the
transparent ones are skipped entirely.)

Finally, SDL_FillRect() may not be all that much faster than
SDL_BlitSurface(), so don’t rely on it being fast just because it seems
trivial… Unless you actually have a background image or something, it may be
better to make the ‘text’ surface opaque, so you don’t have to clear the
screen before blitting.On Wednesday 31 March 2010, at 19.37.06, Ricardo Leite wrote:

Friends,
My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’

…and a nuclear power plant. :wink:

/me is running a quad core dual video card monster that burns around 1 kW at
maximum load. That’s almost a criminal offense these days…! :-DOn Wednesday 31 March 2010, at 20.10.20, Christoph Nelles wrote:

Am 31.03.2010 19:53, schrieb Ricardo Leite:

oO

Ok,
Now, help-me with tips and trick of programming (in C language and SDL
lib, please).

Then give us more information :stuck_out_tongue: Do you cache the output or do you
render the text every frame?

Hm, Intel Atom… Buy better hardware :stuck_out_tongue: :wink: And an irony detector G


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’

The “hardware” needs to be this. $ystem Requirement$
This routine is part of a low cost display device for news in a public
school, in Brazil.
Better hardware is destined for classrooms.

2010/3/31 Christoph Nelles > Am 31.03.2010 19:53, schrieb Ricardo Leite:

oO

Ok,
Now, help-me with tips and trick of programming (in C language and SDL
lib, please).

Then give us more information :stuck_out_tongue: Do you cache the output or do you
render the text every frame?

Hm, Intel Atom… Buy better hardware :stuck_out_tongue: :wink: And an irony detector G


Christoph Nelles

E-Mail : evilazrael at evilazrael.de
Jabber : eazrael at evilazrael.net ICQ : 78819723

PGP-Key : ID 0x424FB55B on subkeys.pgp.net
or http://evilazrael.net/pgp.txt


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

That’d be my question. Depending on the size of the text, you may
wish to blit the entire string first, and then scroll that bitmap.
Or break it into words, and then blit them each individually.
e.g.:

No good deed goes unpunished.

You could break that up into:
No
good
deed
goes
unpunished.

And then, at various points during the animation, you’d be
blitting:

| N|o

| No g|ood

| No good|

| good dee|d

goo|d deed go|es

de|ed goes u|npunished

| unpunish|ed

unp|unished |

-bill!On Wed, Mar 31, 2010 at 08:10:20PM +0200, Christoph Nelles wrote:

Then give us more information :stuck_out_tongue: Do you cache the output or do you
render the text every frame?

Thanks,
I will test the idea.

2010/3/31 Bill Kendrick > On Wed, Mar 31, 2010 at 08:10:20PM +0200, Christoph Nelles wrote:

Then give us more information :stuck_out_tongue: Do you cache the output or do you
render the text every frame?

That’d be my question. Depending on the size of the text, you may
wish to blit the entire string first, and then scroll that bitmap.
Or break it into words, and then blit them each individually.
e.g.:

No good deed goes unpunished.

You could break that up into:
No
good
deed
goes
unpunished.

And then, at various points during the animation, you’d be
blitting:

| N|o

| No g|ood

| No good|

| good dee|d

goo|d deed go|es

de|ed goes u|npunished

| unpunish|ed

unp|unished |

-bill!


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Ok, thanks.
I will test this.

2010/3/31 David Olofson > On Wednesday 31 March 2010, at 19.37.06, Ricardo Leite <@Ricardo_Leite> wrote:

Friends,
My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

First of all, what hardware, operating system and video subsystem are you
using? The “correct” setup and methods for maximum performance depends a
lot
on that.

For example, certain X servers (like XFree86 and X.org) have a fixed pixel
format for the desktop, and that applies to all windows - often even in
fullscreen mode. (XFree86 and X.org don’t really have fullscreen modes, but
rather zoom in on a “locked” window on the standard desktop.) So, if you
have
a 16 or 32 bpp desktop, asking SDL for an 8 bpp display will result in you
getting a software surface that is converted from 8 bpp to the desktop
format
on the fly. Not very efficient…

What you should probably do here is just ask for “0 bpp”, that is, whatever
the default is. Then you should use SDL_DisplayFormat() and
SDL_DisplayFormatAlpha() to convert your source surfaces (‘text’ in your
case)
to formats suitable for fast blitting to whatever display surface you have
at
hand. That is, you don’t really have to care about the pixel format as long
as
you’re not doing pixel level software rendering.

Also, very few video subsystems support windows with hardware surfaces
(that
requires specific hardware support to be practically useful in a desktop
environment), and some don’t even support it in fullscreen mode. Nothing
you
have to worry about most of the time, but keep in mind that SDL_Flip() may
actually be a full-screen shadow-to-display copy, rather than the classic
(very low cost) flip of DMA pointers. This may be of great importance if
your
application usually only changes a small area of the screen at a time.

One thing that may be an absolute performance killer is alpha blending.
It’s
really rather expensive in software - and even more so if your surface is
"straight" as opposed to RLE accelerated! (See SDL_SetAlpha() and
SDL_SetColorKey().)

Try to use opaque surfaces first, then colorkeyed RLE accelerated, and if
you
really need to, RLE accelerated RGBA surfaces. (You may increase the
"contrast" on the alpha channel to eliminate pixels that are nearly opaque
or
nearly transparent. It’s the pixels that are actually blended that are
expensive when you’re using RLE! The opaque ones are just copied whereas
the
transparent ones are skipped entirely.)

Finally, SDL_FillRect() may not be all that much faster than
SDL_BlitSurface(), so don’t rely on it being fast just because it seems
trivial… Unless you actually have a background image or something, it may
be
better to make the ‘text’ surface opaque, so you don’t have to clear the
screen before blitting.


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

ANSI C language, Linux Slackware 13.

The FPS on my program is 28 frames per second in:

Linux Slackware 13 - kernel 2.6.29.6
Language: ANSI C
Compiller: gcc v 4.3.3
Lib: SDL version 1.2.13
CPU Intel atom 1.6Ghz
1Gb ram

CPU use is:25%
Memory: total=102496k used=117976k

2010/3/31 Mason Wheeler > >----- Original Message ----

From: Christoph Nelles
Subject: Re: [SDL] Help to increase performance in SDL_ttf marquee
program.

Am 31.03.2010 19:37, schrieb Ricardo Leite:

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Buy better hardware.

DO NOT buy better hardware. That’s just contributing to the problem and
making it worse instead of fixing it.

First thing to do is run it through a profiler to see what’s taking so much
time. What language are you programming in, and what OS are you on? This
will help to find a good profiler.


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

This is my program in C:

/*

My marquee example: slow performance :frowning:

compile:

gcc -c -MMD -o marqueeslow.o marqueeslow.c
gcc -o marqueeslow marqueeslow.o -lSDL -lSDL_ttf

*/

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <SDL/SDL.h>
#include <SDL/SDL_ttf.h>

int main(int argc, char *argv[]){
SDL_Surface *screen;
SDL_Surface *text;
TTF_Font *font;
SDL_Color white = { 0xFF, 0xFF, 0xFF, 0 };
SDL_Color black = { 0x00, 0x00, 0x00, 0 };
SDL_Color *forecol;
SDL_Color *backcol;
SDL_Rect dstrect;

char *mytext=
{"You are never too old to set another goal "
    "or to dream a new dream. "
        "C. S. Lewis"};

int step;
int delay=15; // Big values=Slow marquee. Small values=Instable cadence

of marquee

int w_text, nada;
Uint32 Start=SDL_GetTicks();
Uint32 Elapsed;

Uint32 fps_ini, fps_now;
int fps;

int w=1280; // Width: of window
int h=300;  // Height of window and font

SDL_Rect sourceArea = {0,0,w,h};

//--------------------------------------------
forecol=&white;
backcol=&black;

//------------------------------------
if ( SDL_Init(SDL_INIT_VIDEO|SDL_INIT_TIMER)<0 ){
    printf("Error in SDL_init(): %s\n", SDL_GetError());
    exit(1);
}
//-----------------------------------
screen=SDL_SetVideoMode(w, h, 8,
    SDL_HWACCEL|SDL_RESIZABLE|SDL_DOUBLEBUF
);

if (screen==NULL) {
    printf("Error in SDL_SetVideoMode(): %s\n", SDL_GetError());
    exit(2);
}
//-----------------------------------
if (TTF_Init()<0){
    printf("Error in TTF_Init(): %s\n", SDL_GetError());
    exit(3);
}
//-----------------------------------

font=TTF_OpenFont("Arial.ttf", h ); // Put here your True Type Font path

file

if (font==NULL) {
    printf("Error in TTF_OpenFont(): %s\n", SDL_GetError());
    exit(4);
}


text=TTF_RenderText_Blended( font, mytext, *forecol );

TTF_SizeText( font, mytext, &w_text, &nada ); // Get length of image of

text

step=-w; // set test to left position

fps=0;
    fps_ini=SDL_GetTicks();
    fps_now=fps_ini;
while( 1 ){
    Start = SDL_GetTicks();

    if( step<0 ){
        dstrect.x=-step;
        sourceArea.x=0;
        sourceArea.y=0;
        sourceArea.w=w;
        sourceArea.h=h;
    }else{
        sourceArea.w=w;
        sourceArea.x=step;
    }

    SDL_FillRect( screen, NULL ,
        SDL_MapRGB( screen->format, backcol->r, backcol->g, backcol->b )
        );

    SDL_BlitSurface( text, &sourceArea, screen, &dstrect );
    SDL_Flip( screen );
    fps++;

    step++;

    if( step>w_text ){
        step=-w;
        SDL_FreeSurface(text);

        text = TTF_RenderText_Blended( font, mytext, *forecol );
        TTF_SizeText( font, mytext, &w_text, &nada );
    }

    Elapsed = SDL_GetTicks() - Start;

    fps_now=SDL_GetTicks();

    /* FPS calc */
            if((fps_now-fps_ini)>=1000){
                 printf("Frames:%d / %3.3f Sec\n", fps,

(float)(fps_now-fps_ini)/1000 );
fps_ini=fps_now;
fps=0;
}

    // Cadence adjust
    if( Elapsed<delay ){
        SDL_Delay( delay - Elapsed );
    }

}

// Never ending

    printf("Never print this...\n");
SDL_FreeSurface(text);
TTF_CloseFont(font);

return 0;

}

2010/3/31 Ricardo Leite <@Ricardo_Leite>> ANSI C language, Linux Slackware 13.

The FPS on my program is 28 frames per second in:

Linux Slackware 13 - kernel 2.6.29.6
Language: ANSI C
Compiller: gcc v 4.3.3
Lib: SDL version 1.2.13
CPU Intel atom 1.6Ghz
1Gb ram

CPU use is:25%
Memory: total=102496k used=117976k

2010/3/31 Mason Wheeler

----- Original Message ----

From: Christoph Nelles
Subject: Re: [SDL] Help to increase performance in SDL_ttf marquee
program.

Am 31.03.2010 19:37, schrieb Ricardo Leite:

My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Buy better hardware.

DO NOT buy better hardware. That’s just contributing to the problem and
making it worse instead of fixing it.

First thing to do is run it through a profiler to see what’s taking so
much time. What language are you programming in, and what OS are you on?
This will help to find a good profiler.


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Well, one might think that this shouldn’t really help much, as only the area
that fits in the target surface is actually touched - but it’s not quite that
simple! There are cache locality issues, and when using RLE, the RLE decoder
has to skip into each scanline when blitting from anywhere but the left edge
and on. Probably no massive impact though, but it’s there…On Wednesday 31 March 2010, at 20.33.02, Bill Kendrick wrote:

On Wed, Mar 31, 2010 at 08:10:20PM +0200, Christoph Nelles wrote:

Then give us more information :stuck_out_tongue: Do you cache the output or do you
render the text every frame?

That’d be my question. Depending on the size of the text, you may
wish to blit the entire string first, and then scroll that bitmap.
Or break it into words, and then blit them each individually.
e.g.:

No good deed goes unpunished.

You could break that up into:
No
good
deed
goes
unpunished.

And then, at various points during the animation, you’d be

blitting:
| N|o
|
| No g|ood
|
| No good|
|
| good dee|d

goo|d deed go|es

de|ed goes u|npunished

| unpunish|ed

unp|unished |


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’

Wait… Is that a multicore Atom (I have a dual core on my desk, so at least I
know they exist ;-), or does this mean rendering is hardware accelerated?

Of course, retrace sync’ed flips could pull the CPU use below 100%, but unless
you’re at one frame per refresh (as in “full frame rate”), you won’t be going
much below 100% before you switch to the next “notch” and get a higher frame
rate.

Anyway, one advanced trick you might want to try is partial (or “smart”)
updates. Quite tricky stuff indeed, but if you have a fixed framerate and a
fixed scrolling speed, you might get away with turning the ‘text’ surface into
a “mask” that just contains “scroll step size” wide edges around the letters,
and then transparent pixels. (Transparency will be optimized out by the RLE,
both from memory and blit operations.) You blit as usual, but without clearing
the screen, and what happens is that the left and right edges of the letters
move, while the solid colored areas in between are untouched. This should
reduce the bandwidth requirements to a fraction of what you have now.On Wednesday 31 March 2010, at 20.55.57, Ricardo Leite wrote:

ANSI C language, Linux Slackware 13.

The FPS on my program is 28 frames per second in:

Linux Slackware 13 - kernel 2.6.29.6
Language: ANSI C
Compiller: gcc v 4.3.3
Lib: SDL version 1.2.13
CPU Intel atom 1.6Ghz
1Gb ram

CPU use is:25%


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’

Multicore. dualcore and Hyper-threading:

4 cores.
2 real cores divided by 2 HTs.
Look the information extracted in Linux /proc/cpuinfo:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 28
model name : Intel® Atom™ CPU 330 @ 1.60GHz
stepping : 2
cpu MHz : 1598.770
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc
arch_perfmon pebs bts pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm
lahf_lm

bogomips : 3197.54
clflush size : 64
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 28
model name : Intel® Atom™ CPU 330 @ 1.60GHz
stepping : 2
cpu MHz : 1598.770
cache size : 512 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 2
initial apicid : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc
arch_perfmon pebs bts pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm
lahf_lm

bogomips : 3196.86
clflush size : 64
power management:

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 28
model name : Intel® Atom™ CPU 330 @ 1.60GHz
stepping : 2
cpu MHz : 1598.770
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc
arch_perfmon pebs bts pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm
lahf_lm

bogomips : 3260.73
clflush size : 64
power management:

processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 28
model name : Intel® Atom™ CPU 330 @ 1.60GHz
stepping : 2
cpu MHz : 1598.770
cache size : 512 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 3
initial apicid : 3
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc
arch_perfmon pebs bts pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm
lahf_lm
bogomips : 3196.91
clflush size : 64
power management:

2010/3/31 David Olofson > On Wednesday 31 March 2010, at 20.55.57, Ricardo Leite <@Ricardo_Leite> wrote:

ANSI C language, Linux Slackware 13.

The FPS on my program is 28 frames per second in:

Linux Slackware 13 - kernel 2.6.29.6
Language: ANSI C
Compiller: gcc v 4.3.3
Lib: SDL version 1.2.13
CPU Intel atom 1.6Ghz
1Gb ram

CPU use is:25%

Wait… Is that a multicore Atom (I have a dual core on my desk, so at
least I
know they exist ;-), or does this mean rendering is hardware accelerated?

Of course, retrace sync’ed flips could pull the CPU use below 100%, but
unless
you’re at one frame per refresh (as in “full frame rate”), you won’t be
going
much below 100% before you switch to the next “notch” and get a higher
frame
rate.

Anyway, one advanced trick you might want to try is partial (or “smart”)
updates. Quite tricky stuff indeed, but if you have a fixed framerate and a
fixed scrolling speed, you might get away with turning the ‘text’ surface
into
a “mask” that just contains “scroll step size” wide edges around the
letters,
and then transparent pixels. (Transparency will be optimized out by the
RLE,
both from memory and blit operations.) You blit as usual, but without
clearing
the screen, and what happens is that the left and right edges of the
letters
move, while the solid colored areas in between are untouched. This should
reduce the bandwidth requirements to a fraction of what you have now.


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Caching and having source image data in the right format is the key.
Memory alignment to architecture boundaries helps too (i.e. 8byte
alignment on the 32/64 bit atom arch). And since you are probably
running on a shared memory video, blitting only what is necessary will
lower memory-bus-pressure and help performance.

I would proceed as follows:

  • break up your font into surfaces, one for each character, and
    pre-render these; alternatively you can create one large surface
    containing all characters and create an array of rectangle coordinates
    (x,y,w,h) which maintains the location for each by character in the texture
  • ensure that the source (pre-rendered character surfaces) and the
    target (screen) have the same surface format; if they are not the same,
    use a surface-surface blit with the pre-rendered character surface into
    new ones that match the screen; possibly resize these new surfaces to
    have 64bit aligned widths (i.e. make width a multiple of 8)
  • when you scroll, perform a screen-to-screen blit to move the existing
    (i.e. already-rendered) text; then draw only the fragment that is new;
    this should just be a vertical slice of one of the pre-rendered
    character surfaces
  • to get better smoothness into the scrolling, use a more sophisticated
    delay implementation (see SDL_gfx’s framerate manager)
  • using one of the SDL demos as benchmark, I’d also experiment with the
    screens resolution, refresh rate and format to find the one that is
    acceptable visually but performs best, then stick to that.

If you have any kind of 3D acceleration, I’d either switch to hardware
accelerated vector based font rendering such as GLTT
(http://gltt.sourceforge.net/) or move all the pre-rendered characters
into texture memory and render quads to the screen
(http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=43).

Happy coding.On 3/31/10 10:37 AM, Ricardo Leite wrote:

Friends,
My program that displays text messages is very slow and inefficient.
Please, help me with tips and tricks.

Tanks.


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

[…]

  • when you scroll, perform a screen-to-screen blit to move the existing
    (i.e. already-rendered) text; then draw only the fragment that is new;
    this should just be a vertical slice of one of the pre-rendered
    character surfaces

Be warned though, that this can be extremely slow unless you have hardware
acceleration, or “real” shared video memory.

One would think that shared video memory should be perfect for software
rendering, but in my experience, some of these solutions actually handle
software rendering worse than video cards with their own VRAM! Could be the
drivers not understanding that the “VRAM” is actually in the address space of
the CPU…

Of course, hardware scrolling (as in changing the RAMDAC DMA pointer rather
than blitting data around) would provide an incredible speed-up, but
unfortunately, there is very limited and non-portable support for that in
widely available APIs. If you’re coding for a specific device, it may be worth
looking into, though.On Friday 02 April 2010, at 16.14.25, Andreas Schiffler wrote:


//David Olofson - Developer, Artist, Open Source Advocate

.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------’