Follow up (more details, test results, replies)

One of the things I didn’t do in the original post (mostly for brevity)
is to give some more details on the “background story” to this post. I should
have at least included this information in the readme, because several posts
in this thread offer solutions that, however elegant, don’t help my problem.

 For starters, I didn't explain why using OpenGL is flat out for my 

project. OpenGL is a large, complex graphics library that virtually everyone
likely to modify my code is 1) not going to have, 2) not going to know much
about, and 3) not going to spend the considerable amount of time needed to set
up and understand unless no other alternative exists. Because multiple
alternatives /do/ exist, discussion of OpenGL has no bearing.
I ask people who post OpenGL solutions on this - not an OpenGL mailing
list - to remember why a lot of us come to SDL. Not to find a wrapper around
any other library, but to find a Simple, Direct, Multimedia library that we
hope will be a complete (or nearly complete) solution for most simple cross-
platform 2D projects, thus freeing us from having to juggle any other library.

 SDL isn't a complete solution for me and my project.  The major reason is 

speed. To attain anything close to the performance of the several single-
platform ports of my project (Mac, X11, and Win32 being three), I must solve
at least one of two problems: Get text onto the application SDL surface
faster, or figure out why SDL_Update_Rects is so bloody slow on all tested
machines and using all tested drivers.

 The source provided makes it easy to determine what problem is more 

important. Or not, as you will see below.

Kein-Hong Man,

 Thank you very much for your tests of the supplied code.  Your setup 

instructions for each test are perfectly clear, but I’m astonished at your
results. In particular, your results show a much larger proportion of time
spent in update_rects than mine do. Here are mine:=======================================

Test results, with source as supplied (no changes to code, 64 stored update
rects)

Compiled with Visual C++ (optimized for speed). Using SDL 1.2.11.

On a 2.2 Ghz desktop machine running windows XP, Pentium IV, Radeon 8500 video
card, 32-bit display:

Time needed to display text pages: 3971.
- spent in text output: 1454
- spent in screen refreshes: 2511
- spent in miscellaneous: 6

Time needed to display text lines: 2684.
- spent in text output: 1225
- spent in screen refreshes: 1447
- spent in miscellaneous: 12

Time needed to display individual characters: 4571.
- spent in text output: 1670
- spent in screen refreshes: 2784
- spent in miscellaneous: 117

On a 633 Mhz desktop machine running Windows 98, Celeron, no seperate video
card, 16-bit display:

Time needed to display text pages: 12268.
- spent in text output: 4985
- spent in screen refreshes: 7033
- spent in miscellaneous: 250

Time needed to display text lines: 13793.
- spent in text output: 4530
- spent in screen refreshes: 8830
- spent in miscellaneous: 433

Time needed to display individual characters: 22978.
- spent in text output: 7543
- spent in screen refreshes: 13170
- spent in miscellaneous: 2265

======================================

 My earlier code (the version before the present one) worked much the same 

way as Kein-Hong Man describes: Render each character in the current font to
an individual 8-bit SDL surface, use SDL_SRCCOLORKEY and SDL_RLEACCEL for
speed, and use palette edits to enable the many text colors needed.
Text output results for that method were almost twice as slow when the
application was run in 8-bit mode (256 colors), and quite a lot worse when it
was run in any other color depth. As you can see from the above results, a
doubling of the time needed to output text has a significant effect on total
performance on my test machines.
These results pass the “Is it reasonable?” test. After all, when you turn
a font into graphics and also insist upon blitting from 8-bit surfaces to non-
8-bit surfaces, you can expect things to be slow.

 When I replaced the pre-rendered blitting code with the current pixel 

offset code, my SDL app went from "it hurts my eyes it flickers so much"
to “just barely acceptable … on fast machines”. If this improvement hadn’t
happened, I might not have offered an SDL version at all.

 Maestros, gurus, experts, and cognoscenti:  I claim SDL (UpdateRects in 

particular) is slow, especially on Windows, but also on Linux. The obvious
response to such a claim is “no, /the way you use SDL/ is slow”. Well, I’ve
tried the example code, followed the tutorials, read the docs and this mailing
list, scoured the web for ideas, tried out all the drivers listed, optimized
for weeks on end, used four different compilers (bcc, msvc, dev-c++, gcc under
Konsole), and STILL HAVEN’T GOTTEN ANYTHING LIKE THE SPEED OF X11, MAC, or
WIN32!!

Sorry for yelling. It’s just that I am /so/ frustrated. I don’t pretend to
be an expert. Any help or ideas are much appreciated.

                For starters, I didn't explain why using OpenGL  

is flat out for my
project. OpenGL is a large, complex graphics library that
virtually everyone
likely to modify my code is 1) not going to have, 2) not going to
know much
about, and 3) not going to spend the considerable amount of time
needed to set
up and understand unless no other alternative exists. Because
multiple
alternatives /do/ exist, discussion of OpenGL has no bearing.
I ask people who post OpenGL solutions on this - not an OpenGL
mailing
list - to remember why a lot of us come to SDL. Not to find a
wrapper around
any other library, but to find a Simple, Direct, Multimedia library
that we
hope will be a complete (or nearly complete) solution for most
simple cross-
platform 2D projects, thus freeing us from having to juggle any
other library.

 SDL isn't a complete solution for me and my project.  The  

major reason is
speed. To attain anything close to the performance of the several
single-
platform ports of my project (Mac, X11, and Win32 being three), I
must solve
at least one of two problems: Get text onto the application SDL
surface
faster, or figure out why SDL_Update_Rects is so bloody slow on all
tested
machines and using all tested drivers.

The reason SDL is so slow, I think, is very simple. It’s not using
your video card. Correct me if I’m wrong, but I believe that because
not all systems can be assumed to have video cards (especially in the
realm of handheld devices, an area which while perhaps not initially
intended for, SDL has left itself open to porting to), SDL does not
default to using your video card. All of the blitting, drawing,
blending, etc… are done on your processor and ONLY your processor.
That’s why everyone was suggesting OpenGL as an addition to direct SDL.

While I understand the desire to keep things as simple as possible, I
feel like I should point out that SDL is in essence a "wrapper"
around a bunch of very system specific libraries to accomplish the
goal that you can write one piece of code and it runs basically the
same on many different setups. I’m sure you realize the difference
and just phrased it mistakenly, but in your usage the problem is that
OpenGL is the only thing which I WOULDN’T call a wrapper. OpenGL is
as cryptic as is it because it gets down and dirty with the video
hardware, and was not written to be elegant to understand, but rather
FAST, which is what you’re asking for.

OpenGL will not run on absolutely all systems, especially mobile
ones, where the idea of a graphics accelerator is so obscure as to
pass into the realm of the absurd. SDL, however, by not relying on
having a video card or OpenGL, can still be run just as effectively
on many handheld/mobile systems.

So it seems to me, though I’m not by ANY means an expert in this
area, that you may well have reached the maximum of performance at
the limitations you’ve set for yourself. AFAIK, when writing text
you either have to have the text pre-rendered to a file and draw
that, or have all your characters stored as small images and then
access them according to whichever character you need, for all
characters which might need displaying that frame. If you have a lot
of characters, as you’ve discovered I think, that can involve a LOT
of function calls, and when drawing with the processor, that can take
a while. (“a while” being relative here)

The only suggestion, which I hope you will take as an apology for
what has probably been a rather blunt and perhaps a bit offensive-
sounding (though I assure you none was intended) email is that you
may be able to save yourself some time in drawing at the expense of
memory. You could pre/cache the text you need drawn, once drawn the
first time, into images stored in memory. That way when you need to
redraw the text, you can simply draw the cached image without need to
redraw every last character every frame, and I think that could save
you some time between frames. Though your initial draw may well take
as much time as every redraw you’re doing now. I’m not an expert at
this as I’ve said before. I generally use OpenGL, and have SDL as my
backend for user input, sound, and simply setting up a drawing area
for my projects.

 The source provided makes it easy to determine what problem is  

more
important. Or not, as you will see below.

Kein-Hong Man,

– ScottOn Apr 17, 2007, at 9:25 PM, Leon Marrick wrote:

                For starters, I didn't explain why using OpenGL

…snip
…snip
The only suggestion, which I hope you will take as an apology for
what has probably been a rather blunt and perhaps a bit offensive-
sounding (though I assure you none was intended) email is that you
may be able to save yourself some time in drawing at the expense of
memory. You could pre/cache the text you need drawn, once drawn the
first time, into images stored in memory. That way when you need to
redraw the text, you can simply draw the cached image without need to
redraw every last character every frame, and I think that could save
you some time between frames. Though your initial draw may well take
as much time as every redraw you’re doing now. I’m not an expert at
this as I’ve said before. I generally use OpenGL, and have SDL as my
backend for user input, sound, and simply setting up a drawing area
for my projects.

I have to apologize. I just re-read your first post, and you said
you already tried pre-rendering.

Sorry I couldn’t offer something more helpful.

I am curious, however, who you expect your users to be in that they
won’t know OpenGL? I was under the impression that most anyone
involved in graphics programming needed to have at least a
rudimentary understanding of OpenGL or Direct3D, and thus OpenGL
would not be an insurmountable obstacle in the realm of graphics
project source code? Please forgive my ignorance if I am mistaken,
however.

– ScottOn Apr 18, 2007, at 12:54 AM, Scott Harper wrote:

On Apr 17, 2007, at 9:25 PM, Leon Marrick wrote:

Leon Marrick wrote:

 SDL isn't a complete solution for me and my project.  The major reason is 

speed. To attain anything close to the performance of the several single-
platform ports of my project (Mac, X11, and Win32 being three), I must solve
at least one of two problems: Get text onto the application SDL surface
faster, or figure out why SDL_Update_Rects is so bloody slow on all tested
machines and using all tested drivers.

For your case, 4.0ms per frame on the first test seems plenty fast
to me. That’s 250 frames per second. For my tests, running your
code using directx mode gives about 4.0ms as well. Exactly why
does your app needs performance in the hundreds of frames per
second range?

If blitting is accelerated, no problem. But if it is not
accelerated, then I don’t think much magic can be done, but I
still got about 60 frames per second. I can live with that, or
else perhaps it can be run in directx mode by default…

=======================================

Test results, with source as supplied (no changes to code, 64 stored update
rects)
[snip]

[snip] Text output results for that method were almost twice as slow when the
application was run in 8-bit mode (256 colors), and quite a lot worse when it
was run in any other color depth. As you can see from the above results, a
doubling of the time needed to output text has a significant effect on total
performance on my test machines.

Doubling of the time sounds about right, based on scaling my
timing data.

 When I replaced the pre-rendered blitting code with the current pixel 

offset code, my SDL app went from "it hurts my eyes it flickers so much"
to “just barely acceptable … on fast machines”. If this improvement hadn’t
happened, I might not have offered an SDL version at all.

How much does it really affect playability? I mean, 250 fps vs 125
fps for text_speed.c … What kind of app or game is it anyway? If
it is real time, you don’t need 250 fps, while if it is an
angband-type game, does the keyboard repeat rate allows 250 fps?
How many fps is your actual app or game getting anyway (which I
think is more important than trying to optimize text_speed.c to
the hilt)?> [snip]

Sorry for yelling. It’s just that I am /so/ frustrated. I don’t pretend to
be an expert. Any help or ideas are much appreciated.


Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia

Kein-Hong Man wrote:

Leon Marrick wrote:

 SDL isn't a complete solution for me and my project. 

[snip snip snip]

Sorry about that double post, I will be more careful in the future…–
khman
KL, MY