X11 performance

Stephane Peter wrote:

I know the internal blitter of my video card is a magnitude faster
than what I get now! Now, how do I get to use it???

Well, the fact is that the X server is the only software responsible for it.
Only the X server decides how to accelerate things, and there is no way that
I know of in the current X11 spec that allow a client to force the server to
use acceleration (X11 was not designed with games in mind!).

Yes, I know there is no explicit “put this in video memory” flag, but I
am trying to know what are the requirements that need to be satisfy so
that XFree86 will use put a pixmap in video memory (I know it could get
kicked out and so on, but I guess that if I use it madly, it should stay
there for a bit)…

I was thinking of creating an X11 extension that would give
DirectX-style control, but I’m not looking very seriously into it
(hacking and recompiling the X server sounds awful, but dynamically
loadable modules in XFree86 4.0 could make this much easier to hack)…
But DGA 2.0 could help (but is only fullscreen).

Although most if not all X11 drivers (especially in XFree86) at least have
blits and rectangles hardware accelerated (else the performance would be
utterly slow, try the framebuffer only X server to get an idea)…

Window to Window XCopyArea and XFillRectangle in Window are definitely
accelerated (100-something megabytes per second and up), both on S3
ViRGE and Matrox G200.

I really have to try setting a pixmap as the background of a window and
use XClearArea, I’ve heard good things about that.

We achieved extraordinary results with some native DirectX tests by
careful hardware usage. Our way of using the hardware resulted in a very
unusual 2D library with features like memory management that are more
often found in 3D libraries than in 2D ones. For example, where most 2D
libraries give you access to the pixel data of surfaces, that we could
instead optimize better by disallowing surface access, having the
library user instead “upload” the surface data, similar to texture
management in OpenGL.

This mostly works the same way in X11, although you don’t have any control
over this (which is a shame for game developers!). However with a bit of
knowledge of X servers work (and mostly XFree in this case), you can arrange
things so that you increase the chances that hardware acceleration will be
used (using Pixmaps that may be stored in video memory for instance)…

Doing things like DirectX is almost impossible in X, because this is just
plain dirty and in contradiction with what X11 stands for. The only solution
would be to use a new X11 extension, something like DGA 2.0 …

No explicit control could do, if at least it can happen sometimes!
Explicit control would be much better, that’s true, but would also be
contrary to regular X philosophy.

I tried using Pixmaps of various sizes (my 3.3.5 XFree86 advertise that
it reserves 9 128x128 areas for pixmap caching, so I tried 640x480,
128x128 and 127x127 pixmaps), to no avail.

DGA 2.0 sounds much better. I expect that the DGACopyArea in it are
hardware accelerated (though I do not exactly understand how you would
address off-screen video memory) and it even has a colorkey blit (YES!).
But it requires a lot of stuff, like being fullscreen and all sort of
permission crappiness…

I do not want direct access to the framebuffer, and the accelerated
functions I want should work both in windowed and fullscreen mode, so
while I appreciate being able to switch to fullscreen (which I can also
do with xf86vmode and an override_redirect window), this is a lot to
pay.

Is there any way to tell that a pixmap has been cached in hardware and
will be HW accelerated in blitting? Is there any way to request a
pixmap that must reside in acceleratable video memory?

Nope. I can tell you under what conditions XAA will currently
put a pixmap in offscreen memory though. If it has an area
larger than 64000 pixels and there’s room for it (and provided
the driver has allowed this) it will get stuck in offscreen
memory. It can get kicked out by the server at any time to
make room for something else though.

Hmm… Is this about 3.3.x or 3.9.x? Raster told me that my benchmark
test kicked ass with 3.9.16, but did suck with 3.3.x (with a pixmap of
640x480, well over 64000 pixels, and I have 8 megs of video memory,
should fit).

That’s assuming you’re talking about using Pixmaps for
back buffers. Pixmaps used as GC tiles are handled a little
differently.

“differently”? How so exactly? I am currently using a Pixmap as a back
buffer (I XCopyArea a 640x480 pixmap to a 640x480 window).–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
“First they ignore you. Then they laugh at you.
Then they fight you. Then you win.” – Gandhi

Sam Lantinga wrote:

But for the shared pixmap vs. shared image, I would agree (a small
improvement tho, in the 2%-5%).

That’s correct. Also, many X servers do not support X shared mem pixmaps
while they do support shared mem images.

Hmm, sticking a mental note of this…–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
“First they ignore you. Then they laugh at you.
Then they fight you. Then you win.” – Gandhi

Stephane Peter wrote:

Here is a post from Mark Vojkovich, one of the main XFree86 developers :

By the way, thanks St?phane (correct? are you are french?)! You’ve been
one of the most helpful help I had!

Is there any XFree86-specific forum that would be appropriate for this,
preferably the kind where knowledgeable people like this Mark Vojkovich
(heard about him before, but never “met” him in email) hang out?

Thanks again!–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
“First they ignore you. Then they laugh at you.
Then they fight you. Then you win.” – Gandhi

Sam Lantinga wrote:

Pierre, your basic complaint is that X does not allow you the precise
control over which pixmaps are put into video memory and which are not.
This is unfortunately a limitation of the X server, and though I have
sent a message to the XFree86 development list, I do not expect this
to change.

While I would like precise control over which pixmaps are put into
video memory, this is not exactly what I expected. I actually expected
no control over this.

What I did expect is that the X server would do use the large amount
of video memory I have to do intelligent caching. If I XCopyArea a 600k
pixmap as quickly as the X server can do it without ever touching the
pixmap itself, I would expect the X server to stick it in video memory!

Now what happen is one of two things. Either:

  • a bunch of smaller pixmaps blitted “every once in a while” (probably
    the icons in the dock of my window manager or something like that, which
    are blitted only when they get an Expose event or some similarly extreme
    slow rate) occupy the video memory, preventing my large pixmap blitted a
    large number of times per second from being accelerated.

Or:

  • there is about 6 or 7 megabytes of expensive SDRAM on my expensive
    video card that are just gathering electronic dust. If I ever hear
    that’s the case, I might become a very infuriated and motivated XFree86
    developer.

So the choices are see are “dumb” and “dumber”. Not exactly cool. I’m
only expecting “okay” behavior from my X server, not total control over
framebuffer memory usage! Proof of this is that I don’t even care for
DGA, I’d prefer people that should know better, the guys that wrote the
driver for my video card, how to best use it.

You might take a look at the new SDL framebuffer console driver, which
is in it’s infancy but has direct access to the entire video memory and
acceleration for supported video cards. Currently, both the 3Dfx and
Matrox cards are supported by the fbcon driver.

I thought about this, but isn’t to my liking. All the stock
distributions (for Intel machines) that I know don’t use fbcon by
default. I don’t even know how to set up fbcon on my computer (okay, I
didn’t even try), so expecting people that have problem finding the
Pause key on their keyboard to do so is maybe a lot to ask (for the
info, our top question we get about Quadra is “What is DINPUT.DLL?”).

Mostly everyone that is going to play games on their Linux box has X
working at least minimally. Our design philosophy for our next
developments has a big part about being well integrated with the regular
user interface and windowing system. Having an optional fbcon driver
that could yield even better performance is nice, but it is unacceptable
to require such a thing as fbcon for our games.

I’m still smarting from requiring Svgalib (which doesn’t work crap with
newer video cards as delivered by most Linux distributions).–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
“First they ignore you. Then they laugh at you.
Then they fight you. Then you win.” – Gandhi

In article <38721214.14828CF5 at ludusdesign.com>,
Pierre Phaneuf writes:

Stephane Peter wrote:

Here is a post from Mark Vojkovich, one of the main XFree86 developers :

By the way, thanks St?phane (correct? are you are french?)! You’ve been
one of the most helpful help I had!

Yes I am French ;-). I bet you are too :wink:

Is there any XFree86-specific forum that would be appropriate for this,
preferably the kind where knowledgeable people like this Mark Vojkovich
(heard about him before, but never “met” him in email) hang out?

This was actually a post on the xfree86-devel list, which is available if
you register as a XFree86 developer. You can probably get subscribed to it
by only asking (look on the XFree86 web site for info).–
Stephane Peter
Programmer
Loki Entertainment Software

“Microsoft has done to computers what McDonald’s has done to gastronomy”

Stephane Peter wrote:

By the way, thanks St?phane (correct? are you are french?)! You’ve been
one of the most helpful help I had!

Yes I am French ;-). I bet you are too :wink:

French Canadian actually, I live in Montr?al. :slight_smile:

This was actually a post on the xfree86-devel list, which is available if
you register as a XFree86 developer. You can probably get subscribed to it
by only asking (look on the XFree86 web site for info).

Thanks, I am sending an e-mail to the XFree86 people at this very
moment!–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
“First they ignore you. Then they laugh at you.
Then they fight you. Then you win.” – Gandhi

  • there is about 6 or 7 megabytes of expensive SDRAM on my expensive
    video card that are just gathering electronic dust. If I ever hear
    that’s the case, I might become a very infuriated and motivated XFree86
    developer.

mind if i ship you my card (Diamond Viper V550 with a TNT chip),i’m
using X 3.3.6 atm and it’s nowhere near accelerated (2 cause for that,i
have to put up with the unnacelerated svga server and 2,the card need an
irq and my bios doesn’t have that option,a bios upgrade is planned).

Alain “currently living in Sherbrooke,QC” Toussaint

Alain Toussaint wrote:

  • there is about 6 or 7 megabytes of expensive SDRAM on my expensive
    video card that are just gathering electronic dust. If I ever hear
    that’s the case, I might become a very infuriated and motivated XFree86
    developer.

mind if i ship you my card (Diamond Viper V550 with a TNT chip),i’m
using X 3.3.6 atm and it’s nowhere near accelerated (2 cause for that,i
have to put up with the unnacelerated svga server and 2,the card need an
irq and my bios doesn’t have that option,a bios upgrade is planned).

Hmm… Sorry, but I’m not yet an XFree86 developer (I did send my
application though), and even if I was, I’m not into driver development,
sorry again! I plan to work on an extension that will allow client
access to the XAA functions (not completely directly, but whatever).

If you want to ship me your card anyway, well thanks, I could find a use
for it, no doubt about this! :-)–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/

Hi everyone,

I am frequently recompiling SDL after making minor modifications,
but I find that just doing “make” followed by “make install” often
doesn’t pick up my latest changes; instead I have to do "make clean"
followed by “make” and “make install”.

Does anyone know the trick to force my changes to be incorporated
into each build, without resorting to the time-consuming "make clean"
every time? Somehow, even though “make” is obviously recompiling
the modified source files, the latest changes don’t get picked up
unless I “make clean” first…

(The changes I’m making are for allowing SDL to drive custom graphics
hardware)

Thanks,

Steve Madsen

Hi everyone,

I am frequently recompiling SDL after making minor modifications,
but I find that just doing “make” followed by “make install” often
doesn’t pick up my latest changes; instead I have to do "make clean"
followed by “make” and “make install”.

Does anyone know the trick to force my changes to be incorporated
into each build, without resorting to the time-consuming "make clean"
every time? Somehow, even though “make” is obviously recompiling
the modified source files, the latest changes don’t get picked up
unless I “make clean” first…

Automake madness:
rm -vf find src -name '*.la'

See ya!
-Sam Lantinga (slouken at devolution.com)

Lead Programmer, Loki Entertainment Software–
“Any sufficiently advanced bug is indistinguishable from a feature”
– Rich Kulawiec

Sam Lantinga wrote:

Hi everyone,

I am frequently recompiling SDL after making minor modifications,
but I find that just doing “make” followed by “make install” often
doesn’t pick up my latest changes; instead I have to do "make clean"
followed by “make” and “make install”.

as make will only recompile .c files that got changed you have to build
a dependency file so that make knows what to rebuild when a file got
changed.

Does anyone know the trick to force my changes to be incorporated
into each build, without resorting to the time-consuming "make clean"
every time? Somehow, even though “make” is obviously recompiling
the modified source files, the latest changes don’t get picked up
unless I “make clean” first…

Automake madness:
rm -vf find src -name '*.la'

Am I overlooking something or why is there no such thing as 'make dep’
with SDL? Basically a line like

find . -name “*.c” | sed “s/^.///;” | xargs gcc $(OPTIONS) -M >
Makefile.dep

where options are the compilation options should produce a file that can
be included to the Makefile. I just ripped it from the ClanLib Makefile
and adjusted it a bit, but without dependencies I would go nuts as a
full rebuild of ClanLib usually takes 4 minutes on my PII 350 ;)–
Daniel Vogel My opinions may have changed,
666 @ http://grafzahl.de but not the fact that I am right

where options are the compilation options should produce a file that can
be included to the Makefile. I just ripped it from the ClanLib Makefile
and adjusted it a bit, but without dependencies I would go nuts as a
full rebuild of ClanLib usually takes 4 minutes on my PII 350 :wink:

Automake generates dependencies. The problem is it doesn’t generate
them for the .la files.

-Sam Lantinga				(slouken at devolution.com)

Lead Programmer, Loki Entertainment Software–
“Any sufficiently advanced bug is indistinguishable from a feature”
– Rich Kulawiec

Sam Lantinga wrote:

where options are the compilation options should produce a file that can
be included to the Makefile. I just ripped it from the ClanLib Makefile
and adjusted it a bit, but without dependencies I would go nuts as a
full rebuild of ClanLib usually takes 4 minutes on my PII 350 :wink:

Automake generates dependencies. The problem is it doesn’t generate
them for the .la files.

Noticed right after I pressed ‘send’ - that explains the .deps dir :wink:
When does automake generate dependencies? During each build or only if
the .P files are non existant?–
Daniel Vogel My opinions may have changed,
666 @ http://grafzahl.de but not the fact that I am right

Noticed right after I pressed ‘send’ - that explains the .deps dir :wink:
When does automake generate dependencies? During each build or only if
the .P files are non existant?

I’m not sure.

-Sam Lantinga				(slouken at devolution.com)

Lead Programmer, Loki Entertainment Software–
“Any sufficiently advanced bug is indistinguishable from a feature”
– Rich Kulawiec

Sam Lantinga wrote:

Noticed right after I pressed ‘send’ - that explains the .deps dir :wink:
When does automake generate dependencies? During each build or only if
the .P files are non existant?

I’m not sure.

I asked because if the latter is true changing the deps of a file (by
adding an include) compiling it and then changing the included file
wouldn’t update the file. Well, then I would be forced to do a make
clean if I couldn’t remember which .o file to delete. Okay, this is
highly theoretical… sorry.–
Daniel Vogel My opinions may have changed,
666 @ http://grafzahl.de but not the fact that I am right

Steve Madsen wrote:

I am frequently recompiling SDL after making minor modifications,
but I find that just doing “make” followed by “make install” often
doesn’t pick up my latest changes; instead I have to do "make clean"
followed by “make” and “make install”.

Does anyone know the trick to force my changes to be incorporated
into each build, without resorting to the time-consuming "make clean"
every time? Somehow, even though “make” is obviously recompiling
the modified source files, the latest changes don’t get picked up
unless I “make clean” first…

(The changes I’m making are for allowing SDL to drive custom graphics
hardware)

Looks like the dependencies aren’t right. Change a file that you know
isn’t being taken into account, then use “make -p 2>&1 | less” and
meditate on the output. Look for the rules that create files you’d like
updated (like libsdl.so for example) and meditate some more.–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/

Sam Lantinga wrote:

Automake madness:
rm -vf find src -name '*.la'

automake is a general, all-around madness if you ask me… :wink:

http://www.tip.net.au/~millerp/rmch/recu-make-cons-harm.html--
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/

Pierre Phaneuf writes:

| Sam Lantinga wrote:
|
| > Pierre, your basic complaint is that X does not allow you the precise
| > control over which pixmaps are put into video memory and which are not.
| > This is unfortunately a limitation of the X server, and though I have
| > sent a message to the XFree86 development list, I do not expect this
| > to change.
|
| While I would like precise control over which pixmaps are put into
| video memory, this is not exactly what I expected. I actually expected
| no control over this.
|
| What I did expect is that the X server would do use the large amount
| of video memory I have to do intelligent caching. If I XCopyArea a 600k
| pixmap as quickly as the X server can do it without ever touching the
| pixmap itself, I would expect the X server to stick it in video memory!
|
| Now what happen is one of two things. Either:
|
| - a bunch of smaller pixmaps blitted “every once in a while” (probably
| the icons in the dock of my window manager or something like that, which
| are blitted only when they get an Expose event or some similarly extreme
| slow rate) occupy the video memory, preventing my large pixmap blitted a
| large number of times per second from being accelerated.
|
| Or:
|
| - there is about 6 or 7 megabytes of expensive SDRAM on my expensive
| video card that are just gathering electronic dust. If I ever hear
| that’s the case, I might become a very infuriated and motivated XFree86
| developer.
|
| So the choices are see are “dumb” and “dumber”. Not exactly cool. I’m
| only expecting “okay” behavior from my X server, not total control over
| framebuffer memory usage! Proof of this is that I don’t even care for
| DGA, I’d prefer people that should know better, the guys that wrote the
| driver for my video card, how to best use it.

The problem here is the way XFree does pixmap caching. When
starting the server you get something like this:

    Setting up tile and stipple cache:
            32 128x128 slots
            32 256x256 slots
            16 512x512 slots

This cache is placed “under” your screen in memory with same width
as the screen as many accelerators can’t work with multiple strides
at the same time.

You can view the contents at least with 3.9 servers by adding
Option “ShowCache” to the device section in XF86Config. Then switch
to a lower resolution with ctrl-alt-+/- and move the screen downwards.

If I remember correctly, 3.3 servers don’t do any kind of pixmap
caching, only patterns, tiles and stipples. 3.9 also caches pixmaps
to some extent, but with limitations. I don’t know the inner
workings of XAA to well (I’ve only worked on the driver side of
things), but I’d guess that the pixmap has to fit inside one cache
slot etc.

Good heuristics are pretty hard to find, but for example at the
moment my cache seems to contain all the pixmaps wmmoon in my
WindowMaker dock would use. That includes 60 pixmaps of moon in
different phases, text labels plus the font it uses. Pretty
efficient, for that application at least, I’d say.

With DGA2 of XFree 3.9 you can use all normal Xlib routines plus
you get in practise straight access to screen to screen blits and
rectangle fills via DGA itself.

| > You might take a look at the new SDL framebuffer console driver, which
| > is in it’s infancy but has direct access to the entire video memory and
| > acceleration for supported video cards. Currently, both the 3Dfx and
| > Matrox cards are supported by the fbcon driver.
|
| I thought about this, but isn’t to my liking. All the stock
| distributions (for Intel machines) that I know don’t use fbcon by
| default. I don’t even know how to set up fbcon on my computer (okay, I
| didn’t even try), so expecting people that have problem finding the
| Pause key on their keyboard to do so is maybe a lot to ask (for the
| info, our top question we get about Quadra is “What is DINPUT.DLL?”).
|
| Mostly everyone that is going to play games on their Linux box has X
| working at least minimally. Our design philosophy for our next
| developments has a big part about being well integrated with the regular
| user interface and windowing system. Having an optional fbcon driver
| that could yield even better performance is nice, but it is unacceptable
| to require such a thing as fbcon for our games.
|
| I’m still smarting from requiring Svgalib (which doesn’t work crap with
| newer video cards as delivered by most Linux distributions).

Actually the newer versions of SVGALib support for example Voodoo 3
and NVIDIA cards natively and more with VESA, but acceleration is
only wild dreams there…

// Jarno