Mattias Engdeg?rd wrote:
Hmm, I didn’t benchmark XSun… Hmm, just did a “x11perf
-copypixwin500”, and I’m not sure… It looks fast (it is faster on the
Sun SparcStation 5 than on my overclocked Pentium 225 with Matrox G200),
but not by much (something like 20 MB/s vs 17 MB/s on my Pentium).
The old 143 MHz UltraSparc I’m sitting in front of does 298 copypixwin500/s,
almost 75 MB/s, but only 104 shmput500/s (both in 8bpp).
Very reasonable. This looks to me like the pixmap is stored in main
memory and you are getting some kind of non-PIO transfer to the
framebuffer (bus master/DMA). Or maybe you just have a very good bus.
BTW, if your UltraSparc is old, what should I call my SparcStation 5?
Yes, it depends on the game. A tile-and-sprite-based game doesn’t need
to update its tiles and sprites all the time. You could upload a single
largish pixmap for each sprite that would contain all the animated
frames (and probably also upload its precalculated clip mask).
I actually tried that for a tile-and-sprite-based game, and got better
framerates doing it by hand. I think my RLE-encoded sprites were way faster
than the clip mask-based X11 blitting. And I can still do pixel effects
Okay, maybe not for sprites, but background tiles could do well. Oh, but
then you have to XShmPutImage a sprite on the background, which wouldn’t
be okay, you want to merge the two beforehand. I know window-to-window
is very fast, so there is a lot to gain there, if you have any
repetitive image on the screen, blit it once through shared memory, then
XCopyArea it everywhere else.
Of course, if shared memory is faster, go for it!
No, they don’t saturate the bus. The problem is with wait states I
think. While it doesn’t go much faster with DMA or bus mastering, you
can start calculating the next frame while it transfer, so you get a
better overall framerate. With a DirectX test we did, one of my friend
was getting nearly twice as many MB/s blitting from system memory to
video memory than I could do in Linux on a faster machine with a better
video card. Aww…
Very interesting! X11 should use that a lot more. There should be no reason
XShmPutImage has to use PIO — it could use the same bus mastering/DMA
as DirectX, and you can even ask for an event to be sent upon completion.
Barring any DMA-related memory limitation or stuff like that, this
should work. I think this is the big problem, but I don’t know enough
about bus transfers. But an asynchronous DMA transfer that would trigger
an event upon completion (as XCopyArea and XShmPutImage both can already
do) would be sweet.
Does XFree86 4.0 do better in this regard? I have no modern PC to try it
on, alas.
I have no idea.–
Pierre Phaneuf
Systems Exorcist