This isn’t specific to SDL, but rather about platform parity between
Windows and Linux. I’m really starting to get pissed off.
Take two computers:
Windows machine:
- Pentium 200 MMX
- 66 MHz system bus
- Voodoo Banshee 16 MB
- 96 MB of RAM
Linux machine:
- Pentium 225 MMX
- 75 MHz system bus
- Matrox Millenium G200 SD 8 MB (also a Voodoo2 12 MB, but unrelated)
- 96 MB of RAM
We do some blitting tests (at 640x480 in 16 bit), only to find out the
following: my machine can barely move around 30 megabytes per second
while the Windows machine can do 474 freakin’ megabytes per second!!!
We’re talking 2D here. We did the test both in fullscreen and windowed
modes on Windows, so telling me to use DGA isn’t going to do (and in the
reality, it isn’t faster, or only barely so, nothing 10-15 times
faster).
Obviously, X isn’t using the video memory to store the pixmap (and use
the hardware blitter to copy it).
Also, note that he gets 56 megabytes per second blitting from a surface
in system memory, which is almost twice as fast. I’m suspecting the X
server of using memcpy() instead of some bus mastered or DMA transfer to
do the blitting…
Oh yes, our test program is basically this: create a 640x480 Window and
two same sized Pixmap, one painted white and the other painted black.
XCopyArea each of them in sucession to the window. I have to use a XSync
to prevent flooding the X server.
Raster (of Enlightenment fame) told me that my test ran around 210
megabytes per second (accelerated) on XFree86 4.0, but I’d like to know
what can be done for XFree86 3.3.x? Am I missing something big? What
about this, reported by my X server at startup (with “xaa_benchmark”):
(–) SVGA: Using XAA (XFree86 Acceleration Architecture)
(–) SVGA: XAA: Solid filled rectangles
(–) SVGA: XAA: Screen-to-screen copy
(–) SVGA: XAA: 8x8 color expand pattern fill
(–) SVGA: XAA: CPU to screen color expansion (TE/NonTE imagetext,
TE/NonTE polytext)
(–) SVGA: XAA: Using 9 128x128 areas for pixmap caching
(–) SVGA: XAA: Caching tiles and stipples
(–) SVGA: XAA: General lines and segments
(–) SVGA: XAA: Dashed lines and segments
CPU to framebuffer 45.71 Mpix/sec (91.42
MB/s)
10x1 solid rectangle fill 19.80 Mpix/sec (39.60
MB/s)
40x40 solid rectangle fill 190.37 Mpix/sec (380.74
MB/s)
400x400 solid rectangle fill 240.61 Mpix/sec (481.22
MB/s)
10x10 screen copy 59.18 Mpix/sec (118.36
MB/s)
40x40 screen copy 149.13 Mpix/sec (298.26
MB/s)
400x400 screen copy 197.84 Mpix/sec (395.68
MB/s)
400x400 aligned screen copy (scroll) 200.39 Mpix/sec (400.78
MB/s)
10x10 8x8 color expand pattern fill 106.10 Mpix/sec (212.20
MB/s)
400x400 8x8 color expand pattern fill 243.43 Mpix/sec (486.86
MB/s)
10x10 CPU-to-screen color expand 5.36 Mpix/sec (10.72
MB/s)
416x400 CPU-to-screen color expand 235.11 Mpix/sec (470.22
MB/s)
10x10 screen-to-screen color expand 66.76 Mpix/sec (133.52
MB/s)
Where the f**k is that 395 MB/s I see for screen copy??? Or that 91 MB/s
for “CPU to framebuffer”??? If I can get one half of those numbers,
I’ll be a happy camper.
Anybody got an idea?–
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
“First they ignore you. Then they laugh at you.
Then they fight you. Then you win.” – Gandhi