Fellows, in SDL fonts i found this:
/*
- Set up a blit between two surfaces – split into three parts:
- The upper part, SDL_UpperBlit(), performs clipping and rectangle
- verification. The lower part is a pointer to a low level
- accelerated blitting function.
-
- These parts are separated out and each used internally by this
- library in the optimimum places. They are exported so that if
- you know exactly what you are doing, you can optimize your code
- by calling the one(s) you need.
*/
int SDL_LowerBlit (SDL_Surface *src, SDL_Rect *srcrect,
SDL_Surface *dst, SDL_Rect *dstrect)
{
(…)
IMHO: SDL_LowBlit() is fast but the parameters must be cleaner.
This is about eliminating the tiny overhead of clipping. It’s only relevant if
you’re doing thousands of blits per frame.
Another idea, my hardware is fixed (atom, i386 compatible). The
SDL_BlitSurface() function may be rewrite in assembly:
Memory block copying (scanlines of a surface blit) are trivial for the
compiler to optimize, so I doubt you’ll improve the bandwidth by using asm.
The real problem here is that PC hardware, since the days of Pentium or so, is
not designed for software rendering. The expansion slot bus (ISA, VLB, PCI,
AGP, etc…) forms a serious bottleneck between the CPU and the VRAM, as the
chipsets are really designed to use those busses for DMA.
Now, you might think that an integrated shared memory video solution would
eliminate this problem, making VRAM as fast as normal RAM, but no! It seems
like most of the time, the driver will point the CPU at the area where the
video chip maps “its” VRAM (which is the only way to access it “directly” with
non-integrated video card), rather than directly to the physical RAM. Thus,
you’ll have both the bottleneck of the bus, and the slow-down caused by the
video chip fighting the CPU for RAM access while forwarding those "VRAM"
accesses.
You may be able to hack the driver to get it to tell you where the “VRAM
window” is, but that memory might be banked or interleaved in strange ways
that you can’t see when accessing through the video chip… Again, this is one
of those things that might be worth checking out if you’re coding for a
specific device, but this will be even more non-portable than hardware
scrolling.
//David Olofson - Developer, Artist, Open Source Advocate
.— Games, examples, libraries, scripting, sound, music, graphics —.
| http://olofson.net http://kobodeluxe.com http://audiality.org |
| http://eel.olofson.net http://zeespace.net http://reologica.se |
’---------------------------------------------------------------------'On Monday 05 April 2010, at 16.12.41, Ricardo Leite wrote: