“More Tricks of the Game Programming Guru’s” has a nice write up of
how to use duff’s device to unroll loops, and even covers optimizing
it a little.
Well, the latest CVS code has duff-unrolled blitters. This increases the
code size, and isn’t usually much faster than the original blitters.
I am interested in performance feedback from the new blitters.
I found that in some cases Duff unrolling was slower than my optimized
blitters. The only thing I can think of is that the unrolling increases
the code size and can thrash the instruction cache.
Please let me know how they work for you!
I left the unrolling a compiler define option. If you want to try without
unrolling, edit src/video/SDL_blit.h and comment the #define USE_DUFFS_LOOP
at the bottom.
-Sam Lantinga (slouken at devolution.com)
Lead Programmer, Loki Entertainment Software–
“Any sufficiently advanced bug is indistinguishable from a feature”
– Rich Kulawiec