SDL slower with Hermes?

My big project right now with SDL is a small 3D rendering engine,
designed around portability rather than speed, but faster is better (of
course). I finally upgraded to a K6 2 recently and wanted to see just
how much of a speed improvement the Hermes MMX code would give. To my
surprise, my program was demonstrably slower! Where I could get up to
30 fps without it, my frame rate dropped to 26 with it. Most curious.
I don’t do a whole lot of blitting in my program, so I don’t see why my
frame rate should drop so much! This is true whether full screen or
windowed. Anyone got any explanations?–

