SDL_RenderPresent() / BitBlt() 16ms pause

Hi,
I have a large application using SDL2 running on linux and windows.
On Windows, close analysis of timings show that the call to SDL_RenderPresent() occasionally takes 16ms rather than <= 4ms. This happens about 8% of the time. No other delays are ever seen (i.e. nothing between 5ms and 15ms)

(I was using 2.0.12, but have moved to hg 2.0.13 to include https://bugzilla.libsdl.org/show_bug.cgi?id=5171

Compiling from source and including otther timing checks in

video/windows/SDL_windowsframebuffer.c

the 16ms can be seen within

WIN_UpdateWindowFramebuffer()

and specifically the call to BitBlt()

so this would therefore seem to be a Windows issue.

Tests show similar issues with direct3d11 driver, and with/out SDL_RENDERER_PRESENTVSYNC

A similar problem appears to have been reported here: https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/2cbe4674-e744-41d6-bc61-3c8e381aa942/how-to-make-bitblt-faster-for-copying-screen?forum=windowsdirectshowdevelopment

As a (probably very dangerous) experiment I have replaced my SDL_RendererPresent with:

unsigned int rpcallback(unsigned int timer, void *ref)
{
    SDL_Renderer *renderer = (SDL_Renderer *)ref;
    SDL_RenderPresent(renderer);
    return 0;
}

...

SDL_AddTimer(1, rpcallback, renderer);

and it does work, but seems unwise

Q1) is this is a viable solution? thread safety would seem to be at risk.
Q2) The 16ms delay hasn’t disappeared completely, the AddTimer() call occasional shows the same delay (0.1% of the time) [ irrelevant if Q1 is ill-advised ]
Q3) is there same aspect of the app that is triggering this behaviour as small stand-alone tests do not appear to show the problem.

Thanks

You are likely syncing to vblank, where ~16 milliseconds is 60fps.

You can create your SDL_Renderer without the SDL_RENDERER_PRESENTVSYNC flag, but some video drivers may still force vsync underneath SDL.

Hi, yes, tried with and without SDL_RENDERER_PRESENTVSYNC, but always get the delay. So it must be the lower layers. Is there any way around this

(although what I don’t quite understand is why there’s never any other delays apart from a few ms or 16ms – why never 8ms? or 13ms?)

after a long test, timings are 150105 @16ms and 9156668 at <4ms, i.e 1:61.00

Because 16ms is 60 FPS, likely the monitor’s refresh rate.

It could also be this SDL bug

Hi, vsync would make sense if the timings were distributed between 0 and 16ms – according to how busy my app was. But the timings I see are either 0ms or 16ms. with 16ms happening 1 in 62 times

If for some reason it delays a monitor frame or waits for vsync, it will be 16ms. Maybe your app is running out of drawables in the swapchain. Depending on how SDL’s DirectX driver sets up the swapchain, the OS may make your app wait until another drawable is available, hence the 1 frame delay.

Hi, my drawable is actually just a single SDL_RenderCopy – all the work of the app is poking directly into the texture; there are no other SDL primitives being used. The 0/16ms timing I see are exactly caused by the call to BitBlt (I’ve put timers into the SDL code to see this). It seems so unlikely that the ratio of 0ms:16ms is also 1:61

By drawable I mean the images in the swapchain, which the SDL DirectX backend must set up to actually show anything on screen.

Anyway, it’s weird that SDL is using a memory copy to get your app on the screen.

Hi, I really don’t understand what’s going on; if my app produce an image to display, I don’t get how the delay to render is either immediate, or 16ms – never, say, 10ms. if there’s a vsync every 16ms, then the delay should be randomly distributed 0…16ms. There is no way that my app is ready exactly at the right time 98% of the time! and the time a full frame is ‘missed’ is exactly 1 in 62. (My app is not full screen, and actually has multiple SDL windows open)

I think the double- (or triple-) buffering which SDL uses can impact on this and possibly cause the effect you are seeing.

This is such an interesting problem, looking around I found someone else complaining about the same issue with BitBlt() and they aren’t using SDL. The following is from 10 years ago, but it sounds like the same issue (I’ve very skeptical of some of the answers but some of them sound reasonable.) How to make BitBlt faster (for copying screen)? This makes me think the only way SDL could “fix” the issue is by not using BitBlt().
[edit] aagh, i just noticed the link is the same one included in the original question. doh!

Wait, I’m just re-reading this…why are you in BitBlt() at all?

To be clear, SDL_RenderPresent() shouldn’t be using BitBlt() unless you’re on the software renderer. The Direct3D renderer should never be using that codepath.

Hi,
Sorry to confuse: My original tests and timings were around SDL_RenderPresent() in my main loop and I saw the same 0ms/16ms effect regardless of the renderer - software or d3d11. In my further diggings I was persuing the ‘software’ path and found the delay could be narrowed down to the BitBlt