Inconsistent behaviour of SDL_RenderPresent with VSYNC on macOS/Metal

oviano · June 20, 2019, 6:45am

Perhaps I’m misunderstanding how this is meant to work but I thought that when v-sync is enabled the SDL_RenderPresent function should block waiting for the vertical refresh.

In actual fact, I find that to properly “wait” until the vsync I need to issue another draw command following SDL_RenderPresent.

E.g. I get the following timings for each command:

user_interface.render(true): 15us
SDL_RenderPresent(renderer): 31us
SDL_SetRenderTarget(renderer, NULL): 2us
SDL_RenderDrawPoint(renderer, 0, 0): 15914us

user_interface.render(true): 17us
SDL_RenderPresent(renderer): 33us
SDL_SetRenderTarget(renderer, NULL): 4us
SDL_RenderDrawPoint(renderer, 0, 0): 16926us

slime · June 25, 2019, 11:04pm

Typically you’re meant to call SDL_RenderPresent after doing all rendering for the frame (and after making sure the active render target is set to NULL). I can imagine timings could be weird because that’s not the case in either of your examples.

oviano · June 26, 2019, 5:07am

Thanks Alex - but you’re incorrect and there is definitely a problem or non-standard behaviour.

The pseudo code I pasted has a call to draw the UI, but replace that with drawing a rectangle or something and you get the same result.

The issue is caused by the fact that not until the first draw does SDL make a call to prepare the command encoder (maybe this is an optimisation?) and it’s only at this point does it wait for the vsync.

Specifically, to “fix” this, you can add this to the end of METAL_RenderPresent:

METAL_ActivateRenderCommandEncoder(renderer, MTLLoadActionLoad);

This waits for the v-sync. There might be a reason you’re not doing this, but be aware that it makes the vsync/render-present idea inconsistent under METAL.

slime · June 29, 2019, 4:47pm

The optimization you pointed out is extremely common in most Metal codebases - especially for the backbuffer, Apple recommends delaying acquiring a drawable (backbuffer) for the current frame for as long as possible, to give previously submitted frames as much time as possible to complete.

https://developer.apple.com/library/archive/documentation/3DDrawing/Conceptual/MTLBestPracticesGuide/Drawables.html#//apple_ref/doc/uid/TP40016642-CH2-SW1

The concept of a frame is dependent on a Present call. The way your pseudocode is structured, your UI is drawn on frame A, that frame is submitted, and then you draw a point to the backbuffer on frame B (before any event processing, logic updates, etc), and then the whole thing repeats.

Do you still get inconsistent vsync timings if you restructure your code to make all drawing happen immediately before the Present call, instead of both before and after?

slime · June 29, 2019, 5:00pm

Also, to clarify - where the driver stalls on the CPU when vsync is enabled is really dependent on the backend, platform, etc.

In SDL’s Metal backend it will stall on the first draw because that’s where a drawable is acquired. That doesn’t mean the driver waits for vsync for the current frame at that point, it just needs to wait for a previous frame’s backbuffer to complete rendering there (since there’s basically a ring-buffer of backbuffers. Direct3D calls this a swap chain).

In an OpenGL backend the driver often waits inside Present, but the concept of a swap chain is abstracted from the external OpenGL API so it’s really up to the driver, and some may wait for the next swap chain drawable to be available the first time a draw or clear operation happens on the backbuffer as well.

As long as you order your code so drawing to the screen happens at the end of the frame, it doesn’t really matter where the backend/driver waits for the next drawable.

oviano · June 29, 2019, 5:21pm

Thanks.

The pseudo code was just meant to show the type of call required to make it wait. If I take away the call to draw the point then it stalls instead on the next loop at the first draw call inside drawing the UI. Since it is always the first draw after the present that blocks for 16.67ms (in the case of 60Hz) and all other calls are as good as instant led me to believe it must be waiting on the v-sync. With the Direct3D 11 it is always the RenderPresent that blocks. Only METAL was behaving differently.

Actually what I was trying to achieve was smooth video playback where the video framerate matches the refresh rate. By knowing when the v-sync occurs, I can ensure video frames are scheduled to be presented midway between two successive v-syncs. This means small amounts of variation in the time taken to get the next frame, draw the UI etc, don’t end up causing the video frame to miss the v-sync which is what happens if I don’t do this and the frame presentation times happen to be too close to the v-sync boundaries. Then you get some judder caused by dropped or duplicated frames.

By adding the draw point at the end of present, I know that after that it has just completed a v-sync, and I can calibrate things more easily. I know that each loop starts at time zero after the last vsync. I have a timer and I would reset it here. I guess I could instead reset the timer after the next draw in the next loop but this might not always be in the same place, depending on what needed drawing for the UI, and this would then also become wrong on, for example, D3D where RenderPresent does indeed seem to block for the vsync. In my local copy of SDL2 I have activated the command encoder at the end of METAL_Present and that gives me consistent behaviour across platforms (and what I envisaged SDL2 doing in the first place - but point totally taken re: METAL best practices).

Although from what you are saying, the delay doesn’t necessarily mean it waited for a v-sync? Which is odd, because it still fixes the issue I was having…