On SDLGPU "Uniforms are best for pushing small amount of data" advice

Hey folks, I’m a beginner when it comes to graphics programming, so I realize I might be asking the wrong question, but…

The docs say:

Uniforms are best for pushing small amounts of data. If you are pushing more than a matrix or two per call you should consider using a storage buffer instead.

I would like to know more about the technical reasons for this, as well as the actual size thresholds for this. A “matrix or two” is 128 bytes, right? So is anything more than that supposed to be passed via a storage buffer? What exactly will happen if I pass larger amount of data? E.g. I want a view, projection, inverse view and inverse projection matrices available in my shaders.

Another example: if I have an array of light data (colors/positions/directions/attenuations/etc.), should that already go the storage buffer route? (I assume yes, since they aren’t even changing that often).

Also, I was thinking it’s related to Vulkan push constants (which I don’t have much experience with either), which have the size limit, but looking at the sdl source code, I’m not seeing they being used for pushing vertex/fragment uniform data.

In general, some more clarity on this topic would be greatly appreciated!

SDL_PushGPUVertex/FragmentUniformData() seem to be meant to be like push constants, where the data is stored directly in the command buffer. With push constants, how much data can be stored is API and implementation dependent (the Vulkan spec only requires a minimum of 128 bytes!). Push constants are supposed to be an easy way to provide small values to shaders that change frequently, possibly every draw call.

Interestingly, the actual SDL_GPU implementations for all current backends seem to instead use a pool of uniform buffers located in GPU memory, copying whatever you push to a buffer.

This isn’t without limitations, however. While SDL_GPU creates the uniform buffers in GPU memory, they’re created in a part of GPU memory that can be directly written to by the CPU for fast and easy updating (no need for transfer buffers and copy passes). The problem is that there often isn’t much of this memory available, so you don’t want to fill it with stuff like light and material arrays, the projection matrix, etc.