SDL_GPU.h

Hey everybody, time to update your SDL3 version to the latest github source code. Hint-hint, wink-wink.

Actually it looks like it was added about 13 days ago, but I missed it until now and haven’t seen any mention of it yet.
Don’t forget to check out SDL/test/testgpu_spinning_cube.c for a brief example.

Please feel free to share any early tips or tutorials if you create them. I’m really interested in what happens when the community has time to tinker.
Good luck!

Hi!
Do I understand correctly that this new SDL_GPU API is more wrapper for modern APIs like Metal/Vulkan/DirectX12? And that in the case of simple 3D, it is much easier to use OpenGL access capabilities, as before?
I assume we need some step-by-step tutorial with a lot of comments, because that code looks like for those who know Vulkan and not for old SDL users (S no longer stands for “simple” :sweat_smile:).

P.S. Link to the mentioned test SDL/test/testgpu_spinning_cube.c at main · libsdl-org/SDL · GitHub

I will not, but I really want to post the Alucard “I am interested in this.” video from Castlevania: Symphony of the Night.

I am so very looking forward to SDL3 official release!

Here’s my attempt at simplifying this “SDL/test/testgpu_simple_clear.c”.

/* a (hopefully) simplified standalone version of SDL's testgpu_simple_clear.c program*/
#include <SDL3/SDL.h>

int windowW = 900;
int windowH = 900;

int main(int argc, char ** argv)
{
	const SDL_DisplayMode * dispMode;
	int dispW, dispH;
	SDL_Init(SDL_INIT_VIDEO);
	SDL_Window * win = SDL_CreateWindow("My GPU Test", windowW, windowH, 0);

	SDL_GPUShaderFormat supportFlags = SDL_GPU_SHADERFORMAT_SPIRV; 
	SDL_GPUDevice * gpuDevice = SDL_CreateGPUDevice(supportFlags, SDL_TRUE, NULL);
	SDL_ClaimWindowForGPUDevice(gpuDevice, win);
	
	int frameCount = 0;
	bool run = true;
	size_t startTick = SDL_GetTicks();
	while(run)
	{
		SDL_Event ev;
		while(SDL_PollEvent(&ev))
		{
			switch(ev.type)
			{
				case SDL_EVENT_KEY_DOWN:
					switch(ev.key.key)
					{
						case SDLK_ESCAPE:
							run = false;
							break;
					}
					break;
				case SDL_EVENT_QUIT:
					run = false;
					break;
			}
		}

		SDL_GPUCommandBuffer * cmdBuffer = SDL_AcquireGPUCommandBuffer(gpuDevice);
		if(cmdBuffer)
		{
			uint32_t w, h;
			SDL_GPUTexture * swapChainTexture = SDL_AcquireGPUSwapchainTexture(cmdBuffer, win, &w, &h);	
			SDL_GPURenderPass * renderPass = NULL;
			SDL_GPUColorTargetInfo colorInfo;
			SDL_zero(colorInfo);
			colorInfo.texture = swapChainTexture;
			colorInfo.clear_color.r = 0.1f;
			colorInfo.clear_color.g = 0.5f;
			colorInfo.clear_color.b = 0.1f;
			colorInfo.clear_color.a = 1.0f;
			colorInfo.load_op = SDL_GPU_LOADOP_CLEAR;
			colorInfo.store_op = SDL_GPU_STOREOP_STORE;
			renderPass = SDL_BeginGPURenderPass(cmdBuffer, &colorInfo, 1, NULL);
			SDL_EndGPURenderPass(renderPass);
			SDL_SubmitGPUCommandBuffer(cmdBuffer);
			frameCount ++;
		}
		else
		{
			SDL_Log("Failed to acquire command buffer: %s", SDL_GetError());
			run = false;
		}
	}

	SDL_Log("FPS: %ld frames per sec", (frameCount * 1000)/(SDL_GetTicks() - startTick));
	SDL_ReleaseWindowFromGPUDevice(gpuDevice, win);
	SDL_DestroyGPUDevice(gpuDevice);
	SDL_DestroyWindow(win);
	SDL_Quit();
}

There appears to be some default VSync state built into the current SDL_GPU libs, I don’t know if there might be a way to disable that for testing purposes.

1 Like

Check out the shell script at /SDL/test/testgpu/build-shaders.sh where you will find examples of building the shaders for your desired operating system. There may be additional programs that you need to install before running this script.

For instance, on an Ubuntu Linux machine we need the glslangValidator program found using
$sudo apt-get install glslang-tools

When the script is run, the cube.glsl file gets compiled and output to testgpu_spirv.h (or to your OS’s version of a compiled shader program).

We see this spirv.h header file included just like a normal header at the top of the testgpu_spinning_cube.c file, and then the data is utilized in the load_shader() function.

As of now there are 77 new structs and 85 new functions in the new GPU API. Not all of them will be used in normal operations. I think that the new SDL_GPU API is going to be about the same level of difficulty as learning OpenGL, yet benefit from Vulkan and other back ends being actively updated.

I don’t know currently if we should consider this to be in a beta testing stage or not. If that is the case, then I think it’s completely fine to start learning the basics and playing with it, but I don’t recommend plugging it straight into your production code just yet.

I don’t even want to guess how long before the LazyFoo tutorials might drop, but I do hope to see them some day.

I succeeded in a similar rewrite of the testgpu_spinning_cube.c file. The bad news is that I began building an OOP C++ framework around the code, so I technically added a bit of unnecessary bloat.
I had it down at at 613 lines of C++ code and showing the spinning cube.
I think you could get it down to about 500 lines if it were pure C code and you skipped rendering to an MSAA (set the color_target.texture to the swapchain and remove nearly all reference to msaa).

I’m switching my focus to simplification of use rather than reduction in line length.
My next step is to separate the GPU setup and management into one class, and the cube/object local stuff to another class. That way I can define multiple objects with their own shaders/pipelines/vectors.

I’m also working on intermediate helper classes that will create the required structs with sane default values, but allow you to change those values before the final object is returned. (Factory classes).

Here’s my version of SDL/test/testgpu_spinning_cube.c file (some code was direct copy-pasted), I went ahead and removed MSAA to reduce lines. It stands at 519 lines right now. I wanted to post this version before I started to break the gpu and object into separate classes.
Please note that I copied the file at SDL/test/testgpu/cube_spirv.h into the current project folder, and linked to it with this compile command
g++ main.cpp -I. -lSDL3

/* A slightly smaller rewrite of SDL/test/testgpu_spinning_cube.c */
#ifndef SIMPLIFIEDSDLGPUCODE
#define SIMPLIFIEDSDLGPUCODE

#include <SDL3/SDL.h>
#include <cube_spirv.h>

typedef struct VertexData
{
	float x, y, z, red, green, blue;
} VertexData;
static const VertexData vertexData[] = {
	/* Front face. */
	/* Bottom left */
	{ -0.5,  0.5, -0.5, 1.0, 0.0, 0.0 }, /* red */
	{  0.5, -0.5, -0.5, 0.0, 0.0, 1.0 }, /* blue */
	{ -0.5, -0.5, -0.5, 0.0, 1.0, 0.0 }, /* green */

	/* Top right */
	{ -0.5, 0.5, -0.5, 1.0, 0.0, 0.0 }, /* red */
	{ 0.5,  0.5, -0.5, 1.0, 1.0, 0.0 }, /* yellow */
	{ 0.5, -0.5, -0.5, 0.0, 0.0, 1.0 }, /* blue */

	/* Left face */
	/* Bottom left */
	{ -0.5,  0.5,  0.5, 1.0, 1.0, 1.0 }, /* white */
	{ -0.5, -0.5, -0.5, 0.0, 1.0, 0.0 }, /* green */
	{ -0.5, -0.5,  0.5, 0.0, 1.0, 1.0 }, /* cyan */

	/* Top right */
	{ -0.5,  0.5,  0.5, 1.0, 1.0, 1.0 }, /* white */
	{ -0.5,  0.5, -0.5, 1.0, 0.0, 0.0 }, /* red */
	{ -0.5, -0.5, -0.5, 0.0, 1.0, 0.0 }, /* green */

	/* Top face */
	/* Bottom left */
	{ -0.5, 0.5,  0.5, 1.0, 1.0, 1.0 }, /* white */
	{  0.5, 0.5, -0.5, 1.0, 1.0, 0.0 }, /* yellow */
	{ -0.5, 0.5, -0.5, 1.0, 0.0, 0.0 }, /* red */

	/* Top right */
	{ -0.5, 0.5,  0.5, 1.0, 1.0, 1.0 }, /* white */
	{  0.5, 0.5,  0.5, 0.0, 0.0, 0.0 }, /* black */
	{  0.5, 0.5, -0.5, 1.0, 1.0, 0.0 }, /* yellow */

	/* Right face */
	/* Bottom left */
	{ 0.5,  0.5, -0.5, 1.0, 1.0, 0.0 }, /* yellow */
	{ 0.5, -0.5,  0.5, 1.0, 0.0, 1.0 }, /* magenta */
	{ 0.5, -0.5, -0.5, 0.0, 0.0, 1.0 }, /* blue */

	/* Top right */
	{ 0.5,  0.5, -0.5, 1.0, 1.0, 0.0 }, /* yellow */
	{ 0.5,  0.5,  0.5, 0.0, 0.0, 0.0 }, /* black */
	{ 0.5, -0.5,  0.5, 1.0, 0.0, 1.0 }, /* magenta */

	/* Back face */
	/* Bottom left */
	{  0.5,  0.5, 0.5, 0.0, 0.0, 0.0 }, /* black */
	{ -0.5, -0.5, 0.5, 0.0, 1.0, 1.0 }, /* cyan */
	{  0.5, -0.5, 0.5, 1.0, 0.0, 1.0 }, /* magenta */

	/* Top right */
	{  0.5,  0.5,  0.5, 0.0, 0.0, 0.0 }, /* black */
	{ -0.5,  0.5,  0.5, 1.0, 1.0, 1.0 }, /* white */
	{ -0.5, -0.5,  0.5, 0.0, 1.0, 1.0 }, /* cyan */

	/* Bottom face */
	/* Bottom left */
	{ -0.5, -0.5, -0.5, 0.0, 1.0, 0.0 }, /* green */
	{  0.5, -0.5,  0.5, 1.0, 0.0, 1.0 }, /* magenta */
	{ -0.5, -0.5,  0.5, 0.0, 1.0, 1.0 }, /* cyan */

	/* Top right */
	{ -0.5, -0.5, -0.5, 0.0, 1.0, 0.0 }, /* green */
	{  0.5, -0.5, -0.5, 0.0, 0.0, 1.0 }, /* blue */
	{  0.5, -0.5,  0.5, 1.0, 0.0, 1.0 } /* magenta */
};

class GPU
{
	public:
	GPU()
	{
		initialized = false;
		win = NULL;
		depthTexture = NULL;
		pipeline = NULL;

		sampleCount = SDL_GPU_SAMPLECOUNT_1;
	}

	/*
	 * Simulates desktop's glRotatef. The matrix is returned in column-major
	 * order.
	 */
	void rotate_matrix(float angle, float x, float y, float z, float *r)
	{
		float radians, c, s, c1, u[3], length;
		int i, j;
	
		radians = angle * SDL_PI_F / 180.0f;
	
		c = SDL_cosf(radians);
		s = SDL_sinf(radians);
	
		c1 = 1.0f - SDL_cosf(radians);
	
		length = (float)SDL_sqrt(x * x + y * y + z * z);
	
		u[0] = x / length;
		u[1] = y / length;
		u[2] = z / length;
	
		for (i = 0; i < 16; i++) {
			r[i] = 0.0;
		}
	
		r[15] = 1.0;
	
		for (i = 0; i < 3; i++) {
			r[i * 4 + (i + 1) % 3] = u[(i + 2) % 3] * s;
			r[i * 4 + (i + 2) % 3] = -u[(i + 1) % 3] * s;
		}
	
		for (i = 0; i < 3; i++) {
			for (j = 0; j < 3; j++) {
				r[i * 4 + j] += c1 * u[i] * u[j] + (i == j ? c : 0.0f);
			}
		}
	}
	
	/*
	 * Simulates gluPerspectiveMatrix
	 */
	
	void perspective_matrix(float fovy, float aspect, float znear, float zfar, float *r)
	{
		int i;
		float f;
	
		f = 1.0f/SDL_tanf(fovy * 0.5f);
	
		for (i = 0; i < 16; i++) {
			r[i] = 0.0;
		}
	
		r[0] = f / aspect;
		r[5] = f;
		r[10] = (znear + zfar) / (znear - zfar);
		r[11] = -1.0f;
		r[14] = (2.0f * znear * zfar) / (znear - zfar);
		r[15] = 0.0f;
	}
	
	/*
	 * Multiplies lhs by rhs and writes out to r. All matrices are 4x4 and column
	 * major. In-place multiplication is supported.
	 */
	void multiply_matrix(float *lhs, float *rhs, float *r)
	{
		int i, j, k;
		float tmp[16];
	
		for (i = 0; i < 4; i++) {
			for (j = 0; j < 4; j++) {
				tmp[j * 4 + i] = 0.0;
	
				for (k = 0; k < 4; k++) {
					tmp[j * 4 + i] += lhs[k * 4 + i] * rhs[j * 4 + k];
				}
			}
		}
	
		for (i = 0; i < 16; i++) {
			r[i] = tmp[i];
		}
	}

	SDL_GPUTexture * CreateDepthTexture()
	{
		SDL_GPUTextureCreateInfo depthTextureInfo;
		SDL_GPUTexture * result;
		SDL_zero(depthTextureInfo);
		depthTextureInfo.type = SDL_GPU_TEXTURETYPE_2D;
		depthTextureInfo.format = SDL_GPU_TEXTUREFORMAT_D16_UNORM;
		depthTextureInfo.width = windowW;
		depthTextureInfo.height = windowH;
		depthTextureInfo.layer_count_or_depth = 1; 
		depthTextureInfo.num_levels = 1;
		depthTextureInfo.sample_count = sampleCount;
		depthTextureInfo.usage = SDL_GPU_TEXTUREUSAGE_DEPTH_STENCIL_TARGET;
		depthTextureInfo.props = 0;
		result = SDL_CreateGPUTexture(gpuDevice, &depthTextureInfo);
		if(!result)
		{
			SDL_Log("Failed to create depth Texture: %s", SDL_GetError());
		}
		return result;
	}

	bool init(SDL_Window * targetWindow, const char * gpuName = NULL)
	{
		win = targetWindow;
		SDL_GetWindowSize(win, &windowW, &windowH);
		oldWindowW = windowW;
		oldWindowH = windowH;
		if(initialized)
		{
			// unclaim old window, glose GPU
		}
		// Claim window
		SDL_GPUShaderFormat spirv = SDL_GPU_SHADERFORMAT_SPIRV;
		gpuDevice = SDL_CreateGPUDevice(spirv, SDL_TRUE, NULL);
		SDL_ClaimWindowForGPUDevice(gpuDevice, win);
		initialized = true;

		// Prep Shaders 
		// Vertex Shader
		SDL_GPUShaderCreateInfo shaderInfo;
		SDL_zero(shaderInfo);
		shaderInfo.num_samplers = 0;
		shaderInfo.num_storage_buffers = 0;
		shaderInfo.num_storage_textures = 0;
		
		shaderInfo.num_uniform_buffers = 1;
		shaderInfo.props = 0;

		shaderInfo.format = SDL_GPU_SHADERFORMAT_SPIRV;
		shaderInfo.code = cube_vert_spv;
		shaderInfo.code_size = cube_vert_spv_len;
		shaderInfo.entrypoint = "main";

		shaderInfo.stage = SDL_GPU_SHADERSTAGE_VERTEX;
		SDL_GPUShader * vertexShader = SDL_CreateGPUShader(gpuDevice, &shaderInfo);
		// Fragment Shader
		shaderInfo.num_uniform_buffers = 0;
		
		shaderInfo.code = cube_frag_spv;
		shaderInfo.code_size = cube_frag_spv_len;
		shaderInfo.stage = SDL_GPU_SHADERSTAGE_FRAGMENT;
		SDL_GPUShader * fragmentShader = SDL_CreateGPUShader(gpuDevice, &shaderInfo);
		if(!(fragmentShader && vertexShader))
		{
			SDL_Log("A shader has failed.");
			initialized = false;
			SDL_Log("Error check: %s", SDL_GetError());
			return false;
		}

		// Create Buffers
		SDL_GPUBufferCreateInfo bufferInfo;
		bufferInfo.usage = SDL_GPU_BUFFERUSAGE_VERTEX;
		bufferInfo.size = sizeof(vertexData);
		bufferInfo.props = 0;
		activeBuffer = SDL_CreateGPUBuffer(gpuDevice, &bufferInfo);
		SDL_SetGPUBufferName(gpuDevice, activeBuffer, "MyGPUBuffer");
		// Transfer Buffer
		SDL_GPUTransferBufferCreateInfo tbInfo;
		tbInfo.usage = SDL_GPU_TRANSFERBUFFERUSAGE_UPLOAD;
		tbInfo.size = sizeof(vertexData);
		tbInfo.props = 0;
		transferBuffer = SDL_CreateGPUTransferBuffer(gpuDevice, &tbInfo);
		if(!transferBuffer)
		{
			initialized = false;
			SDL_Log("Error check: %s", SDL_GetError());
			return false;
		}

		// Upload Static Data
		void * map = SDL_MapGPUTransferBuffer(gpuDevice, transferBuffer, SDL_FALSE);
		SDL_memcpy(map, vertexData, sizeof(vertexData));
		SDL_UnmapGPUTransferBuffer(gpuDevice, transferBuffer);
		
		SDL_GPUCommandBuffer *cmd = SDL_AcquireGPUCommandBuffer(gpuDevice);
		SDL_GPUCopyPass *cpass = SDL_BeginGPUCopyPass(cmd);
		SDL_GPUTransferBufferLocation loc;
		loc.transfer_buffer = transferBuffer;
		loc.offset = 0;
		SDL_GPUBufferRegion dest;
		dest.buffer = activeBuffer;
		dest.offset = 0;
		dest.size = sizeof(vertexData);
		SDL_UploadToGPUBuffer(cpass, &loc, &dest, SDL_FALSE);
		SDL_EndGPUCopyPass(cpass);
		SDL_SubmitGPUCommandBuffer(cmd);

		SDL_ReleaseGPUTransferBuffer(gpuDevice, transferBuffer);
		
		sampleCount = SDL_GPU_SAMPLECOUNT_1;

		// Graphics Pipeline setup
		SDL_GPUColorTargetDescription tempColor;
		SDL_GPUGraphicsPipelineCreateInfo pipelineInfo;

		SDL_zero(pipelineInfo);
		SDL_zero(tempColor);

		tempColor.format = SDL_GetGPUSwapchainTextureFormat(gpuDevice, win);

		pipelineInfo.target_info.num_color_targets = 1;
		pipelineInfo.target_info.color_target_descriptions = &tempColor;
		pipelineInfo.target_info.depth_stencil_format = SDL_GPU_TEXTUREFORMAT_D16_UNORM;
		pipelineInfo.target_info.has_depth_stencil_target = SDL_TRUE;

		pipelineInfo.depth_stencil_state.enable_depth_test = 1;
		pipelineInfo.depth_stencil_state.enable_depth_write = 1;
		pipelineInfo.depth_stencil_state.compare_op = SDL_GPU_COMPAREOP_LESS_OR_EQUAL;

		pipelineInfo.multisample_state.sample_count = sampleCount;

		pipelineInfo.primitive_type = SDL_GPU_PRIMITIVETYPE_TRIANGLELIST;

		pipelineInfo.vertex_shader = vertexShader;
		pipelineInfo.fragment_shader = fragmentShader;

		vertexBinding.index = 0;
		vertexBinding.input_rate = SDL_GPU_VERTEXINPUTRATE_VERTEX;
		vertexBinding.instance_step_rate = 0;
		vertexBinding.pitch = sizeof(VertexData);

		SDL_GPUVertexAttribute vertexAttributes[2];
		vertexAttributes[0].binding_index = 0;
		vertexAttributes[0].format = SDL_GPU_VERTEXELEMENTFORMAT_FLOAT3;
		vertexAttributes[0].location = 0;
		vertexAttributes[0].offset = 0;

		vertexAttributes[1].binding_index = 0;
		vertexAttributes[1].format = SDL_GPU_VERTEXELEMENTFORMAT_FLOAT3;
		vertexAttributes[1].location = 1;
		vertexAttributes[1].offset = sizeof(float) * 3;

		pipelineInfo.vertex_input_state.num_vertex_bindings = 1;
		pipelineInfo.vertex_input_state.vertex_bindings = &vertexBinding;
		pipelineInfo.vertex_input_state.num_vertex_attributes = 2;
		pipelineInfo.vertex_input_state.vertex_attributes = (SDL_GPUVertexAttribute*) &vertexAttributes;

		pipelineInfo.props = 0;

		pipeline = SDL_CreateGPUGraphicsPipeline(gpuDevice, &pipelineInfo);
		if(!pipeline)
		{
			SDL_Log("Error creating pipeline: %s", SDL_GetError());
			return false;
		}
		SDL_ReleaseGPUShader(gpuDevice, vertexShader);
		SDL_ReleaseGPUShader(gpuDevice, fragmentShader);

		SDL_GetWindowSizeInPixels(win, &windowW, &windowH);

		// depthTexture info
		depthTexture = CreateDepthTexture();

		angleX = 10;
		angleY = 20;
		angleZ = 30;
		return true;
	}

	void close()
	{
		SDL_ReleaseGPUTexture(gpuDevice, depthTexture);
		SDL_ReleaseWindowFromGPUDevice(gpuDevice, win);
		SDL_ReleaseGPUBuffer(gpuDevice, activeBuffer);
		SDL_ReleaseGPUGraphicsPipeline(gpuDevice, pipeline);

		SDL_DestroyGPUDevice(gpuDevice);
		gpuDevice = NULL;
	}

	void resize()
	{
		if(depthTexture)
		{
			SDL_ReleaseGPUTexture(gpuDevice, depthTexture);
		}
		SDL_GetWindowSize(win, &windowW, &windowH);
		depthTexture = CreateDepthTexture();
	}

	void draw()
	{
		SDL_GPUCommandBuffer * cmdBuffer;
		cmdBuffer = SDL_AcquireGPUCommandBuffer(gpuDevice);
		if(cmdBuffer)
		{
			swapchainTexture = SDL_AcquireGPUSwapchainTexture(cmdBuffer, win, (unsigned int *) &windowW, (unsigned int *) &windowH);
			if(swapchainTexture)
			{
				// double check window is correct size
				if(oldWindowW != windowW || oldWindowH != windowH)
				{
					resize();
				}
				oldWindowW = windowW;
				oldWindowH = windowH;

				float matrix_rotate[16], matrix_modelview[16], matrix_perspective[16], matrix_final[16];
				rotate_matrix((float)angleX, 1.0f, 0.0f, 0.0f, matrix_modelview);
				rotate_matrix((float)angleZ, 0.0f, 1.0f, 0.0f, matrix_rotate);
				multiply_matrix(matrix_rotate, matrix_modelview, matrix_modelview);
		
				/* Pull the camera back from the cube */
				matrix_modelview[14] -= 2.5f;
		
				perspective_matrix(45.0f, (float)windowW/windowH, 0.01f, 100.0f, matrix_perspective);
				multiply_matrix(matrix_perspective, matrix_modelview, (float*) &matrix_final);
		
				angleX += 3;
				angleY += 2;
				angleZ += 1;
			
				if(angleX >= 360) angleX -= 360;
				if(angleX < 0) angleX += 360;
				if(angleY >= 360) angleY -= 360;
				if(angleY < 0) angleY += 360;
				if(angleZ >= 360) angleZ -= 360;
				if(angleZ < 0) angleZ += 360;

				SDL_zero(colorInfo);
				colorInfo.clear_color.a = 1.0f;
				colorInfo.clear_color.r = 0.5f;
				colorInfo.load_op = SDL_GPU_LOADOP_CLEAR;
				colorInfo.store_op = SDL_GPU_STOREOP_STORE;
				colorInfo.texture = swapchainTexture;

				SDL_GPUDepthStencilTargetInfo depthInfo;
				SDL_zero(depthInfo);
				depthInfo.clear_depth = 1.0f;
				depthInfo.load_op = SDL_GPU_LOADOP_CLEAR;
				depthInfo.store_op = SDL_GPU_STOREOP_DONT_CARE;
				depthInfo.texture = depthTexture;
				depthInfo.cycle = SDL_TRUE;

				SDL_GPUBufferBinding binding;
				binding.buffer = activeBuffer;
				binding.offset = 0;

				SDL_PushGPUVertexUniformData(cmdBuffer, 0, matrix_final, sizeof(matrix_final));

				renderPass = SDL_BeginGPURenderPass(cmdBuffer, &colorInfo, 1, &depthInfo);
				SDL_BindGPUGraphicsPipeline(renderPass, pipeline);
				SDL_BindGPUVertexBuffers(renderPass, 0, &binding, 1);
				SDL_DrawGPUPrimitives(renderPass, 36, 1, 0, 0);
				SDL_EndGPURenderPass(renderPass);
		
				frameCount ++;
			}
			SDL_SubmitGPUCommandBuffer(cmdBuffer);
		}
	}

	public:
	SDL_GPUDevice * gpuDevice;
	SDL_GPUGraphicsPipeline * pipeline;
	SDL_GPUSampleCount sampleCount;
	SDL_GPUBuffer * activeBuffer;
	SDL_GPUTransferBuffer * transferBuffer;
	SDL_GPUComputePipeline * computePipeline;
	
	//Initialization
	SDL_GPUVertexBinding vertexBinding;

	// Rendering
	SDL_GPURenderPass * renderPass;
	SDL_GPUColorTargetInfo colorInfo;

	// GPU Textures
	SDL_GPUTexture * depthTexture;
	SDL_GPUTexture * swapchainTexture;

	size_t frameCount;
	SDL_Window * win;
	int windowW, windowH, oldWindowW, oldWindowH;
	bool initialized;
	float angleX, angleY, angleZ;
};

#endif

int main()
{
	SDL_Init(SDL_INIT_VIDEO);
	SDL_Window * win = SDL_CreateWindow("My GPU", 800, 800, SDL_WINDOW_RESIZABLE);
	GPU myGPU;
	myGPU.windowW = 800;
	myGPU.windowH = 800;
	myGPU.init(win, NULL);

	bool run = true;
	while(run)
	{
		SDL_Event ev;
		while(SDL_PollEvent(&ev))
		{
			switch(ev.type)
			{
				case SDL_EVENT_WINDOW_RESIZED:
					break;
				case SDL_EVENT_KEY_DOWN:
					switch(ev.key.key)
					{
						case SDLK_ESCAPE:
							run = false;
							break;
					}
					break;
				case SDL_EVENT_QUIT:
					run = false;
					break;
			}
		}
		myGPU.draw();
	}
	myGPU.close();
	SDL_DestroyWindow(win);
	SDL_Quit();
}

You probably will want MSAA later on for antialiasing. You will see the issue if you slow down the cube to a snail’s pace.

I just updated my version of SDL3 for the week, and the above code is now broken due to some minor changes to the GPU API names such as SDL_GPUVertexBinding is renamed SDL_GPUVertexBufferDescription.
(Also SDL_TRUE and SDL_FALSE are no longer a thing at the moment).

This is not a huge problem, it is a simple refactor and minor rewrite, but good to confirm that we are in early access or preview stage, further changes can be expected.

Please have fun learning the new 3D API, it’s going to be worth it, but be ready for occasional frustration as the API itself is still going through its own growing pains.

I just downloaded and installed the latest sources. The demo with the cube is running. Linux Mint.
It’s amazing what you can do with SDL3.

.../SDL3/build/test/testgpu_spinning_cube

Is that what was promised here?

Yes, currently that is one of the only examples in existence of the new API in use. I’m trying to use the source code of that test program in order to learn the new API. Anyone with knowledge in Vulkan or one of the other 3D libraries will likely have an easier time, though.

Since the GPU API is so new, there is little in the way of explanations in how to use it. However, it is similar enough to Vulkan that if the SDL Wiki does not explain an object/function well enough, then there probably is a Vulkan tutorial or Wikipedia article about similar object/function that will help.

If you choose to play with the API, know this:
It takes a lot of lines of code to work with the GPU with the current SDL_GPU API (or with Vulkan) because no structures are created with default values. This means a large amount of data needs to be provided by you in order to initialize any struct.
This is a blessing and a curse; you have gained control over everything with the downside that you now need to know how to control everything.
I don’t want to even think about Mobile vs Desktop setups, that’s something to worry about after stable releases of the API come out.

There has been a lot of work going into the SDL3/examples/ folder for other topics, which makes me hopeful that we will see some GPU tutorials popping up there in a couple of months… or possibly some time before 2026?

Early into researching the set up for multiple objects I saw a recommendation that each object get its own pipeline.

Now I’ve come across this post, which indicates that while it is common to see one-pipeline-per-object in tutorials, it is not actually the recommended method.

I see the logic behind sharing pipelines between objects in order to save processing time (Flyweight pattern), but this also means that I need a way to group objects by pipeline in order to render them all while that pipeline is currently loaded.

I’m thinking of using a hash map such as this:

std::unordered_map <SDL_GPUGraphicsPipeline *, std::vector <MyObject *> > objectDirectory;

All objects that use a specific pipeline can be mapped then referenced/incremented in the MyGPU.RenderPresent() function.
The downside to this approach is that I wanted to have a draw function in the MyObject class rather than in my GPU class.
Instead I will have MyObject.show() and MyObject.hide().

Does anyone know a better way or the standard way to share pipelines?

Edit: This post does something similar, but without the unordered_map, they just maintain a couple of pipelines and keep std::vector(s) of those objects to render when the pipeline is loaded.

Here’s a Vulkan Series on Youtube I’m following to get caught up.

Unrelated to all this, your bio says:

Chose the name after running through a blender tutorial. There’s no change-name option here? …Life has consequences.

https://discourse.libsdl.org/u/guildeddoughnut/preferences/account then click the pencil icon next to your username.

(If it turns out I only have that pencil because I’m an admin here, I can definitely change it for you; DM me.)

Often you can do:

SDL_GPUStruct MyStruct;
SDL_zero(MyStruct);

…and then just fill in the non-zero parts you care about.

There is a huge pile of GPU API examples over at TheSpydog’s GitHub, which we intend to migrate to SDL3’s examples directory. I started on that work, but other obligations (and the question of what to do with shader binaries) have slowed me down.

We are hoping to lock down the API in the next few days. We will still add new stuff at that point, but we won’t change or remove existing stuff. From there we’re motoring towards an official release.

(And we feel like the massive fundamental changes, like replacing SDL_bool with the standard C99/C++ bool type, are officially done now. May I not have to eat these words this week, though, haha.)

1 Like

Thank you @icculus, that was a ton of clarification in a single post.

2 Likes

I wanted to try out the example above, but I noticed that there was a major change in SDL_gpu.h.

old:

extern SDL_DECLSPEC SDL_GPUTexture *SDLCALL SDL_AcquireGPUSwapchainTexture(
    SDL_GPUCommandBuffer *command_buffer,
    SDL_Window *window,
    Uint32 *w,
    Uint32 *h);

mew:

extern SDL_DECLSPEC bool SDLCALL SDL_AcquireGPUSwapchainTexture(
    SDL_GPUCommandBuffer *command_buffer,
    SDL_Window *window,
    SDL_GPUTexture **swapchain_texture,
    Uint32 *swapchain_texture_width,
    Uint32 *swapchain_texture_height);

Here is the entire source for those who want to try it out with the current release.

// gcc main.c -o main -lSDL3

#include <SDL3/SDL.h>

int windowW = 900;
int windowH = 900;

int main(int argc, char ** argv)
{
	const SDL_DisplayMode * dispMode;
	int dispW, dispH;
	SDL_Init(SDL_INIT_VIDEO);
	SDL_Window * win = SDL_CreateWindow("My GPU Test", windowW, windowH, 0);

	SDL_GPUShaderFormat supportFlags = SDL_GPU_SHADERFORMAT_SPIRV; 
	SDL_GPUDevice * gpuDevice = SDL_CreateGPUDevice(supportFlags, true, NULL);
	SDL_ClaimWindowForGPUDevice(gpuDevice, win);
	
	int frameCount = 0;
	bool run = true;
	size_t startTick = SDL_GetTicks();
	while(run)
	{
		SDL_Event ev;
		while(SDL_PollEvent(&ev))
		{
			switch(ev.type)
			{
				case SDL_EVENT_KEY_DOWN:
					switch(ev.key.key)
					{
						case SDLK_ESCAPE:
							run = false;
							break;
					}
					break;
				case SDL_EVENT_QUIT:
					run = false;
					break;
			}
		}

		SDL_GPUCommandBuffer * cmdBuffer = SDL_AcquireGPUCommandBuffer(gpuDevice);
		if(cmdBuffer)
		{
			uint32_t w, h;
            SDL_GPUTexture * swapChainTexture;
            SDL_AcquireGPUSwapchainTexture(cmdBuffer, win, &swapChainTexture, 0, 0);

			SDL_GPURenderPass * renderPass = NULL;
			SDL_GPUColorTargetInfo colorInfo;
			SDL_zero(colorInfo);
			colorInfo.texture = swapChainTexture;
			colorInfo.clear_color.r = SDL_sin(frameCount / 100.0f) / 2 + 0.5f;
			colorInfo.clear_color.g = SDL_sin(frameCount / 120.0f) / 2 + 0.5f;
			colorInfo.clear_color.b = SDL_sin(frameCount / 133.0f) / 2 + 0.5f;
			colorInfo.clear_color.a = 1.0f;
			colorInfo.load_op = SDL_GPU_LOADOP_CLEAR;
			colorInfo.store_op = SDL_GPU_STOREOP_STORE;
			renderPass = SDL_BeginGPURenderPass(cmdBuffer, &colorInfo, 1, NULL);
			SDL_EndGPURenderPass(renderPass);
			SDL_SubmitGPUCommandBuffer(cmdBuffer);
			frameCount ++;
		}
		else
		{
			SDL_Log("Failed to acquire command buffer: %s", SDL_GetError());
			run = false;
		}
	}

	SDL_Log("FPS: %ld frames per sec", (frameCount * 1000)/(SDL_GetTicks() - startTick));
	SDL_ReleaseWindowFromGPUDevice(gpuDevice, win);
	SDL_DestroyGPUDevice(gpuDevice);
	SDL_DestroyWindow(win);
	SDL_Quit();
}

It was a lot easier to get up and running with this than when I attempted Vulkan! Still don’t have a good understanding of all the steps, but slowly learning.

Wonder if someone else noticed that there is a significant delay of the screen updating? (4 frames/ticks to be exact)

The console on the left is doing a simple log whenever it notices input in the app_iterate function, and the cube is being drawn in app_iterate as well. First image is when input is first held, second is when 4 frames have passed, and input has already been released, but the cube still hasn’t moved. Last image is when the cube starts moving

image
image
image

app_iterate looks like this: (pretty much the same as the skybox example from TheSpydog)

SDL_AppResult SDL_AppIterate(void *appstate)
{
	SDL_Log("%f",Movement.x);

	vec2 pos = (vec2){CamPos.x, CamPos.z};
	vec2 fwd = (vec2){-CamPos.x, -CamPos.z};
	vec2 rht = vec2_rot_90_ccw(fwd);
	pos = vec2_add(pos, vec2_scl(fwd, Movement.z * 0.1f));
	pos = vec2_add(pos, vec2_scl(rht, Movement.x * 0.1f));
	CamPos.x = pos.x;
	CamPos.z = pos.y;
	
	// Draw
	SDL_GPUCommandBuffer* cmdbuf = SDL_AcquireGPUCommandBuffer(VIEW->device);
    if (cmdbuf == NULL)
    {
        SDL_Log("AcquireGPUCommandBuffer failed: %s", SDL_GetError());
        return -1;
    }

    SDL_GPUTexture* swapchainTexture;
    if (!SDL_AcquireGPUSwapchainTexture(cmdbuf, VIEW->window, &swapchainTexture, NULL, NULL)) {
        SDL_Log("AcquireGPUSwapchainTexture failed: %s", SDL_GetError());
        return -1;
    }

	if (swapchainTexture != NULL)
	{
		Matrix4x4 proj = Matrix4x4_CreatePerspectiveFieldOfView(
			75.0f * SDL_PI_F / 180.0f,
			512.0f / 512.0f,
			0.01f,
			100.0f
		);
		
		Matrix4x4 view = Matrix4x4_CreateLookAt(
			CamPos,
			(vec3) { 0, 0, 0 },
			(vec3) { 0, 1, 0 }
		);

		Matrix4x4 viewproj = Matrix4x4_Multiply(view, proj);
	
		SDL_GPUColorTargetInfo colorTargetInfo = {
			.texture = swapchainTexture,
			.clear_color = (SDL_FColor){ 0.3f, 0.3f, 0.0f, 1.0f },
			.load_op = SDL_GPU_LOADOP_CLEAR,
			.store_op = SDL_GPU_STOREOP_STORE
		};

		SDL_PushGPUVertexUniformData(cmdbuf, 0, &viewproj, sizeof(viewproj));
		
		SDL_GPURenderPass* renderPass = SDL_BeginGPURenderPass(cmdbuf, &colorTargetInfo, 1, NULL);
		SDL_BindGPUGraphicsPipeline(renderPass, Pipeline);
		SDL_BindGPUVertexBuffers(renderPass, 0, &(SDL_GPUBufferBinding){ .buffer = VertexBuffer, .offset = 0}, 1);
		SDL_BindGPUIndexBuffer(renderPass, &(SDL_GPUBufferBinding){ IndexBuffer, 0 }, SDL_GPU_INDEXELEMENTSIZE_16BIT);
		SDL_DrawGPUIndexedPrimitives(renderPass, 36, 1, 0, 0, 0);
		SDL_EndGPURenderPass(renderPass);
		
	}

	SDL_SubmitGPUCommandBuffer(cmdbuf);

    return SDL_APP_CONTINUE;
}

I hope there’s away to get this delay reduced, probably I’m just missing something fundamental?

Edit: I can reduce the lag by using the MAILBOX or IMMEDIATE swapchain, but isn’t 4 frames of delay for normal “vsync” a bit too high?

I suppose the 4 frame delay corresponds with how many frames are in flight/stored in the swapchain. Is it possible to set this to a lower number? like 2 or even 1?

One thing I’m still confused about – even after inspecting the readme, the Moonside blog post, and the SDL_gpu_examples repo – is how exactly shader workflows are supposed to work in SDL3. More specifically, how would a workflow look like if I want to update shader code at runtime? For example, could I write shader in HLSL and cross-compile and load these at runtime using the SDL Shadercross header on all platforms?

Also: has the original idea for SDL’s own shading language been ditched? In the GDC’23 talk you also mentioned your own cross platform byte code format. I guess that has been ditched, too?