Big performance differences between SDL 1.x and 2.03

My application (CAT3D) uses windows from 1800x1000 pixels to 2500x1400 pixels and in some interactive drawing the users complained that CAt3D based on SDL 1.x was much faster and nicer to use that the equivalent with SDL 2.03

Our implementation on SDL 1.x writes directly to a SDL_Surface.

The implementations on SDL 2.03 uses a system memory buffer were all the drawing is done before copying to a SDL_Texture and rendering (SDL_RenderCopy followed by SDL_RenderPresent).
The texture is created as : SDL_PIXELFORMAT_ARGB8888 and SDL_TEXTUREACCESS_STREAMING. I am forced to have a system memory buffer for SDL 2.03 because as far as I know, texture is a write-only-access region, and I need to query the content of any pixel of the window.

I started to investigate the time for image refresh on both systems.The test draws 4096 random filled rectangles and found that SDL 1.x is three times faster than the SDL 2.03 solution.

Am I doing something wrong with SDL 2.03 ?

NOTE: I try to update only the region of the texture that is modified at each access not the whole area, or course.

Thanks for any suggestions !

Armando------------------------
Armando Alaminos Bouza

It would really help if we could actually see the code in question. It’s quite possible that you ported the SDL 1 code to SDL 2 in a highly inefficient way, but we can’t really know that, or know how to help you, without looking at it.

MasonFrom: alabouza
To: sdl at lists.libsdl.org
Sent: Monday, December 14, 2015 1:22 PM
Subject: [SDL] Big performance differences between SDL 1.x and 2.03

My application (CAT3D) uses windows from 1800x1000 pixels to 2500x1400 pixels and in some interactive drawing the users complained that CAt3D based on SDL 1.x was much faster and nicer to use that the equivalent with SDL 2.03

Our implementation on SDL 1.x writes directly to a SDL_Surface.

The implementations on SDL 2.03 uses a system memory buffer were all the drawing is done before copying to a SDL_Texture and rendering (SDL_RenderCopy followed by SDL_RenderPresent).
The texture is created as : SDL_PIXELFORMAT_ARGB8888 and SDL_TEXTUREACCESS_STREAMING. I am forced to have a system memory buffer for SDL 2.03 because as far as I know, texture is a write-only-access region, and I need to query the content of any pixel of the window.

I started to investigate the time for image refresh on both systems.The test draws 4096 random filled rectangles and found that SDL 1.x is three times faster than the SDL 2.03 solution.

Am I doing something wrong with SDL 2.03 ?

NOTE: I try to update only the region of the texture that is modified at each access not the whole area, or course.

Thanks for any suggestions !

Armando

Armando Alaminos Bouza


SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

If you’re not using the software renderer, it is going to be much slower. Under the OpenGL and Direct3D renderers, the entire memory buffer must be uploaded to the GPU every time you update. You might get better performance with those renderers by separating the buffer into small (say 64x64 or 128x128) tiles made of streaming textures, and only updating ones that have been modified, but even then its doubtful you’ll get the same performance as in the software renderer or SDL 1.2.

Basically try forcing the use of the software renderer (SDL_RENDERER_SOFTWARE as only value for flags in SDL_CreateRenderer) and see if that improves anything.------------------------
Nate Fries

Dear Nathaniel.

Thank you for your confirmation, I was afraid of something like that.

I have tried several approaches, like the use of SDL_LockTexture with different versions of mempcy (the Intel fast implementation, my own SSE2 version, etc) but never reached the old SDL 1.2.15 performance. I am taking care of updating only the affected area of the texture.

The problem is evident in a few situations when a “paintbrush” segmentation action by the user affects almost all the area, because while he/she draws in a horizontal plane the coronal and sagittal views are refreshed simultaneously to show the progress from all points of view. The texture refresh is forced almost at any pixel movement of the mouse (I starting putting a minimum resolution for the additional view refreshes and it helped).

The advantage of using hardware or software rendered change with the PC configuration, for some cases software was better and hardware was better for others.

Any way, the application can read a .INI file with the preference (RENDERER = HARDWARE or RENDERER=SOFTWARE). This is important because in some old configurations with XP or Server 2003 there is no support for hardware rendering.

Regards.------------------------
Armando Alaminos Bouza

A brief consideration:

If for the case of SOFTWARE RENDERER the textures LOCKs warrantied read-write access, there were no need to create and additional frame buffer on system memory.------------------------
Armando Alaminos Bouza

"The problem is evident in a few situations when a “paintbrush” segmentation action by the user affects almost all the area, because while he/she draws in a horizontal plane the coronal and sagittal views are refreshed simultaneously to show the progress from all points of view."
Cutting the framebuffer into smaller tiles as I suggested in my previous post is probably the simplest way to tackle this issue. Unfortunately this would require modification of your code to draw into the framebuffer as well.

Maybe with more information about what the application does, I (or someone else here) could come up with something more useful. I tried googling it last night and all I could find that might have been relevant was a single website in (I think) Italian describing an application for viewing CAT scan data, which doesn’t seem to be this application.------------------------
Nate Fries

Dear Nathaniel. Thank you for your help and interest in my problem !

Our applications are for Radiotherapy planning (simulation of high energy photons and electrons in the tissues) and image guided neurosurgery. There is a brochure of CAT3D here http://www.slideshare.net/alabouza/cat3d-brief-presentation

The only complain was with an action of manual segmentation of anatomy for the reasons previously exposed, only inside CAT3D.

The image on the link ( https://goo.gl/photos/JzAYmLGFik5fQnmY9 ) shows the paintbrush segmentation window. As the user drags the brush the system reformat and shows the three orthogonal planes with the evolution of the drawing in all planes (I cut parts of the window to remove patient name).

I made some changes to the code to palliate the issue. Now the code seems to be adequate. But I had to introduce a minimum resolution of the mouse movement to avoid continuous image refresh on all the planes. I also grouped several frame buffer modifications before the actual rendering, to avoid repeated access to the texture for small regions of the frame buffer.

It it not clear for me how a set of tiles conforming the big window area could be of help. Because my problem is triggered by frequent modification of almost all the window, so on that cases I will have to update and render most of the tiles at once.

Regards.------------------------
Armando Alaminos Bouza

when you say paintbrush, it makes me think of an image editing program, not this type of program. My suggestion would probably still help if paintbrushes are used similarly in this program, as instead of uploading the entire framebuffer for each small mouse movement it would only upload a portion of it. It could be worth trying, but tbh your workaround is a much simpler change to make so as long as its acceptable to your users it might not be worth the bother.------------------------
Nate Fries