SDL2 performances on RaspberryPI

Even though they probably have one of the larger user base on that distribution and architecture, they may not have the resources to tackle this issue. It probably requires tracking down bugs in LLVM or compiling and testing the newer versions. LLVM is rather large.

Maybe downgrading to the old LLVM version will help? Did not try that on Debian yet.

@rtrussell, @ChliHug,

I’m currently developing my app on Debian 9 Stretch on a laptop, and not a powerful one (MSI CR70).
Anyway, SDL2 runs in accelerated mode so I don’t have to complain about performances issue in my particular case.
Is there simple tests I could make to give you some infos to check if the problem really concerns Stretch or the Raspberry distro?

Here are the glxgears-info:

Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
GL_RENDERER   = Mesa DRI Intel(R) Haswell Mobile 
GL_VERSION    = 3.0 Mesa 13.0.6
GL_VENDOR     = Intel Open Source Technology Center
GL_EXTENSIONS = GL_ARB_multisample GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color   GL_EXT_blend_minmax GL_EXT_blend_subtract GL_EXT_copy_texture GL_EXT_polygon_offset GL_EXT_subtexture GL_EXT_texture_object GL_EXT_vertex_array GL_EXT_compiled_vertex_array GL_EXT_texture GL_EXT_texture3D GL_IBM_rasterpos_clip GL_ARB_point_parameters GL_EXT_draw_range_elements GL_EXT_packed_pixels GL_EXT_point_parameters GL_EXT_rescale_normal GL_EXT_separate_specular_color GL_EXT_texture_edge_clamp GL_SGIS_generate_mipmap GL_SGIS_texture_border_clamp GL_SGIS_texture_edge_clamp GL_SGIS_texture_lod GL_ARB_framebuffer_sRGB GL_ARB_multitexture GL_EXT_framebuffer_sRGB GL_IBM_multimode_draw_arrays GL_IBM_texture_mirrored_repeat GL_3DFX_texture_compression_FXT1 GL_ARB_texture_cube_map GL_ARB_texture_env_add GL_ARB_transpose_matrix GL_EXT_blend_func_separate GL_EXT_fog_coord GL_EXT_multi_draw_arrays GL_EXT_secondary_color GL_EXT_texture_env_add GL_EXT_texture_filter_anisotropic GL_EXT_texture_lod_bias GL_INGR_blend_func_separate GL_NV_blend_square GL_NV_light_max_exponent GL_NV_texgen_reflection GL_NV_texture_env_combine4 GL_S3_s3tc GL_SUN_multi_draw_arrays GL_ARB_texture_border_clamp GL_ARB_texture_compression GL_EXT_framebuffer_object GL_EXT_texture_compression_s3tc GL_EXT_texture_env_combine GL_EXT_texture_env_dot3 GL_MESA_window_pos GL_NV_packed_depth_stencil GL_NV_texture_rectangle GL_ARB_depth_texture GL_ARB_occlusion_query GL_ARB_shadow GL_ARB_texture_env_combine GL_ARB_texture_env_crossbar GL_ARB_texture_env_dot3 GL_ARB_texture_mirrored_repeat GL_ARB_window_pos GL_EXT_stencil_two_side GL_EXT_texture_cube_map GL_NV_depth_clamp GL_APPLE_packed_pixels GL_APPLE_vertex_array_object GL_ARB_draw_buffers GL_ARB_fragment_program GL_ARB_fragment_shader GL_ARB_shader_objects GL_ARB_vertex_program GL_ARB_vertex_shader GL_ATI_draw_buffers GL_ATI_texture_env_combine3 GL_ATI_texture_float GL_EXT_shadow_funcs GL_EXT_stencil_wrap GL_MESA_pack_invert GL_NV_primitive_restart GL_ARB_depth_clamp GL_ARB_fragment_program_shadow GL_ARB_half_float_pixel GL_ARB_occlusion_query2 GL_ARB_point_sprite GL_ARB_shading_language_100 GL_ARB_sync GL_ARB_texture_non_power_of_two GL_ARB_vertex_buffer_object GL_ATI_blend_equation_separate GL_EXT_blend_equation_separate GL_OES_read_format GL_ARB_color_buffer_float GL_ARB_pixel_buffer_object GL_ARB_texture_compression_rgtc GL_ARB_texture_float GL_ARB_texture_rectangle GL_EXT_packed_float GL_EXT_pixel_buffer_object GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc GL_EXT_texture_rectangle GL_EXT_texture_sRGB GL_EXT_texture_shared_exponent GL_ARB_framebuffer_object GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample GL_EXT_packed_depth_stencil GL_APPLE_object_purgeable GL_ARB_vertex_array_object GL_ATI_separate_stencil GL_EXT_draw_buffers2 GL_EXT_draw_instanced GL_EXT_gpu_program_parameters GL_EXT_texture_array GL_EXT_texture_integer GL_EXT_texture_sRGB_decode GL_EXT_timer_query GL_OES_EGL_image GL_AMD_performance_monitor GL_ARB_copy_buffer GL_ARB_depth_buffer_float GL_ARB_draw_instanced GL_ARB_half_float_vertex GL_ARB_instanced_arrays GL_ARB_map_buffer_range GL_ARB_texture_rg GL_ARB_texture_swizzle GL_ARB_vertex_array_bgra GL_EXT_texture_swizzle GL_EXT_vertex_array_bgra GL_NV_conditional_render GL_AMD_conservative_depth GL_AMD_draw_buffers_blend GL_AMD_seamless_cubemap_per_texture GL_ARB_ES2_compatibility GL_ARB_blend_func_extended GL_ARB_debug_output GL_ARB_draw_buffers_blend GL_ARB_draw_elements_base_vertex GL_ARB_explicit_attrib_location GL_ARB_fragment_coord_conventions GL_ARB_provoking_vertex GL_ARB_sample_shading GL_ARB_sampler_objects GL_ARB_seamless_cube_map GL_ARB_shader_texture_lod GL_ARB_texture_cube_map_array GL_ARB_texture_gather GL_ARB_texture_multisample GL_ARB_texture_query_lod GL_ARB_texture_rgb10_a2ui GL_ARB_uniform_buffer_object GL_ARB_vertex_type_2_10_10_10_rev GL_EXT_provoking_vertex GL_EXT_texture_snorm GL_MESA_texture_signed_rgba GL_NV_texture_barrier GL_ARB_get_program_binary GL_ARB_robustness GL_ARB_separate_shader_objects GL_ARB_shader_bit_encoding GL_ARB_texture_compression_bptc GL_ARB_timer_query GL_ARB_transform_feedback2 GL_ARB_transform_feedback3 GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ARB_compressed_texture_pixel_storage GL_ARB_conservative_depth GL_ARB_internalformat_query GL_ARB_map_buffer_alignment GL_ARB_shader_atomic_counters GL_ARB_shader_image_load_store GL_ARB_shading_language_420pack GL_ARB_shading_language_packing GL_ARB_texture_storage GL_ARB_transform_feedback_instanced GL_EXT_framebuffer_multisample_blit_scaled GL_EXT_transform_feedback GL_AMD_shader_trinary_minmax GL_ARB_ES3_compatibility GL_ARB_arrays_of_arrays GL_ARB_clear_buffer_object GL_ARB_compute_shader GL_ARB_copy_image GL_ARB_explicit_uniform_location GL_ARB_framebuffer_no_attachments GL_ARB_invalidate_subdata GL_ARB_program_interface_query GL_ARB_robust_buffer_access_behavior GL_ARB_shader_image_size GL_ARB_shader_storage_buffer_object GL_ARB_stencil_texturing GL_ARB_texture_query_levels GL_ARB_texture_storage_multisample GL_ARB_texture_view GL_ARB_vertex_attrib_binding GL_KHR_debug GL_KHR_robustness GL_ARB_buffer_storage GL_ARB_clear_texture GL_ARB_internalformat_query2 GL_ARB_multi_bind GL_ARB_query_buffer_object GL_ARB_seamless_cubemap_per_texture GL_ARB_shader_draw_parameters GL_ARB_texture_mirror_clamp_to_edge GL_ARB_texture_stencil8 GL_ARB_vertex_type_10f_11f_11f_rev GL_EXT_shader_integer_mix GL_INTEL_performance_query GL_ARB_clip_control GL_ARB_conditional_render_inverted GL_ARB_cull_distance GL_ARB_derivative_control GL_ARB_get_texture_sub_image GL_ARB_pipeline_statistics_query GL_ARB_shader_texture_image_samples GL_ARB_texture_barrier GL_EXT_polygon_offset_clamp GL_KHR_blend_equation_advanced GL_KHR_context_flush_control GL_KHR_robust_buffer_access_behavior GL_ARB_shader_atomic_counter_ops GL_ARB_shader_clock GL_EXT_shader_samples_identical GL_MESA_shader_integer_functions 
VisualID 206, 0xce
378 frames in 5.0 seconds = 75.572 FPS
300 frames in 5.0 seconds = 59.980 FPS
300 frames in 5.0 seconds = 59.981 FPS

xvinfo:

X-Video Extension version 2.2
screen #0
Adaptor #0: "GLAMOR Textured Video"
number of ports: 16
port base: 117
operations supported: PutImage 
supported visuals:
  depth 24, visualID 0x21
number of attributes: 5
  "XV_BRIGHTNESS" (range -1000 to 1000)
          client settable attribute
          client gettable attribute (current value is 0)
  "XV_CONTRAST" (range -1000 to 1000)
          client settable attribute
          client gettable attribute (current value is 0)
  "XV_SATURATION" (range -1000 to 1000)
          client settable attribute
          client gettable attribute (current value is 0)
  "XV_HUE" (range -1000 to 1000)
          client settable attribute
          client gettable attribute (current value is 0)
  "XV_COLORSPACE" (range 0 to 1)
          client settable attribute
          client gettable attribute (current value is 0)
maximum XvImage size: 8192 x 8192
Number of image formats: 2
  id: 0x32315659 (YV12)
    guid: 59563132-0000-0010-8000-00aa00389b71
    bits per pixel: 12
    number of planes: 3
    type: YUV (planar)
  id: 0x30323449 (I420)
    guid: 49343230-0000-0010-8000-00aa00389b71
    bits per pixel: 12
    number of planes: 3
    type: YUV (planar)

Al

I too have seen a extreme performance drop from SDL1 to SDL2 on the Raspberry Pi. I’m using a Atari ST (an 4MHz M68k computer) emulator that can use either SDL1 or SDL2 (compile-time option) - and I have not found any SDL compile-time option or anything like that, that can “make” SDL2 as fast as SDL1.
I’ve tried both the old firmware-side driver and the new from Anholt.

Oh, no. This is just a software renderer we’re talking about. Testing the issue probably requires building Mesa and/or LLVM which are rather large projects.

If you got one of the hardware accelerated renderers running, you’re all good.

What interface does SDL1 use to show stuff on screen? /dev/fb0 or a dispmanx element? Such an emulator usually likes to access pixels directly and the current renderers are not optimized for that. OpenGL ES might just be the wrong thing here. A dumb KMS buffer could be better (something I was planning on testing), but that’s currently not implemented in the new KMSDRM driver in SDL2.

Good question, ChliHug. I quite don’t know, but I know for sure that it’s not dispmanx as the emulator works out of the box on both Linux and OSX on other architectures. If you want, you can check it out your self: https://hg.tuxfamily.org/mercurialroot/hatari/hatari

I also want to try out the new experimental vc4 driver but I only have a Raspbian Jessie. When I open raspi-config on my Jessie system, there is no “GL (Full KMS)” option under “Advanced Options/GL Driver”. Instead, it just asks me whether I want to enable or disable the experimental OpenGL driver. When I enable it, SDL_GetCurrentVideoDriver() still returns “x11” but it is noticably faster. Does this mean that SDL is using the new vc4 driver now or could it also be that it is using its standard OpenGL for X11 driver? Is there any way I can check that SDL is using the new vc4 driver?

That’s odd. Are you sure you have the latest version? dpkg -s raspi-config | grep Version shows “20170705” for me.

X is probably using the vc4 driver. There are a few ways you can check things.

  • if the vc4 kernel module has been loaded: lsmod | grep vc4 should show a vc4 line.
  • If the vc4 driver is active: The file /dev/dri/card0 should exist.
  • If X found the card0 file: Run /var/log/Xorg.0 | grep /dev/dri and it should say if it managed to open it or not.
  • What OpenGL renderer is being used: Run glxinfo | grep string and it should say “OpenGL renderer string: Gallium 0.4 on VC4 V3D 2.1”.

You can’t directly test if SDL is using it because it goes through X11. Unless you create your own OpenGL context, but that should be the same renderer string as above.

I can confirm that you will see the video driver as ‘x11’ and the renderer as ‘opengl’ irrespective of whether you are using the Mesa software driver or the VC4 accelerated driver. I don’t know of any direct way you can tell the difference via SDL; here all I see is a big increase in the rendering speed (much bigger on Raspbian Stretch than on Jessie).

Richard.

Sorry, I wasn’t running the latest version. I’ve updated to the latest Jessie version and I think SDL is using the vc4 driver now.

Please be aware that (unless it has been fixed very recently) enabling the VC4 driver in Jessie breaks sound output via the HDMI port. The sound is there, but it is grossly distorted. This is fixed in Stretch, but my understanding is that it won’t be fixed in Jessie. If you’re not using HDMI sound this probably isn’t an issue for you.

Richard.

Yes, I noticed that as well. Actually, I don’t get any sound via HDMI. It only works via jack audio out.