How to access the data pointer on an SDL_RWops structure?

vanfanel · October 12, 2017, 10:55am

Hi,

I am trying to debug a certain program that has audio problems on ARM but not on X86_64 (https://github.com/cyxx/rawgl/issues/17)

The thing is that a certain funcion on it (Mixer_impl::playSoundWav in mixer.cpp) receives a data pointer that is then loaded to an SDL_RWops structure with:
SDL_RWops *rw = SDL_RWFromConstMem(data, size);
and then loaded to an Mix_Chunk with:
Mix_Chunk *chunk = Mix_LoadWAV_RW(rw, 1);
Well: the input pointer “data” has the same information on both architectures, BUT the Mix_Chunk data pointer (chunk->abuf) differs in some bytes!
So, the problem must be on SDL_RWFromConstMem() or Mix_LoadWAV_RW(), so I need to see what’s inside the SDL_RWops rw data pointer to confirm, but the structure is confusing and according to https://wiki.libsdl.org/SDL_RWops, the “unknown” union is platform specific.

Also, did SDL_RWFromConstMem() change bewteen SDL 2.0.4 and 2.0.5, where the audio problems started appearing on ARM??

The rawgl programmer, cyxx, seems to agree that something is wrong here with the current SDL_RWFromConstMem() implementation on ARM…

ChliHug · October 17, 2017, 10:08pm

The memory RWops always saves the pointer in the hidden.mem.base member. Simply access it through that. Although the ConstMem RWops really should not change the data in there.

You can also get fancy by fishing for the offsets by creating a dummy SDL_RWops with magic values. You can then search the allocated union for the values.

#include <SDL.h>

#define MAGIC_POINTER 0xE64B2A253897CE03

static int FindRWOffset(size_t * pointer)
{
	size_t i;
	size_t max_offset = sizeof(SDL_RWops) - sizeof(void *);
	void * test_pointer = (void *)MAGIC_POINTER;
	SDL_RWops * rw = SDL_RWFromConstMem(test_pointer, 1);

	for (i = 0; i <= max_offset; i++) {
		if (SDL_memcmp(&test_pointer, (Uint8 *)rw + i, sizeof(void*)) == 0) {
			*pointer = i;
			break;
		}
	}

	SDL_RWclose(rw);

	if (i > max_offset)
		return 1;

	return 0;
}

int main(int argc, char * argv[])
{
	size_t offset;
	char str[] = "Found me!";
	SDL_RWops * rw = SDL_RWFromConstMem(str, sizeof(str));

	if (FindRWOffset(&offset)) {
		SDL_Log("Could not find offset\n");
		return 1;
	}

	SDL_Log("Offset at %u\n", offset);
	SDL_Log("%p %s\n", rw->hidden.mem.base, rw->hidden.mem.base);
	SDL_Log("%p %s\n", *(void **)((Uint8 *)rw + offset), *(void **)((Uint8 *)rw + offset));

	return 0;
}

The memory reading functions of RWops never changed.

Regarding the issue: I can’t see how the memory RWops could fail on ARM. This may be in SDL_Mixer.

Is the sound data you’re having issues with and that is passed to Mix_LoadWAV_RW in the wave format or something else? Can you dump it into a file and upload it? I may be able to test it on a Raspberry Pi too.

I’m having weird behavior on Linux x86 where it doesn’t play wave files and (was due to extended wave file and SDL 2.0.5 not having support for that) on Linux ARM where Vorbis gets a bit distorted for some sample files.

vanfanel · October 18, 2017, 12:30am

Hi, ChliHug, and thanks for your response!

I uploaded an example you can easily test on my bug report on SDL’s bugzilla:
https://bugzilla.libsdl.org/show_bug.cgi?id=3876
It has a wav file from wich the distorted sample on ARM only (not on X86_64) can be found.

Seems that Ryan could already reproduce the issue on ARM and is investigating.

ChliHug · October 18, 2017, 4:06am

Aha! It’s the SDL_Convert_F32_to_S16_Scalar function. Or… well… the resampler. Depending if floating point values over 1.0 and under -1.0 are valid or not.

It calculates *dst = (Sint16) (*src * 32767.0f); which can cause issues if *src is over 1.0 or under -1.0. The other functions that don’t check this also produce garbage output.

SSE has these nice saturation opcodes. Does NEON have them too?

vanfanel · October 18, 2017, 9:50am

@ChliHug: Thanks for the patch on bugzilla, it works and no more distortion is heard with it!
About NEON supporting saturation opcodes,

According to the Arm Developer technical docs, there are those saturating opcodes:

VQABS Absolute value, saturate V{Q}ABS and V{Q}NEG
VQADD Add, saturate V{Q}ADD, VADDL, VADDW, V{Q}SUB, VSUBL, and VSUBW
VQDMLAL, VQDMLSL Saturating Doubling Multiply Accumulate, and Multiply Subtract VQDMULL, VQDMLAL, and VQDMLSL (by vector or by scalar)
VQDMUL Saturating Doubling Multiply VQDMULL, VQDMLAL, and VQDMLSL (by vector or by scalar)
VQDMULH Saturating Doubling Multiply returning High half VQ{R}DMULH (by vector or by scalar)
VQMOV{U}N Saturating Move (register) VMOVL, V{Q}MOVN, VQMOVUN
VQNEG Negate, saturate V{Q}ABS and V{Q}NEG
VQRDMULH Saturating Doubling Multiply returning High half VQ{R}DMULH (by vector or by scalar)
VQRSHL Shift Left, Round, saturate (by signed variable) V{Q}{R}SHL (by signed variable)
VQRSHR{U}N Shift Right, Round, saturate (by immediate) VQ{R}SHR{U}N (by immediate)
VQSHL Shift Left, saturate (by immediate) VSHL, VQSHL, VQSHLU, and VSHLL (by immediate)
VQSHL Shift Left, saturate (by signed variable) V{Q}{R}SHL (by signed variable)
VQSHR{U}N Shift Right, saturate (by immediate) VQ{R}SHR{U}N (by immediate)
VQSUB Subtract, saturate V{Q}ADD, VADDL, VADDW, V{Q}SUB, VSUBL, and VSUBW

I have never done any ARM assembly, and it has been 15 years since I did some Z80 assembly, so I am not sure I can help much with that optimization, but I think this is doable as with SSE.