How do you read binary files with varied content using RWops?

Suppose we have a struct with several primitive type fields, a pointer to a list of structs of another type, and a pointer to a list of texture handles. Now we need to load such an object from any stream with RWops. First, these dozen or so fields of various simple types, then the whole list of structures, and finally the list of textures.

You need to use SDL_RWread, how would you do it? There are three general ways:

  1. Read all fields without checking what SDL_RWread returns (YOLO).
  2. Load all fields with checking what SDL_RWread returns and if there is a problem clean up and stop further loading.
  3. The same as in point 2., plus additionally checking the correctness of the loaded data, i.e. not only the SDL_RWread result, but also whether a value from a specific, valid range has been entered into the struct field.

The easiest, fastest and at the same time the worst solution is the first one. We don’t check anything, we just load the data into the fields of the structure, without taking care of the completeness and correctness of the data. If the game crashes later, who cares.

The best solution is the third one, because it allows you to detect a possible problem, not only, for example, not enough data in the stream, but also incorrect values from the point of view of the usability of the object. Unfortunately, this is the slowest solution, which in the case of larger files can have a negative impact on performance (e.g. longer game boot time).

Second question — would you be reluctant to load the field values one at a time (one field equals one SDL_RWread call), or would you rather load as much data as possible at once (a whole block of data for all fields)?

I’d like to know your opinion on this. Do you prefer efficiency and no error handling, or is error handling important to you?

Since error handling is what keeps malicious users from being able to blow up your game, and also keeps you from shooting yourself in the foot later if the file format (or in-memory struct) changes, I’d say it’s pretty important.

As to the second question, as so often happens with game development, It Depends™. A modern operating system will load the data in chunks and keep it in a buffer anyway, so loading one field at a time isn’t gonna be dog slow like it was a long time ago (especially with SSDs), and it involves less memory wrangling if your in-memory struct doesn’t match the file format. However, if you’re doing asynchronous I/O it might be easier to load the file in chunks yourself and then assign to the struct fields from there.

Either way, I’m generally not a fan of having the file format match the in-memory structs the game is actually using.

1 Like

Thanks for the answer. For me, it is very important that when loading data from a file, detect the problem right away and stop loading, even if it is a critical error and the game has to report a problem and shut down. Unfortunately, I’ve seen a lot of developers’ codes on the web, where they didn’t even check what SDL_RWread returns, let alone testing the correctness of values read from the stream. I hope these were just quick tests and not production code. :wink:

I asked about these things mainly because in my project I’m working on code that loads custom font data from binary files/streams. Not only do I check for each field of the structure whether the data has been read (the result of SDL_RWread and in case of an error goto to the cleaning code, at the end of the routine), but I also test the value ranges if the data has been read correctly.

I’ve written quite a few wrappers for this purpose, which I can give the context of the stream, a pointer for the data, as well as the minimum and maximum values allowed. Such functions return true if the data has been read and the values are valid, otherwise false and the data loading can be terminated.

If you read in chunks to a buffer and then read from this buffer to where you want, essentially, implementing your own SDL_rwops for a buffered stream, you get the benefit speed of reading in chunks and the reading to each struct you mentioned too. Essentially solve the problem of reading from disk to memory and then from memory to your struct memory. In my tests, this is usually faster - say in a game from a sdcard or a slow hard drive and similar.