Feature Discussion: Unicode encoder / decoder interfaces + implementation for UTF-8

Hi,

I’d like know if there community is interested in the following features for SDL:

An “Bytes (of a particular Unicode encoding) → UTF code points” decoder interface

An “UTF code points → Bytes (of a particular Unicode encoding)” decoder interface

Implementation for the UTF 8 encoding.

Remarks: By interface I mean the standard approach of a structs with pointers to functions akin to

typedef struct SDL_UnicodeDecoder {
  SDL_Error (*getCodePoint)(SDL_UnicodeDecoder*, Uint32* codePoint);
  ...
} SDL_UnicodeDecoder;

typedef struct SDL_UTF8UnicodeDecoder  {
  SDL_MyInterface parent;
  ...
} SDL_UTF8UnicodeDecoder;

SDL_Error SDL_UTF8UnicodeDecoder_create(SDL_UTF8UnicodeDecoder** decoder);

Furthermore, I’d be able to provide implementations for the UTF-8 encoding.

Rational: We cannot outrun internationalization in a non-opaque fashion forever. At some point, when dealing with e.g., file system abstractions or text rendering or even data access, then we at some point cannot always treat text as opaque “sequences of Bytes”. The foundations of all this is the ability to encode and decode Unicode text.

Let me know what you think of that.

Another forum in which you cannot fix typos. Anyway, obviously this should not be SDL_MyInterface but SDL_UnicodeDecoder.

typedef struct SDL_UnicodeDecoder {
  SDL_Error (*getCodePoint)(SDL_UnicodeDecoder*, Uint32* codePoint);
  ...
} SDL_UnicodeDecoder;

typedef struct SDL_UTF8UnicodeDecoder  {
  SDL_UnicodeDecoder parent;
  ...
} SDL_UTF8UnicodeDecoder;

SDL_Error SDL_UTF8UnicodeDecoder_create(SDL_UTF8UnicodeDecoder** decoder);