Hello!
When just playing (and experimenting with) a game (not my own) based on SDL 2 with ttf, I was first happy to see it could display UTF-8 text (a previous version of that game couldn???t), then perplexed by the inability to show anything beyond the 16-bit BMP subset of Unicode, even when I knew the used font had glyphs beyond that.
So I downloaded the source to SDL ttf v2.0.14 ??? kind of expecting to see some really old UCS-2 code under the hood, I???ll admit, but was delighted to see that, on the contrary, UTF-8 is used as the base for everything internally.
Still there are/were two problems, both fairly easily fixed:
First, Unicode code points were, for some reason, everywhere crammed into 16-bit variables (type Uint16), except only in the function
UTF8_getch, reading a character from a UTF-8 string, which properly returned the character as a 32-bit value (Uint32). (Only for that value to be cut down to smithereens afterwards.)
Second, when a character map for the loaded font was to be selected, there was no attempt to find a full Unicode map (UCS-4), before looking (and settling) for a mere 16-bit Unicode subset (UCS-2) map.
With those two things fixed, I did indeed get the mentioned game to show Unicode beyond the 16-bit BMP.________________________________
So… I am wondering if perhaps there is a way to apply these fixes also to the official SDL ttf library?
I should say that my fixed version is not terribly well tested. I haven???t even yet written any SDL 2 application of my own to test it with. Everything seems to work without a problem with the mentioned game that isn???t mine (and is closed-source), but I don???t think it makes that heavy use of the library.
In particular, in the existing official library, there are a few functions that take Unicode code points as 16-bit parameters (ugh), which I really think should be upgraded to 32-bit. When I made my changes I did just that as well (though I don???t think they are used by the mentioned game). However, although I know very little about how program libraries are made, I suspect this can cause ABI compatibility problems. In which case I assume the old 16-bit versions have to remain (even if deprecated) while the 32-bit versions are given some new names. (I don???t know what those new names should be, though.) The functions in question are these:
TTF_GlyphIsProvided
TTF_GlyphMetrics
TTF_RenderGlyph_Solid
TTF_RenderGlyph_Shaded
TTF_RenderGlyph_Blended
TTF_GetFontKerningSizeGlyphs
With these changes, the only remaining 16-bit parameters would be for the code units of UCS-2 or UTF-16 strings. Which, I think, is as it should be. No other 16-bit variables should really be let anywhere near anything Unicode, I think.
By the way, I did find an earlier discussion on these matters (http://forums.libsdl.org/viewtopic.php?t=9228), but nothing final seems to have come out of it, and it also seemed to largely focus on the addition of a function to let the user select character maps for a font. Not saying such a function couldn???t have a use, perhaps, in a font viewer or font editor application. But I would imagine the normally desired behavior of the library to be to just automatically pick any full Unicode (UCS-4) character map it can find first, and a 16-bit UCS-2 map only otherwise.