IME (Input Method Editor) planning

Simos_Xenitellis · May 28, 2009, 4:22am

Hi All,

Here are some thoughts for the IME work in SDL that takes place this summer.
The aim is to start a discussion on the API that will support the
input method in SDL.

As far as I understand, SDL does not provide GUI elements such as text
input widgets.
For the IME support requirements, we need to bring to SDL such a basic
text input widget.
Shall SDL perform the full drawing of the characters pressed, or shall
the programmer
do it instead (with help from SDL)?

For Latin, Greek and Cyrillic (LGC) there is no strong need for visual
feedback when pre-editing
(for example, when typing a compose sequence such as dead_acute + a),
which makes it easier
than more complex scripts.
SDL would consume key events as they are pressed until it would be
able to produce
a new printable character. Then, the updated string is printed over
the previous version.

When visual feedback is required while pre-editing, things get
complicated and SDL has to take
over the input and display of text.

Would to be acceptable for SDL to take over when the program gives control
to a text input box, until a condition arises (such as pressing Esc,
Enter, Tab).
This might be acceptable for simple programs (such as when typing your
name for the scores
in an arcade game), however it would not be acceptable for a network
game where you
type a message in a multi-line textbox and at the same time you make
interruptions to control
the game elements with your mouse.

Simos

Jiang_Jiang · May 28, 2009, 6:36am

Hi Simos,

As far as I understand, SDL does not provide GUI elements such as text
input widgets. For the IME support requirements, we need to bring to
SDL such a basic text input widget. Shall SDL perform the full drawing
of the characters pressed, or shall the programmer do it instead (with
help from SDL)?

I don’t think a built-in text widget is needed, because programmers using
SDL can receive SDL_TEXTINPUT events and check the text received from
‘event.text.text’ variable, then they can draw the text wherever they
wants. SDL itself simply don’t know the position and font style of these
text.

For Latin, Greek and Cyrillic (LGC) there is no strong need for visual
feedback when pre-editing (for example, when typing a compose sequence
such as dead_acute + a), which makes it easier than more complex scripts.
SDL would consume key events as they are pressed until it would be able
to produce a new printable character. Then, the updated string is
printed over the previous version.

IMHO, an additional event type should be defined for “pre-editing” events,
in Mac OS X, it’s called “marked text”. About the “consume key events”
part, I’m not sure if normal SDL_KEYDOWN/SDL_KEYUP/… events should be
generated at this point, maybe we can leave the application developer to
decide which state (inputing text or controlling game) they’re in then
have these events filtered within the application.

Would to be acceptable for SDL to take over when the program gives
control to a text input box, until a condition arises (such as pressing
Esc, Enter, Tab). This might be acceptable for simple programs (such as
when typing your name for the scores in an arcade game), however it
would not be acceptable for a network game where you type a message
in a multi-line textbox and at the same time you make interruptions to
control the game elements with your mouse.

One more thing I’d like to bring up is, since we are defining a cross
platform API, we should consider systems using virtual keyboards like
iPhone/iPod Touch, NDS, PSP, Wii, PS3, etc. In these systems, the only
way to input text is bring up the system virtual keyboard (with some
special APIs provided by that system).

In my upcoming emails, I will give a brief overview of how input methods
work for CJK (Chinese, Japanese, Korean) languages and detailed discussion
on how input methods work on Mac OS X, finally, I’ll summarize the current
status of SDL Unicode text input support.

JiangOn Thu, May 28, 2009 at 12:22 PM, Simos Xenitellis <simos.lists at googlemail.com> wrote:

Alissa_Sabre · May 29, 2009, 12:42pm

Here are some thoughts for the IME work in SDL that takes place this summer.

Great.

The aim is to start a discussion on the API that will support the
input method in SDL.

Great.

As far as I understand, SDL does not provide GUI elements such as text
input widgets.
For the IME support requirements, we need to bring to SDL such a basic
text input widget.

I don’t think so.

IMHO, one of the strength of SDL is not having such features built-in.
SDL itself provides basic abstraction of the minimal graphics
features, and higher layer functions are provided as separate
libraries. I have a feeling it’s better to make minimum changes to
SDL interface, and put higher level functions for input method
handling in a separate libraries, e.g., sdlim.

I’m sorry, at this moment, I’m not sure whether we can acheive the

goal through this path…

When visual feedback is required while pre-editing, things get
complicated

True.

and SDL has to take
over the input and display of text.

Be careful. SDL (or a separate input method support library) could
take over the input and display preedit and/or candidate texts, but is
will be very tough job. The problem is that each input method has its
own manner of showing candidates, and each application has its own style
of showing preedits. If you really resolved to go this way, I will
not stop you, but you surely need a lot of workload (and very good luck.)

Alissa Sabre @ SL--------------------------------------

Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/

Scott_Harper · May 29, 2009, 6:07pm

Be careful. SDL (or a separate input method support library) could
take over the input and display preedit and/or candidate texts, but is
will be very tough job. The problem is that each input method has its
own manner of showing candidates, and each application has its own
style
of showing preedits. If you really resolved to go this way, I will
not stop you, but you surely need a lot of workload (and very good
luck.)

So here’s just a random idea I had on the subject. What if IME were
like a service available to SDL developers? By that I mean that, as
the dev, if you want to enable IME support, you might use SDL_IME (or
whatever it should be called) to first check whether the system would
be using it to see if you should be sending your call to it or not.
Then, if you should be using it, whenever the user is in a state for
editing text, it is the responsibility of the developer then to feed
all of the user input into the SDL_IME functions, which functions do
their own tracking of input, purely behind-the-scenes. Finally, to
display the text, the developer queries SDL_IME for what text they
should be drawing, and whether or not that is a completed text
fragment. If it is a completed text fragment, then they append their
own string, otherwise, they draw their current string plus the
temporary fragment, followed by an underline.

As for selecting choices (my experience here is only limited to
Japanese, sorry if it differs significantly from others), you can find
out 1) if you should even worry about drawing choices, and 2) what
choices there are available to draw and a list of them (maybe which
one is currently selected, but perhaps that’s best left to the dev? I
don’t know in this scenario) and then leave it up to the developer
(again) to draw out the list and highlight the selected option.

I suppose it wouldn’t be a bad idea to provide some basic sample
drawing functions that a dev could resort to implementing should s/he
not feel up to the task of handling all the IME themselves right off
the bat, but still it doesn’t seem that difficult, assuming that the
information I mentioned could be available from an IME library in the
first place.

How does that sound?

– Scott

???
…as a man’s knowledge widens, ever the way he can follow grows
narrower; until at last he chooses nothing, but does only and wholely
what he must do…
? Ursula K. Le Guin "A Wizard of Earthsea"On May 29, 2009, at 6:42 AM, Alissa Sabre wrote:

Bob_Pendleton · May 29, 2009, 10:16pm

Hi All,

Here are some thoughts for the IME work in SDL that takes place this summer.
The aim is to start a discussion on the API that will support the
input method in SDL.

As far as I understand, SDL does not provide GUI elements such as text
input widgets.

That is correct, SDL does not have the concept of a widget. In 1.3 you
can have a separate window. Having said that, I should point out that
it is possible that not all implementations of 1.3 will actually
support multiple windows. If the underlying OS does not support them,
SDL won’t either.

Can we look at how the dialogs used by existing IMEs interact with the
graphics environments of typical SDL applications? SDL may be used in
windowed, and full screen modes. It can be used with single and double
buffering, and with its 2D API and OpenGL rendering. The IME must be
able to be used, or ignored, in all of those graphics environments.

Just for fun, I started SCIM and ran Extreme TuxRacer and tried to use
the input method in the game. When I left the game, the SCIM dialog
box was on my desktop, but it was not visible while I played the game.
To me, it would seem that we can not count on the dialogs of existing
IME systems being visible on top of full screen 3D games. But, I can
imagine wanting to use one to chat with my CJK friends during an FPS
that is animating the full screen all the time. We must verify whether
or not the dialogs used by existing IMEs can be made visible on top of
the different SDL graphic modes. Which raises the question, do we
always want to do “on the spot” IME dialog, which are more work for
the programmer, or do we want to allow the use of the IME’s dialog box
when it is possible?

Just to give folks something to think about and throw rocks at I’m
going to propose an API

The first principle is that if an application wants to use and IME it
must take special actions to get it. If it just ignores the problem
then it will not get an IME. Second principle is that we can’t require
SDL to contain fonts. If a system IME includes fonts we can use those,
but we can’t require them for SDL.

Because I am lazy I am not going to type the “SDL” part of the name, OK?

SelectIME(optional standard locale name) - This function tells SDL
that we want to use an IME and tells SDL what the locale is. If the
locale is null or empty then SDL will use the default system locale,
otherwise it will use the provided locale. If that locale is not
support it returns false, otherwise it return true.

I’m going to assume that turning on IME support causes SDL to start
sending IME specific events to the event queue. To handle these events
lets provide a special IME filter function.

IMEMode(mode) - the mode tells the IME to either use its own dialog,
or to do let the program display the interaction text. If you ask for
a mode that can not be supported the function returns false. We may
need an IMEQueryModes() to find out which mode are possible. In SDL
1.3 these functions would also have a renderer as an argument so that
they can look at the display mode and see what is possible.

IMEEventFilter(SDL_event *) - You simply pass all events to this
function. The function returns NULL if it handled the event and it
returns the event pointer if it did not. A function like this lets us
hide a huge amount of details.

I’m going to assume that when the IME has completed the input of one
of more characters it will generate the correct sequence of text
events so that the program will just get the right input.

If the program is handling the display of the interaction text, then
we need an event that sends that text to the program. It would just
contain text that the IME wants displayed. The IME can send these at
any time and the program is required to display them. Before sending
the input text to the program the IME must send another event that
clears the IME text. I see this being used in a program where text is
being input in a line, or column, and when the IME text is sent it is
displayed either where the next character would be drawn, or on a line
or column of its own. Of course, no such events would ever be sent if
the IME is using its own input dialog.

Well, that make perfect sense to me… I’m well aware that it
ignores details of communication with real IMEs. But, it might work,
and it is simple…

Bob PendletonOn Wed, May 27, 2009 at 11:22 PM, Simos Xenitellis <simos.lists at googlemail.com> wrote:

For the IME support requirements, we need to bring to SDL such a basic
text input widget.
Shall SDL perform the full drawing of the characters pressed, or shall
the programmer
do it instead (with help from SDL)?

For Latin, Greek and Cyrillic (LGC) there is no strong need for visual
feedback when pre-editing
(for example, when typing a compose sequence such as dead_acute + a),
which makes it easier
than more complex scripts.
SDL would consume key events as they are pressed until it would be
able to produce
a new printable character. Then, the updated string is printed over
the previous version.

When visual feedback is required while pre-editing, things get
complicated and SDL has to take
over the input and display of text.

Would to be acceptable for SDL to take over when the program gives control
to a text input box, until a condition arises (such as pressing Esc,
Enter, Tab).
This might be acceptable for simple programs (such as when typing your
name for the scores
in an arcade game), however it would not be acceptable for a network
game where you
type a message in a multi-line textbox and at the same time you make
interruptions to control
the game elements with your mouse.

Simos

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

–
±----------------------------------------------------------

Bob Pendleton: writer and programmer
email: Bob at Pendleton.com
web: www.TheGrumpyProgrammer.com

Donny_Viszneki · May 30, 2009, 9:18pm

Very interesting discussion! Here I go!

As far as I understand, SDL does not provide GUI elements such as text
input widgets.

I agree with others who have said that SDL should remain this way.

Can we look at how the dialogs used by existing IMEs interact with the
graphics environments of typical SDL applications? SDL may be used in
windowed, and full screen modes. It can be used with single and double
buffering, and with its 2D API and OpenGL rendering. The IME must be
able to be used, or ignored, in all of those graphics environments.

Being a person whose native and supplementary languages are pretty
well accommodated by a US English keyboard layout (in most gaming
environments no one is offended if you omit details like accent marks)
I am pretty ignorant of any IME other than compose/dead keys like
holding ALT and punching in codes on MSWindows, pressing
Control+Shift+U and then punching in codes on Debian, or holding the
Option/Alt key on Mac OS X.

Having said that, I have this to say: I am not even certain where the
realm of things called IME begins and ends. So far it seems that it
includes means of inputting text defined by your “desktop environment”
(for lack of a better term – please don’t say OS, though as well
as third-party applications for intercepting the user’s input and
transforming it into different input for other applications. Is this
right? Does anything that is called IME fall outside of this (poor)
definition?

Moving along…

When visual feedback is required while pre-editing, things get
complicated and SDL has to take
over the input and display of text.

The problem is that each input method has its
own manner of showing candidates, and each application has its own style
of showing preedits.

We must verify whether
or not the dialogs used by existing IMEs can be made visible on top of
the different SDL graphic modes. Which raises the question, do we
always want to do “on the spot” IME dialog, which are more work for
the programmer, or do we want to allow the use of the IME’s dialog box
when it is possible?

I think we need a comprehensive discussion of the input methods SDL
may support. (I don’t use the term “IME” since I am still not entirely
sure what that term encompasses.) It is critical to compare the APIs
and capabilities of these different input methods. I expect, for
instance, that stylus-driven input interfaces common on portables, all
have no problem overlaying the input editor GUI over top of the
application. (That may or may not be accurate.) I’m not sure sure
however, for instance, that Mac OS X can offer accommodate OpenGL
applications with a text edit field (and I’m quite certain that most
developers would not want to be constrained to using an Apple text
widget for i18n-capable text input.) We need to know what these
various platforms and IME applications have in common, the technical
details of how they interface with users’ applications, before we can
begin deciding upon implementation details and how much implementation
belongs in SDL proper or in the application / satellite libraries.

Where platforms fall short of providing good integration with SDL
applications, we could consider platform emulation to some degree. I
give an example of emulating Mac OS X compose key emulation further
down in this email.

Just to give folks something to think about and throw rocks at I’m
going to propose an API

SelectIME(optional standard locale name) - This function tells SDL
that we want to use an IME and tells SDL what the locale is. If the
locale is null or empty then SDL will use the default system locale,
otherwise it will use the provided locale. If that locale is not
support it returns false, otherwise it return true.

Locale (? la setlocale() et al, see
http://linux.die.net/man/3/setlocale) seems to me to be an issue best
handled separately. For input, locale really only matters for
formatting expectations of things like numbers (is one thousand
1,000.00 or 1.000,00 for example) and phone numbers (1-412-555-1212 or
1.412.555.1212) and I think is outside the scope of this discussion.

(Did you maybe mean character encoding?)

I’m going to assume that turning on IME support causes SDL to start
sending IME specific events to the event queue. To handle these events
lets provide a special IME filter function.

Is it possible to fake complex IME if you (the SDL layer) can make
certain text-edit-ish assumptions about the application mode?
(Provided, obviously, by the application letting SDL know just that.)
For instance, if I am on Mac OS X, and I hold ALT+E to compose an
accent acute, and then press “e” to make “?,” would it be good for SDL
to fake this by sending first an accent character to begin with, and
then once I hit “e,” SDL could send two events: backspace, then “??”
The application could do things like kill or restart the input session
if it was closing the target of the user’s input, or if the user
clicks somewhere else to move the cursor.On Thu, May 28, 2009 at 12:22 AM, Simos Xenitellis <simos.lists at googlemail.com> wrote:
On Fri, May 29, 2009 at 6:16 PM, Bob Pendleton wrote:
On Thu, May 28, 2009 at 12:22 AM, Simos Xenitellis <simos.lists at googlemail.com> wrote:
On Fri, May 29, 2009 at 8:42 AM, Alissa Sabre <alissa_sabre at yahoo.co.jp> wrote:
On Fri, May 29, 2009 at 6:16 PM, Bob Pendleton wrote:

On Thu, May 28, 2009 at 2:36 AM, Jjgod Jiang wrote:

In my upcoming emails, I will give a brief overview of how input methods
work for CJK (Chinese, Japanese, Korean) languages and detailed discussion
on how input methods work on Mac OS X, finally, I’ll summarize the current
status of SDL Unicode text input support.

I look forward to your future posts!

–
http://codebad.com/

Rainer_Deyke · May 31, 2009, 12:08am

Donny Viszneki wrote:

Having said that, I have this to say: I am not even certain where the
realm of things called IME begins and ends. So far it seems that it
includes means of inputting text defined by your “desktop environment”
(for lack of a better term – please don’t say OS, though as well
as third-party applications for intercepting the user’s input and
transforming it into different input for other applications. Is this
right? Does anything that is called IME fall outside of this (poor)
definition?

There are four levels of international input support:

Combinations of simultaneous keys: AltGr + a = ?.
Combinations of sequential keys, composed using dead keys: " + a = ?.
Combinations of sequential keys, composed in place. k + a =
underlined ?. Underlined ? + return = ?.
Substitution through a menu: underlined ? + space = ? or ? or ?
or ? or ? or ? or ? or any number of other kanji that can be read as
“ka”, selected from a menu that pops up when space is pressed.

I tend to use the term IME to refer to levels 3 and 4. Level 3 is
comparatively easy to handle; level 4 is hard.–
Rainer Deyke - rainerd at eldwood.com

Simos_Xenitellis · May 31, 2009, 1:44am

Donny Viszneki wrote:

Having said that, I have this to say: I am not even certain where the
realm of things called IME begins and ends. So far it seems that it
includes means of inputting text defined by your “desktop environment”
(for lack of a better term – please don’t say OS, though as well
as third-party applications for intercepting the user’s input and
transforming it into different input for other applications. Is this
right? Does anything that is called IME fall outside of this (poor)
definition?

There are four levels of international input support:

Combinations of simultaneous keys: AltGr + a = ?.

This can be dealt with by the keyboard layout and should be similar to
the layouts
that assign ? to a single key by itself. I am not sure if non-Xorg would
require something special here.

Combinations of sequential keys, composed using dead keys: " + a = ?.

When we make the SDL IME, we would typically duplicate the compose
sequence table
that is found in X.Org. I do not know if we could go for a shortcut here.
In X.Org, compose sequences can be at most five keys long, and it
requires for the IME
to trace the sequence as it is being built. As soon as a sequence
matches a known
compose sequence, the resulting character is printed.

This type is a special case of [3] below, with the differences,

the list of compose sequences in X.Org is global, and can be used as
long as the current
keyboard layout can produce the proper keys to start a sequence.
due to being global, the starting ‘key’ has to be special so as not
to mess with potential keyboard layouts.
For example, for ‘k + a = ?’, if there was a compose sequence that
started with ‘k’, then it would
mess up with any layout that has a ‘k’ in it. If someone would press ‘k’,
the IME would start a matching process which means that the ‘k’ would not show.

Combinations of sequential keys, composed in place. ?k + a =
underlined ?. ?Underlined ? + return = ?.

Substitution through a menu: underlined ? + space = ? or ? or ?
or ? or ? or ? or ? or any number of other kanji that can be read as
“ka”, selected from a menu that pops up when space is pressed.

I tend to use the term IME to refer to levels 3 and 4. ?Level 3 is
comparatively easy to handle; level 4 is hard.

If we are to re-implement 3 and 4 from scratch, it would make sense
to add [2] as a special case of [3].

SDL should use as match as possible from the IME of the host OS.

SimosOn Sun, May 31, 2009 at 1:08 AM, Rainer Deyke wrote:

Donny_Viszneki · May 31, 2009, 2:18am

Combinations of sequential keys, composed using dead keys: " + a = ?.

When we make the SDL IME, we would typically duplicate the compose
sequence table
that is found in X.Org. I do not know if we could go for a shortcut here.
In X.Org, compose sequences can be at most five keys long, and it
requires for the IME
to trace the sequence as it is being built. As soon as a sequence
matches a known
compose sequence, the resulting character is printed.

You don’t seem very fond of actually interoperating with the input
systems actually provided by each platform. I would hesitate before
deciding upon emulating the behavior you expect to be provided by the
user’s platform.

due to being global, the starting ‘key’ has to be special so as not
to mess with potential keyboard layouts.
For example, for ‘k + a = ?’, if there was a compose sequence that
started with ‘k’, then it would
mess up with any layout that has a ‘k’ in it. If someone would press ‘k’,
the IME would start a matching process which means that the ‘k’ would not show.

I think you and Rainer are on the same page, and you just misunderstood.On Sat, May 30, 2009 at 9:44 PM, Simos Xenitellis <simos.lists at googlemail.com> wrote:

On Sun, May 31, 2009 at 1:08 AM, Rainer Deyke wrote:

–
http://codebad.com/

Simos_Xenitellis · May 31, 2009, 2:34am

Very interesting discussion! Here I go!

As far as I understand, SDL does not provide GUI elements such as text
input widgets.

I agree with others who have said that SDL should remain this way.

Can we look at how the dialogs used by existing IMEs interact with the
graphics environments of typical SDL applications? SDL may be used in
windowed, and full screen modes. It can be used with single and double
buffering, and with its 2D API and OpenGL rendering. The IME must be
able to be used, or ignored, in all of those graphics environments.

Being a person whose native and supplementary languages are pretty
well accommodated by a US English keyboard layout (in most gaming
environments no one is offended if you omit details like accent marks)
I am pretty ignorant of any IME other than compose/dead keys like
holding ALT and punching in codes on MSWindows, pressing
Control+Shift+U and then punching in codes on Debian, or holding the
Option/Alt key on Mac OS X.

The Alt+number, Ctrl+Shift+U and the OS/X variation is an extra
feature of the IME
to enter a wider variety of Unicode characters.
For more details, see Unicode input - Wikipedia

The Ctrl+Shift+U is a GTK+ functionality and requires to type the
Unicode codepoint value
of a character (as in 2318 = ?). This functionality conforms to ISO 14755,

In Windows there are quite a few variations. The universal Unicode
style as in GTK+ requires
a registry setting in order to be enabled.
For the other legacy variations, see
http://www.fileformat.info/tip/microsoft/enter_unicode.htm

I suspect with OS/X it should be similar to GTK+/Linux.

In the case of Windows, the character appears as soon as you finish
typing the numbers.
In Linux, this functionality is provided with the GTK+ library, so any
library linked to GTK+
has the Ctrl+Shift+U shortcut. You have visual feedback as you type
the hex numbers;
you see them being typed. When you press Space, the hex numbers are
replaced with the
resulting Unicode character.

SimosOn Sat, May 30, 2009 at 10:18 PM, Donny Viszneki <donny.viszneki at gmail.com> wrote:

On Thu, May 28, 2009 at 12:22 AM, Simos Xenitellis <@Simos_Xenitellis> wrote:
On Fri, May 29, 2009 at 6:16 PM, Bob Pendleton wrote:

Rainer_Deyke · May 31, 2009, 3:45am

Simos Xenitellis wrote:

When we make the SDL IME, we would typically duplicate the compose
sequence table
that is found in X.Org. I do not know if we could go for a shortcut here.
In X.Org, compose sequences can be at most five keys long, and it
requires for the IME
to trace the sequence as it is being built. As soon as a sequence
matches a known
compose sequence, the resulting character is printed.

If I set up my (Windows) keyboard layout to produce ? when I press ‘"’
followed by ‘a’, I expect SDL to respect that. Likewise when I set
keyboard to produce ? for ‘k’ + ‘a’. The correct thing to do here is
to cooperate with the platform IME as much as possible.

The problem is (4):

Substitution through a menu: underlined ? + space = ? or ? or ?
or ? or ? or ? or ? or any number of other kanji that can be read as
“ka”, selected from a menu that pops up when space is pressed.

We can’t rely on the platform IME tries to correctly draw a menu over an
SDL window, especially in full-screen mode. If the platform provides
the /contents/ of that menu in plain text, SDL applications should be
able to draw their own IME menus that are consistent with the platform
IME. If the platform IME does not provide the contents of the menu, we
are fucked.–
Rainer Deyke - rainerd at eldwood.com

Jiang_Jiang · May 31, 2009, 3:51am

Hi,On Sun, May 31, 2009 at 11:45 AM, Rainer Deyke wrote:

We can’t rely on the platform IME tries to correctly draw a menu over an
SDL window, especially in full-screen mode. ?If the platform provides
the /contents/ of that menu in plain text, SDL applications should be
able to draw their own IME menus that are consistent with the platform
IME. ?If the platform IME does not provide the contents of the menu, we
are fucked.

In Mac OS X, the fact is, there is no way to retrieve such content,
but there exists some workarounds to allow input methods to draw
their window on top of our full-screen SDL apps. As discussed in [1],
even Apple themselves used such hackish way in their apps.

Jiang

[1] http://www.cocoabuilder.com/archive/message/cocoa/2008/1/31/197724

Donny_Viszneki · May 31, 2009, 9:21am

In my reply I will try to bring things into focus as I outlined in my
first email by discussing the capabilities and limitations of these
input methods and the technical details of APIs they are accessible
to.

Being a person whose native and supplementary languages are pretty
well accommodated by a US English keyboard layout (in most gaming
environments no one is offended if you omit details like accent marks)
I am pretty ignorant of any IME other than compose/dead keys like
holding ALT and punching in codes on MSWindows, pressing
Control+Shift+U and then punching in codes on Debian, or holding the
Option/Alt key on Mac OS X.

The Alt+number, Ctrl+Shift+U and the OS/X variation is an extra
feature of the IME
to enter a wider variety of Unicode characters.
For more details, see Unicode input - Wikipedia

Oh, I was actually unaware that Mac OS X had an analog to the numeric
code entry method I described for MSWindows and Debian (or GTK+,
apparently.) I was only aware of things like sequencing Alt+Quote, ‘o’
to get ‘?’ (‘O’ avec le tr?ma.)

If no one objects, I think we can differentiate these type of symbolic
extended character entry methods from the numeric catalog-number
methods mentioned just below by referring to them as “mnemonic” entry
methods.

Oh, in fact, that is what others are calling them as well, as
described in the Wikipedia article you referred to, and an IETF memo
linked therein (RFC 1345 - Character Mnemonics and Character Sets)

I suggested a method in a previous email for creating the capability
of generating visual feedback for mnemonic entry which I believed
would require a minimal amount of effort to add support to existing or
future applications, by allowing the application to communicate with
SDL when the user is entering text, and allowing SDL to then generate
false keyboard input events which closely approximate the desired
visual feedback. Example: composing O+tr?ma (?) by pressing Alt+Quote,
“o” would send the application first a “naked tr?ma” character, then
once the user finalizes the mnemonic by pressing “o,” the application
would receive a backspace keystroke (erasing the “naked tr?ma”) and
then the full character O+tr?ma (?.) Comments?

(For posterity: that mnemonic is used on Mac OS X, and may be defined
in the aforementioned IETF memo RFC-1345.)

The Ctrl+Shift+U is a GTK+ functionality and requires to type the
Unicode codepoint value
of a character (as in 2318 = ?). This functionality conforms to ISO 14755,
http://www.cl.cam.ac.uk/~mgk25/volatile/ISO-14755.pdf

Hey! An actual standard! This won’t be so hard if more answers are
like this one!

My gut tells me that it might be advisable to break down a standard
such as this into two parts: one appropriate for implementation within
SDL-proper, and one appropriate for implementation within applications
and/or an SDL support library. Comments?

In Windows there are quite a few variations. The universal Unicode
style as in GTK+ requires
a registry setting in order to be enabled.

Should SDL query the MSWindows “Registry” to find these settings and
try imitate

For the other legacy variations, see
How to enter Unicode characters in Microsoft Windows

Even though this is does not say what standard(s) these MSWindows and
MSWindows applications behaviors may follow, it is very clear and
concise! I’d call this one a win!

I suspect with OS/X it should be similar to GTK+/Linux.

In what way do you suspect they should be similar?

(As a side note: Is anyone familiar with how GTK+ operates under
MSWindows? I assume GTK+ applications under MSWindows can cooperate
with all the entry methods described at the fileformat.info given
above.)

In the case of Windows, the character appears as soon as you finish
typing the numbers.
In Linux, this functionality is provided with the GTK+ library, so any
library linked to GTK+
has the Ctrl+Shift+U shortcut. You have visual feedback as you type
the hex numbers;
you see them being typed. When you press Space, the hex numbers are
replaced with the
resulting Unicode character.

What about other GUI toolkits such as Qt? Does anyone know what they offer?

Simos, from a previous email you contributed to the thread, you seem
to know about Xorg compose key sequences. Do you have any helpful
links to information on that as well?

Simos has given us a lot of information that will be useful for
emulating extended keyboard input for many platforms, but so far we
don’t have a lot of information about how to tie into these existing
systems and benefit from them without having to emulate them from
scratch. We need more technical details – even if those details are
simply that there is no API that can feasibly be used by SDL – on the
APIs of these extended input methods. As I’ve expressed before, I
think emulation should be the last line of defense against a situation
where no extended input is supported, but if we are to emulate, Simos
has given us a lot of great material for doing so! :)On Sat, May 30, 2009 at 10:34 PM, Simos Xenitellis <simos.lists at googlemail.com> wrote:

On Sat, May 30, 2009 at 10:18 PM, Donny Viszneki <@Donny_Viszneki> wrote:

–
http://codebad.com/

Simos_Xenitellis · June 1, 2009, 7:52am

Very interesting discussion! Here I go!

As far as I understand, SDL does not provide GUI elements such as text
input widgets.

I agree with others who have said that SDL should remain this way.

Can we look at how the dialogs used by existing IMEs interact with the
graphics environments of typical SDL applications? SDL may be used in
windowed, and full screen modes. It can be used with single and double
buffering, and with its 2D API and OpenGL rendering. The IME must be
able to be used, or ignored, in all of those graphics environments.

Being a person whose native and supplementary languages are pretty
well accommodated by a US English keyboard layout (in most gaming
environments no one is offended if you omit details like accent marks)
I am pretty ignorant of any IME other than compose/dead keys like
holding ALT and punching in codes on MSWindows, pressing
Control+Shift+U and then punching in codes on Debian, or holding the
Option/Alt key on Mac OS X.

Having said that, I have this to say: I am not even certain where the
realm of things called IME begins and ends. So far it seems that it
includes means of inputting text defined by your “desktop environment”
(for lack of a better term – please don’t say OS, though as well
as third-party applications for intercepting the user’s input and
transforming it into different input for other applications. Is this
right? Does anything that is called IME fall outside of this (poor)
definition?

Moving along…

When visual feedback is required while pre-editing, things get
complicated and SDL has to take
over the input and display of text.

The problem is that each input method has its
own manner of showing candidates, and each application has its own style
of showing preedits.

We must verify whether
or not the dialogs used by existing IMEs can be made visible on top of
the different SDL graphic modes. Which raises the question, do we
always want to do “on the spot” IME dialog, which are more work for
the programmer, or do we want to allow the use of the IME’s dialog box
when it is possible?

I think we need a comprehensive discussion of the input methods SDL
may support. (I don’t use the term “IME” since I am still not entirely
sure what that term encompasses.) It is critical to compare the APIs
and capabilities of these different input methods. I expect, for
instance, that stylus-driven input interfaces common on portables, all
have no problem overlaying the input editor GUI over top of the
application. (That may or may not be accurate.) I’m not sure sure
however, for instance, that Mac OS X can offer accommodate OpenGL
applications with a text edit field (and I’m quite certain that most
developers would not want to be constrained to using an Apple text
widget for i18n-capable text input.) We need to know what these
various platforms and IME applications have in common, the technical
details of how they interface with users’ applications, before we can
begin deciding upon implementation details and how much implementation
belongs in SDL proper or in the application / satellite libraries.

On top of this, I think it would be beneficial to have an SDL sample (something
to reside in SDL/test/) that implements a rudimentary text input box.
While looking for examples, I found an example based on SDL_Input,
though an example based on pure SDL would be desirable.
Then, it would be easy to request contributors to try out the demos on their
system and report back the result.
Is there such an example available that can be contributed?

Where platforms fall short of providing good integration with SDL
applications, we could consider platform emulation to some degree. I
give an example of emulating Mac OS X compose key emulation further
down in this email.

Just to give folks something to think about and throw rocks at I’m
going to propose an API

SelectIME(optional standard locale name) - This function tells SDL
that we want to use an IME and tells SDL what the locale is. If the
locale is null or empty then SDL will use the default system locale,
otherwise it will use the provided locale. If that locale is not
support it returns false, otherwise it return true.

Locale (? la setlocale() et al, see
http://linux.die.net/man/3/setlocale) seems to me to be an issue best
handled separately. For input, locale really only matters for
formatting expectations of things like numbers (is one thousand
1,000.00 or 1.000,00 for example) and phone numbers (1-412-555-1212 or
1.412.555.1212) and I think is outside the scope of this discussion.

(Did you maybe mean character encoding?)

Locale would be important to identify the encoding of the typed text.
There could be an issue with Linux that expects UTF-8, Windows with UT-16,
and other smaller platforms that may still expect non-Unicode encodings.
Should SDL produce the default platform-specific encoding and then expect
from the application to further convert as needed?

I’m going to assume that turning on IME support causes SDL to start
sending IME specific events to the event queue. To handle these events
lets provide a special IME filter function.

Is it possible to fake complex IME if you (the SDL layer) can make
certain text-edit-ish assumptions about the application mode?
(Provided, obviously, by the application letting SDL know just that.)
For instance, if I am on Mac OS X, and I hold ALT+E to compose an
accent acute, and then press “e” to make “?,” would it be good for SDL
to fake this by sending first an accent character to begin with, and
then once I hit “e,” SDL could send two events: backspace, then “??”
The application could do things like kill or restart the input session
if it was closing the target of the user’s input, or if the user
clicks somewhere else to move the cursor.

That would be indeed something to write an example program about.
I do not know how feasible it would be.

SimosOn Sat, May 30, 2009 at 10:18 PM, Donny Viszneki <donny.viszneki at gmail.com> wrote:

On Thu, May 28, 2009 at 12:22 AM, Simos Xenitellis <@Simos_Xenitellis> wrote:
On Fri, May 29, 2009 at 6:16 PM, Bob Pendleton wrote:
On Thu, May 28, 2009 at 12:22 AM, Simos Xenitellis <@Simos_Xenitellis> wrote:
On Fri, May 29, 2009 at 8:42 AM, Alissa Sabre <alissa_sabre at yahoo.co.jp> wrote:
On Fri, May 29, 2009 at 6:16 PM, Bob Pendleton wrote:

On Thu, May 28, 2009 at 2:36 AM, Jjgod Jiang wrote:

In my upcoming emails, I will give a brief overview of how input methods
work for CJK (Chinese, Japanese, Korean) languages and detailed discussion
on how input methods work on Mac OS X, finally, I’ll summarize the current
status of SDL Unicode text input support.

I look forward to your future posts!

–
http://codebad.com/

SDL mailing list
SDL at lists.libsdl.org
http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org

Simos_Xenitellis · June 1, 2009, 8:28am

I am replying to the second part of your e-mail here,
which focuses on the “numeric catalog-number method” of entering
Unicode characters.

In my reply I will try to bring things into focus as I outlined in my
first email by discussing the capabilities and limitations of these
input methods and the technical details of APIs they are accessible
to.

Being a person whose native and supplementary languages are pretty
well accommodated by a US English keyboard layout (in most gaming
environments no one is offended if you omit details like accent marks)
I am pretty ignorant of any IME other than compose/dead keys like
holding ALT and punching in codes on MSWindows, pressing
Control+Shift+U and then punching in codes on Debian, or holding the
Option/Alt key on Mac OS X.

The Alt+number, Ctrl+Shift+U and the OS/X variation is an extra
feature of the IME
to enter a wider variety of Unicode characters.
For more details, see Unicode input - Wikipedia

Oh, I was actually unaware that Mac OS X had an analog to the numeric
code entry method I described for MSWindows and Debian (or GTK+,
apparently.) I was only aware of things like sequencing Alt+Quote, ‘o’
to get ‘?’ (‘O’ avec le tr?ma.)

If no one objects, I think we can differentiate these type of symbolic
extended character entry methods from the numeric catalog-number
methods mentioned just below by referring to them as “mnemonic” entry
methods.

Oh, in fact, that is what others are calling them as well, as
described in the Wikipedia article you referred to, and an IETF memo
linked therein (RFC 1345 - Character Mnemonics and Character Sets)

I suggested a method in a previous email for creating the capability
of generating visual feedback for mnemonic entry which I believed
would require a minimal amount of effort to add support to existing or
future applications, by allowing the application to communicate with
SDL when the user is entering text, and allowing SDL to then generate
false keyboard input events which closely approximate the desired
visual feedback. Example: composing O+tr?ma (?) by pressing Alt+Quote,
“o” would send the application first a “naked tr?ma” character, then
once the user finalizes the mnemonic by pressing “o,” the application
would receive a backspace keystroke (erasing the “naked tr?ma”) and
then the full character O+tr?ma (?.) Comments?

I looked more carefully the SDL_Input test application, exampleCodeSDL_Input.
It implements Backspace in the text input field.
What we need is to add an effect when typing, so that intermediate characters
would look, for example, as underlined, which would mean that based on
what is to be pressed later, the screen output will change to the
final character.

(For posterity: that mnemonic is used on Mac OS X, and may be defined
in the aforementioned IETF memo RFC-1345.)

The Ctrl+Shift+U is a GTK+ functionality and requires to type the
Unicode codepoint value
of a character (as in 2318 = ?). This functionality conforms to ISO 14755,
http://www.cl.cam.ac.uk/~mgk25/volatile/ISO-14755.pdf

Hey! An actual standard! This won’t be so hard if more answers are
like this one!

My gut tells me that it might be advisable to break down a standard
such as this into two parts: one appropriate for implementation within
SDL-proper, and one appropriate for implementation within applications
and/or an SDL support library. Comments?

In all three (Windows, Linux, OS/X), they support the same
functionality with a different
key shortcut that starts the compose sequence. Therefore, there is
common functionality
that can be shared.

The differences are

the shortcut that begins the “numeric catalog-number method”
the visual effect that one sees while composing the numeric value

In Windows there are quite a few variations. The universal Unicode
style as in GTK+ requires
a registry setting in order to be enabled.

Should SDL query the MSWindows “Registry” to find these settings and
try imitate

For simplicity, it would be OK if SDL simply supported it anyway.
The only issue I can think of is a situation where the user may inadvertently
enter the entry keyboard sequence to type the numeric value, for example,
while playing a game and pressing Alt and ‘+’.

For the other legacy variations, see
How to enter Unicode characters in Microsoft Windows

Even though this is does not say what standard(s) these MSWindows and
MSWindows applications behaviors may follow, it is very clear and
concise! I’d call this one a win!

I suspect with OS/X it should be similar to GTK+/Linux.

In what way do you suspect they should be similar?

I did not use in practice those shortcuts on OS/X, Windows so I do not know
the visual effect when one is entering the numeric value.
I assume in Windows there is no visual feedback.

(As a side note: Is anyone familiar with how GTK+ operates under
MSWindows? I assume GTK+ applications under MSWindows can cooperate
with all the entry methods described at the fileformat.info given
above.)

I think that Win32 GTK+ offers the Ctrl+Shift+u functionality while it
does not offer
a full set of shortcuts described in fileformat.info. The last part is
just me reading
the bugzilla reports in bugzilla.gnome.org, without actually trying out.

In the case of Windows, the character appears as soon as you finish
typing the numbers.
In Linux, this functionality is provided with the GTK+ library, so any
library linked to GTK+
has the Ctrl+Shift+U shortcut. You have visual feedback as you type
the hex numbers;
you see them being typed. When you press Space, the hex numbers are
replaced with the
resulting Unicode character.

What about other GUI toolkits such as Qt? Does anyone know what they offer?

My current understanding (QT 3.x) is that QT does not offer an input method
and it uses what is provided by the system. I do not know if QT 4.x changes
and offers an input method. The “numeric catalog-number method” is simply
a value-added feature when an input method is in place.

SimosOn Sun, May 31, 2009 at 10:21 AM, Donny Viszneki <donny.viszneki at gmail.com> wrote:

On Sat, May 30, 2009 at 10:34 PM, Simos Xenitellis <@Simos_Xenitellis> wrote:

On Sat, May 30, 2009 at 10:18 PM, Donny Viszneki <donny.viszneki at gmail.com> wrote:

Simos_Xenitellis · June 1, 2009, 9:32am

I am replying to the last part of the questions on input details about
Xorg compose sequences.

In my reply I will try to bring things into focus as I outlined in my
first email by discussing the capabilities and limitations of these
input methods and the technical details of APIs they are accessible
to.

Being a person whose native and supplementary languages are pretty
well accommodated by a US English keyboard layout (in most gaming
environments no one is offended if you omit details like accent marks)
I am pretty ignorant of any IME other than compose/dead keys like
holding ALT and punching in codes on MSWindows, pressing
Control+Shift+U and then punching in codes on Debian, or holding the
Option/Alt key on Mac OS X.

The Alt+number, Ctrl+Shift+U and the OS/X variation is an extra
feature of the IME
to enter a wider variety of Unicode characters.
For more details, see Unicode input - Wikipedia

Oh, I was actually unaware that Mac OS X had an analog to the numeric
code entry method I described for MSWindows and Debian (or GTK+,
apparently.) I was only aware of things like sequencing Alt+Quote, ‘o’
to get ‘?’ (‘O’ avec le tr?ma.)

If no one objects, I think we can differentiate these type of symbolic
extended character entry methods from the numeric catalog-number
methods mentioned just below by referring to them as “mnemonic” entry
methods.

Oh, in fact, that is what others are calling them as well, as
described in the Wikipedia article you referred to, and an IETF memo
linked therein (RFC 1345 - Character Mnemonics and Character Sets)

I suggested a method in a previous email for creating the capability
of generating visual feedback for mnemonic entry which I believed
would require a minimal amount of effort to add support to existing or
future applications, by allowing the application to communicate with
SDL when the user is entering text, and allowing SDL to then generate
false keyboard input events which closely approximate the desired
visual feedback. Example: composing O+tr?ma (?) by pressing Alt+Quote,
“o” would send the application first a “naked tr?ma” character, then
once the user finalizes the mnemonic by pressing “o,” the application
would receive a backspace keystroke (erasing the “naked tr?ma”) and
then the full character O+tr?ma (?.) Comments?

(For posterity: that mnemonic is used on Mac OS X, and may be defined
in the aforementioned IETF memo RFC-1345.)

The Ctrl+Shift+U is a GTK+ functionality and requires to type the
Unicode codepoint value
of a character (as in 2318 = ?). This functionality conforms to ISO 14755,
http://www.cl.cam.ac.uk/~mgk25/volatile/ISO-14755.pdf

Hey! An actual standard! This won’t be so hard if more answers are
like this one!

My gut tells me that it might be advisable to break down a standard
such as this into two parts: one appropriate for implementation within
SDL-proper, and one appropriate for implementation within applications
and/or an SDL support library. Comments?

In Windows there are quite a few variations. The universal Unicode
style as in GTK+ requires
a registry setting in order to be enabled.

Should SDL query the MSWindows “Registry” to find these settings and
try imitate

For the other legacy variations, see
How to enter Unicode characters in Microsoft Windows

Even though this is does not say what standard(s) these MSWindows and
MSWindows applications behaviors may follow, it is very clear and
concise! I’d call this one a win!

I suspect with OS/X it should be similar to GTK+/Linux.

In what way do you suspect they should be similar?

(As a side note: Is anyone familiar with how GTK+ operates under
MSWindows? I assume GTK+ applications under MSWindows can cooperate
with all the entry methods described at the fileformat.info given
above.)

In the case of Windows, the character appears as soon as you finish
typing the numbers.
In Linux, this functionality is provided with the GTK+ library, so any
library linked to GTK+
has the Ctrl+Shift+U shortcut. You have visual feedback as you type
the hex numbers;
you see them being typed. When you press Space, the hex numbers are
replaced with the
resulting Unicode character.

What about other GUI toolkits such as Qt? Does anyone know what they offer?

Simos, from a previous email you contributed to the thread, you seem
to know about Xorg compose key sequences. Do you have any helpful
links to information on that as well?

In XOrg, the compose sequences are listed in
http://cgit.freedesktop.org/xorg/lib/libX11/tree/nls/en_US.UTF-8/Compose.pre
(warning: big text file)

There are two general types of compose sequences (all mnemonic), the dead key
sequences that start with a dead key and the compose sequences with
the Compose (Multi_key).
In the latter, there is no default key and the user has to configure
when enabling.
This is really bad for usability terms. The common keyboard key to
select is Right-Win key.
You get mnemonics like RightWin + ( + 1 + 0 + ) and it produces ?
(it’s the Unicode character for 10 in a circle).
In effect, the RightWin key becomes an additional global ‘dead key’
that one can use
to start selected compose sequences.

For more, see

I have to admit that the ‘extended keyboard input’ is a niche that
(sadly) may cover a small percentage
of our users. It is indeed a small percentage for XOrg and GTK+ users.

Simos has given us a lot of information that will be useful for
emulating extended keyboard input for many platforms, but so far we
don’t have a lot of information about how to tie into these existing
systems and benefit from them without having to emulate them from
scratch. We need more technical details – even if those details are
simply that there is no API that can feasibly be used by SDL – on the
APIs of these extended input methods. As I’ve expressed before, I
think emulation should be the last line of defense against a situation
where no extended input is supported, but if we are to emulate, Simos
has given us a lot of great material for doing so!

When emulating, in X.Org it is rather straight-forward because all
compose sequences
reside in the same file,
Compose.pre « en_US.UTF-8 « nls - xorg/lib/libX11 - libX11 GIT Repository (mirrored from https://gitlab.freedesktop.org/xorg/lib/libx11)

There is actually a complication in Windows in the sense that it does
not offer the same list
of dead key compose sequences.
While in Linux you can stack dead keys in order to produce characters like
? (uses three dead keys, stacked),
in Windows, the layout offers only a special ‘compound dead key’, so that one
presses deadkey+letter to get the result.
That is, there is a special dead key for the set of [dasia, grave,
ypogeggrameni] (the three squiggles),
which means that for GTK+ to work in Win32, it needs to define and
keep an extra list of ‘compound dead keys’.
AFAIK, such a list does not exist and one would need to extract the
information by analysing
the Win32 keyboard layout files, a non-trivial task.

The reason why Win32 cannot stack dead keys could be either

design deficiency
or, usability issue for typing less keys to produce the resulting character.
A sideeffect of not being able to stack dead keys is that the keyboard layouts
tend to have too many useful keys used up as dead keys.

Coming back to SDL; emulating the host input method gives finer control
to the functionality and features, with the side-effect of increased
complexity and more
chances to missing something in some obscure layout.

SimosOn Sun, May 31, 2009 at 10:21 AM, Donny Viszneki <donny.viszneki at gmail.com> wrote:

On Sat, May 30, 2009 at 10:34 PM, Simos Xenitellis <@Simos_Xenitellis> wrote:

On Sat, May 30, 2009 at 10:18 PM, Donny Viszneki <donny.viszneki at gmail.com> wrote:

slouken · June 3, 2009, 4:56am

Locale would be important to identify the encoding of the typed text.

There could be an issue with Linux that expects UTF-8, Windows with UT-16,
and other smaller platforms that may still expect non-Unicode encodings.
Should SDL produce the default platform-specific encoding and then expect
from the application to further convert as needed?

SDL should use UTF-8 encoding for all text in and out of the API. There are
already some text encoding/decoding functions available in SDL_stdinc.h

See ya,
–Sam