Input Method Editors for Latin based character set

Post general font related questions (e.g. how to install, convert and use fonts) and requests (looking for fonts, designers etc.) here.
algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Fri Jul 13, 2012 3:47 pm

Bhikkhu wrote:
The ligature features use many -> one substitutions, so e.g. f + f -> ff ligature.
Stylistic alternates (salt) can use one -> one substitutions, but they require manual input, which is inefficient, but very flexible as the user can decide which letter form to insert. One glyph has a lookup table for several other forms to substitute for that glyph — the user has to choose from a dialogue presented by the application, which Stylistic Set to use.
Thanks again for your inputs.
Your answer many ->one substitution is clear and that would still be ok.
The second answer is limited to one -> one substitution but if it requires manual input then I think it may not be as useful if such input is something like Alt+ 5 digits.

To answer WIlliam and Vanisaac question let me explain how the Atok IME pad works. The Japanese language uses a syllabic "alphabet" consisting of 228 charcacters plus some 2100 other ideograms derived from the much larger set of the Chinese ideograms. In all therefore to write in Japanese a user needs to select characters out of some 2350 glyphs. The way they have solved the problem is to use the latin alphabet, which they call Romaji, to enter the sound of the Japanese word spelt using the Romaji letters. The Atok IME pad then transliterates transparently and pops up with a dialog box which lists the corrsponding words that use that phonetic entered input. The look up glyphs that come up on the dialogue pop-up window can be few or many depending on the word that was entered. The user chooses the right one with the cursor keys and then proceeds to type the next text. There is no need to enter special code such as %%, etc. You just look up on the dialogue box and choose one of the optional glyphs that are shown, with a mouse or cursur keys or shortcuts.
The process is very user friendly and effective. It does slow down the typing speed but it works positively and is amenable to all kind of users, whether they are PC literate or not.

In the case of adopting such method for the Latin character fonts the job is much simplified as here we do not need transliteration as such but the opportunity to flag, on a pop-up dialogue box, alternative styles of letters whenever one of the basic letters is typed. For example, if I was typing the word "corrugated" then upon typing the letter "o" after the first "c" the IME pad would pop-up the dialogue box expecting that the next word to be entered could well be an r, or a v, or a w, as in such cases -provided that the font type was cursive- those letters would need to have a short leading stroke. So the user can choose the most appropriate style on the fly and continue typing undisturbed. I have been using the Atok bar for a while and I can assure you that it is painless and very user friendly.

I estimate that to make a cursive script, such as the one I am completing, truly cursive using the rule that the pen is never lifted from the paper, it essentially requires another set of 52 glyphs. Then obviously one can add some alternative styles, for example a minuscule "z" without descender if the main one was designed with a descender; and so on. Clearly if such system was available one would take advantage of this IME to add several alternative styles as well.

according to my understanding, I believe that this Romaji transliteration method is also used for the Chinese language where they have more than 50000 glyphs. I believe it may also apply to Koreaan and other similar languages.

I trust this clarifies how the Atok pad works.

vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Input Method Editors for Latin based character set

Post by vanisaac » Fri Jul 13, 2012 11:49 pm

If it's truly contextual, you can actually hard code the rules into open type/graphite, and it will automatically select the right contextual glyphs. I think you are becoming the embodiment of the old axiom of every problem looking like a nail. There are lots of technical answers to alternate glyph selection already, and Unicode and smart font technologies were designed to accommodate a lot of different options. So let's actually clearly define the problem, and then we can actually move forward on a solution.

PS, I know that I am intimately familiar with the vagueries of oriental orthography and computer input, and my understanding is that the others involved are also quite familiar with those technologies and their possibilities as well.

vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Input Method Editors for Latin based character set

Post by vanisaac » Sat Jul 14, 2012 12:19 am

algrass wrote:according to my understanding, I believe that this Romaji transliteration method is also used for the Chinese language where they have more than 50000 glyphs. I believe it may also apply to Koreaan and other similar languages.

I trust this clarifies how the Atok pad works.
Actually, there are several Chinese input methods, some of which use Pinyin - the Chinese equivalent of Romaji - some use Bopomofo - the Chinese equivalent of kana - and some of which, like Wubi, ZhengMa, and Canjie, are radical/shape/stroke based. See http://en.wikipedia.org/wiki/Chinese_in ... _computers. Atok is really just a single IME in a stand-alone, platform-independent app, rather than a full IME format. It actually contains a bunch of additional features like date formats, handwriting recognition, kanji search, etc. Korean usually uses regular keyboards, as two of the three Unicode Korean text models (don't ask) allows for syllables to be composed, so Hangul encoding and input is simple alphabetic. Amharic (Ethiopian) is actually more commonly input with an IME than Korean.

algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Sat Jul 14, 2012 1:53 pm

Vanisaac wrote:
If it's truly contextual, you can actually hard code the rules into open type/graphite, and it will automatically select the right contextual glyphs. I think you are becoming the embodiment of the old axiom of every problem looking like a nail. There are lots of technical answers to alternate glyph selection already, and Unicode and smart font technologies were designed to accommodate a lot of different options. So let's actually clearly define the problem, and then we can actually move forward on a solution.
Well, I thought the question had been well understood by now: how to enter a Latin based extended character set inclusive of alternate styles of glyphs in a user friendly way. You say it can be done with an OTF/ graphite. I do not understand the term graphite. As a scientist I am familiar with graphite for nuclear reactors, now obsolete, or the new material graphene but graphite for typo is kind of double dutch to me. The quest for Higg's boson is more familiar to me and I don't believe that it has yet been found despite the recent announcements. Yet that would not preclude me from talking to someone at a basic level even if he or she were not skilled in the art or science of Physics.
Going back to the IME if such a method exists I would be grateful if someone can let me know so I can buy it and start using my Latin based character set. But my understanding is that such a method does not exist as a product and that is why this discussion was started.
Cheers all the same.

Bhikkhu Pesala
Top Typographer
Top Typographer
Posts: 8345
Joined: Tue Oct 29, 2002 5:28 am
Location: Seven Kings, London UK
Contact:

Re: Input Method Editors for Latin based character set

Post by Bhikkhu Pesala » Sat Jul 14, 2012 7:25 pm

algrass wrote:Well, I thought the question had been well understood by now: how to enter a Latin based extended character set inclusive of alternate styles of glyphs in a user friendly way.
I thought the answer was well understood by now — there isn't a user-friendly way if the user has to stop and choose the alternate glyphs, or stop and think which shortcut to use. Users should just be able to type normally and the smart font technology should decide which letter form is best depending on the preceding letter(s).

Graphite (SIL)
My FontsReviews: MainTypeFont CreatorHelpFC12 Pro + MT9.0 @ Win10 1903 build 18362.356

algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Sun Jul 15, 2012 7:19 am

Thanks for the link to Graphite (SIL), now I know what the generic term Graphite in the previous thread meant. That stub was interesting to read but I could never have found it as in my mental domain the word "graphite" was not linked to an alternative software programming language for font development.

I agree with your comment that a user friendly way should not involve the user is such choices. However, from what I have been able to assess by reading some literature on OT font development by Adobe, and having bought Pageplus X6 over the last two days, to specifically assess OTF as Word 2010 is so primitive, I can conclude that a perfect solution is not available even with OTF as the rules for "advanced" cursive writing are too many; above all because most of them are dictated by personal choices at the time of writing by the user. It is for this reason that I thought a look up table therefore, such as the existing technology of the Atok bar or even the MS IME pad, might be adequate if it could be adapted to a Latin character set. Though I do agree that the frequency of the pair of letters o, r,v, and w is so often that this dialog box will be popping up so often to perhaps becoming annoying for the user.

Thanks for the discussion. I will now consider the matter resolved and closed.

I have some comments on the Autokerning features of FC. In which thread should I address those?

William
Top Typographer
Top Typographer
Posts: 1997
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Input Method Editors for Latin based character set

Post by William » Sun Jul 15, 2012 7:34 am

vanisaac wrote: PS, I know that I am intimately familiar with the vagueries of oriental orthography and computer input, and my understanding is that the others involved are also quite familiar with those technologies and their possibilities as well.
Well, I know only a little about oriental orthography. I know some things about computer input, but not all. I knew nothing about the use of an Input Method Editor for entering oriental characters and have learned from this thread.

Also, I do not remember having previously known the phrase "Spencerian loops" and so I looked it up and learned much about handwriting style.

I am simply participating in this thread, answering questions when I am able to do so, learning some things that I did not know before and interested in trying to develop a solution where none exists at the present time.
algrass wrote: Going back to the IME if such a method exists I would be grateful if someone can let me know so I can buy it and start using my Latin based character set. But my understanding is that such a method does not exist as a product and that is why this discussion was started.
Cheers all the same.
I do not know if such a product exists. I feel that it would be a very useful product to have available if it were usable with many application programs. What I mean is a system that sits between the keyboard and the application program so that, say, it would be usable with both Microsoft WordPad and Serif PagePlus without either of those programs themselves being modified. That could perhaps be done by having the new program substitute some sequences with Private Use Area characters automatically and having some characters substitutable by a Private Use Area character as a user choice: the rules for substitution being in a text file, so the rules could be changed by an end user fairly easily if so desired.

My %%t idea may not be the best way to go, but it might possibly be the best that can be achieved without a major redesign of the operating system and the font system.

I have done some programming some years ago, including FORTRAN, Pascal, C, Visual Basic but I have no idea as to how one gets a program to sit between the keyboard input and an application program in a Windows system.

However, it may be that the discussion in this thread may lead to a solution becoming produced.

William Overington

15 July 2012

William
Top Typographer
Top Typographer
Posts: 1997
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Input Method Editors for Latin based character set

Post by William » Sun Jul 15, 2012 7:42 am

algrass wrote: Thanks for the discussion. I will now consider the matter resolved and closed.
Oh, I was preparing my post that is next after your post and had not seen your post when I posted mine.

I was hoping that the discussion was getting towards trying to get something produced as it would be a useful facility.
algrass wrote: I have some comments on the Autokerning features of FC. In which thread should I address those?
I suggest starting a new thread in the FontCreator - Discussion section of the forum.

William Overington

15 July 2012

algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Mon Jul 16, 2012 6:08 am

William,
The IME pad works transparently and does not affect any word processor program you may use, in principle. Generally it is optimised to work with MS Word and the Atok system IME pad with both Word and their own Word Process program made by Just Systems, a Japanese company that specializes in this field for the Japan market. The program works as an interactive layer and detects what the user has pressed on the keyboard. Then transliterates, not translates, those entries entered in Romaji (i.e. using the Roman alphabet) into the corresponding character symbol of their language. This is what is done with Japanese. Most other Asian languages such as Hindi or even Arabic have to resort to some similar form as their character set is so much bigger than our Roman set. Interestingly, despite much effort to use native language inputs on computers, de facto, the Roman alphabet is the preferred method. For example in Japan all laptops have a dedicated Japanese keyboard entry input limited to the Hiragana characters syllabary which can be converted to Katakana at the press of a function key, F7, if my memory does not fail me. Yet, most users don't bother with that native entry and prefer to use the Romaji entry. In fact I was in japan just at the time of the launch of the new macBook retina and I was considering buying it as the presence of the Japanese Hiragana symbols on the keys can be totally ignored and just use the Qwerty symbols. Of course these IME pads provide other added facilities for character recognition, essential with Chinese and Japanese ideograms, such as dictionaries, spellers etc., but for a character based latin text these would not be necessary as they are already provided by the main application such as Word or Pageplus, etc.

Now that you have seen more into the cursive script, which we call it copperplate here in the UK, inclusive of the fancy Spencerian loops, you will realise that the choice of which one to use is largely a gut feeling approach decided at the time the user is writing or typing. That is why I think canonical rules are probably not the best solution and I favour a popup dialogue box, not very intrusive, that just flashes up the alternative characters to choose from. Incidentally, if you have not realised this yet, this kind of copperplate script requires a large numbe rof kerning pairs. In my font I have reached 2440 pairs and I still have to include the diacritic vowels.

I am contacting the company Just Systems in Japan to ask them if they are interested in adapting their IME for my cursive script and offer it to them as collateral so to speak. I am resolved to see this job being done by someone and probably the commercial interest might be a deciding factor.
All the best.

Post Reply