Input Method Editors for Latin based character set

Post general font related questions (e.g. how to install, convert and use fonts) and requests (looking for fonts, designers etc.) here.
algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Input Method Editors for Latin based character set

Post by algrass » Mon Jul 09, 2012 2:16 pm

I would like to propose a discussion group on transliteration Input Method Editors for entering alternative styles of Latin characters into a Latin based character set. It is often desirable to have alternative styles of some characters and be able to enter them into an application as easily as typing the normal characters. This technique of transliteration is well known and established as the routine input method for languages in Central and Far East Asia, as well in Arabic languages. My interest though is to adapt such transliteration Input Method Editor for use specifically with Latin based charcter fonts.
Is anyone interested in joining a new group to promote a solution to this topic?

Known transliteration Input Editors currently available:
1- Microsoft IME available for Windows XP and 7.
2- Atok IME Pad (this is a commercial software for entering Kanji characters)
3- Google IME pad. Google currently offers transliteration for 20 different languages. Their program looks adaptable for Latin texts.

Bhikkhu Pesala
Top Typographer
Top Typographer
Posts: 8345
Joined: Tue Oct 29, 2002 5:28 am
Location: Seven Kings, London UK
Contact:

Re: Input Method Editors for Latin based character set

Post by Bhikkhu Pesala » Mon Jul 09, 2012 7:56 pm

I have used the Microsoft Keyboard Layout Creator to create my own customised Windows keyboard with which I can type a wide range of accented characters for European languages, Pāḷi, and Sanskrit.

Full List of Shortcuts

I suppose you could create a similar keyboard to type alternate characters in the Private Use Area, which is the right place to map alternate contextual ligatures. Such a keyboard would be hard to use, and there is no agreed standard mapping method to follow.

I am not familiar with any input method used outside of Windows on a desktop PC.
My FontsReviews: MainTypeFont CreatorHelpFC12 Pro + MT9.0 @ Win10 1903 build 18362.356

vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Input Method Editors for Latin based character set

Post by vanisaac » Tue Jul 10, 2012 2:15 am

I actually have an MSKLC layout that I can use to type in any of a number of scripts: Greek, Cyrillic, Hiragana, Katakana, Runic, Ogham, Cherokee, Arabic, Hebrew, etc. The problem with any Windows IME is that they do not recognize capital letters, and I'm not even sure exactly how it works with the non-alphabetic keys. Whether these same limitations exist for the Google and Atok IMEs - and whether custom IMEs can be defined for those tools - is a technical hurdle that needs to be investigated before you can even start on this kind of project.

William
Top Typographer
Top Typographer
Posts: 1997
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Input Method Editors for Latin based character set

Post by William » Tue Jul 10, 2012 7:25 am

Could you explain how this would be used please?

For example, would it be like the following, or would it be different?

Suppose that a font contains a glyph for an alternate lowercase e where the glyph is of a lowercase e with a swash flourish such as a calligrapher might use at the end of a line of text.

Suppose there is a line of poetry ending a poem.

For example, as follows.

And beyond, an apple tree

If someone wanted to key that line of poetry so that the alternate lowercase e where the glyph is of a lowercase e with a swash flourish is used at the end of the line, would that person key the text as follows?

And beyond, an apple tre%e

Is what you are wanting to develop a method so that keying that sequence would automatically produce a display using the alternate lowercase e?

Is that what you want to achieve, or is what you want to achieve something different?

On what platforms and with what software are you wanting the system you seek to work please?

---

An approach that I have used is to produce a pdf (portable document format) document containing one of each of the alternate character glyphs; glyphs that are mapped into the plane 0 Private Use Area. A user could then copy from that document onto the clipboard, paste into WordPad and then reformat using the font. That technique allows a person using a program such as WordPad that does not have, in at least the version on the computer that I am using running Windows xp, an Input Symbol facility.

An example of such a pdf document, for my Sonnet Calligraphic font, is available in the following place.

viewtopic.php?p=15058#p15058

That technique is a little awkward to use, yet does work well without needing to add special software into the operating system. A pdf can be made to have internal hyperlinks, so an opening page with the letters of the alphabet each set up as a hyperlink to another page and a collection of pages, one for each letter, with the alternate glyphs on the pages could be produced within the pdf.

However, that method is not automatic, so an automated method would be of interest.

William Overington

10 July 2012

algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Wed Jul 11, 2012 12:13 pm

Thanks for your inputs chaps. I try to be a little clearer with the attached example in pdf.
You will notice that I dleiberately mispelt the word "most" to read "morst" so that I could show the absence of ligature when I use the normal "r" which, in most cases, requires a long leading stroke and the case when the "r" follows the letter "o" or "v" or "w" in which case a short trailing stroke is available on such letters, and the letter "r" requires also a short leading stroke. In this case I created an alternative small letter "r" which I placed in the Private User Area, hex E028. Using the Alt+ decimal code I can then call up this alternative letter as you can see in the second part of the example.

The problem is that if the font has to be made available to a general public then this method, or any of the known methods, are really too technical hence unsuitable. For us it is relatively easy to use one of the methods you referred to and others that come to mind, but for the general public this would not be acceptable. Furthermore, if a friendly transliteration IME is available one can expands the alternative characters to any number. For example in my character set I am providing over 52 alternative characters even without bothering with Spencerians loops. If I were to add those as well then the added characters would be substantially more. I trust this makes clear the objective behind my request.

I believe that the only way to go is to use an IME pad, such as the Google Android IME for Greek which works on a Windows PC as well. I looked at this download and it works very well plus it seems easily customizable. Unfortunately I do not have sufficient programming skills to undertake a dissecting of this program. But I understand that modifications are allowed by Google. After all it means creating a link to a different character set database.

Finally, I am not aware why the IME cannot handle capital letters. The Google IME can handle both small and capital letters. I am using the MS IME pad to enter kana and kanji characters and I do not see this problem as those charcaters are essentially capitals as they are defined within essentially a bilinear height for horizontal, left to right, scripting and a fixed column width is writing from right to left vertically.

William I live in Stourbridge which used to be part of Worcestershire!

Cheers
Attachments
Example Alternative letter small R.pdf
Example of character ligature
(13.04 KiB) Downloaded 236 times

Bhikkhu Pesala
Top Typographer
Top Typographer
Posts: 8345
Joined: Tue Oct 29, 2002 5:28 am
Location: Seven Kings, London UK
Contact:

Re: Input Method Editors for Latin based character set

Post by Bhikkhu Pesala » Wed Jul 11, 2012 4:54 pm

There are several reasons why OpenType glyph substitutions are the best solution to this problem:
  1. Users don't have to learn any special shortcuts or install any special keyboards. They just type as usual.
  2. The text string in the application is not modified, so words that contain ligatures do not break spell-checking.
  3. Search and replace operations work normally.
  4. If the user changes the font to one lacking OpenType features, the text is still correct.
  5. Even in applications that don't support OpenType features, the fonts work, though no ligatures are substituted.
The downside, of course, is that there are not yet many applications that support OpenType features.

The way to go for the future is surely to promote and support application support for OpenType or other Smart Font technologies.
My FontsReviews: MainTypeFont CreatorHelpFC12 Pro + MT9.0 @ Win10 1903 build 18362.356

algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Thu Jul 12, 2012 7:36 am

Your suggestions are noted but since I am not familiar with Open Type fonts I am not able to comment further. However, I spent some time yesterday reading about OTF on the fontshop.com website. It seems to me that the literature on this subject assumes that the reader knows what OTF can do for him or her and dwell more on technical issues which are of no help to people who are still wondering if OTF meets their objectives. Based on what I could understand it seems to me that OTF will not be sufficiently "smart" to make some smart decisions by itself as the built-in rules would be too many for the font to make the best choices. For example. if the letter "o" has been typed then all the following letters need to be chosen from another set of letters having short lead-strokes. Likewise if the letters "v" or "w" are chosen during typing then the following vowels or consonants have to be chosen from another set having an even shorter leading stroke. And I can think of further additional rules for Spencerian and non Spencerian loops. A simple dialog pop-up window would leave the choice to the user. However, I do accept that my limited knowledge on this subject might be misleading and I will let the matter rest. Please note that my comments refer to cursive scripts, otherwise called Copperplate, in which both the leading tail and the trailing tail are connected to the previous and following letters. Unlike all the commercially available cursive fonts which break these rules.
However, I do see the commercial opportunity to have such small program available in the market and perhaps Erwin should take a look at it and come out with an innovative software product specific to European Latin and Greek/Cyrillic languages. I believe the market for the Asian languages is well covered by the Atok program sold by Just Systems.

I also noted that the second line in the Example I posted yesterday does not show the letter "r" linked to the previous letter "o". This was my mistake which went unnoticed. I forgot to load the latest version of my font in which I had already adjusted the kerning pair "o-r".

Best regards

Bhikkhu Pesala
Top Typographer
Top Typographer
Posts: 8345
Joined: Tue Oct 29, 2002 5:28 am
Location: Seven Kings, London UK
Contact:

Re: Input Method Editors for Latin based character set

Post by Bhikkhu Pesala » Thu Jul 12, 2012 8:11 am

algrass wrote:For example. if the letter "o" has been typed then all the following letters need to be chosen from another set of letters having short lead-strokes.
The beauty of OpenType glyph substitutions is that the user doesn't have to select which glyph to insert. A simple lookup table is used. So if a user types f, then i, the ligature fi is used. If the user backspaces, then types another f, the ff ligature is used. If they type f, f, then l, the ff ligature is changed for the ffl ligature, etc. Its fully automatic.

Code: Select all

lookup ligaSub { 
    sub f f i -> ffi; 
    sub f f l -> ffl; 
    sub f f -> ff; 
    sub f i -> fi; 
}
My FontsReviews: MainTypeFont CreatorHelpFC12 Pro + MT9.0 @ Win10 1903 build 18362.356

William
Top Typographer
Top Typographer
Posts: 1997
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Input Method Editors for Latin based character set

Post by William » Thu Jul 12, 2012 9:27 am

Does anyone know the answer to the following question please?

Can the right-hand side of a glyph substitution statement (I do not know if that is the correct parlance) in an OpenType GSUB table have more than one glyph listed within that place?

William Overington

12 July 2012

William
Top Typographer
Top Typographer
Posts: 1997
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Input Method Editors for Latin based character set

Post by William » Thu Jul 12, 2012 9:57 am

algrass wrote: I also noted that the second line in the Example I posted yesterday does not show the letter "r" linked to the previous letter "o". This was my mistake which went unnoticed. I forgot to load the latest version of my font in which I had already adjusted the kerning pair "o-r".
You are welcome to post another pdf in a later post.

I studied the pdf that you posted yesterday with great interest and found the two versions of the lowercase r.

If you do try another pdf a sentence that shows both types of r would be of interest.

For example, something like the following.

She writes about the flowers in a forest.

Can anyone think of any words that contain the sequence vr that could be used in an example please?

Thus far I have thought of avril, the French word for April.

William Overington

12 July 2012

vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Input Method Editors for Latin based character set

Post by vanisaac » Thu Jul 12, 2012 10:38 am

William wrote:Can the right-hand side of a glyph substitution statement (I do not know if that is the correct parlance) in an OpenType GSUB table have more than one glyph listed within that place?
No. Graphite does, however, support one-to-many and many-to-many substitutions. I believe that AAT also supports multiple substitutions as well. OpenType only allows one-to-one (single substitution) and many-to-one (ligature) substitutions.

William
Top Typographer
Top Typographer
Posts: 1997
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Input Method Editors for Latin based character set

Post by William » Fri Jul 13, 2012 6:11 am

vanisaac wrote:
William wrote:Can the right-hand side of a glyph substitution statement (I do not know if that is the correct parlance) in an OpenType GSUB table have more than one glyph listed within that place?
No. Graphite does, however, support one-to-many and many-to-many substitutions. I believe that AAT also supports multiple substitutions as well. OpenType only allows one-to-one (single substitution) and many-to-one (ligature) substitutions.
Thank you.

That seems to imply to me that an OpenType solution will not provide the solution to the problem.

----

Thinking about the original problem, I ask the following three questions please.

----

1. If a transliteration Input Method Editor method were implemented and were in use with the font, how would the following phrase be entered?

The three great forests

The letter r is used three times. As I understand it at the present time, the r in three and the r in great would be the one design of r and the r in forest would be the other design of r, because that r follows a letter o. Is that correct?

----

2. If a transliteration Input Method Editor method were implemented and were in use with the font, if the font were to have an alternate design of a letter t with a swash ascender design, such that that t could be used wherever the person using the font felt that it would look good, how would that person indicate that the alternate t were to be used?

Would it be by using %%t or in some other way?

----

3. If a transliteration Input Method Editor method were implemented and someone obtained a copy, how would he or she use the facility on a computer?

Would it be some system level method of which one is unaware unless one uses it? Would it be usable with any application program that one is using?

----

William Overington

13 July 2012

algrass
Posts: 114
Joined: Mon Dec 10, 2007 9:02 pm
Location: West Midlands, UK
Contact:

Re: Input Method Editors for Latin based character set

Post by algrass » Fri Jul 13, 2012 6:41 am

Your comments and inputs were much appreciated. I am enclosing another pdf example sheet in which I try to answer some of the questions. Judge for yourself if I grasped the issue well enough and do not forget to bear in mind that I am a "dilettante" in typography.

More specifically I think that I understood Pesala's lookup substitution list. And it seems very easy to me to implement. If it is so, it really begs the question: why isn't OT the font by default?
With reference to William's question I believe I understood it and rephrased it in example 8 in my pdf file. The reply is meaningful and I look forward to learn what Pesala's answer will be.

If I may be allowed to use a more mathematical expression the question is: If a character set has, say, 365 glyphs in all of which some are standard and some are modified versions, in the lookup ligaSub list, does the right hand side of the equality sign need to be another of those 365 glyphs or a combination of two gliphs kerned together? The same question can be extended to those few cases where three letters require some form of common ligatures.

I have some other questions as whatever the answer will be I can see that OT is the way to go.
1- How do I go about converting my ttf font to OT?
2- Does Microsoft Word support OT fonts fully?
3- What other software supports OT fonts?
4- Do Apple's programs support OTF?

Regards
Attachments
Example2.pdf
New example page
(32.55 KiB) Downloaded 234 times

Bhikkhu Pesala
Top Typographer
Top Typographer
Posts: 8345
Joined: Tue Oct 29, 2002 5:28 am
Location: Seven Kings, London UK
Contact:

Re: Input Method Editors for Latin based character set

Post by Bhikkhu Pesala » Fri Jul 13, 2012 8:11 am

algrass wrote:If it is so, it really begs the question: why isn't OT the font by default?
Application support is limited, adding OpenType Features involves more work, most fonts don't really need them.
algrass wrote:If I may be allowed to use a more mathematical expression the question is: If a character set has, say, 365 glyphs in all of which some are standard and some are modified versions, in the lookup ligaSub list, does the right hand side of the equality sign need to be another of those 365 glyphs or a combination of two gliphs kerned together? The same question can be extended to those few cases where three letters require some form of common ligatures.
The ligature features use many -> one substitutions, so e.g. f + f -> ff ligature.
Stylistic alternates (salt) can use one -> one substitutions, but they require manual input, which is inefficient, but very flexible as the user can decide which letter form to insert. One glyph has a lookup table for several other forms to substitute for that glyph — the user has to choose from a dialogue presented by the application, which Stylistic Set to use.

Code: Select all

}  lookup StylisticAlternates {
    sub asterisk -> [asterisk asteriskmath uni2051 uni2042 uni203B uni273B];
    sub plus -> [plus uni2795 uni271A-uni271C uni2720];
    sub at -> [at uni2121 uni213B uni260E uni260F uni2709];
    sub copyright -> [copyright uni2117 uniF000 estimated uni2139 uni2638];
    sub multiply -> [multiply uni2715-uni2718 uni274E];
    sub dagger -> [dagger uni2620 uni2622 uni26A1 uni26B0 uni2690];
    sub daggerdbl -> [daggerdbl uni2623 uni26A0 uni2621 uni26B1 uni2691];
    sub bullet -> [bullet uni204D uni2023  uni2767 uni2712 uni261B];
}
Stylistic Alternates.png
Stylistic Alternates.png (24.27 KiB) Viewed 7143 times
algrass wrote:1- How do I go about converting my ttf font to OT?
Study the tutorial on Adding OpenType Features. The free program MS VOLT can also be used, but I prefer the OpenType Compiler method. Once a script has been written, its very easy to add the same features to similar fonts, and sharing or editing scripts is straightforward.
algrass wrote:2- Does Microsoft Word support OT fonts fully?
Word 2010 supports OpenType Features. Levels of OpenType support in some well-known applications.
algrass wrote:3- What other software supports OT fonts?
Word 2010 InDesign, Serif™ PagePlus X5, PagePlus X6, or DrawPlus X5.
algrass wrote:4- Do Apple's programs support OTF?
I am sure they do, but I am not sure if they support mine (see my sig).
My FontsReviews: MainTypeFont CreatorHelpFC12 Pro + MT9.0 @ Win10 1903 build 18362.356

vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Input Method Editors for Latin based character set

Post by vanisaac » Fri Jul 13, 2012 9:31 am

William wrote:2. If a transliteration Input Method Editor method were implemented and were in use with the font, if the font were to have an alternate design of a letter t with a swash ascender design, such that that t could be used wherever the person using the font felt that it would look good, how would that person indicate that the alternate t were to be used?

Would it be by using %%t or in some other way?
Easiest way is to define the different glyphs as ligatures of the base character and a variation selector, and then have the keyboard set up to input variation selectors. Variation selectors are default ignorable code points, so you would be able to do proper search and if the text were copied to a program where the font weren't available, it would still be displayed as legible text.

Post Reply