Page 3 of 4

Posted: Mon Aug 11, 2008 8:39 pm
by Bhikkhu Pesala
I like the language selection method best — use a different template for each language. That would permit a decent range of punctuation on each template — things like Smart Quotes, em-dashes, etc.

French needs only 40 extended glyphs, and most other languages need fewer than French. That leaves 70 spaces on each additional language template for other glyphs.

Posted: Tue Aug 12, 2008 5:55 am
by William
Timo Kähkönen wrote:Also professionals has need for simple way to scan font and do the mappings. As far as I know there is no this like program for beginner or professional.
Some years ago there was a program available named Your Handwriting II which was produced by Data Becker.

viewtopic.php?t=1837

William Overington

12 August 2008

Posted: Tue Aug 12, 2008 7:09 am
by William
Bhikkhu Pesala wrote:I like the language selection method best — use a different template for each language. That would permit a decent range of punctuation on each template — things like Smart Quotes, em-dashes, etc.
I wonder if language selection could perhaps be implemented as an option with a language template being produced dynamically by reading in an XML file specifically for that language. For example, by reading in a file named French.xml in order to produce a template for French.

In that way, Scanahand could be supplied with a few such XML files and others could be added by an end user or downloaded from the Scanahand forum if other users of Scanahand uploaded their own XML files to the forum.

If the results were superimposable, then a font for, say, French, Portuguese and Latvian could be produced by producing templates from French.xml, Portuguese.xml and Latvian.xml, filling in the templates and then scanning in the templates. In the event of two glyphs for the same character being on two different templates, maybe Scanahand could prompt on-screen as to whether to keep the existing glyph or to use the new glyph: Scanahand discarding the glyph that is not chosen.

This approach would keep Scanahand straightforward for beginners yet also have the powerful ability to produce comprehensive fonts for any code points in the Unicode Basic Multilingual Plane.

Indeed, such a language selection facility could be used for other purposes as well, such as for including various symbols and accessing codepoints in the Unicode Private Use Area.

Scanahand would know how to allocate codepoints because the scan would be processed with respect to the same XML file as had been used to produce the template. Although templates would ideally have small glyphs printed on them as a guide, this would not be essential for an advanced user, as long as he or she had a chart of some sort (maybe just handwritten on the cardboard of the inside of an old cereal box) available so that he or she knew how to fill in the template.

All of the above could be implemented without affecting the including in Scanahand of both the basic template and also the second template being discussed in some of the earlier posts in this thread. I am just thinking of what could perhaps be done after inclusion of both of those templates has been achieved.

William Overington

12 August 2008

Posted: Tue Aug 12, 2008 2:04 pm
by Timo Kähkönen
Huh! Here is now the "first" draft version of dynamic font template creation:

http://www.royalcomics.org/puhekupla/draw_template.php

It is not beautiful and logical but the idea of dynamic templates hopefully can be seen. With it you can
a) select glyphs by language (there may be mistakes)
b) Type unicode character ranges
c) Type glyphs as characters

Does this has some features that could be good to implement in Scanahand?

It uses pdflib for template creation and libdmtx for data matrix creation. In the top of every page there is little black white squares, which contain page related information (present only the character range of page).

EDIT: The count of rows and columns in template can be set. Reducing count of cells may help making Asian fonts, logos and other complex shapes.

Posted: Tue Aug 12, 2008 3:33 pm
by Bhikkhu Pesala
That is very easy to use, and seems to work well. :)

I made myself a Pali Template

I think the row/column count needs to be fixed at 10/11 for Scanahand to be able to interpret it correctly, so large fonts may not be possible, though a dynamic template is clearly better than a single fixed template.

Posted: Tue Aug 12, 2008 4:41 pm
by Timo Kähkönen
Bhikkhu Pesala wrote:I think the row/column count needs to be fixed at 10/11 for Scanahand to be able to interpret it correctly, so large fonts may not be possible, though a dynamic template is clearly better than a single fixed template.
Fine that the template generator works!

If Scanahand is going to dynamic templates, then there must be the Data Matrix Barcode at every page. In this barcode it is possible to store other page related properties such as
a) row and column count
b) various "metrics" of page (margin, page size, glyph title cell)
c) draftname for font
d) and of course unicode ranges of page

This information can gzipped or bzipped and in Barcode generation can be used Base 256 encoding (all byte values 0-255). In one barcode square is not reasonable to store too much data, because module size comes too little and when printed in inkjet the modules blend together and the reading and decoding of barcode will fail. That's the reason why I chunked the data across multiple squares.

So if DT (Dynamic Templates) is being implemented in Scanahand and people have old not-barcoded templates that have 10x11 template, in these cases Scanahand uses default which is 10x11. So no problem!

Posted: Tue Aug 12, 2008 7:34 pm
by William
Timo Kähkönen wrote:Huh! Here is now the "first" draft version of dynamic font template creation:

http://www.royalcomics.org/puhekupla/draw_template.php
I tried it for the preset English template and then I decided to experiment.

I tried Custom template (Type Unicode ranges) and used 59143-59252 so as to start with 59143 and use 110 code points.

Well, wow and wow again!

I saved the generated pdf to the local hard disc, copied it and renamed it as experimental.pdf and uploaded it to the web.

http://www.users.globalnet.co.uk/~ngo/experimental.pdf

I am amazed and delighted!

William Overington

12 August 2008

Posted: Tue Aug 12, 2008 9:18 pm
by Timo Kähkönen
William wrote: I tried Custom template (Type Unicode ranges) and used 59143-59252 so as to start with 59143 and use 110 code points.
And there ARE many empty slots because of little incompleteness of the font. I had to select sample font that has nearly all of plane 0 covered. And this one has 63546 glyphs of 65536.

In my template creator demo there is no detection of control characters and other empty glyphs and missing ones of sample font. So it's quick and dirty exemplary version.

Posted: Wed Aug 13, 2008 7:06 am
by William
Timo Kähkönen wrote:
William wrote: I tried Custom template (Type Unicode ranges) and used 59143-59252 so as to start with 59143 and use 110 code points.
And there ARE many empty slots because of little incompleteness of the font. I had to select sample font that has nearly all of plane 0 covered. And this one has 63546 glyphs of 65536.
I realized overnight that I had not explained my amazement and delight at the results of your experiment and that I should add a few notes for new readers of this forum, hoping that those readers who already knew about it would not mind it being repeated in this thread. Having this morning seen your post it seems a good idea to add the explanation as a reply.

Since the days of using metal type I have been interested in ligatures.

When I began to learn about electronic fonts I found that although glyphs for ligatures could be added, mapped to the Unicode Private Use Area, that there was a culture amongst some people that this should not be done and that no more glyphs for ligatures should be added to regular Unicode, and that ligature glyphs should be unmapped within a font and only be accessible using glyph substitution technology. The ligature glyphs in U+FB00 to U+FB06 only being included in Unicode for backward compatibility with some prior standard.

I thought that this missed out the very real fact that people using non-OpenType-aware software packages could not access ligature glyphs to produce printouts, so I decided to produce some code point allocations for ligature glyphs within the Unicode Private Use Area.

http://www.users.globalnet.co.uk/~ngo/golden.htm

http://www.unicode.org/mail-arch/unicod ... /0223.html

Doug Ewell wrote as follows.

http://www.unicode.org/mail-arch/unicod ... /0422.html

The following two posts are from James Kass, the producer of the Code2000 font.

http://www.unicode.org/mail-arch/unicod ... /0426.html

http://www.unicode.org/mail-arch/unicod ... /0441.html

When I tried the range 59143-59252 I expected it to have all blank cells. I was amazed that there were any glyphs shown at all! Also I was reminded of the phrase "I laughed out loud" in the following post.

http://www.unicode.org/mail-arch/unicod ... /0009.html

When I had looked at the English example and at the example which Bhikkhu Pesala posted, I had not realized that the Code2000 font was being used. So, it was with amazement that I saw the ct ligature glyph in the top left cell of the pdf which the system produced for me! Similar perhaps to when James Kass saw the ct ligature in the Unicode mailing list post from Doug Ewell in 2002.

Although those posts all happened in 2002 it seems to me that the use of Private Use Area codepoints for glyphs for ligatures is still needed in 2008, perhaps more needed now because there is, because of OpenType technology, more interest in glyphs for ligatures yet people without the very expensive packages cannot display them nor print them!

The following blog from Thomas Phinney may also be of interest to readers.

http://blogs.adobe.com/typblography/200 ... priva.html
Timo Kähkönen wrote: In my template creator demo there is no detection of control characters and other empty glyphs and missing ones of sample font.
Well, that is, in my opinion, a benefit and not a defect. Please do not change it.
Timo Kähkönen wrote: So it's quick and dirty exemplary version.
Well, "quick" only because of your skill and ability to produce such excellent results. I would not be able to produce it in a month!

It is not "dirty". In my opinion, it is a great step forward.

I have thought of a few matters that I would like to mention. Could you possibly consider making the inclusion of the guidelines across the cell an option please and making the glyph in the cell an option too please. People using black and white printers might get problems with scans having unwanted dark pieces in them.

Also, in Unicode, U+FFFE and U+FFFF (65534 and 65535) are non-characters. Could it be a useful convention that if someone uses 65535 in one of your templates that Scanahand then uses that glyph for the .notdef glyph of the font?

William Overington

13 August 2008

Posted: Wed Aug 13, 2008 9:19 am
by Timo Kähkönen
William wrote: ... making the inclusion of the guidelines across the cell an option ...
... making the glyph in the cell an option too please ...
Now there is Guidelines on/off and Sample characters on/off:
http://www.royalcomics.org/puhekupla/draw_template.php?
William wrote: Could it be a useful convention that if someone uses 65535 in one of your templates that Scanahand then uses that glyph for the .notdef glyph of the font?
What FontCreator mans are thinking about using Kahkonen-templates in Scanahand? Good or bad thing? If HighLogic thoughts it's okay, then Scanahand should recognize automatically or get manual input few parameters of templates:
a) Unicode ranges of pages, eg. UnicodeRange = 33-44,68-78,40000
b) Column and Row count, eg. ColumnsRows = 10x11 (of course width so column count first)
c) Vertical metrics of slots (for example as percent of slot height. If slot height is 100%, vertical metrics could be e.g. [Windescent, Baseline, x-height, Capheight, Winascent] = [5.00, 20.43, 50.32, 70.83, 95.00]. In this case there would be 5% top and bottom margin, that will not be included in Glyph shape.
d) If Scanahand has not automatic template's border recognition then also:
- PageWidth
- PageHeight
- Top/Bottom/Left/Rightmargin
- PageTitleCell Height
- GlyphTitleCellHeight
- BorderLineWidth
- ShortGuidelineLength (meaning short horizontal black lines crossing column's left and right borders)

and possibly:
- SignatureCell parameters also. What is the reason for Signature in Scanahand?

So is this Dynamic Template going to be "a public standard"? It would be very interesting to develop such standard.

Posted: Wed Aug 13, 2008 11:40 am
by William
Timo Kähkönen wrote:
William wrote: ... making the inclusion of the guidelines across the cell an option ...
... making the glyph in the cell an option too please ...
Now there is Guidelines on/off and Sample characters on/off:
http://www.royalcomics.org/puhekupla/draw_template.php?
Excellent.
Timo Kähkönen wrote: So is this Dynamic Template going to be "a public standard"? It would be very interesting to develop such standard.
I feel that "a published specification" would perhaps be a better way to look at it.

For example I published the specification for the golden ligatures collection.

http://www.users.globalnet.co.uk/~ngo/golden.htm

It is not a standard and I do not seek it to be a standard.

It exists and people may use it if they so wish. Some of it has been used by James Kass within the Code2000 font and I have used various parts of it within some of my own fonts, for example Quest text, Chronicle Text and the 10000 font for which there are threads in the High-Logic Gallery forum.

As the golden ligatures collection is not a standard and is not purported to be a standard, there is no problem when other people use other code point allocations for some of the same ligature glyphs.

You have already made progress which has amazed me. I did not realize that a template could be produced as a pdf as a result of entering information in a form on a web page.

William Overington

13 August 2008

Posted: Thu Aug 14, 2008 6:12 am
by William
I have been thinking further about the pdf producing facility produced by Timo Kähkönen.

http://www.royalcomics.org/puhekupla/draw_template.php?

I am thinking that when the Submit button is pushed that the client-side form sends a package of information, in text form, to the server-side processor. Would it be a good idea to add an extra page to the pdf and start with the word ITEMS and then include, in text form, an exact copy of the text, including field names, that the client-side form sent to the processor?

William Overington

14 August 2008

Posted: Thu Aug 14, 2008 6:29 am
by Timo Kähkönen
William wrote: ... Would it be a good idea to add an extra page to the pdf and start with the word ITEMS and then include, in text form, an exact copy of the text, including field names, that the client-side form sent to the processor?
Yes, it IS a good idea. The human needs it - to check that there is all as expected in the pdf and to remember what were the parameters and possibly to copy the information to text editor (or EXcel) and modify it there and go then back to the form and enter modified information.

In template pages there is no room to include meta information in font size that is readable but in title sheet there would be room.

The program needs only the barcode with encoded data.

Posted: Fri Aug 15, 2008 7:58 am
by William
I got to thinking that if the program were adapted so that OpenType glyph substitution information could be included in the pdf, then the template generator program would be capable of being used to produce OpenType fonts with automatic ligature substitution, discretionary ligatures and alternate glyphs. All of these features might not be used by Scanahand at the present time or even in the medium future timescale, yet the template program would allow the information to be encoded ready to be used if the opportunity arose.

At first I thought that that might be a very lengthy task to achieve, needing lots of discussion by people who know about OpenType.

However, I am wondering whether the infrastructure could be achieved by adding one or two large text area inputs to the form and allowing people to add into those text area inputs whatever text they like and an exact copy of that text would then be included on a page of the pdf. Maybe one text area with the name OpenTypeParameters or maybe several text areas with names based on the names of the various OpenType tables?

The system could be produced such that if there is no text in a text area, then nothing is added to the pdf, so if OpenType information was not being specified for the font then the pdf would not be affected by the availability of one or more of such text areas on the form.

It would need some discussion by people who know about OpenType, but maybe a specification would be straightforwardly achievable.

William Overington

15 August 2008

Posted: Fri Aug 15, 2008 9:06 am
by Timo Kähkönen
William wrote: ... OpenType glyph substitution information could be included in the pdf...
... automatic ligature substitution, discretionary ligatures and alternate glyphs...
Do you mean e.g. if the user has included characters f and i in template, the template generator program would add also ligature glyph fi AUTOMATICALLY or only if the user has MANUALLY INSTRUCTED the program to add ligature fi to some unicode code point?

In the automatic method there should be preinstalled ligature tables and in the manual method this information comes manually from user. The table structure could be like this LigatureLeft[0]="f", LigatureRight[0]="i", UnicodePointOfLigature[0]=45321 or formatted according to Opentype tables.
William wrote: ...adding one or two large text area inputs to the form and allowing people to add into those text area inputs whatever text they like and an exact copy of that text would then be included on a page of the pdf. Maybe one text area with the name OpenTypeParameters or maybe several text areas with names based on the names of the various OpenType tables?
That's of course possible. The first page(s) of template pdf could include meta information of the font, both as text and as barcode. In the proper glyph outline pages of template there is no much room for barcodes containing metainformation. It would also be possible to extract the whole metadata to xml-file that could be shared to other people or inputted to font generator program.