Emoji

Please try to keep all the discussions in the main forums on topic! If you have anything else, related to fonts, you want to share, please post it here!
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

I started from the following post in the archive of the Unicode public mailing list.

http://www.unicode.org/mail-arch/unicod ... /0067.html

After some searching I managed to find the files for the emoji in the following directory.

http://www.unicode.org/Public/6.0.0/charts/blocks/

Here are direct download links.

http://www.unicode.org/Public/6.0.0/cha ... U1F300.pdf

http://www.unicode.org/Public/6.0.0/cha ... U1F600.pdf

William Overington

4 June 2010
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

I had been looking through the example glyphs in (the present version of) the emoji codechart document.

http://www.unicode.org/Public/6.0.0/cha ... U1F300.pdf

Later, quite separately, I was having a look through various photographs of San Gimignano, Italy within Google Streetview.

I noticed the following picture, which seemed to have a resonance with one of the emoji images.

http://maps.google.com/?ie=UTF8&ll=43.4 ... eoqmHPByFg

Checking back in the emoji document I found the following.

U+1F306 CITYSCAPE AT DUSK

There are a number of differences between the two images as well as similarities, and some readers might perhaps find it interesting to compare and contrast the two images.

William Overington

8 June 2010
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

On 4 June 2010 I wrote as follows.
Versions produced on 14 June 2010 have now been uploaded.

The links to download them are the same as before.

William Overington

16 June 2010
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

In the http://www.unicode.org/Public/6.0.0/cha ... U1F300.pdf document.

U+1F308 RAINBOW

From http://maps.google.com and search for San Gimignano, enlarge several steps, and then drag the orange man logo to highlight Streetview availability, one of the photographs linked is as follows.

http://maps.google.com/maps?f=q&source= ... po-6883577

William Overington

18 June 2010
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

Some readers might find the following thread that I started of interest.

http://www.unicode.org/mail-arch/unicod ... /0435.html

The thread is about keying emoji characters using an ordinary keyboard.

William Overington

30 June 2010
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

Most of the emoji characters are now available in Unicode in the following code chart, though some got unified with other Unicode characters that had been defined in earlier versions of Unicode.

----

Miscellaneous Symbols And Pictographs

http://www.unicode.org/charts/PDF/U1F300.pdf

----

Also of interest may be the following.

Emoticons

http://www.unicode.org/charts/PDF/U1F600.pdf

The following document explains that emoticons are not exactly the same as emoji.

http://www.unicode.org/faq/emoji_dingbats.html

I found that page from a link in the following page.

http://www.unicode.org/press/pr-6.0.html

----

The above and many other code charts are available from the following web page.

http://www.unicode.org/charts/

Regarding the unifying of some of the emoji with other Unicode characters that had been defined in earlier versions of Unicode, fortunately the following document is still available.

http://www.unicode.org/~scherer/emoji4u ... jidata.pdf

Adobe Reader dates the document as 27/04/2010 15:48:30 when I opened the document from the link while preparing this post.

William Overington

13 December 2010
vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Emoji

Post by vanisaac »

And just to keep up with matters, the Beta for Unicode 6.1 has been released. It includes 17 new emoji added to the Emoticons, and Miscellaneous Symbols and Pictographs blocks; seven new scripts: Meroitic Hieroglyphs, Meroitic Cursive, Sora Sompeng, Chakma, Sharada, Takri, and Miao (Pollard); 143 Arabic Mathematical Alphabetic Symbols; additions to 14 already supported scripts; 10 new punctuation; and several compatibility or presentation characters. Unicode 6.1 is scheduled for final release in February.
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

The following has an interesting proposal for the future of the encoding of emoji in plain text: please note that the file is 1.1 Megabytes in size.

http://std.dkuug.dk/JTC1/SC2/WG2/docs/n4182.pdf

It is linked from the following page.

http://std.dkuug.dk/JTC1/SC2/WG2/docs/

William Overington

16 January 2012
vanisaac
Posts: 337
Joined: Sun Mar 30, 2003 1:33 pm
Location: Washington State, USA

Re: Emoji

Post by vanisaac »

Just to add a bit of context for those unfamiliar with the history of emoji and the Unicode encoding thereof, this proposal by Peter Edberg (of Apple computers) is to allow the specification of emoji presentation styles for characters that were not encoded as emoji, but rather as regular compatibility or graphic characters. In short, the emoji collections from Japanese phone manufacturers were not simply copy-and-pasted into Unicode, but the entire repertoire was compared with characters already in the standard. Where matches were found, a new, duplicate character was not encoded, but the emoji use was unified with the pre-existing character. As such, the emoji-specific characters are generally interpreted as being colored and shaded, sometimes with animation. Obviously, these are features that are not found in regular text characters. The characters that were unified with an emoji use have two different interpretations, namely as regular black-and-white text elements, map symbols, etc., but also as colored, shaded, animated emoji characters. Fortunately, Unicode has a handy, built-in method of limiting the presentation form of a given character through the use of Variation Selectors. These Variation Selectors can be used to select the Japanese, Chinese, or historical styles of the Unified CJK (Han) ideographs, specific shaping forms of 'Phags Pa letters, and styles of mathematical operators. Peter proposes to use the 15th and 16th Variation Selectors to limit the appropriate glyph presentation of these dual text/emoji characters to text-style (VS-15) or emoji-style (VS-16). One of these characters without a Variation Selector can be freely represented with either style glyph.
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

Regarding the following document.

http://www.unicode.org/L2/L2014/14093-u ... -emoji.pdf

Please consider in particular the last two paragraphs of section 3 of the document.

Having seen the earlier HTML version I produced the following document.
The_format_of_the_readouts.dat_file_suggested_for_possible_use_in_the_application_of_localized_read-out_labels.pdf
The format of the readouts.dat file suggested for possible use in the application of localized read-out labels
(33.48 KiB) Downloaded 596 times
The format is almost an exact copy of the format in the document available in the following post.

viewtopic.php?p=21048#p21048

Since publishing the readouts.dat format document I have received helpful advice about the XLIFF format.

I knew nothing at all about XLIFF.

I found the following.

http://en.wikipedia.org/wiki/OASIS_%28organization%29

http://en.wikipedia.org/wiki/XLIFF

http://docs.oasis-open.org/xliff/xliff- ... -core.html

I have not studied it all yet.

Suppose that the Unicode Technical Committee accepts section 3 of the 14093-utr51-draft-emoji.pdf document. (I hope they do.)

Suppose then that a manufacturer of a text-to-speech system then sees that section and asks as follows.

"In adding that facility into our text-to-speech system, is there a portable file format that we can use so that our user community can crowd-source localizations of the emoji into the many languages with which our text-to-speech system is used?" asks the manufacturer.

I feel that as XLIFF exists that the readouts.dat format may well never be used by most businesses. However, perhaps the readouts.dat format might be useful for student projects and some research and development projects.

I am thinking about adding some features so as to assist a software tool to convert from readouts.dat format to XLIFF format, while still keeping the same original lightweight processing demand if someone wishes to program a routine to read in a readouts.dat file to, say, a text-to-speech program.

So as well as its original intended use, maybe the readouts.dat format augmented so as to assist a software tool to convert from readouts.dat format to XLIFF format will be a convenient way for people to enter the localization data into a computer system prior to an XLIFF file being produced from the readouts.dat file.

William Overington

3 May 2014
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

Here are some notes that might be of interest to some readers.

XLIFF 2.0 candidate standard is due to be published today.

http://docs.oasis-open.org/xliff/xliff- ... -cs01.html

http://docs.oasis-open.org/xliff/xliff- ... cos01.html

Reference in my draft text below to XLIFF refers to XLIFF 1.2 format.

http://docs.oasis-open.org/xliff/xliff- ... -core.html

http://en.wikipedia.org/wiki/XLIFF

MY DRAFT TEXT SO FAR ONLY

If the first character of the line of a readouts.dat file is an ASTERISK the line is a comment for the primary purpose of using a readouts.dat file, namely of providing a text string in a particular language that describes in words the description of a particular pictograph character.

It would be possible, in principle, to use a specially written software tool to convert the contents of a readouts.dat file to an XLIFF file that has the same information content, though presented in an XLIFF structure.

For this purpose, several additional features are now introduced into the format of a readouts.dat file, though in such a way that the same original format may be used when using the readouts.dat file for its primary purpose.

These features are as follows, all being defined only when the first character of the line is an ASTERISK: the definitions being based upon the second character of the line when the first character of the line is an ASTERISK.

*[

On a line starting *[ the text after the second character, if there is any, can be used in the source string of an XLIFF trans-unit element.

As conversion of a localization line of a readouts.dat file to XLIFF coding takes place, the latest use of a *[ line indicates which string should be used in the source string of the XLIFF trans-unit element related to that localization line.

For example

*[en-GB

Please note that no quotation marks are used in the *[ line.

*]

On a line starting *] the text after the second character, if there is any, can be used in the target string of an XLIFF trans-unit element.

As conversion of a localization line of a readouts.dat file to XLIFF coding takes place, the latest use of a *] line indicates which string should be used in the target string of the XLIFF trans-unit element related to that localization line.

For example

*]en-GB

Please note that in the original use for a readouts.dat file there would only be one use of *] in the file, at the start of the file before any localization lines.

There might not be any use of a *[ in the file if the source is pictograph symbols such as emoji.

----

As I learn which information that could be in comments in a readouts.dat file that could usefully be formatted so as to facilitate automated transfer into an XLIFF file, I may define other features, for example lines in a readouts.dat file that start with *{ and *} and maybe *_ and, if needed, some others as well.

William Overington

5 May 2014
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

William wrote:Regarding the following document.

http://www.unicode.org/L2/L2014/14093-u ... -emoji.pdf

Please consider in particular the last two paragraphs of section 3 of the document.
Regarding the above referenced document.

Some readers might like to try the following.

On the following web page,

http://www.unicode.org/L2/L-curdoc.htm

please find the link about document L2/14-093 and note that there is a link

Snapshot; HTML version

and clicking that link leads to working draft 4 dated 2013-06-09, though maybe 2014-06-09 is intended as it is now version 4.

That draft no longer mentions read-out labels yet does refer to a TTS name, meaning text-to-speech.

I noticed that in particular: there appear to be other changes in the document as well.

William Overington

20 June 2014
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

There are lots of new encoding proposals linked from the following page.

http://www.unicode.org/L2/L-curdoc.htm

Some readers might like to view my contribution in the Other Reports section of the following document that is linked from that page.

http://www.unicode.org/L2/L2015/15019-pubrev.html

William Overington

3 February 2015
William
Top Typographer
Top Typographer
Posts: 2038
Joined: Tue Sep 14, 2004 6:41 pm
Location: Worcestershire, England
Contact:

Re: Emoji

Post by William »

The following document is interesting.

http://www.unicode.org/L2/L2016/16008r3 ... -emoji.pdf

Can these customized items be implemented using FontCreator?

Please note in particular the Private Use section, which seems to open up vast amounts of coding space for private use, with graceful display of a fallback character if the private encoding is not recognized.

I am wondering whether it is possible to use such a coding on web pages using a webfont.

William Overington

30 January 2016
Post Reply