Some readers have read the thread “A facility for entering Esperanto accented characters” and also have read the thread “Using the Hot Beverage character”.
Here is a link to another typecase_ pdf. This one would be named typecase_f_ligatures.pdf except that I am having problems with using it.
If I try to copy and paste the glyphs into WordPad, then, instead of the desired ligature glyphs, WordPad displays glyphs of the underlying characters that make up the ligature character.
Is it possible to do anything so as to be able to copy the glyphs into WordPad please? I have tried Paste Special… with no success.
I tried making typecase_spaces.pdf and could not copy out into WordPad, Word 97 and SC Unipad without the spaces each converting to U+0020 and I tried making typecase_combination_border_private_use_area.pdf and found that I could not copy out into WordPad (I got an array of spots which would not format to another font) but that I can copy out into Word 97 and into SC Unipad. From Word97 and SC Unipad, I can then copy and paste into WordPad.
The combination border, thirteen sorts including the special square space which is used in the typecase_ pdf, is implemented in my Quest text font and in my 10000 font: the combination border in the 10000 font using different glyphs from those in the Quest text font.
The clipboard contains spaces, even in Unicode Text:
0000 3E0020003C000D00 0A003E0020003C00 > < > <
0010 0D000A003E002000 3C000D000A003E00 > < >
0020 20003C000D000A00 3E0020003C000D00 < > <
0030 0A003E0020003C00 0D000A0045006100 > < E a
0040 6300680020007300 7000610063006500 c h s p a c e
0050 2000690073002000 6C006F0063006100 i s l o c a
0060 7400650064002000 6200650074007700 t e d b e t w
0070 650065006E002000 6100200067007200 e e n a g r
0080 6500610074006500 72002D0074006800 e a t e r - t h
0090 61006E0020007300 690067006E002000 a n s i g n
00A0 61006E0064002000 61000D000A006C00 a n d a l
00B0 6500730073002D00 7400680061006E00 e s s - t h a n
00C0 2000730069006700 6E002E0020005400 s i g n . T
00D0 6800650020006E00 75006D0062006500 h e n u m b e
00E0 7200730020006100 7200650020007400 r s a r e t
00F0 6800650020006800 6500780061006400 h e h e x a d
0100 6500630069006D00 61006C0020007600 e c i m a l v
0110 61006C0075006500 7300200075007300 a l u e s u s
0120 650064000D000A00 69006E0020007400 e d i n t
0130 6800650020005500 6E00690063006F00 h e U n i c o
0140 6400650020006300 6F00640065002000 d e c o d e
0150 6300680061007200 74002E000D000A00 c h a r t .
0160 3200300030003000 0D000A0032003000 2 0 0 0 2 0
0170 300031000D000A00 3200300030003200 0 1 2 0 0 2
0180 0D000A0032003000 300033000D000A00 2 0 0 3
0190 3200300030003400 0D000A0032003000 2 0 0 4 2 0
01A0 300035000D000A00 3E0020003C000D00 0 5 > <
01B0 0A003E0020003C00 0D000A003E002000 > < >
01C0 3C000D000A003E00 20003C000D000A00 < > <
01D0 3E0020003C000D00 0A003E0020003C00 > < > <
01E0 0D000A0032003000 300036000D000A00 2 0 0 6
01F0 3200300030003700 0D000A0032003000 2 0 0 7 2 0
0200 300038000D000A00 3200300030003900 0 8 2 0 0 9
0210 0D000A0032003000 300041000D000A00 2 0 0 A
0220 3200300030004200 000000 2 0 0 B
So I suspect it’s due to your PDF creator (PDFlib+ PDI 6.0.0p2 in case of Serif PagePlus 12). It is also possible that Acrobat Reader replaces the spaces into regular spaces, but it would surprise me.
FYI Microsoft naming fields are stored in UTF-16 format (2 bytes per plane 0 character), while Macintosh naming fields are stored in Mac format (one byte per character).
I think that the Macintosh name is in plain text yet the Microsoft name is encrypted in some way. It is interesting to look at a .ttf front using the WordPad program.
Do you find that it is the Microsoft name that is not picked up in the search?
As for the dump, I do not know how Erwin produced that dump, but I think that some dump programs put the ASCII-8 equivalent of each byte in the display at the right, regardless of the format of the file from which the data has been retrieved.
It looks like the left column is byte count at that point in hexadecimal (that is, in decimal, none so far, sixteen so far, thirty-two so far, and so on. The two large columns in the middle appear to be pairs of hexadecimal characters so that each pair is a byte within the range from hexadecimal 00 to hexadecimal FF. Each pair of bytes is one 16-bit character: it is a bit difficult to make out because of a historical quirk whereby the first byte of the pair is the second byte of the character. So the top line begins 3E00 which means a U+003E character, then 2000 which means a U+0020 character, then 3C00 which means a U+003C character, then D000 which means a U+000D character, and so on.
I think it unlikely (in fact I think it does not, but I am not in a position to say so congruently) that the spaces between the letters in the fourth column in any way affect the search. My guess is that they are just output put there so that a programmer looking at a dump can quickly pick-up where text strings are located in the code.
While I was preparing my post, unknown to me, Erwin posted some information.
Upon reading that I thought that I should therefore (or maybe just possibly) be able to find the name in amongst the black boxes using WordPad. I had in fact got a copy of my Sonnet to a Renaissance Lady font open in WordPad at the time. So I searched for the letter S using the WordPad search facility, with a view that I should hopefully be able to find the the word S o n n e t spaced out like that but with a black box between characters rather than spaces as shown here.
So I searched for S using the Edit Find… facility of WordPad and clicked the Find Next button a number of times, looking to see if that was the start of the spaced out word Sonnet at each click.
The Sixteenth click brought me to the clear unspaced-out plain text, namely the Microsoft platform information. The description information in that font is quite long, and it took until click seventy-four to get beyond it.
That appears to be at the start of the word Standard.
Click seventy-five brought me to the start of the word Sonnet.
I realized afterwards that I could have checked the Match case checkbox and got there in eleven clicks: however, it was fun searching.
That did not work, so I tried removing all but one of the twenty-one uses of the “square space (private use area)” replacing those in the two lines of border glyphs each with two ordinary spaces and replacing the eight in the middle line between the border glyphs with one of them between > and < characters.
The pdf is as follows: I notice that the pairs of spaces appear each to have been replaced with a single space.
The format is typical Hex Editor with the hex characters on the left and translated letters on the right. Google for Hex Editors Freeware for a whole list. Good utility to have to investigate deeper.