The explanation above covers the essence of the process. From it, one can outline a step‑by‑step process to do auto‑kerning with any sets of glyphs, from any Unicode blocks, up to a maximum of 1,435 glyphs. But what if you need more? Many professional fonts have hundreds of extra glyphs used in stylistic variations. What if you want to obtain kerning pairs for all those variations? The strategy here is divide and conquer.
To understand the process, imagine that you decide to work only with the 1,383 characters that remain after excluding the 52 from the Basic Latin block. Since each glyph will be in its own kerning class, it does not even make sense to speak of classes, and we will speak only of glyphs. However you distribute the glyphs across the characters, AutoKern will compute the pairing of each glyph with all the others, in both possible positions (one on the left, the other on the right). It will even compute the pairing of each glyph with itself. In mathematical terms, it will compute the Cartesian product of the set of all glyphs. If you are working only with Latin glyphs, you can very well call this set of glyphs L, and the set of kerning pairs for all these glyphs will be the Cartesian product L×L.
But suppose you decide, to better organize things, to divide all those glyphs into two large groups, placing, for example, all uppercase letters together, separate from the lowercase ones, or all letters that include diacritics separate from those without diacritics. Call the first subset of glyphs A and the second subset B. It should be clear to you that AutoKern will pair all glyphs in subset A among themselves, and also all glyphs in subset B among themselves, and also each glyph in A with each glyph in B (in that order) and each glyph in B with each glyph in A (in that order). In mathematical terms, you will have divided the set L into two disjoint subsets A and B, and induced AutoKern to compute the products A×A, B×B, A×B, and B×A, which is equivalent to computing the Cartesian product L×L, only in parts.
Now suppose you increase your set L of Latin glyphs to include a third subset C, containing stylistic variations of several letters. Now you have more glyphs than AutoKern can process all at once. How can you use AutoKern to obtain the new Cartesian product L×L? I will give you a moment to think…
By making AutoKern compute the Cartesian products of subsets A, B, and C of set L taken two by two! First, the products involving A and B — A×A, B×B, A×B, and B×A, as before; then the products involving A and C — A×A, C×C, A×C, and C×A; finally, the products involving B and C — B×B, C×C, B×C, and C×B. That is, you first run AutoKern with the glyphs from A and from B and save the generated code to an FEA file; next, run it again, this time with the glyphs from A and from C, and save the code to another FEA file; and run a third time with the glyphs from B and C and save the result to a third FEA file. After that, just merge the three generated files into one, which will contain all pairs of all glyphs.
Note that the products of each subset with itself will be calculated twice, and the pairings of its glyphs will appear duplicated in the unified file. These repetitions must be eliminated. Strictly speaking, that is not mandatory, because the OpenType Designer’s Code Editor lets these errors pass when compiling the code, merely warning about them. But since there will be tens of thousands of repeated lines, removing them will substantially reduce the code’s compilation time and make the final FEA file less gigantic.
A practical problem is that the Code Editor does not have a feature for automatically removing repeated lines. It only warns about them when it checks the code’s syntax, but leaves the removal to the user. Other applications, like Word and Excel, do have this feature, but the number of lines generated by AutoKern is often much larger than these applications support. I solved this problem by asking ChatGPT to write a small Python function for me to remove the duplicate lines in the code generated by AutoKern.
There is, however, a more serious problem to consider. In the reasoning described above, we considered only unitary kerning classes, with a single glyph each. That is because when you run AutoKern in parts, multi‑glyph kerning classes cause intractable problems. To understand how, suppose that, in the distribution of glyphs into three subsets, the glyph E ends up in subset A, the glyph Eacute ends up in subset B, and the glyph Edieresis in subset C. Also suppose that the AutoKern algorithm determines that the three, having identical left‑side outlines, can be placed in the same right‑side class in the pairings (a Second class).
What will happen is that when you run AutoKern with subsets A and B, the algorithm will create the class Second_E and include in it the glyphs E and Eacute, but not the glyph Edieresis, which will be neither in A nor in B. Next, when you run AutoKern with subsets A and C, the algorithm will create another class Second_E and include in it the glyphs E and Edieresis, but not the glyph Eacute, which will be neither in A nor in C. And later, when you run AutoKern with subsets B and C, the algorithm will either create a numbered class SecondN and include in it the glyphs Eacute and Edieresis (if you placed some other glyph in the character E), or it will create two classes for each glyph, Second_Eacute and Second_Edieresis (if you left the character E blank).
Later, when you merge all the classes into the same FEA file, you will have two classes Second_E with different glyphs, and the glyphs Eacute and Edieresis repeated in different classes. And each of these classes will have been paired by the algorithm with all the others. Since all these classes will have identical left‑side outlines, you would expect them to have the same kerning values in the pairings, which in theory would not cause problems. But we do not know the parameters or the logic of the AutoKern algorithm to be sure that the computed values will be the same for all these classes. Even if those values are identical, we do not know how the Code Editor’s compiler will treat these redundancies when checking syntax. Even if the compiler accepts these inconsistencies or redundancies, we do not know how the commercial software that will render the font will handle them.
These inconsistencies and redundancies can occur with any multi‑glyph classes, not only the 104 named with letters, but also the hundreds of others named with numbers. In probabilistic terms, it will happen to many of them, potentially involving thousands of glyphs and tens of thousands of pairings. Good luck debugging all that…
It is therefore much safer, when running AutoKern in parts, to exclude the 52 characters from Basic Latin block to induce AutoKern to create only classes with isolated glyphs, which will guaranteedly have different names and different glyphs. In short, this applies the KISS principle to the problem — Keep It Simple, Stupid!
This approach, however, as explained earlier, considerably increases the number of classes and pairings between them. It also limits the number of glyphs to be processed in each AutoKern run to 1,383. And since two subsets of the total glyphs will have to be processed in each run, each one will have to have at most half that amount: one with 691, the other with 692 glyphs. To make the math easier, let us normalize to 690 glyphs per subset, 1,380 in each AutoKern run. In practice, the calculations to be performed are as follows:
-
Divide the total number of glyphs to be processed by 690. If it yields a fractional number (it most likely will), round up to the next integer. This is the number N of subsets into which the glyphs will have to be distributed.
-
Divide the total number of glyphs to be processed by this number N of subsets. The result is the number of glyphs that should be placed in each subset. If the division is not exact (it probably will not be), some subsets will have one more or one fewer glyph than the others.
-
The number of AutoKern runs needed to compute all pairs of all glyphs will be a combination of the N subsets taken 2 at a time.
Three subsets (more than 1,380, up to 2,070 glyphs) will require three AutoKern runs; four subsets (up to 2,760 glyphs) will require six runs; five subsets (up to 3,450 glyphs) already mean 10 runs; six subsets (up to 4,140 glyphs), fifteen runs; seven subsets (up to 4,830 glyphs), 21 runs; eight subsets (up to 5,520 glyphs), 28 runs; and so on.
The AutoKern runs do not need to be sequential. To save time, they can be performed simultaneously, in parallel. My notebook — with a 12th Gen Intel Core i7‑1255U 1.70 GHz processor, 16 GB RAM, Windows 11 Pro 64‑bit — can run up to nine instances of FontCreator simultaneously, each one executing an AutoKern on 1,380 glyphs, without a drop in the performance of each instance. You can measure your machine’s performance (if it is a PC) with the Task Manager app, native to Windows: when several instances of FontCreator running AutoKern in parallel occupy 90% of the CPU processing time, you will have reached the maximum performance of your system for this type of processing. Running more instances of FontCreator will force Windows to dedicate less CPU time to each task, nullifying the benefits of parallelism.
The method explained above makes it feasible to automate kerning for all possible pairs of glyphs in a font. The number of glyphs is limited only by the available computational power. The values obtained for each pair, however, should be taken only as default values. The designer will still need to analyze the most frequent pairs in the most widely used languages to check whether the values assigned by AutoKern are indeed the most appropriate, or whether they need to be modified. No algorithm (for now) replaces the expertise of a good type designer.
And here are my suggestions for the FontCreator developers to improve AutoKern:
-
DO NOT FIX whatever it is that makes the AutoKern algorithm unable to group glyphs into kerning classes when the 52 Basic Latin characters are missing. It is this providential deficiency that makes it possible to use the tool on partitions of large sets of glyphs and then merge these partitions without inconsistencies or redundancies in the generated classes.
-
Create an option to run AutoKern without trying to group glyphs in kerning classes, instead putting them each in its own class from the beginning . Because when the 52 Basic Latin characters are missing, the many hours the program spends trying unsuccessfully to group the other glyphs into classes are just wasted processing time and electricity (therefore, money).
Even for those who will run AutoKern “the right way” for which it was designed, grouping glyphs into kerning classes may not always be desirable. Kerning classes were invented to somewhat expand type designers’ ability to manually make glyph pairings. Still, it is a modest increase compared to what a computer can do. Today’s machines are powerful enough to calculate the proper spacing of hundreds of thousands of glyph pairs in minutes. But they still take hours to analyze a few thousand glyphs and decide whether they can be grouped into classes. If the designer is going to accept the default values generated by AutoKern, making few later changes, the designer will be more interested in obtaining those values quickly, to have more time later to make the necessary adjustments.