- 1 Introduction
- 2 Smart fonts — the basics
- 3 The bad news: Multiple technologies (TIMTOWTDI)
- 4 Text Services Application Programming Interfaces (APIs)
- 5 The Contenders
- 6 Comparison of available technologies
- 7 References
1.1 Your first model
My son puts together plastic models of World War II airplanes. He can tell you the history, performance, armament, kill ratio, and numerous other specifications for a couple dozen aircraft types. When we bought his first AirFix model, we looked for one with “Level 1” difficulty because I knew it would have just a few parts to fit together. With a little experience under his belt, he has now moved on to Level 2 and 3.
Understanding font and rendering technologies is a little like that. Most of us started at Level 1 when we learned that one character was represented by one byte in the computer. We probably developed a mental model of rendering that looked something like this:
That is, the byte value for each character is used to index an array of shapes (i.e. font), and the resultant sequence of shapes is laid out on the paper, one next to the other.
Just like my son, who learned that Level 1 models do not have the detail he wanted, we need to understand that the rendering model shown above is naîve and inadequate for a number of reasons, and we need a more accurate understanding of what is going on during rendering. That is what this section is about.
1.2 Why the simple model doesn’t stand up
One of the first obvious weaknesses of the simple model is that a font cannot support more than one character set and encoding1. For example, the single font named “Times New Roman” supports several different character sets (Western, Eastern European, Cyrillic, Arabic). Further, we know that this same font has different encodings for the same characters on different platforms (e.g., Windows and Macintosh). The simple model does not allow us to do this.
Another weakness of our simple model is that it allows for small character sets only, i.e., those that have less than a few hundred characters. Not only do we need Unicode (so we can get beyond 256 characters), we need an efficient mapping from Unicode character code domain (0 .. 0x10FFFF) to glyph — we do not want our font to require an array of 1,114,112 glyphs just to support a thousand Unicode characters needed for, say, Ethiopic.
Finally, we recognize that rendering (even Latin text) is more difficult than simply putting a bunch of glyphs side by side in a row. In Latin text, for example, diacritics need to be positioned (or even stacked) over the preceding character and simultaneously any dotted letters (i or j) on which the diacritics are stacking probably need to lose their dots. This model obviously cannot handle non-Roman scripts, with their complexities of contextual glyph shapes, reordering, and positioning.
1.3 Simple TrueType model
The first enhancement we will look at is what I’ll call the TrueType model. The essential difference between this model and our simple model is the introduction of one or more mappings between coded character sets and the glyph repertoire. You will recall that in TrueType fonts a table called the cmap is used to find what glyph is to be used for a given character code. Further, a TrueType font can -– and most do -– contain more than one cmap2.
As it turns out, TrueType permits each computer platform to have its own cmaps, and Apple and Microsoft have used different design strategies in their overall system design. Specifically, Microsoft has chosen to implement just one cmap, and it maps from Unicode3 to glyph. This was true even on Windows 3.x that did not otherwise know about Unicode. Windows uses codepages to map from 8-bit character sets to Unicode, but that is the subject of another paper4. Apple, on the other hand, implements multiple cmaps for itself, each one mapping from a particular 8-bit encoding (e.g., MacRoman, MacArabic) to the glyph palette5.
A picture is worth 1000 words, they say, so have a look at the following:
1.4 Why the TrueType model still is not enough
Well we have certainly overcome one of the limitations of the simple model: TrueType fonts support multiple simultaneous encodings and more than a few hundred characters. We can see how a single font such as Times New Roman implements multiple character sets on both Windows and Macintosh, and the two platforms each have their own distinct mappings. However, for a given platform and encoding, the TrueType model is still limited to one-to-one character-to-glyph mappings. That is, a given character code always shows up as the same glyph.
So we call the TrueType model a “dumb font” model because it does not have the ability to assist with complex rendering issues such as diacritic stacking, contextual forms, reordering, etc. This contrasts with a “smart font” which can handle these additional features.
2 Smart fonts — the basics
In order to support the additional complexities of non-Roman scripts, we have to get away from the one-to-one character-to-glyph mappings that have limited us so far. Smart fonts contain additional information (the “smarts”) that assists in implementing complex transformations between the encoded data and the display surface (printer or screen).
2.1 Complex transformation types
Just what do I mean by complex transformations? In order to support non-Roman scripts, we need to facilitate at least the following kinds of complexities (there are more, but this is a sampling):
2.1.1 Ligature Substitution
In some scripts, specific sequences of letters have shortcut forms called ligatures. Often these have developed over the centuries as either more efficient or more ornate ways to write the sequences. As a Latin script example, the ampersand character “&” is actually a ligature of “et”, Latin for “and”. In Arabic there are hundreds of ligatures, some of which are required— you dare not write without them.
2.1.2 Contextual Substitution
In some scripts the shape of a letter is dependent on the letters around it. A simple example of contextual substitution is the sigma in Greek: it appears differently in the middle of the word than at the end of the word. Fonts that can simulate cursive handwriting are another example.
Most of us are used to thinking of glyphs on the page appearing in the same order as they are spoken (which is usually how they are encoded). Indic languages are not so simple: some letters get moved to the beginning (or end) of a syllable or word.
We have already mentioned glyph positioning in regard to diacritic handling. Certain scripts in Asia have characters that visually wrap around other characters. Many glyph positioning problems are contextual in nature, i.e., the position of one glyph is dependent on what is before or after it.
Rendering complex scripts also means having to deal with directionality issues. There are three distinct areas of concern: baseline orientation, script direction, and “bidi” (bi-directional) issues. Orientation distinguishes scripts like Chinese and Mongolian, which can be written top-to-bottom on the page from scripts that are written horizontally. Directionality refers to the fact that while most horizontal scripts run left to right, some such as Hebrew, Arabic and Syriac run right to left. Bidi issues arise from mixing languages that go different horizontal directions, e.g., a Hebrew phrase in an otherwise-English paragraph. Some scripts are internally bi-directional — in Arabic, for example, the nominal flow is right-to-left but numbers are written left-to-right.
2.2 Smart fonts cannot do it alone
We said that smart fonts contain additional information (the “smarts”) that assists in implementing complex transformations between the encoded data and the display surface (printer or screen). While you can put all the additional data you want into a font (for example the TrueType file format is inherently extensible by adding new table definitions), the data is useless without rendering software that can interpret that data and do something useful with it. Thus we have our first axiom:
- Successful rendering of textual data requires a cooperative effort between rendering software and fonts.
What this means is that application developers have to design their applications to specifically take advantage of a given complex script rendering technology. In particular, existing applications cannot magically gain the ability to deal with complex scripts by the addition of some system component. Why is this?
Complex scripts introduce a number of thorny issues that applications need to be aware of. As a simple (and very common) example, consider what an application needs to do as you type text into a new paragraph. A westerner thinking about this problem probably figures he can simply output each character as it comes in from the keyboard. In complex scripts this does not work since appending a new character on the end of a sequence can change the appearance of the preceding characters. So at least the whole last word (and maybe more!) of the line needs to be erased and redrawn.
Consider how an application figures out where to break a line as text approaches the margin of the paper. A simple approach says you iteratively add one character at a time to a text buffer, ask the system how much room the text takes up, and repeat until the margin is reached. In complex scripts this does not work. It might be, for example, that adding a few more characters would cause the line to be shorter (e.g., due to a ligature).
Probably one of the nastiest complexities is known as hit testing. In interactive systems, rendering is not a one-way process from character to screen. You also have to support going the other way, from screen to character, so that when the user clicks their mouse somewhere on the screen you are able to work back through the process and figure out where in the underlying character stream the edit should be made. When ligature or contextual substitution, reordering, or complex glyph positioning is taking place, hit testing is not a simple task.
3 The bad news: Multiple technologies (TIMTOWTDI)
If you haven’t seen this acronym yet, you probably will soon anyway. It means “there is more than one way to do it.”
It is unfortunate that at this point in time there are multiple contenders for the role of complex script technology. Microsoft, while it has the lion’s share of the market, has only recently made progress in this area, and does not implement sufficient extensibility in its software to support all of the world’s writing systems. Apple, which is now on their third generation of such technology, all of which have always had the needed extensibility, does not have the market share and thus application support is lagging.
It is for these reasons that the SIL Graphite project was started, but that of course just adds yet another contender to the ring!
Each of the several contending technologies works differently and, as a result, has defined its own format for the smarts that go into a font. Specifically, each technology defines its own new TrueType tables to contain the data it needs.
3.1 Fonts that Support Multiple Technologies
It should be noted that it is possible to build fonts that support multiple technologies. Because of the extensibility of TrueType fonts, it is feasible for one font file to contain the “smarts” for several different technologies. This makes it possible to build one font that implements the same, or at least similar, behavior on each technology. Until one technology wins the competition and becomes ubiquitous, it may make sense to develop such multi-technology fonts in order to get the application coverage needed.
4 Text Services Application Programming Interfaces (APIs)
Application writers utilize operating system-supplied facilities by invoking one or more APIs (Application Programming Interfaces). Drawing text on a screen is often done by a series of API calls to first measure and divide the text into lines and then draw the physical glyphs of each line. Detecting user events such as mouse clicks involves another set of APIs. Taken together, the APIs needed to draw and edit text, including formatting, line breaking and hit testing, are often called the text services APIs.
Now the bad news: each of the various complex script technologies defines its own APIs. There is not one all-encompassing API that would permit an application writer to write one application that supports all the technologies6. So we get to our second axiom:
- Applications must be designed and implemented with a certain rendering API in mind.
In order to understand the capabilities and limitations of each technology we need to discuss at least some general characteristics of the APIs of each. There are at least two axes to look at:
4.1 Axis 1: Encoding
The first major question is: are the APIs Unicode-based or codepage-based? Unicode-based APIs have access to the full range of Unicode characters. Codepage-based APIs interpret the (typically) 8-bit characters in the text through a mapping called a codepage. So to know what an 8-bit value really represents, you have to know what codepage is being used.
APIs that are not Unicode-based are always going to be living within the limitations imposed by traditional 8-bit character sets. One of the primary limitations here is that there are a lot of characters, whole scripts in fact, defined by Unicode but which are not accessible through system-supplied codepages.
4.1.1 UTF-8 — the 8-bit imposter?
There is a complicating factor to watch out for: there is an encoding form that represents Unicode text as a series of 8-bit bytes, e.g., each Unicode character will require between 1 and 4 bytes to represent it. Called UTF-8, it was invented as a way to pass off Unicode data as if it were 8-bit character data, and thus to be able to sneak it through traditional 8-bit APIs and into interchange media such as data files.
Do not confuse UTF-8, which is a Unicode encoding, with codepage-based encodings. It might be possible to encode Unicode data in a sequence of 8-bit bytes, but passing those bytes to a codepage-based text services API is not likely to get you what you want. Conversely, a Unicode-based API may be implemented in such a way that the text data buffers are UTF-8, but passing codepage-based data to such APIs will give equally bad results.
4.2 Axis 2: Complex-script aware or not
Depending on what operating system you are using, the “standard” or default text services APIs may or may not be complex-script aware. And a given smart-font technology may have different levels of access that achieve different kinds of results. Here are some examples:
4.2.1 Windows TextOut() APIs
From antiquity, the routines DrawText(), TextOut() and (more recently) ExtTextOut() have been the work-horse APIs used to draw text. Except on OS editions localized for regions of the world where complex scripts were needed, these routines did not do any complex script handling. And in fact they still do not, except on Windows 2000 and XP. On those operating systems, which come with Uniscribe as a standard system component, these standard APIs now do complex script shaping even for programs that have not been designed for it or are not expecting it7.
4.2.2 Uniscribe or OTLS?
When Microsoft and Adobe first introduced their OpenType technology, Microsoft promised to supply developers with a set of APIs for extracting and using the OpenType information in a font. The APIs would be bundled together as the OpenType Layout Services (OTLS) library. As it turns out, the OTLS is implemented at a relatively low level, leaving the application responsible for a lot of the work when using OpenType to do either complex script rendering or fancy typography.
When IE 5, Office 2000, and Windows 2000 were in development, Microsoft realized the common need to render Unicode text could be managed by a single OS component, and so built a new set of APIs called Uniscribe. For each supported script, Uniscribe contains a component that acts as a specialized “shaping engine.” Each shaping engine assumes certain OpenType features are implemented in the fonts.
The result is that on this one platform (Windows), there are different levels at which an application might take advantage of one technology (OpenType). Applications such as Adobe’s InDesign utilize OTLS to use OpenType for full flexibility and typographic finesse, while other applications such as Word use Uniscribe (which uses OTLS) to obtain shaping for complex scripts.
4.3 FieldWorks: Multi-technology rendering
FieldWorks represents a new genre of application in respect to rendering: it implements a plug-in architecture. We said earlier in Axiom 2 that applications must be written with a specific API in mind. FieldWorks has abstracted the rendering interface and, using COM technology, allows different rendering plug-ins to be used as they are needed — even within the same document. There are currently two implementations of the rendering interface, one based on the standard Windows text services APIs (thus it does not know anything about complex scripts on most Windows platforms), and one based on SIL Graphite. It is expected that eventually a third implementation will be based on Uniscribe.
Note that Axiom 2 still holds: FieldWorks is designed and implemented with a specific API in mind. In this case it is a high-level API that is implemented as a wrapper around different underlying technologies.
5 The Contenders
Finally we get to a more detailed analysis of each of the rendering technologies.
For FieldWorks (and any other software that would like to use it) SIL International has developed Graphite. The name Graphite applies to both the font technology (you write script descriptions in a high level language and compile them into a font) and the rendering software (a module that applications need to do the rendering).
Currently the only Graphite-aware application is a styled-text editor WorldPad, but of course FieldWorks is still in development. There is significant interest in Graphite from companies outside of SIL (particularly the Linux market), and the SilGraphite open source project aims to stimulate 3rd party development (applications and fonts).
5.2 Uniscribe + OpenType fonts
Microsoft’s entrance into the arena came several years ago when, in cooperation with Adobe, they introduced OpenType (originally called TrueType Open), an enhancement to TrueType font files that provided extra tables intended to support complex scripts. However Microsoft did not simultaneously provide software that could take advantage of these extra tables but assumed that application developers would build their own. They did not, and as a result take-up on the technology has been slow.
Finally, in 1999 Microsoft introduced the needed software: Uniscribe, a Windows system-level component that could take advantage of OpenType fonts. Microsoft Windows 2000 and applications Internet Explorer 5 and Office 2000 were released with support for Uniscribe built in.
5.3 ATSUI + AAT fonts
In 1999, Apple introduced Apple Type Services for Unicode Imaging ( ATSUI) which is now the basis for all Unicode text drawing in the system. The corresponding font format, Apple Advanced Typography ( AAT), is the successor to the Quickdraw GX font technology. ATSUI is new enough that there are few applications available, but Jonathan Kew has adapted TeXgX to ATSUI, renaming it XeTeX (pronounced Zee-Tech).
5.4 Quickdraw GX + GX fonts, WorldScript+ WorldScript modules + simple fonts
Apple was first to provide sophisticated rendering technologies capable of handling complex scripts. Both GX and WorldScript have been supplanted by ATSUI+AAT, but they are still used in some environments.
WorldScript was the first of these two technologies and utilized plug-in modules (external to the Mac OS and external to the fonts) to manage the complex rendering task. Though limited in some ways (e.g., total glyph count), it was relatively easy to make an application “WorldScript aware”, so there were a lot of such applications, including ShoeBox for the Mac.
The GX technology, particularly as it relates to text rendering, was much more capable than WorldScript. Nearly all the complexities needed for non-Roman scripts could be implemented using the powerful state-machine facility of GX. Unfortunately, utilizing GX required application writers to adopt a completely new imaging model, and as a result there was poor uptake on the technology. For this and other reasons, OS 8.6 was the last system release to officially support GX. However, the advanced typography features of GX (i.e., the features that were needed to support complex scripts) live on in ATSUI + AAT.
5.5 SDF Renderer + SDF Description + simple fonts
SDF technology, implemented by SIL’s Tim Erickson, is optimized for handling the contextual substitutions of Arabic script but has been used successfully for other non-Roman scripts. SDF stands for “Script Definition File”, and this is one of the features of the technology that makes SDF so approachable: the contextual mapping rules are written in plain text in a .SDF file, and there is no compilation stage and no special fonts — you simply associate the .SDF file with the target font.
The major applications supporting SDF today are ShoeBox and LinguaLinks. A styled text editor, ScriptPad, is in development.
6 Comparison of available technologies
|Graphite||Uniscribe/ OpenType||ATSUI/ AAT||GX||WorldScript||SDF|
|Supplier||SIL (NRSI & LSDev)||Microsoft||Apple||Apple||Apple||SIL (Tim Erickson)|
|Production Status||in development||released||released||obsolete||obsolete||released|
|Glyph limits||65535||65535||65535||65535||~4000||223 (note 1)|
|Tools||Graphite compiler||MS VOLT and OT Assembler, Adobe FDK, SIL Font:TTF (Perl)||Apple AAT Font Tool, SIL ScriptWrite||Apple TrueEdit and TypeWriter, SIL ScriptWrite||SIL ScriptWrite||SDFE (SDF Editor)|
|Word Processing||WorldPad (note 2)||MS Word 2000||WorldText||SimpleText, WorldWrite, Nisus||ScriptPad (note 2)|
|Specialty||FieldWorks(note 2)||Paratext (note 2)||ShoeBox, Translator’s Assistant, TextUtils, Conc||LinguaLinks, Shoebox|
|Graphics/DTP||MS Publisher 2002||Lightning Draw, RSG, Creator2|
- There is an SDF interface that supports more glyphs, but there is no application that uses it yet.
- In Development — pre-release software available.
Constable, Peter. 2000. Understanding Multilingual software on MS Windows: The answer to the ultimate question of fonts, keyboards and everything. ms. Available in CTC Resource Collection 2000 CD-ROM, by SIL International. Dallas: SIL International.