In this entry I’ll start with some more Latin script examples in order to illustrate some principles, and then discuss some variants for other writing systems. This entry is very incomplete and needs a lot more research; I'm publishing it so people can help.
First, a cameo initial; that is, where the letter has a solid background:
The picture shows an SGML viewer I wrote a long time ago, displaying part of Thomas More’s Natural History. The initial is white on a dark reddish rectangular background and because the background is such a strong design element it is that background that aligns with the cap height and with the baseline, not the actual capital “I” inside it. If you are working with a font for decorative initials this is what will happen automatically. This example aligns with the small caps height, by the way; that would work if the entire first line, or at least the first two words had been set in small caps, but with just the “T” in “IT” in small caps the result is somewhat unsatisfactory.
I am showing this example now because it gives a way of making your own decorative initial: you can draw a rectangle and put any letter inside it, and as long as the letter is well within the rectangle and as long as there is strong contrast you can get away with it. Note that you cannot do this with an img element and float, in general, as you don’t know what font size the user will have for the text and hence you don’t know how big to make your rectangle. With css calc() you could maybe say it is 2LH + 1CH, where CH is the cap height of the current font, but I don’t know as we have that unit yet, and LH (line height) is probably also not supported.
Next an example to show the idea of treating a “glyph” from a font as an image and then running the text around it:
This “A” is from an article by Allan Haley in U&lc Type Selection (London, 1992). It’s hard to see that the top of the A aligns with the cap height in the first line as there are no caps; the feet sit happily on the sixth baseline from the top, and the serif at bottom left protrudes a little further into the margin than would be normal, in order to make an effect. But the most important thing here is that the text abuts the edge of the A as closely as possible.
This is not the usual way to do a drop cap, partly because it wasn’t possible with earlier typesetting systems: with metal type and woodcut or metal initials this level of kerning involved filing (or kerfing) not only the initial but also the individial sorts, or pieces of type, for the first character on each line, and doing that with an apostrophe would be awfully hard. With film photosetters all the way to early laser printers the glyph outline shapes were not available to the renderer - only the character widths.
Today its possible to do this sort of work automatically, but the way to do it in CSS would probably be to be able to derive an image from a font glyph, and then use a runaround. The alignment points of the image would have to be set just as for a regular bitmap or SVG image used as a drop cap.
Let’s look at some more Western examples before moving on; I’ll use examples from the same article.
Notice the top bar of the “f” lining up with the heading to the left as well as to the x-height on the first line of text. This is not really a drop cap or initial letter at all.
You can see the rest of that article at uandlc.com starting at page 18. There are quite a few more examples.
Now that we have seen some more complex Western examples, let’s consider other writing systems, starting with Arabic. The initial capital does not work well with traditional Arabic calligraphy, because the individual letters are joined together and you can’t join a large thick letter to a small thin one without something looking really odd. The examples I have seen are mostly modern. Some use a modern sans-serif Arabic font, and it seems one approach is to draw the first letter large in a square, rather as I did with the cameo initial at the start of this blog entry, and to continue with a joiner such as “ـ” and the rest of the word small. If the square (whether cameo/reversed or not) is not used, the sane serif font seems to work best. A couple of examples (with kind thanks to Roozbeh Pournader and Shervin Afshar) can be found here and here. Typesetters and designers are still experimenting with alignment and positioning and style, as the initial cap seems relatively new. A larger initial word also occurs in Arabic, and in the examples I have seen so far is simply raised, like the raised initial “r” above.
The Arabic tradition of text as playful calligraphy suggests techniques like the “f” above will be common, and indeed they are common in calligraphy: I have seen a great many examples from Iran and Iraq especially. But they are not so common in typeset work, possibly because it quickly becomes too difficult. There are a couple of artistic calligraphic examples here and here by ibrahim abu touq of Jordan, or here is one by Malik Anas Al-Rajab in Iraq. You can’t do anything that wild with OpenType (although there are some sophisticated attempts) and the boundary between typography and calligraphy is fluid in the Arabic world, but the typography is only slowly starting to catch up with the calligraphy. The engines simply couldn’t take it.
See also this Google Group discussion.
For Chinese and Japanese I have seen examples with the first ideogram, or the first few characters, set in a box with a line round it, rather like some of the Arabic script examples. In this case, the box aligns with the character grid used for the text. However, it seems uncommon. A copy of GQ Magazine from Japan has nothing like a drop cap anywhere that I can see. I’d love to know more. The examples I have seen of a two-line drop cap in Japanese may just be a function of someone using InDesign or Microsoft Word (or JustSystems?) which supports the feature in some way. What about Korean? Mongolian?
For Indic languages such as Hindi (using the Devanagari writing system) the principal seems to be so use the top and bottom extent of the letter, just like an “A" in the simpler Latin script styles. Devanagari is interesting because the nominal baseline is actually not at the bottom of the characters, and the writing system does not really have the same alignment points as the Latin and Greek scripts. As with Hebrew (see below) it’s not just a single character that is made larger. For Hindi it’s generally a syllable, so that in स्थिति its the स्थि that is enlarged. I do not have information about alignment: does the result share a common “clothes line” baseline?
I do not yet have information on Indic scripts other than Devanagari (the example here came from Richard Ishida of W3C) but I have seen printed samples in other scripts. I had hoped that the W3C Indic language Requirements Document would clarify this, but the document didn’t really get far enough yet.
For languages like Slavonic and Vietnamese using Latin characters with lots of diacritical marks you end up having to decide what to do with accents. Accents, or diacritical marks, are entirely ignored when sizing and positioning an initial cap, but are taken into account when you have to avoid other things (such as text above or below) bumping into the letter. In practice this means that extra marks below a letter will probably cause an extra line of two of the main text to be indented, but the baseline of the main character aligns with the baseline of the nth line of text, and the cap height of the main character aligns with the cap height of the main text on the first line, even if that pushes accents up above the start of the text. This means that a 5-line drop cap (say) will use the same font size for the initial throughout a publication (assuming the body font and line spacing are the same), regardless of accents, and that’s what you want, following the graphic design principles of Similarity and Repetition.
This leads me round to Hebrew, where the entire first word of a paragraph is sometimes set larger making a “drop word.” Again I do not have information on alignment, especially of cantillation marks (John Hudson maybe?)
Cyrillic drop caps seem to work the same as Latin ones.
In almost all the cases I have seen the basic principle is the same: you line up the base character (or syllable or word) with the top of the base characters on the first line and the baseline, or bottom of the base characters (e.g. for Devanagari) on the bottom. For vertical scripts top and bottom are obviously rotated appropriately.