Saturday, May 17, 2014

Drop Caps: Other Writing Systems, Other Styles

In this entry I’ll start with some more Latin script examples in order to illustrate some principles, and then discuss some variants for other writing systems. This entry is very incomplete and needs a lot more research; I'm publishing it so people can help.

First, a cameo initial; that is, where the letter has a solid background:

The picture shows an SGML viewer I wrote a long time ago, displaying part of Thomas More’s Natural History. The initial is white on a dark reddish rectangular background and because the background is such a strong design element it is that background that aligns with the cap height and with the baseline, not the actual capital “I” inside it. If you are working with a font for decorative initials this is what will happen automatically. This example aligns with the small caps height, by the way; that would work if the entire first line, or at least the first two words had been set in small caps, but with just the “T” in “IT” in small caps the result is  somewhat unsatisfactory.

I am showing this example now because it gives a way of making your own decorative initial: you can draw a rectangle and put any letter inside it, and as long as the letter is well within the rectangle and as long as there is strong contrast you can get away with it. Note that you cannot do this with an img element and float, in general, as you don’t know what font size the user will have for the text and hence you don’t know how big to make your rectangle. With css calc() you could maybe say it is 2LH + 1CH, where CH is the cap height of the current font, but I don’t know as we have that unit yet, and LH (line height) is probably also not supported.

Next an example to show the idea of treating a “glyph” from a font as an image and then running the text around it:
This “A” is from an article by Allan Haley in U&lc Type Selection (London, 1992). It’s hard to see that the top of the A aligns with the cap height in the first line as there are no caps; the feet sit happily on the sixth baseline from the top, and the serif at bottom left protrudes a little further into the margin than would be normal, in order to make an effect. But the most important thing here is that the text abuts the edge of the A as closely as possible.

This is not the usual way to do a drop cap, partly because it wasn’t possible with earlier typesetting systems: with metal type and woodcut or metal initials this level of kerning involved filing (or kerfing) not only the initial but also the individial sorts, or pieces of type, for the first character on each line, and doing that with an apostrophe would be awfully hard. With film photosetters all the way to early laser printers the glyph outline shapes were not available to the renderer - only the character widths.

Today its possible to do this sort of work automatically, but the way to do it in CSS would probably be to be able to derive an image from a font glyph, and then use a runaround. The alignment points of the image would have to be set just as for a regular bitmap or SVG image used as a drop cap.

Let’s look at some more Western examples before moving on; I’ll use examples from the same article.

The raised initial “r” here is thoroughly modern and very conscious: people reading the text will notice that letter. It’s again treated as an image, so that the text can be run up next to the letter. One more of these, showing a different alignment point. I think this does not need to be in scope for CSS, because you would probably have to set the whole paragraph so carefully that you would end up using SVG, or an image.
Notice the top bar of the “f” lining up with the heading to the left as well as to the x-height on the first line of text. This is not really a drop cap or initial letter at all.

You can see the rest of that article at starting at page 18. There are quite a few more examples.

Now that we have seen some more complex Western examples, let’s consider other writing systems, starting with Arabic.  The initial capital does not work well with traditional Arabic calligraphy, because the individual letters are joined together and you can’t join a large thick letter to a small thin one without something looking really odd. The examples I have seen are mostly modern. Some use a modern sans-serif Arabic font, and it seems one approach is to draw the first letter large in a square, rather as I did with the cameo initial at the start of this blog entry, and to continue with a joiner such as “ـ” and the rest of the word small. If the square (whether cameo/reversed or not) is not used, the sane serif font seems to work best. A couple of examples (with kind thanks to Roozbeh Pournader and Shervin Afshar) can be found here and here. Typesetters and designers are still experimenting with alignment and positioning and style, as the initial cap seems relatively new. A larger initial word also occurs in Arabic, and in the examples I have seen so far is simply raised, like the raised initial “r” above.

The Arabic tradition of text as playful calligraphy suggests techniques like the “f” above will be common, and indeed they are common in calligraphy: I have seen a great many examples from Iran and Iraq especially. But they are not so common in typeset work, possibly because it quickly becomes too difficult. There are a couple of artistic calligraphic examples here and here by ibrahim abu touq of Jordan, or here is one by Malik Anas Al-Rajab in Iraq. You can’t do anything that wild with OpenType (although there are some sophisticated attempts) and the boundary between typography and calligraphy is fluid in the Arabic world, but the typography is only slowly starting to catch up with the calligraphy. The engines simply couldn’t take it.

See also this Google Group discussion.

For Chinese and Japanese I have seen examples with the first ideogram, or the first few characters, set in a box with a line round it, rather like some of the Arabic script examples. In this case, the box aligns with the character grid used for the text. However, it seems uncommon. A copy of GQ Magazine from Japan has nothing like a drop cap anywhere that I can see. I’d love to know more. The examples I have seen of a two-line drop cap in Japanese may just be a function of someone using InDesign or Microsoft Word (or JustSystems?) which supports the feature in some way. What about Korean? Mongolian?

For Indic languages such as Hindi (using the Devanagari writing system) the principal seems to be so use the top and bottom extent of the letter, just like an “A" in the simpler Latin script styles. Devanagari is interesting because the nominal baseline is actually not at the bottom of the characters, and the writing system does not really have the same alignment points as the Latin and Greek scripts. As with Hebrew (see below) it’s not just a single character that is made larger. For Hindi it’s generally a syllable, so that in स्थिति its the स्थि that is enlarged. I do not have information about alignment: does the result share a common “clothes line” baseline?

I do not yet have information on Indic scripts other than Devanagari (the example here came from Richard Ishida of W3C) but I have seen printed samples in other scripts. I had hoped that the W3C Indic language Requirements Document would clarify this, but the document didn’t really get far enough yet.

For languages like Slavonic and Vietnamese using Latin characters with lots of diacritical marks you end up having to decide what to do with accents. Accents, or diacritical marks, are entirely ignored when sizing and positioning an initial cap, but are taken into account when you have to avoid other things (such as text above or below) bumping into the letter. In practice this means that extra marks below a letter will probably cause an extra line of two of the main text to be indented, but the baseline of the main character aligns with the baseline of the nth line of text, and the cap height of the main character aligns with the cap height of the main text on the first line, even if that pushes accents up above the start of the text. This means that a 5-line drop cap (say) will use the same font size for the initial throughout a publication (assuming the body font and line spacing are the same), regardless of accents, and that’s what you want, following the graphic design principles of Similarity and Repetition.

This leads me round to Hebrew, where the entire first word of a paragraph is sometimes set larger making a “drop word.” Again I do not have information on alignment, especially of cantillation marks (John Hudson maybe?)

Cyrillic drop caps seem to work the same as Latin ones.

In almost all the cases I have seen the basic principle is the same: you line up the base character (or syllable or word) with the top of the base characters on the first line and the baseline, or bottom of the base characters (e.g. for Devanagari) on the bottom. For vertical scripts top and bottom are obviously rotated appropriately.

Wednesday, May 14, 2014

Using images as initial drop caps

Decorative initials are often sold as images, whether bitmap or vector, and whether as a set (such as one for each letter from A through Z to Æ) or singly. It’s common for alphabets to be incomplete, partly because J and W came relatively recently to our alphabet and partly because you often only get letters that occurred at the start of a chapter in the book for which they were designed.

Images lack some important information that is available for glyphs in a font. Fonts have information about where the baseline is to be found, and sometimes also the x-height and cap-height, and the character advance which is generally different from the bounding box. Images only have the bounding box: the size of the image. But we need that other information in order to position a drop cap correctly.

I’ll use an example from the GNU/Linux™ programmers’ manual to illustrate what I mean. Here, the blue and red “L” is a single image (a JPEG image in this case, from my own Web site at of course).

To make clearer exactly what’s going on I drew some lines:

Here, the image actually goes from just inside the left edge to line B; I darkened the image slightly so you could see the rectangle. Vertically it goes from line 1 at the top down to line 4 at the bottom. The image is positioned so that the top of the red “L” lines up exactly with the top of the first line of capital letters (line 2). The baseline of the “L” is harder to see, but lines up exactly with the baseline of the fourth line of text, (the line marked 3).

Because the first word is LSEEK, the rest of the first line (SEEK) is set close against the L. The remaining lines are flush against the edge of the image (line B) in this example. I’m not yet ready to describe non-rectangular images.

To get this right, we need to know the distance from line 1 to line 2, so that we can float the image backwards in the text until the top of the “L” lines up (I am assuming the reference to the image occurs just before the SEEK, in that paragraph, and not earlier in the page). We also need to know the position of line 3 so we can align the baseline to the text. In general, if you can't align everything, aligning the baselines is most important, although in this particular case you could reasonably argue that the curvy flourish of the “L” gives extra freedom. Don’t succumb to temptation: do it right.

There are actually three ways to set this initial and get everything aligned.

  1. Choose the size of the main body text and the line spacing so that you get an even number of lines (see below for the actual calculation);
  2. Scale the image so that it fits—this would work best for an SVG image rather than the JPEG that I have;
  3. Allow the image to rise above the start of the paragraph (least desirable but easiest to do).
The formula for the text size is that you want three times the full baseline-to-baseline distance plus the cap height of the text font to equal the height of the letter (not the height of the whole image, of course, which is almost never the same thing). Since in CSS you can never be certain which fonts will be used to display a document for any given user and browser/user agent, you cannot get this formula right for the Web. You can get it right for print, if you are printing your own document, of course.

Another difficulty is that CSS does not currently have a way to force lines to have their baselines glued to an invisible grid that’s a multiple of a line height (or a multiple of 0.5LH units, or ⅓rd for some designs). So the superscripted “e” on the first line may mess everything up.

The imaginative reader will also be wondering what should happen when the paragraph with the decorative initial should fall at the start of a page instead of the middle, so that there is no room for the blue flourishes above it. it would be nice to be able to supply alternate images for that case, but I am going to ignore it here, because it would be a separate mechanism entirely.

Scaling the image to fit also requires the same formula, but now instead of the cap height and line spacing being variable we have the image height being variable and the other quantities fixed.

As with drop caps made from fonts, it is always visible baselines that align, not invisible CSS line boxes. But with images, we need a way to mark the baseline, the indent (which we probably already have with text-indent) and the topmost (cap-height) alignment point, and to give a kern amount for the first line in the event that the letter forms art of a word.

Friday, May 9, 2014

Bruce Rogers on Drop Caps

The following notes on large (or decorative) initial capital letters are taken from Bruce Rogers, Paragraphs on Printing, 1943, New York. The copyright was not (as far as I can tell) renewed, placing the work out of copyright.

If you are short of time skip to the end; figures g, h and i represent, for Bruce Rogers, “how it should be done”. The opinion in the article on whether to align with small caps on the first line may be disputed in the case that the entire first line is set in small caps. The opinion about complex shapes not being worth the trouble to do right comes from the difficulty of hand-set metal type and no longer applies to computer type. Notes in square brackets are mine.


Usually it is desirable to distinguish the opening of a book, or a chapter, or sometimes even of a paragraph, by inserting an initial letter larger than the type. When individual types [i.e. metal pieces of type - Liam] were more precious than they are now a wise printer would have several founts of what was called ‘titling’ type. These were often the larger letters of one of the standard faces in used in his [printing] office, case without shoulder at the bottom so that they could be easily justified with any body type. A printer, lacking this titling type, would just insert one of the large capital letters of the fount [font], but usually without cutting off th shoulder, which would damage the type for further use with its own lower case.  This naturally resulted in an undue amount of white space below the letter, and to balance this he would leave a corresponding space between the side of [p. 113] the letter and the text type in all the lines abutting on the initial, with the exception of the first line which, with the first word usually in capitals, was brought over close against it (a and c, p. 114).

This became such a convention in composition that when rectangular or other shapes of ornamental engraved initials were inserted, the white spacing between the block and the type was still exaggerated, even though the bottom of the letter aligned perfectly with one of the text lines (e).
This was known as the ‘river and bridge’ practice, when it came in for criticism at the hands pf writers on typography; but to this day many compositors do not seem to have ever encountered any condemnation of the custom, and in the work of otherwise careful printers there is an astounding indifference to the manner of fitting in an initial letter.

One of the most common errors occurs when the initial is the beginning of a word which is completed in small capitals instead of capitals.  Almost invariable the top of the [page 114: figures, inserted at references] [page 115] letter will be found aligned with the top of the small capitals, instead of being lifted slightly above them to align with the capitals of the fount. (b).
If your initial will not range [that is, align] at the bottom with any of the lines of type [because pieces of metal type are physical objects of fixed sizes] it is preferable to raise it slightly above the first line, rather than to sink it (d, f).

Of course a safe way out of these perplexities is to use an initial only a little larger than the text ans set it so that it aligns at the bottom with the first line of type, extending upward into the blank space left for the chapter heading (j).
This is a ‘modern’ rather than an ‘old-style’ fashion, and is generally more appropriate when the composition is in modern type, though in fact it does occur very early in the history of printing and was common all through the eighteenth century. Bodoni followed this fashion continually. In many of his books the word, after the capital, is continued in lower-case type.

If calligraphic or other initials of free and irregular form are employed it is impossible to give any explicit directions for their setting. They should be dealt with according to their various shapes, and your eye alone must be the final arbiter.  It is usually better to let them stand above the type page (in the Bodoni manner) rather than to leave a blank rectangle in the type for them to fill. (Illustration 63).

Because of their shapes the letters A and L are especially difficult to fit as initials. If the A is an article [that is, a stand-alone single-letter word “a”], it can have a blank rectangle to itself (g), but if it is the beginning of a word it is usually better to cut into the top of the body to permit fairly close setting of the remainder of the word (h).
It is also an advantage to letter-space the word slightly to help equalize the space between it and the initial.

The letter L is still more difficult to handle, but it it be large enough to occupy more than two lines of type it should be cut in for the first line (i), as with A (h).
If the opening sentence is a quotation it is always a question how to indicate it, especially if the initial be a rectangular decorative one. The custom in some offices is to indent the whole initial to permit the quotes to be set outside it, but this always looks a misfit. If the quotes can be put in the margin they are better so; but otherwise it is allowable to ignore the opening quotes entirely. Their absence very seldom causes any confusion to the reader.


Figure 63 mentioned in the text is a reproduction of a full page, with the dotted lines showing the boundary of the original page:

Thursday, May 1, 2014

vi and emacs

Years ago as an undergraduate I spent a holiday working for a CAD/CAM software firm and used emacs there. The alternative was an ancient line editor on PR1MOS... so I got pretty used to emacs.

When I got back to University I was for a while a convert. It meant using the department's VAX in the evening, because emacs was banned in busy hours (it used to many system resources).

One day I was with some friends and happened on an insight that a block of C code should be in a loop. I logged in but, not wanting to keep people waiting, used vi to edit the code. There was already an if statement and I just needed to indent the block, so withot thinking I used the vi command >% to do that.

Afterwards I reflected. I had never indented a block using >% before. The reason I knew the command was not that I'd memorized thousands of weird keystrokes in case I ever needed them. It was because > is the vi command to indent by one tab stop, and % means from here to the matching bracket. I knew both of those separately and could do, for example, >L to indent every line from the current one to the bottom of the screen, or >1G to indent from here back to the start of a file, and I knew y% to yank (copy) from here to matching bracket, d% to delete, so >% just came naturally. It's a language, verb noun.

If I’d needed to do the same thing in emacs I'd have probably researched whether there was a command for it - not impossible but this was James Gosling’s emacs in 1983 or so, and it was still relatively minimal. When there wasn't, I'd have written a LISP function, documented it, added it to my personal library, bound it to a key, and, by this time, would have entirely forgotten why I wanted it.

For me, the simple clear language means I don’t need to think about editing, I can just do it. But maybe it’s more like the “Unix Philosophy” of having distinct commands that one can combine, rather than a few giant applications that don’t interact with each other, the dominant design pattern at the time.

It’s not about one being better than another, because people are individuals and different tools have different strengths and weaknesses for different people. You do have to learn a language with vi, and understand that it’s a language rather than just a set of weird and arbitrary commands, so the initial learning curve is much steeper than for, say, Windows Notepad. For me it was well worth it.