Robin Sloan
the lab
January 2016

Typographical tune-up

I can’t stand it when a phrase set off in em dashes — like this — gets broken such that a line of text begins with an em dash. I’m not sure if that’s a recog­nized typo­graph­ical sin or just my own peeve; either way, I use a lot of em dashes, so I decided to fix the problem.

The regular expres­sion you want is:

/(\w+)(—|—|—)/

Which you can then replace with:

<span class="nobr">$1$2</span><wbr>

Or, if you’d like to “upgrade” your em dashes at the same time (certainly not required):

<span class="nobr">$1&VeryThinSpace;&mdash;&VeryThinSpace;</span><wbr>

The <span> tag binds each em dash to its preceding word, so that none may ever stab nakedly to the left. You’ll need a scrap of CSS that looks like this:

span.nobr { white-space: nowrap; }

While I’ve got the glue out, I also run a bit of code to prevent widows and a partic­ular kind of orphaned I. (That’s another annoyance I’m not sure is univer­sally condemned, but which is defi­nitely condemned by me.)

In Ruby, that processing block looks like this:

# a grab-bag of potential ending punctuation
content = content.gsub(/(\w+) (\w+[\.\!\?\)\:]+)\<\/p\>/,
                       '\1&nbsp;\2</p>')

# links don't necessarily have ending punctuation, tho
content = content.gsub(/(\w+) (\w+)\<\/a\>/,
                       '\1&nbsp;\2</a>')

# no orphan Is
content = content.gsub(/I (\w+)/,
                       'I&nbsp;\1')

January 2016, Berkeley