From: Robin Sloan
To: the lab
Sent: January 2016

Typographical tune-up

I can’t stand it when a phrase set off in em dashes — like this — gets broken such that a line of text begins with an em dash. I’m not sure if that’s a recognized typographical sin or just my own peeve; either way, I use a lot of em dashes, so I decided to fix the problem.

The regular expression you want is:

/(\w+)(—|—|—)/

Which you can then replace with:

$1$2

Or, if you’d like to “upgrade” your em dashes at the same time (certainly not required):

$1&VeryThinSpace;—&VeryThinSpace;

The  tag binds each em dash to its preceding word, so that none may ever stab nakedly to the left. You’ll need a scrap of CSS that looks like this:

span.nobr { white-space: nowrap; }

While I’ve got the glue out, I also run a bit of code to prevent widows and a particular kind of orphaned I. (That’s another annoyance I’m not sure is universally condemned, but which is definitely condemned by me.)

In Ruby, that processing block looks like this:

# a grab-bag of potential ending punctuation
content = content.gsub(/(\w+) (\w+[\.\!\?\)\:]+)\<\/p\>/,
                       '\1&nbsp;\2</p>')

# links don't necessarily have ending punctuation, tho
content = content.gsub(/(\w+) (\w+)\<\/a\>/,
                       '\1&nbsp;\2</a>')

# no orphan Is
content = content.gsub(/I (\w+)/,
                       'I&nbsp;\1')

January 2016, Berkeley