Robin Sloan
the lab
January 2016

Typographical tune‑up

I can’t stand it when a phrase set off in em dashes — like this — gets bro­ken such that a line of text begins with an em dash. I’m not sure if that’s a recog­nized typo­graph­i­cal sin or just my own peeve; either way, I use a lot of em dashes, so I decided to fix the problem.

The reg­u­lar expres­sion you want is:

/(\w+)(—|—|—)/

Which you can then replace with:

<span class="nobr">$1$2</span><wbr>

Or, if you’d like to “upgrade” your em dashes at the same time (certainly not required):

<span class="nobr">$1&VeryThinSpace;&mdash;&VeryThinSpace;</span><wbr>

The <span> tag binds each em dash to its pre­ced­ing word, so that none may ever stab nakedly to the left. You’ll need a scrap of CSS that looks like this:

span.nobr { white-space: nowrap; }

While I’ve got the glue out, I also run a bit of code to pre­vent wid­ows and a par­tic­u­lar kind of orphaned I. (That’s another annoy­ance I’m not sure is uni­ver­sally condemned, but which is def­i­nitely con­demned by me.)

In Ruby, that process­ing block looks like this:

# a grab-bag of potential ending punctuation
content = content.gsub(/(\w+) (\w+[\.\!\?\)\:]+)\<\/p\>/,
                       '\1&nbsp;\2</p>')

# links don't necessarily have ending punctuation, tho
content = content.gsub(/(\w+) (\w+)\<\/a\>/,
                       '\1&nbsp;\2</a>')

# no orphan Is
content = content.gsub(/I (\w+)/,
                       'I&nbsp;\1')

January 2016, Berkeley