the lab
January 2016
Typographical tune-up
I can’t stand it when a phrase set off in em dashes —
The regular expression you want is:
/(\w+)(—|—|—)/
Which you can then replace with:
<span class="nobr">$1$2</span><wbr>
Or, if you’d like to “upgrade” your em dashes at the same time (certainly not required):
<span class="nobr">$1 — </span><wbr>
The <span> tag binds each em dash to its preceding word, so that none may ever stab nakedly to the left. You’ll need a scrap of CSS that looks like this:
span.nobr {
white-space: nowrap;
}
While I’ve got the glue out, I also run a bit of code to prevent widows and a particular kind of orphaned I. (That’s another annoyance I’m not sure is universally condemned, but which is definitely condemned by me.)
In Ruby, that processing block looks like this:
# a grab-bag of potential ending punctuation
content = content.gsub(/(\w+) (\w+[\.\!\?\)\:]+)\<\/p\>/,
'\1 \2</p>')
# links don't necessarily have ending punctuation, tho
content = content.gsub(/(\w+) (\w+)\<\/a\>/,
'\1 \2</a>')
# no orphan Is
content = content.gsub(/I (\w+)/,
'I \1')
January 2016, Berkeley