I can’t stand it when a phrase set off in em dashes—like this—gets broken such that a line of text begins with an em dash. I’m not sure if that’s a recognized typographical sin or just my own peeve; either way, I use a lot of em dashes, so I decided to fix the problem.
The regular expression you want is:
Which you can then replace with:
Or, if you’d like to “upgrade” your em dashes at the same time (certainly not required):
<span class="nobr">$1 — </span><wbr>
<span> tag binds each em dash to its preceding word, so that none may ever stab nakedly to the left. You’ll need a scrap of CSS that looks like this:
While I’ve got the glue out, I also run a bit of code to prevent widows and a particular kind of orphaned I. (That’s another annoyance I’m not sure is universally condemned, but which is definitely condemned by me.)
In Ruby, that processing block looks like this:
# a grab-bag of potential ending punctuation content = content.gsub(/(\w+) (\w+[\.\!\?\)\:]+)\<\/p\>/, '\1 \2</p>') # links don't necessarily have ending punctuation, tho content = content.gsub(/(\w+) (\w+)\<\/a\>/, '\1 \2</a>') # no orphan Is content = content.gsub(/I (\w+)/, 'I \1')
December 2020, Berkeley
You can explore my other blog posts.
The main thing to do here is sign up for my email newsletter, which is infrequent and wide-ranging. It goes out to around 18,000 people, but/and I try to make it feel like a note from a friend: