Tuesday, June 19, 2007

widont.rb

I don't mean to knock Shaun Inman and his wordpress plugin Widon't. It's a very useful plugin that eliminates typographical widows by replacing the last space with a non-breaking space. He says the regular expression is '|([^s])s+([^s]+)s*$|'. This is of course, completely false; something was lost in translation here.

First, the pipe characters are weird but I think PHP accepts them. I've always used the forward slash to denote a regular expression. Second, the s is meant to be \s the notation for whitespace of any sort.

The actual plugin works fine. I think Shaun's just made a typo on his blog.

Here's a similar piece of code in Ruby:

"Lorem Ipsum Dalor Est".gsub(/([^\s])\s+([^\s]+)$/, '\1 \2')
# => "Lorem Ipsum Dalor Est"

I love how Ruby (and Javascript, if I recall correctly) gives regular expressions special status like how most modern languages treat strings. In web development and command line scripting, strings are the only real input/output you work with and regular expressions are akin to the hand of God.

No comments: