Hi, Jeff.
You say “It’s considered good form to demand that regular expressions be considered verboten, totally off limits for processing HTML, but I think that’s just as wrongheaded as demanding every trivial HTML processing task be handled by a full-blown parsing engine.” Well, let me tell you my story.
Once, I had a rep of 1486 in Stack Overflow. I was so excited because finally, FINALLY, I could create my own tags. This was the objective of my life. I got 616 rep points in one month. I deleted my Twitter and Google + accounts for not losing a second. I just needed mere fourteen points! My question at http://stackoverflow.com/q/6873945 finally would have a “mozmill” tag; http://stackoverflow.com/q/6797631 and http://stackoverflow.com/q/6797779 would have the “rhinounit” tag; I could solve problems such as http://meta.stackoverflow.com/q/98584 by myself whether I find them. I rejoiced in anticipation.
Then, I found a quite innocent question about extracting some data from HTML. It seemed to be a pretty stably structured document, so I answered with a regex that could solve the problem: http://stackoverflow.com/q/6878032#6878203 Note that I emphasized that the solution was quick’n’dirty, an unstable document required some more sophisticated tool.
And I got a downvote. I could see my dreamt tags going away. I just give two steps behind, my journey would be longer. What if more people find my answer and downvote it too? What if I lost hundred of rep points?! My tags! MY TAGS! I panicked. I just managed to refrain my mourning to, between hiccups, give my testimony here.
There is a clear lesson here: do not parse HTML with regular expressions in any way. It can destroy your dreams, your soul, your life. If you do it, you’ll end up smoking crack. I learned the lesson and am trying to rebuild my life, maybe - MAYBE - with the ability of creating tags in SO. Do not make my mistake. It is not worth it.