[thelist] Regular Expression Question
Sam-I-Am
sam at sam-i-am.com
Thu Nov 13 10:25:25 CST 2003
> She said "hello, world" [<a target="_new"
> href="link.html">1</a>]
>
> Is translated to:
>
> She said "hi, there." [<a target="_blank"
> href="link.html">1</a>]
>
hi Beau,
I'm also going to side-step your question. If you're using perl, I'd
recommend using HTML::TokeParser (one of the HTML::Parser family of
modules) which will raise events for each html tag (or text, comment,
etc) encountered. For a start tag, you get passed the tagname, attribute
hash, and original string. So you can focus on positive matches rather
than trying to regexp your way through an entire document with all the
possible exceptions.
Failing that maybe split it into 2 steps? Use /<([^>]+)>/ to get the
contents of the tag, unescape the quotes and whatever else you want to
do and write it back out.
hth
Sam
More information about the thelist
mailing list