[thelist] Regex

Lindsay Evans lindsay at redsquare.com.au
Thu Feb 28 16:19:15 CST 2002


> It's worth noting that the Perl Cookbook (Recipe 20.6) cites the regexp
> below as invalid for all but the most simple HTML.  If you're using Perl,
> try using a package like HTML::Parser.  Otherwise, you're going to have a
> very hard time constructing a regexp that does this.

Interesting.

Does it happen to metion any specific cases where it doesn't work?
I've used this quite a bit on some rather complex html (including xhtml
tags, etc.), and it worked fine.

Though I'd imagine if you had invalid html to start with (ie. <img
src="arrow.gif" ... alt=" -> ">) that it would break.

Just curious.

--
 Lindsay Evans.
 Developer,
 Red Square Productions.

 [p] 8596.4000
 [f] 8596.4001
 [w] www.redsquare.com.au






More information about the thelist mailing list