[thelist] Regex
Andrew Moore
amoore at mooresystems.com
Thu Feb 28 16:32:00 CST 2002
for starters it won't match this by default:
<img
src="arrow.gif">
(beware the intentional line wrapping.)
-Andy
On Fri, Mar 01, 2002 at 09:21:07AM +1100, Lindsay Evans wrote:
>
> > It's worth noting that the Perl Cookbook (Recipe 20.6) cites the regexp
> > below as invalid for all but the most simple HTML. If you're using Perl,
> > try using a package like HTML::Parser. Otherwise, you're going to have a
> > very hard time constructing a regexp that does this.
>
> Interesting.
>
> Does it happen to metion any specific cases where it doesn't work?
> I've used this quite a bit on some rather complex html (including xhtml
> tags, etc.), and it worked fine.
>
> Though I'd imagine if you had invalid html to start with (ie. <img
> src="arrow.gif" ... alt=" -> ">) that it would break.
>
More information about the thelist
mailing list