[thelist] regex help

Chris Spruck cspruck at mindspring.com
Sun Mar 24 11:48:01 CST 2002


Bruce,

At 12:23 PM 3/24/02 -0500, D.Bruce Saurer wrote:
>I read the regex article by Chris Spruck (thank you) and decided to dig in.
>I didn't get far but, thanks to the article I did manage to remove the font
>tags and attributes from an old page using  <f+.[^>]+>

Glad you read it and found it helpful!

>I tried for awhile to append it to grab the closing tags as well but
>settled for a quick
>two-step macro using TextPad.  Of course, today that's not good enough.
>I've reread the article and it seems it should be a simple thing but, I'm
>missing it.  Help would be most appreciated.

I tried this expression in TextPad 4.5 and it did the trick.

<FONT[^>]*>[^<]*</FONT>

This is a slightly modified version of the "simpler one" from the "Remove
FONT tags from your web pages" example in my article. You use the same
principle that finds the > that closes the opening <FONT> tag, to skip over
anything between the tags and find the < that starts the closing </FONT> tag.

[^<]* finds anything but the <, and then it finds the literal match to </FONT>.

Let me know if you need any more help or explanation.

HTH!

Chris




More information about the thelist mailing list