[thelist] regex help
Chris Spruck
cspruck at mindspring.com
Sun Mar 24 11:48:01 CST 2002
Bruce,
At 12:23 PM 3/24/02 -0500, D.Bruce Saurer wrote:
>I read the regex article by Chris Spruck (thank you) and decided to dig in.
>I didn't get far but, thanks to the article I did manage to remove the font
>tags and attributes from an old page using <f+.[^>]+>
Glad you read it and found it helpful!
>I tried for awhile to append it to grab the closing tags as well but
>settled for a quick
>two-step macro using TextPad. Of course, today that's not good enough.
>I've reread the article and it seems it should be a simple thing but, I'm
>missing it. Help would be most appreciated.
I tried this expression in TextPad 4.5 and it did the trick.
<FONT[^>]*>[^<]*</FONT>
This is a slightly modified version of the "simpler one" from the "Remove
FONT tags from your web pages" example in my article. You use the same
principle that finds the > that closes the opening <FONT> tag, to skip over
anything between the tags and find the < that starts the closing </FONT> tag.
[^<]* finds anything but the <, and then it finds the literal match to </FONT>.
Let me know if you need any more help or explanation.
HTH!
Chris
More information about the thelist
mailing list