[thelist] html stripping

Mehmet Buyukozer mbuyukozer at gmx.co.uk
Thu Apr 29 13:09:35 CDT 2004


Thank you Stephen&Chris&Joshua

Stephen: It sucks when it faces with '< a' this one. i must catch ALL links
in a given web site.

Chris: I try this with somepages,as you said,the result was what i wanted
but then tried http://www.extratitles.to for example, it stripped <a> but
also javascript portion of page. So i don't trust this one, and at it's
official web site, it says, it tries to do its best.

Joshua, this site is very good. i am looking examples,thank you.

----- Original Message ----- 
From: "Joshua Olson" <joshua at waetech.com>
To: <thelist at lists.evolt.org>
Sent: Thursday, April 29, 2004 8:32 PM
Subject: RE: [thelist] html stripping


> > -----Original Message-----
> > From: Mehmet Buyukozer
> > Sent: Thursday, April 29, 2004 8:38 AM
> >
> > Hi all
> >
> > I am not used to use regular expression,so i think i have some mistakes
in
> > pattern. Other than correcting my code, i really appreciate if
> > you have one that leads me to final point.
>
> Mehmet,
>
> The following link has some very good regex's for matching HTML tags.
> Though they are written for CF's regex, you should be able to easily
convert
> the posix to what you need:
>
> http://concepts.waetech.com/unclosed_tags/
>
> Best of luck,
>
> <><><><><><><><><><>
> Joshua Olson
> Web Application Engineer
> WAE Tech Inc.
> http://www.waetech.com/service_areas/
> 706.210.0168
>
>
> -- 
> * * Please support the community that supports you.  * *
> http://evolt.org/help_support_evolt/
>
> For unsubscribe and other options, including the Tip Harvester
> and archives of thelist go to: http://lists.evolt.org
> Workers of the Web, evolt !
>



More information about the thelist mailing list