Hi Thanks for all the responses guys... really helpful stuff I'll definitely be reading more about these tricky (yet amazingly useful) little blighters :) It's nice to have some support where people know what you are talking about... a relatively unknown concept for me :) Cheers Dan -----Original Message----- From: thelist-bounces at lists.evolt.org [mailto:thelist-bounces at lists.evolt.org] On Behalf Of Joshua Olson Sent: 09 February 2006 13:43 To: thelist at lists.evolt.org Subject: Re: [thelist] Regexs and headaches > -----Original Message----- > From: Dan Parry > Sent: Thursday, February 09, 2006 7:29 AM > I've successfully got it to locate all opening tags and > ignore self-closers (eg <br/>). it even picks up tags with > attributes > > But (and this is a big but) it can't find single letter tags > (eg <b>). it can find single letter tags with attributes though > (eg <a> href="http://example.org <http://example.org/> ">) > > Here is the regex: > > /\<[^\/]([^<>]*)[^\/]>/g Hi Dan, The trick, I've found, to making a robust regex for matching HTML is to go back to the RFC and build it in totality. Using ColdFusion's version of Regex (which uses slightly different tokens than most regex's) I did this a while back. The resulting regex can be used to find all sorts of variations on tags. Take a look here if you are interested: http://concepts.waetech.com/unclosed_tags/ <><><><><><><><><><> Joshua L. Olson WAE Tech Inc. http://www.waetech.com/ Phone: 706.210.0168 Fax: 413.812.4864 Monitor bandwidth usage on IIS6 in real-time: http://www.waetech.com/services/iisbm/ -- * * Please support the community that supports you. * * http://evolt.org/help_support_evolt/ For unsubscribe and other options, including the Tip Harvester and archives of thelist go to: http://lists.evolt.org Workers of the Web, evolt !