[Javascript] regexp - how to exclude a substring?

Mike Dougherty mdougherty at pbp.com
Mon May 23 14:48:56 CDT 2005


Have you considered using DOM methods?  if this were client-side work, it'd be straightforward: 
var loContainedDivs = getElementsByTagName("div")  if (loContainedDivs.length == 0) {something}

Can you instantiate a rendering engine and use its methods to interrogate these html snippets?

On Sat, 21 May 2005 14:51:47 -0700
  Paul Novitski <paul at novitskisoftware.com> wrote:
>Shawn et al.,
>
>I'm parsing some HTML using regular expressions but I'm stumped on one point:
>
>I want to find a string that begins with "<div" and ends with "</div" that does not enclose a 
>nested "</div".
>
>I'm starting by locating a start & end tag pair:
>
>	/<div.*>.*<\/div/si
>
>[si = include newlines + case-insensitive]
>
>I'm actually locating a specific tag using a regexp like this:
>
>	/<div [^>]*id="target".*>.*<\/div/si
>
>That finds my starting & closing tags, but if I've got multiple divs it finds everything up to & 
>including the final </div on the page.
>
>Therefore as my next step I need to know how to exclude "</div" from the innerHTML of the div. 
> I've tried (.*(<\/div){0}) but it doesn't seem to work.
>
>1) How do I say "allow any number of any characters but don't allow this substring"?
>
>2) The direction I'm headed is to be able to include all nested divs in my target div.  In other 
>words, the range of selected text should include an even number of start & end tags of the same 
>tagName as my target tag:
>
>	<div id="target">
>		<div>blah he blah</div>
>		<div>blah he blah
>			<div>blah he blah</div>
>		</div>
>	</div>
>
>I figure that once I solve problem 1) I'll be able to assemble a regular expression that allows 
>nested tags (<div...>...</div) at least to some reasonable level of nesting.  Any suggestions?
>
>Thanks,
>Paul
>
>
>_______________________________________________
>Javascript mailing list
>Javascript at LaTech.edu
>https://lists.LaTech.edu/mailman/listinfo/javascript
>
>
>
>__________________________________________________________
>This message was scanned by ATX
>5:52:35 PM ET - 5/21/2005




More information about the Javascript mailing list