[Javascript] regexp - how to exclude a substring?
Mike Dougherty
mdougherty at pbp.com
Mon May 23 14:48:56 CDT 2005
Have you considered using DOM methods? if this were client-side work, it'd be straightforward:
var loContainedDivs = getElementsByTagName("div") if (loContainedDivs.length == 0) {something}
Can you instantiate a rendering engine and use its methods to interrogate these html snippets?
On Sat, 21 May 2005 14:51:47 -0700
Paul Novitski <paul at novitskisoftware.com> wrote:
>Shawn et al.,
>
>I'm parsing some HTML using regular expressions but I'm stumped on one point:
>
>I want to find a string that begins with "<div" and ends with "</div" that does not enclose a
>nested "</div".
>
>I'm starting by locating a start & end tag pair:
>
> /<div.*>.*<\/div/si
>
>[si = include newlines + case-insensitive]
>
>I'm actually locating a specific tag using a regexp like this:
>
> /<div [^>]*id="target".*>.*<\/div/si
>
>That finds my starting & closing tags, but if I've got multiple divs it finds everything up to &
>including the final </div on the page.
>
>Therefore as my next step I need to know how to exclude "</div" from the innerHTML of the div.
> I've tried (.*(<\/div){0}) but it doesn't seem to work.
>
>1) How do I say "allow any number of any characters but don't allow this substring"?
>
>2) The direction I'm headed is to be able to include all nested divs in my target div. In other
>words, the range of selected text should include an even number of start & end tags of the same
>tagName as my target tag:
>
> <div id="target">
> <div>blah he blah</div>
> <div>blah he blah
> <div>blah he blah</div>
> </div>
> </div>
>
>I figure that once I solve problem 1) I'll be able to assemble a regular expression that allows
>nested tags (<div...>...</div) at least to some reasonable level of nesting. Any suggestions?
>
>Thanks,
>Paul
>
>
>_______________________________________________
>Javascript mailing list
>Javascript at LaTech.edu
>https://lists.LaTech.edu/mailman/listinfo/javascript
>
>
>
>__________________________________________________________
>This message was scanned by ATX
>5:52:35 PM ET - 5/21/2005
More information about the Javascript
mailing list