[Javascript] regexp - how to exclude a substring
Paul Novitski
paul at novitskisoftware.com
Mon May 30 11:26:28 CDT 2005
At 04:02 AM 5/30/2005, Alberto Domingo wrote:
>//
>
>
>Therefore as my next step I need to know how to exclude
>"</div" from the
>
>innerHTML of the div. I've tried (.*(<\/div){0}) but it doesn't
>seem to work.
>
>
>
>1) How do I say "allow any number of any characters but don't allow
>this
>
>substring"?
>
>//
>
>May be an stupid idea, but, I would try to first change the substring to
>exclude to a single special character (for example |). Then you can do the
>regxp search excluding that special character.
Thanks, Alberto, that's a great suggestion. I've often used transitional
global replacements to solve parsing & expanding problems in the past, but
I completely overlooked that doorway this time.
You could pre-process the HTML by replacing all close-tags with unique
tokens, so you wouldn't have to do it repeatedly:
</a> => chr(01)
</b> => chr(02)
</div> => chr(03)
</h1> => chr(04)
...
Then you could ensure that a particular tag didn't enclose nested tags of
the same type with a regular expression such as:
regex: "/<" + HTMLtag(N) + "[^" + HTMLtoken(N) + "]*" + HTMLtoken(N) + "/is"
i.e.: /<div[^]*/is
where == </div>
Cheers,
Paul
More information about the Javascript
mailing list