[Javascript] regexp - how to exclude a substring

Paul Novitski paul at novitskisoftware.com
Mon May 30 11:26:28 CDT 2005


At 04:02 AM 5/30/2005, Alberto Domingo wrote:
>//
>
>
>Therefore as my next step I need to know how to exclude
>"</div" from the
>
>innerHTML of the div.  I've tried (.*(<\/div){0}) but it doesn't
>seem to work.
>
>
>
>1) How do I say "allow any number of any characters but don't allow
>this
>
>substring"?
>
>//
>
>May be an stupid idea, but, I would try to first change the substring to 
>exclude to a single special character (for example |). Then you can do the 
>regxp search excluding that special character.


Thanks, Alberto, that's a great suggestion.  I've often used transitional 
global replacements to solve parsing & expanding problems in the past, but 
I completely overlooked that doorway this time.

You could pre-process the HTML by replacing all close-tags with unique 
tokens, so you wouldn't have to do it repeatedly:

         </a>   => chr(01)
         </b>   => chr(02)
         </div> => chr(03)
         </h1>  => chr(04)
         ...

Then you could ensure that a particular tag didn't enclose nested tags of 
the same type with a regular expression such as:

regex:  "/<" + HTMLtag(N) + "[^" + HTMLtoken(N) + "]*" + HTMLtoken(N) + "/is"

i.e.:   /<div[^•]*•/is

where   • == </div>

Cheers,
Paul 





More information about the Javascript mailing list