[thelist] Detecting HTML tags with Regular Expressions.
Sam-I-Am
sam at sam-i-am.com
Tue May 8 09:28:12 CDT 2001
> seems to me this'll match: '<(\w+)>' and '<\/(\w+)>'
> but it won't ensure they are valid html tags which you seem to require. The
> only way for that is to actually compare against every possible html tag.
this will miss tags like <img src="some_image.gif">. Although it's not
as efficient, I usually use:
/<(\/*[^>]+>/
(< followed by anything except >, followed by >.)
though if you take a look at
http://msdn.microsoft.com/scripting/jscript/doc/jsgrpregexpsyntax.htm
they suggest
/<(.*)>.*<\/\1>/ to match a opening and closing HTML tag
I tend to steer clear of .* as it doesn't (normally) match on newlines,
and occassionally you see
<img
src="this.gif"
name="that"
onmousover="something()"
alt="alt">
which is perfectly valid. In perl you can alter this behaviour so .*
would work, I don't know a way in javascript.
> > I'm trying to do a
> > form validation
> > in which it will match any user's input, which can include
> > html tags, and
> > compare it with another list of valid html tags they can use.
needless to say you should also validate on the server-side too.
hth
Sam
More information about the thelist
mailing list