[thelist] Detecting HTML tags with Regular Expressions.

Sam-I-Am sam at sam-i-am.com
Tue May 8 10:05:42 CDT 2001


missing end parens...

> I usually use:
>         /<(\/*[^>]+>/

should be

>         /<(\/*[^>]+)>/

such sloppiness deserves a tip. (and it's been a while)

<tip type="working with forms" author="sam-i-am">
When your cgi/javascript/cfm (/ etc.) isn't giving back the results you
expect there's a few things you can do. 
You can put in debugging routines to print out everything that was
received
Or you can pick through your html to check all the elements are names
correctly, that javascript is sending what is should do .. etc. 
All of them a pain in the ass. 

I wrote a super simple cgi called echoForm that does just that -- it
echos back the form values it receives. By changing the action of the
form to point to it, I can see what all my form elements are named, and
what values are getting passed through often quicker than I could by
scanning the html. 

Hook this up with a bookmarklet to do the action switching easier makes
this (for me at least) a cool-tool worth keeping. Its also handy for
looking at how other sites do it.

Here's the bookmarklet. It uses the dom to switch the action, so NN4 is
out. 
NN6, IE5 (pc and mac) should be ok though. 

watch the wrap. You may have to install manually by making a new
bookmark and pasting it in, as it's over 256 characters. 
<javascript:void(d=document);void(df='http://sam-i-am.com/cgi-bin/echoform.cgi');void(c=(d.all)?d.all.tags('form'):d.getElementsByTagName('form'));with(c){for(i=0;i<length;i++)
item(i).setAttribute('action',df);}>

(this only looks at the first form in the document)

The cgi is hosted on my site - feel free to use it.. however if you want
that source too I can send it/post it. 
(and if you have suggestions on how to improve it.. let me know

</tip>


> 
> (< followed by anything except >, followed by >.)
> 
> though if you take a look at
> http://msdn.microsoft.com/scripting/jscript/doc/jsgrpregexpsyntax.htm
> 
> they suggest
>         /<(.*)>.*<\/\1>/ to match a opening and closing HTML tag
> 
> I tend to steer clear of .* as it doesn't (normally) match on newlines,
> and occassionally you see
> <img
>         src="this.gif"
>         name="that"
>         onmousover="something()"
>         alt="alt">
> 
> which is perfectly valid. In perl you can alter this behaviour so .*
> would work, I don't know a way in javascript.
> 
> > > I'm trying to do a
> > > form validation
> > > in which it will match any user's input, which can include
> > > html tags, and
> > > compare it with another list of valid html tags they can use.
> 
> needless to say you should also validate on the server-side too.
> 
> hth
> 
> Sam
> 
> ---------------------------------------
> For unsubscribe and other options, including
> the Tip Harvester and archive of TheList go to:
> http://lists.evolt.org Workers of the Web, evolt !




More information about the thelist mailing list