[thelist] [ColdFusion] Regular Expression Questions

.jeff jeff at members.evolt.org
Tue Feb 25 13:19:01 CST 2003


beam,

><><><><><><><><><><><><><><><><><><><><><><><><><><><><><
> From: Beam
>
> > <a href="([^:"]+)"
> >
> > All absolute URLs have colons in them between the
> > protocol and the rest of the URL AFAIK (https?:, ftp:,
> > mailto:), and so a URL that is lacking them would be
> > relative.
>
> well, you could have a : in the path somewhere, [...]
><><><><><><><><><><><><><><><><><><><><><><><><><><><><><

if you do, it should be escaped.  imo, the only truly valid characters in
paths are a-z, 0-9, _, -, ., /, %, ?, &, =, and ;.  i know that rfc 1738
(http://www.w3.org/Addressing/rfc1738.txt) states otherwise, but in our
system, those are the only valid characters.  so, with local links, i can
safely assume that they'll never contain a colon.

><><><><><><><><><><><><><><><><><><><><><><><><><><><><><
> so you could either say: ^(https?|ftp|mailto):\/\//
> where anything that does NOT match is a relative link, or,
> ^(\/?(\.\.?\/)*)
> which is sexier and matches things like
> /foo/bar
> ./foo/bar
> ../../../foo/bar
> but also
> /////wtf
><><><><><><><><><><><><><><><><><><><><><><><><><><><><><

but, unfortunately, doesn't match things like 'catalog_request/' which is
also a valid relative link.  you and i know that ./ is semantically the same
thing, but i can't expect non-html-savvy clients to get that right.

thanks,

.jeff

http://evolt.org/
jeff at members.evolt.org
http://members.evolt.org/jeff/




More information about the thelist mailing list