[thelist] regex to validate filenames

Sarah Adams mr.sanders at geekjock.ca
Fri Dec 1 10:03:45 CST 2006


I'm writing a regex to validate filenames. I've got it pretty much
worked out, but it's getting a little unwieldy, so I thought I'd ask for
suggestions for simplifying it.

Basically, I want to validate a filename, checking to make sure that it
is a gif or jpg, and that it has no illegal characters in it. The files
are stored on a Linux box, so I'm using that as my basis for filenaming
restrictions. My understanding is that these characters should be
avoided in filenames on Linux:
* : \ / < > | " ? NULL (and all non-printing characters; i.e. \n)
I also want to make sure the filename can't start with whitespace or "."
(which indicates a hidden file).

So here's what I'm thinking:
- start of string:
  ^
- one allowable character, except not whitespace or ".":
  [\w`~!@#$%^&()\-=+[\]{};',]
- 0 or more allowable characters:
  [\w`~!@#$%^&()\-=+[\]{};', \t.]*
- a dot:
  \.
- an allowed file extension:
  (GIF|JPG)
- end of string:
  $

So I have:
^[\w`~!@#$%^&()\-=+[\]{};',][\w`~!@#$%^&()\-=+[\]{};', \t.]*\.(GIF|JPG)$

I tried to figure out a shortcut for specifying "one or more allowed
characters with the first not being whitespace or '.'", so that I
wouldn't have to repeat the list of allowed characters, but I had no luck.

Suggestions?

-- 
sarah adams
web developer & programmer
portfolio: http://sarah.designshift.com
blog: http://hardedge.ca



More information about the thelist mailing list