[thelist] cross-platform file names

Jeffery To jeffery.to at gmx.net
Wed Apr 20 15:27:00 CDT 2005


Hassan Schroeder wrote:
> Jeffery To wrote:
> 
>>> <cite>: http://www.ietf.org/rfc/rfc2396.txt
>>> Uniform Resource Identifiers (URI): Generic Syntax
> 
>> URIs and file names are two different things.
> 
> I'd call that arguable but irrelevant if the files are going to be
> used in a Web context.

If I have a file "yucky spaces.txt" available at 
"http://www.foo.com/yucky%20spaces.txt", then the system file name 
"yucky spaces.txt" and the URL segment "yucky%20spaces.txt" are two 
syntaticly different strings.

> And white space in filenames/URIs fails the transcribability test
> discussed in the cited document:
> 
> 1.5. URI Transcribability
> 
>    The URI syntax was designed with global transcribability as one of
>    its main concerns. ...

I would handwrite the above URL segment as:
y u c k y % 2 0 s p a c e s . t x t

> So I'd still consider it best practice to avoid spaces in filenames.

I agree that in general avoiding spaces is a good idea. I can see how 
people can overlook using %20's instead of spaces in their markup. I'm 
against blindly following general conventions or common sense without 
examining the circumstances surrounding a problem.

What happens when we extend the practice of avoiding problematic 
characters to file names with accented characters? (If I have two French 
files named "âge" and "âgé", do I rename them to "ge", or "age"?) What 
about Chinese file names? URL encoding (or escaping) preserves system 
file names while providing safe URLs. It also encodes spaces.


I can see I'm probably not going to win any arguments here, so I'll quit 
now. I guess I should limit my file names to 8.3 characters because of 
the various apps and OSs long file names could break.

Jeff

BTW I'd be interested to know of any browsers that will show %20's 
instead of spaces to the user. Also I Googled:
   spaces filename OR "file name" url OR uri
but didn't find any relevant pages on problems caused by spaces. What 
keywords should I use instead?


More information about the thelist mailing list