[thelist] cross-platform file names

Jeffery To jeffery.to at gmx.net
Wed Apr 20 11:53:35 CDT 2005


Hassan Schroeder wrote:
> Fred D Yocum wrote:
> 
>> Yes, but why is it bad practice?
> 
> 
> Your "young upstart Web master" should read the basic documents of
> his chosen trade :-)
> 
> <cite>: http://www.ietf.org/rfc/rfc2396.txt
> 
> Uniform Resource Identifiers (URI): Generic Syntax
> 
> 2.4.3. Excluded US-ASCII Characters
<snip>
> Aside from that, using any *non-printing* character in a file name
> is obviously ambiguous -- is it a space (as above)? a non-breaking
> space? a tab, which might look like a space in an HTML context?
> 
> Bad idea, all around. IMHO :-)

URIs and file names are two different things. If the file system 
supports spaces in file names, the file "this has spaces" should be 
accessible as <http://www.foo.com/this%20has%20spaces>. Note that this 
URL does not contain any excluded characters.

In fact all non-printable US-ASCII characters should be URL encoded, as 
explained in section 2.4 Escape Sequences in RFC 2396. This allows 
reserved and excluded characters, such as "&" and "#", as well as 
characters from other languages to be represented. And speaking of 
2.4.3, let me include the last line from that section:

<quote>
Data corresponding to excluded characters must be escaped in order to be 
properly represented within a URI.
</quote>

File names should not be limited to printable US-ASCII just because URIs 
are limited to those characters, if the file system supports it. 
Otherwise all non-English speakers will have to rename all of their (web 
accessible) files. (I may be wrong, but I believe most modern OSs 
support Unicode file names.) Having policy regarding file names is a 
different matter.

If a file does have non-printable US-ASCII characters in its name, then 
URIs referencing it should be URL encoded. This young upstart web master 
should be doing this, if he actually knows what he is doing.

Jeff

BTW Even though the markup may include "ugly" encoded URLs, most modern 
browsers will decode the URL before showing it to the user on the status 
bar.


More information about the thelist mailing list