[thelist] cross-platform file names

Luther, Ron ron.luther at hp.com
Thu Apr 21 08:44:04 CDT 2005

Jeffery To noted:

>>I can see I'm probably not going to win any arguments here, 

Hi Jeffery,

Maybe there is a reason for that?  ;-)

(I think there is a pretty good chance you could be wrong ... However, 
I think that's very dependent upon "local conditions".)

This list encompasses a considerable diversity of folks working on a 
wide variety of projects covering a broad range of topics of differing 
scope. Some work on hobby sites they need to render in an RTL language. 
Some may be the only 'IT' person in a twenty person company needing 
to render all of their information in a single language that contains 
numerous characters with diacritical marks.

On the other hand, we have some folks here who need to *integrate* 
world-wide data originating in a number of different languages and 
character sets.  There are people here who need the data collected 
from their web apps to feed into 'traditional' big-iron corporate 
applications and downstream systems.  'Traditional' systems that can 
puke, abend, page out support and cause considerable misery and expense 
when fed file names containing spaces and data containing non ASCII-7 

Now ... if you are working on a project with a small 'local' scope, or 
if you are working with a small organization - you can pretty much do 
what you want.  You know everyone.  You know all of the impacts of your 
decision. And you can quickly fix anything that might go wrong.

However, if you are working in a bigger environment where you don't 
have control - where you don't comprehend the "big picture" ... and 
your 'cutesy' file names and characters take down corporate MRP, Order 
Management, or Financial systems ... that company might just decide to 
change your first name to "remember" -- as in "Remember To? He _used_ 
to work here."

I suspect that this difference in perspective (derived, perhaps, from 
working in a position with different scope) may be why folks are giving 
you such a rough time on this.

>>What happens when we extend the practice of avoiding problematic 
>>characters to file names with accented characters? (If I have two French 
>>files named "âge" and "âgé", do I rename them to "ge", or "age"?) 

As above ... the answer is "it depends".  If you work on a small system 
that you have complete control over -- nothing happens because you are 
aware of and have planned for this input.

OTOH, if your app feeds files and data into 'traditional' systems for 
a larger company based in an English speaking country ... you ARE going 
to cause other people's systems to break. 

And ... generally speaking ... other people don't like that!

>>What about Chinese file names? 

Same answer dude ... multibyte character set names and data can wreak havoc 
on the folks downstream of you ... and cost your company a considerable pile 
of cash and time and meetings to clean it up.  In some cases it may be worth 
it. In other cases it most definitely is not. YMMV.

It's all about the 'scope'. If you are building an on-line sales site for 
the Japanese market, to be managed and administrated in Asia --- then you 
are going to need to work with local multibyte character sets.  And that's 
fine.  But when you need to send the data collected back to the 'home 
office', (whether that's in Edinburgh, Atlanta, or Berlin), you are going 
to need to send that data in a format they can handle ... and that 
'acceptable' format is unlikely to be a mixture of Kanji, Big-5, and Katakana.


(Who has spent QUITE a bit of time over the last two years negotiating 
code changes across multiple systems located in multiple time zones to 
correct issues causes by 'spaces', 'diactrical characters', and Asian 
multibyte character sets. Heck, I'm *still* working on some of this 
stuff - and it makes me cranky.)

More information about the thelist mailing list