[thelist] non-Roman characters in URIs [was: [TIP] - Use UTF-8 whenever possible]
kasimir-k
evolt at kasimir-k.fi
Fri May 12 13:51:25 CDT 2006
T. R. Valentine scribeva in 12/05/2006 17:01:
> So, how does someone with an Arabic or Armenian or Chinese ChaJei --
> or any one of dozens -- keyboard layout enter the non-accented Latin
> characters (000000–00007F of Unicode or ASCII) (besides using Alt+
> codes or copying and pasting)?
Good question! It would be great if list members using such keyboards
could tell us how it works in the real life. Meanwhile, I had a look at
http://en.wikipedia.org/wiki/Keyboard_layout
"Also, most non-Roman keyboard layouts have the capacity to be used to
input Roman letters as well as the script of the language"
I would believe that Roman letters were often printed on these
keyboards, as they are so commonly needed - but I just guessing here.
> My concern/interest is in having the Internet as international as possible.
You are not alone there, another Wikipedia article worth checking out:
http://en.wikipedia.org/wiki/Internationalized_domain_names
"Internationalizing Domain Names in Applications (IDNA) is a mechanism
defined in 2003 for handling internationalized domain names containing
non-ASCII characters. ... it was decided that non-ASCII domain names
should be converted to a suitable ASCII-based form by web browsers ..."
See also http://www.icann.org/topics/idn.html
Many are concerned though that domain name internationalization brings
babelization.
Also, it is good to remember that while users of non-Roman keyboards
most of the time can type Roman characters too (and they might even be
printed on the keys), users of Roman keyboards have no means to type
Chinese or Arabic - so using non-Roman characters (without possibility
to convert them to Roman characters, which IDNA does provide) would make
the Internet less international, not more.
.k
More information about the thelist
mailing list