[thelist] Website in Multiple Languages

Martin Burns martin at easyweb.co.uk
Wed Oct 20 11:22:58 CDT 2004


Bob asked:
> As a general matter, what methods are used to present content in
> multiple languages? I was thinking some sort of CMS where the data is
> stored in a db and translated "on the fly" into the appropriate
> language -- but I'm new to this area.

Hi Bob

You have 2 levels of problem here:

1) Translating and storing the info

This breaks down into
a) What info?

You have both the page content, and all the labels that go around it - stuff
like navigation, instructions ("Search") etc. For the labels, you'd usually have
a dictionary for each language, so navigation label 1234 is "Search" in English and
"Rechercher" en Francais. You may also need to go to different levels for language
differentiation: EN_GB, EN_US etc.

Don't forget images too!

Take a look at how (for example) ZenCart does it.

For the page content, you'll need to store the manually translated content in
all your multiple languages for each content asset - each page with all its
data items.

As has previously been mentioned, machine translation really, really doesn't
work if you want to effectively communicate. Not only are there the issues of
machine translators not understanding what you mean, but also the problem of
nuance and idiom. Even between versions of English, there are significant risks
of misunderstanding - I forget the phrase, but I've worked on a project where
the exact same words meant *diametrically opposite* things in US and UK English.


2) Presenting the info
Next you need to work out which info to present to the user, and how.
Browsers are pretty good at telling servers what the user's preferred language
is. Right now, I'm in French Switzerland, in a client environment. My client
provided PC has default browser prefs to request French. My laptop (on which
I'm typing) is set to UK English. When I visit my own website, the interface
(ie all the labels, helptext and so on) are in French on one machine, and
English on the other, automagically.

If I had multilingual content too, that would also localise.

You might want to think about the costs & benefits of allowing users to
set a preference without mucking about in the browser. That could be a
simple cookie thing, or if you have user registration, then an explicit
user preference.

But there's a fun problem specifically with the interface, particularly
with tightly defined non-liquid layouts that take into account given
lengths of nav text: the same information is a different size in different
language. German in particular tends to be significantly longer than the
English equivalent.

Also, if you're in different language, you'll need to play the characterset
game. Not every language (actually, very few) will display nicely in ASCII.
Even among single byte, left-to-right languages, you've got a whole range
of fun accents and non-ASCII characters. Google for ISO 8859 for more...


To be honest, many CMSs (particularly OSS ones used in multiple countries)
will handle this for you out of the box. Otherwise, it's a Hard Thing to
code from scratch.

Cheers
Martin

-- 
"Names, once they are in common use   | Spammers: Send me email to
 quickly become mere sounds, their    | -> yumyum at easyweb.co.uk <- to train
 etymology being buried, like so many | my filter. Currently killing over
 of the earth's marvels, beneath the  | 99.7% of all known spams stone dead.
 dust of habit." - Salman Rushdie     | http://nuclearelephant.com/projects/dspam



More information about the thelist mailing list