[Javascript] Re: Charset (sort-of OFF_TOPIC for js)

David Dorward david at dorward.me.uk
Mon Aug 14 08:47:11 CDT 2006


On Mon, Aug 14, 2006 at 09:39:16AM -0400, tedd wrote:

> However, the whole situation seems a bit redundant, doesn't it? You 
> have to both save your document in the right charset encoding AND 
> then again restate the charset encoding in the document. Why twice? 

First, it should be pointed out that sticking the details of the
encoding in a <meta> element and expecting the client to read that is
a nasty hack (which comes with a bunch of provisos).

The proper place for this information is in the HTTP header sent by
the server before the document itself.

That way the server says "Hello! Here is a UTF-8 document" and then
the browser can read it knowing it is UTF-8.

> Is it that some browsers can detect the encoding and others can't?

Detecting the encoding comes down to guesswork. You get situations
such as:

  "The boy ate a chip"

Which, in a document authored by an American would probably mean "The
boy are a crisp." in British English. However, if it was authored by a
Brit, then in American English it would probably mean "The boy are a
french fry."

The same byte can mean different things in different encodings,
without making either encoding obviously wrong.


-- 
David Dorward                                      http://dorward.me.uk




More information about the Javascript mailing list