[Javascript] Re: Charset (sort-of OFF_TOPIC for js)

Scott Reynen scott at randomchaos.com
Mon Aug 14 11:21:40 CDT 2006


On Aug 14, 2006, at 10:32 AM, tedd wrote:

> At 7:08 AM -0700 8/14/06, Bill Moseley wrote:

>> What should display for a byte with a value of 192?  You have to tell
>> the browser how to translate the string of bytes into characters --
>> and that's what the charset setting does.
>
> Okay, assuming your value is HEX, the code point 0192 is Latin  
> Capital Letter A with grave.

Documents don't exist in code points.  They exist in bytes.  A byte  
with a value of 192 only maps to a Latin Capital Letter A in ASCII.   
In UTF-8, it's the first half of a two-byte character.  That's a good  
example of why you should always state your character encoding.

> The point is that I  (perhaps I'm showing my ignorance here) don't  
> know of any characters in the Unicode dB that can't be shown by  
> UTF-8 -- so what's the point of other charset's? Are they for  
> legacy concerns?

There is a large and growing body of data that already exists in  
encodings other than UTF-8, so we can't just tell everyone to use  
UTF-8 and more than we can tell everyone to stop using COBOL.  There  
are also some characters that don't exist in Unicode.  See:

http://www.unicode.org/faq/han_cjk.html

> And, as I've asked before, does any of this affect javascript? Will  
> having the wrong charset cause js to fail in some way?

Character encoding affects any JavaScript working with server-side  
documents, e.g. JSP or AJAX.  If the server-side document has a  
different encoding than the client-side document, you'll need to  
convert before displaying.  But if you're not actually having that  
problem, I'm not sure why we're talking about this.

Peace,
Scott




More information about the Javascript mailing list