[thelist] character encoding tip

Alan Wood alan.wood at context.co.uk
Fri Mar 14 03:39:01 CST 2003


Emma Jane Hogbin wrote:

> Make sure your encoded characters match the encoding your 
> pages are using.
> A character that looks like this in your code: ’ is a UTF-8
> character. However, many pages still use the ISO-8859-1 
> encoding. Although
> browsers might display the character the way you intended 
> it's not right to mix and match.

According to the HTML 4 specification, numeric character references such as
’ are supposed to be independent of the charset/encoding, and so
should work equally well with ISO-8859-1 or UTF-8.

If your code includes the actual character (apostrophe or right single quote
in this example) then you need to use and specify UTF-8 encoding, but it is
not needed for numeric character references (’) or character entity
references (’).

Alan Wood
http://www.alanwood.net (Unicode, special characters, pesticide names)


More information about the thelist mailing list