[thelist] HTML Document Encoding

David Dorward evolt at david.us-lot.org
Wed Oct 1 17:47:13 CDT 2003


On Wed, Oct 01, 2003 at 06:10:26 -0400, Pierre Lemieux wrote:
> I need help understand how document encoding work on web pages.
> 
> Does the order of the content-type meta in the head is important?
> Should it appears first, before any other element?

It shouldn't appear at
all. http://www.htmlhelp.com/tools/validator/charset.html
 
> What is suppose to happen if a document is encoded in UTF-8 but the
> content-type is set to 8859-1?

I don't believe its defined. Its generally a very bad idea to lie to
the client.
 
> It appears browsers do not bother but search engines do: accented characters
> display correctly in the browser but are scrambled in search results
> (Google). 

I've seen scrambled results in Mozilla when given a UTF-8 document
with a claim of 8859-1.

> As it appears to me, encoding can be affected by:
> 
>     - the content-type

No, this can carry a message telling the client what character
encoding to expect, but it can't influence it.

>     - the encoding of the document

The encoding can't effect the encoding. The encoding IS the encoding.

>     - the server

Its possible (but unlikely AFAIK) that some form of transcoding could
occur on the server.

>     - the development environment (web.config in .Net)
>     - the applications (databases, editors)

This (and the settings of those applications) can influence the
encoding.

>     - the OS (I use Mac and Windows)

This shouldn't effect it (other then to influence the applications
used as above)

-- 
David Dorward                                       http://dorward.me.uk/


More information about the thelist mailing list