[thelist] Removing Microsoft Word special characters

Timothy J. Luoma luomat at operamail.com
Fri Sep 19 08:39:40 CDT 2003


On Fri, 19 Sep 2003 22:47:18 +1000, Ken Schaefer <ken at adOpenStatic.com> 
wrote:

> I only mention charset=UTF-8 because I've stuck that onto various pages 
> that have been converted from MS Word just so that the w3.org validator 
> doesn't complain (so I can validate the rest of the HTML). It seems to 
> deal fine with all the characters mentioned (smart quotes, m-dashes etc)

I ran a quick test

http://www.tntluoma.com/temp/char-via-utf8.html

but the validator[1] said "Sorry, I am unable to validate this document 
because on line 17 it contained one or more bytes that I cannot interpret 
as utf-8 (in other words, the bytes found are not valid values in the 
specified Character Encoding). Please check both the content of the file 
and the character encoding indication."

To test, I just wrote up a sentence in WinWord that used the "smart 
quotes" etc and copied it to ... a text editor... and saved the page to my 
server.

[1] 
http://validator.w3.org/check?uri=http://www.tntluoma.com/temp/char-via-utf8.html

TjL


-- 
Flash, Pop Up Windows, Animated Gifs... what did I miss?
"Most Annoying Things on the Web" @ 
http://www.tntluoma.com/notes/000340.php


More information about the thelist mailing list