On Mon, Sep 17, 2001 at 02:17:45PM -0500, Green, Janet wrote: [1 and 2 have already been answered] > 3. What are unicode and UTF-8, and how do I know what I am using? I've never > made a conscious decision on this. > > Please post answers in "for dummies" lingo. Thanks. Unicode is the name of a huge international character set - essentially a really big alphabet with letters, numbers and symbols drawn from just about every human language in the world. You need a lot of characters for that, so Unicode gives you 2,147,483,648 possibilities (wow). UTF-8 is just one of the ways you can encode a Unicode document, and it's what XML (and HTML) documents use by default. If your document is mainly written in English and/or other Europaen languages, and you're not fussed about accent symbols don't worry about UTF-8. UTF-8 is compatible with the characters you can see on a US keyboard (except the Euro symbol, if you have one there). If your documents are written in a non-Roman alphabet like Greek, Hebrew, Cyrillic, or Korean, or you mix them in with English or other languages, then you probably want to consider UTF-8 as an encoding (especially if you're mixing scripts together). You can learn far, far more theory than you or I will ever need to know at <http://czyborra.com/>, and there's a slightly more user- friendly FAQ at <http://www.cl.cam.ac.uk/~mgk25/unicode.html>. They're both geared towards Unix machines and crazy Linux commies like me, but they're useful from a non-denominational perspective too.  Save to disk in a way the computer can understand next time you load it in.  Not really an alphabet but, um, never mind :) -- Andrew Chadwick, UNIX/Internet Programmer, PR Newswire Europe, Oxford -- The views or opinions above are solely mine and are not necessarily those of PR Newswire Europe. The message may contain privileged or confidential information; if you are not a named recipient, notify me, and do not copy, use, or disclose this message. <andrew.chadwick at prnewswire.co.uk>.