[thelist] problem with accented characters in email

Brian Cummiskey Brian at hondaswap.com
Wed Dec 8 09:29:17 CST 2004


Sarah Sweeney wrote:

> The problem arises when users enter accented characters in their subject 
> or message; these characters come through with some kind of encoding 
> (e.g. "=?iso-8859-1?Q?Version_fran=E7aise_?="). 
> 

Hi Sarah-

It looks like its encoding it into standard iso-8859-1 French to account 
for the "e accent grave" and "e accute" and so forth characters that you 
are implying is the problem.

There must be something built into the back end of the perl or cfm to 
catch these characters and to determine the content type based on it.

In my opinion, the easiest thing to do would be to simply remove those 
characters.  I don't know a thing about CFM, so i can't offer any code 
for it, but I would use some kind of string replacement

for example, in ASP i would do something like this:

dim formfield
formfield = replace(request("formfield"), "badcharacter", "goodcharacter")


php.net has an automated function solution available:

<?php
function unaccent($text){
$trans = get_html_translation_table(HTML_ENTITIES); //Get the entities 
table into an array
foreach ($trans as $literal =>$entity){ //Create two arrays, for 
accented and unaccented forms
    if (ord($literal)>=192){ //Don't contemplate other characters such 
as fractions, quotes etc
      $replace[]=substr($entity,1,1); //Get 'E' from string '&Eaccute' etc.
      $search[]=$literal;}} //Get accented form of the letter
return str_replace($search, $replace, $text);}

echo unaccent("Hêllò Èvérÿöñë!");

?>


i'm sure there are javascript options available to you as well, but 
beware of client side form validation in a non-controlled environment.


However, if after all this, you actually WANT the accented characters to 
be passed in, i don't know what to tell you.  If it were me, I'd strip 
them out if your company is an English-speaking company that sells 
English products.  But that's a whole other issue that the business side 
of things should conclude on- not the programmers.


More information about the thelist mailing list