[thelist] PHP & UTF-8

Noah St. Amand noah at tookish.net
Mon Dec 19 18:09:24 CST 2005


I'm trying to deal with UTF characters in PHP, and I'm running into some 
bizarre problems. Well, bizarre to me, anyway -- probably common to 
people who deal with this stuff regularly.

I'm building a Drupal-based site that will exist in both French and 
English. In certain places, in order to locate images that have names 
similar to page titles, etc., I'm have to replace accented characters 
with the unaccented equivalents. So given the string:

assurance_de_qualité

I need to convert it to:

assurance_de_qualite

According to the manual, it seems that:

strtr($string, "é", "e")

should do it, but it does not. It does the replacement, but also inserts 
(or perhaps reveals) a character that I can't decipher after the e. So:

assurance_de_qualité

becomes

assurance_de_qualite?

Where it gets bizarre is if I do:

htmlentities($string)

The é gets converted to "é".

This all makes me think that the accented e I'm dealing with is not 
actually the character I think it is. Is there anyway that I can find 
out if it is indeed the right character? Does anyone have any other 
advice about how to make this work?

Thanks,
Noah



More information about the thelist mailing list