[thelist] PHP & UTF-8

Ivo P ipletikosic at gmail.com
Tue Dec 20 12:25:46 CST 2005


hello,

The first thing I would check is where the input $string is coming
from and the encoding at that source.

It likely is an encoding difference in which case you should transcode
the source strings or use the mb_ functions
(http://us2.php.net/manual/en/ref.mbstring.php)



On 12/19/05, Noah St. Amand <noah at tookish.net> wrote:
> I'm trying to deal with UTF characters in PHP, and I'm running into some
> bizarre problems. Well, bizarre to me, anyway -- probably common to
> people who deal with this stuff regularly.
>
> I'm building a Drupal-based site that will exist in both French and
> English. In certain places, in order to locate images that have names
> similar to page titles, etc., I'm have to replace accented characters
> with the unaccented equivalents. So given the string:
>
> assurance_de_qualité
>
> I need to convert it to:
>
> assurance_de_qualite
>
> According to the manual, it seems that:
>
> strtr($string, "é", "e")
>
> should do it, but it does not. It does the replacement, but also inserts
> (or perhaps reveals) a character that I can't decipher after the e. So:
>
> assurance_de_qualité
>
> becomes
>
> assurance_de_qualite?
>
> Where it gets bizarre is if I do:
>
> htmlentities($string)
>
> The é gets converted to "&Atilde;&copy;".
>
> This all makes me think that the accented e I'm dealing with is not
> actually the character I think it is. Is there anyway that I can find
> out if it is indeed the right character? Does anyone have any other
> advice about how to make this work?
>
> Thanks,
> Noah



More information about the thelist mailing list