[thelist] php - htmlentities & charencoding

Taras D taras.di at gmail.com
Thu Jul 6 21:34:44 CDT 2006


Hi all,

I was hoping to get some clarification on a couple of questions I have:

1) When should htmlspecial characters be used? As a general rule should
it be used for text that may contain special characters that is going
to be rendered in the browser (ie: text that isn't in tags)? I've got a
javascript onclick handler whose code includes an ampersand and the
HTML validator complains. I don't know if I should escape the
ampersand, or even if its possible (seeing that the text is inside a
HTML attribute).

Why would you ever use htmlentities as opposed to htmlspecialchars? The
only reason I can think of is if you're page's charset doesn't support
the special character you're trying to render (for example, the euro
using Latin1), but then why wouldn't you just change the pages charset
to UTF-8 (unless you're editor can't save in UTF-8, which might
indicate its time to get another editor). The comment on the PHP manual
entry for html entities, 'Please, don't use htmlentities to avoid XSS!
Htmlspecialchars is enough!' seems to suggest that the uses for
htmlentities is limited, since it needn't be used to avoid XSS.

2) A comment in the PHP manual entry for htmlentities states that their
function can be used to 'replace any characters in a string that could
be 'dangerous' to put in an HTML/XML file with their numeric entities
(e.g. &#233 for [e acute])'. Why would it be dangerous!?

3) What are some typical uses of specifying HTTP input/output character
encoding? If it is used to convert output, why wouldn't you just change
the output page's char encoding? If its used to convert input from say
UTF-8 to Latin1, couldn't you just use a function to do this?

That's about it!

Thanks in advance

Taras



More information about the thelist mailing list