[thelist] the ampersand & and similar entity type characters

Chris Anderson Chris at activeide.com
Sun Nov 8 11:21:34 CST 2009


> Just curious here.  Can someone explain when/why it's compulsory to
use
> the entity code rather than the character itself in a document.
> Commonly in validating pages, even the cryptic php strings they won't
> validate unless I convert & to &, etc.
> 
> With simple page titles like "Jack & Jill" they become "Jack &
> Jill".
> 
> Or perhaps:
> 
> index.php?option=page&id=3 becomes index.php?option=page&id=3
> 
> It's just  rather fatiguing...

It's because in markup language formats, the ampersand is a special
character used to encode other special characters (such as < (&lt;), >
(&gt;), etc). 

If it were not a special character itself, the markup parser would have
to try to determine whether the ampersand was being used as a special
character marker or was content.

As it is though, the parser knows that the ampersand *always* starts a
special character encoding, and simply scans forward to the next
semicolon to extract the encoding

What language/platform are you using?
These days most have utility functions that do the encoding for you

Chris



More information about the thelist mailing list