[thechat] XML as read from the horse's mouth >> OH! MY! GOD!
Warden, Matt
mwarden at mattwarden.com
Sun Apr 22 21:01:27 CDT 2001
> yes I know those are capital letter - and yes I was shouting.
Damn... I could hear you all the way from Ohio...
> I intend to know more tomorrow than I do today.
And you didn't just watch Jerry Springer?
> Now don't get me wrong, I've read a thing or two about the subject,
> but trying to read through this
>
> http://www.w3.org/TR/REC-xml
>
> page, is driving me crazy.
The w3.org specs usually should be the last thing you look at, IMO.
> You want examples, I'll give you examples:
>
> 1) a definition or what ...
Lemme see if I can break it down for ya...
> XML documents are made up of storage units called entities, which
> contain either parsed or unparsed data. Parsed data is made up of
> characters, some of which form character data, and some of which form
> markup.
<person>
<name>elfur</name>
<mugshot>http://elfur.is/mypic.jpg</mugshot>
<favorite-color>neon clear</favorite-color>
</person>
the PERSON element contains both text and markup, correct?
the NAME, MUGSHOT, and FAVORITE-COLOR elements contains text, no markup
> Markup encodes a description of the document's storage layout
> and logical structure.
<name>elfur</name>
the NAME element "encodes" a description of what the string "elfur" is. It's
a name.
> 2) well-formedness what?
>
> Violations of well-formedness constraints are fatal errors
<unclosedtag>
<name>elfur</name>
that is a fatal error and this XML file can not be parsed.
> 3) what is text?
>
> A parsed entity contains text, a sequence of characters, which may
> represent markup or character data
Not sure what's giving you problems here. Basically, a parsed entity (like
PERSON above) can contain both markup or characters/text.
> 4) and this tops everyting >> atomic unit !!
>
> A character is an atomic unit of text as specified by ISO/IEC 10646
> [ISO/IEC 10646] (see also [ISO/IEC 10646-2000]).
Never heard of it refered to as this, but I think this is just saying that a
character is the smallest unit in a XML document. Larger units would be made
up of characters (like things made up on atoms) like text, markup, etc.
help any?
--
mattwarden
mattwarden.com
More information about the thechat
mailing list