[thechat] XML as read from the horse's mouth >> OH! MY! GOD!

Warden, Matt mwarden at mattwarden.com
Sun Apr 22 21:01:27 CDT 2001


> yes I know those are capital letter - and yes I was shouting.

Damn... I could hear you all the way from Ohio...

> I intend to know more tomorrow than I do today.

And you didn't just watch Jerry Springer?

> Now don't get me wrong, I've read a thing or two about the subject,
> but trying to read through this
>
> http://www.w3.org/TR/REC-xml
>
> page, is driving me crazy.

The w3.org specs usually should be the last thing you look at, IMO.

> You want examples, I'll give you examples:
>
> 1) a definition or what ...

Lemme see if I can break it down for ya...

> XML documents are made up of storage units called entities, which
> contain either parsed or unparsed data. Parsed data is made up of
> characters, some of which form character data, and some of which form
> markup.

<person>
    <name>elfur</name>
    <mugshot>http://elfur.is/mypic.jpg</mugshot>
    <favorite-color>neon clear</favorite-color>
</person>

the PERSON element contains both text and markup, correct?

the NAME, MUGSHOT, and FAVORITE-COLOR elements contains text, no markup

> Markup encodes a description of the document's storage layout
> and logical structure.

<name>elfur</name>

the NAME element "encodes" a description of what the string "elfur" is. It's
a name.

> 2) well-formedness what?
>
> Violations of well-formedness constraints are fatal errors

<unclosedtag>
    <name>elfur</name>

that is a fatal error and this XML file can not be parsed.

> 3) what is text?
>
> A parsed entity contains text, a sequence of characters, which may
> represent markup or character data

Not sure what's giving you problems here. Basically, a parsed entity (like
PERSON above) can contain both markup or characters/text.

> 4) and this tops everyting >> atomic unit !!
>
> A character is an atomic unit of text as specified by ISO/IEC 10646
> [ISO/IEC 10646] (see also [ISO/IEC 10646-2000]).

Never heard of it refered to as this, but I think this is just saying that a
character is the smallest unit in a XML document. Larger units would be made
up of characters (like things made up on atoms) like text, markup, etc.



help any?


--
mattwarden
mattwarden.com





More information about the thechat mailing list