[thelist] XHTML or HTML WAS Good Examples of XHTML Usage

Ian Hickson ian at hixie.ch
Sat Sep 6 10:17:54 CDT 2003


On Sat, 6 Sep 2003, TjL wrote:
>>>
>>> Show me a browser that can handle HTML and not XHTML.
>>
>> Windows IE 6.
>>
>> Testcase: http://www.hixie.ch/tests/adhoc/xhtml/mime-type/001
>
> I'll take your word for it, since I'm partially color-blind.

You don't need to be able to distinguish colours to tell that IE fails
that test, since it refuses to even try to render it.


> How about restating this better... show me a real "in the wild"
> example of an XHTML file sent as HTML that causes a browser to fail.

This is a strawman argument. I never said that sending XHTML as HTML
(assuming it complies with XHTML 1.0 Appendix C guidelines) would
cause Tag Soup ("HTML") renderers to fail.

What I said was that using XHTML, and sending it as text/html, is
pointless, because:

 * The XHTML document will typically either by syntactically or
   semantically incorrect,

 * The XHTML document will thus not work if you ever switch to an XML
   MIME type, so in practice will always be sent using an HTML MIME
   type,

 * UAs do not treat text/html documents as XHTML, they treat them as
   HTML, and only render them as expected because of their error
   handling behaviour,

 * UAs will render text/html documents according to the HTML rendering
   rules, which are different from the XHTML rendering rules.

If you're going to be using the Web browser's HTML parser and HTML
rendering rules, why would you send it content that it was not
designed to parse, and only parses due to its error handling logic,
rather than sending it content it _was_ designed to parse?


> I think I was referring to the "Evil Mangled Comments Embedding Hack"
>
>    <script type="text/javascript"><!--//--><![CDATA[//><!--
>      ...
>    //--><!]]></script>

The fact that you have to go to such extreme lengths to embed style
and script into your documents should be setting off alarm bells.

The fact that nobody uses the above is even more telling.


>>>> * XHTML documents that use the "/>" notation, as in "<link />", are
>>>> not valid HTML documents. (See the third bullet point in the
>>>> section entitled "The Myth of "HTML-compatible XHTML 1.0
>>>> documents"".)
>>>
>>> Show me a browser that can't handle <link /> but can handle <link>
>>
>> No browsers handle <link/> according to the HTML4 spec.
>
> That's not what I asked.

So what do you mean, "show me a browser which parses <link/>
incorrectly"? Most of them do, as far as I know.

How is that relevant?

If all you are going to do is rely on the UAs' error handling code,
why bother trying to make your markup valid at all? Just write pure
Tag Soup and be done with it.


>>>> Therefore the main advantage of using XHTML, that it has to be
>>>> valid, is lost if the document is then sent as text/html.
>>>
>>> I would say <em>one</em> advantage is lost, but it is not the main one,
>>> to me.
> [...] the first thing I mentioned was that it was easier to find
> mistakes.

Is that not the same thing?


>>>> * If you ever switch your XHTML documents from text/html to text/xml,
>>>> then you will in all likelyhood end up with a considerable number
>>>> of XML errors, meaning your content won't be readable by users.
>
> What I meant was that no one is going to send XHTML as XML until well
> after IE6 is dead, and I mean Netscape3 dead.  Which is so far in the
> future as to not even bother with worrying about.

It doesn't matter _what_ browsers are "dead" when you switch MIME
type. The point is that if you write XHTML now, and send it as
text/html, a huge number of those documents _will_ break if you change
the MIME type.

Your own article, for example, as mentioned below.

Now, given that there is little or no chance that these errors will
ever be corrected, and that there is therefore no chance that the MIME
type will ever be changed from text/html, why would you use XHTML?

Why write XHTML if you are simply going to claim it is HTML (text/html
is defined as meaning "treat me as HTML") and rely on parsers "error
handling" the document into a readable state?


> Could be... I don't do Javascript so I have to claim ignorance
> there. I thought we were talking XHTML vs HTML.

The differences between the semantics of JavaScript in XHTML vs its
semantics in HTML is one of the main reasons not to use XHTML.


>>>> * A CSS stylesheet written for an HTML document is interpreted
>>>> slightly differently in an XHTML context [...]
>>>
>>> If you are writing XHTML, you are probably aware of that when you
>>> write CSS.
>>
>> [...] Most authors are _not_ aware of this. Were you?
>
> Sure I was... but I use UltraEdit, so I'm pretty much aware of what
> I'm writing.

You are aware of the differences between the semantics of CSS in XHTML
and the semantics of CSS in HTML?


>>   http://www.joinwow.org/learningcenter/markup/articles/2003/m200302.asp
>
> All I can say is that the page was valid when I submitted it, it didn't
> have the invalid SGML characters, and the invalid (non XHTML) markup was
> the navigation stuff to the side of the page that I didn't have anything
> to do with.

_That is exactly my point._

It doesn't matter how good you are or how valid your XHTML is at the
start of the process. By the time it hits the server, if it is sent as
text/html, then errors that have crept in will not be caught.


>>> Yeah I know he says that HTML compatible XHTML is a myth.  Again, show
>>> me a real world example of it breaking anywhere.
>>
>> The simplest example is the XHTML "<br/>" which in HTML means
>> "<br>&gt;".
>
> Is there a browser that renders <br /> as <br>&gt; ?

I believe Emacs/W3 may, but it's been a while since I tested it.

But that isn't relevant. My argument is that XHTML isn't HTML
compatible, not that it isn't Tag Soup compatible.


> My point was that there are a lot of theoretical problems with it,
> but much fewer practical ones.

There are also few practical problems with generating invalid HTML.

What is the point of using XHTML rather than HTML4?


>> In conclusion: on the one hand you claim that using XHTML means
>> requiring a validator, and that validity is key, yet on the other, you
>> claim that so long as things work in most browsers, irrespective of
>> the specs, then it's ok. This seems inconsistent.
>
> XHTML pushes me towards writing stricter code.

No it doesn't. Validators push you towards writing stricter code.
XHTML simply has a different set of rules.

Why is

   <link ... />

...and "stricter" than:

   <link ... >

...? Both are just as valid in their respective languages.


> I have been writing XHTML for a couple years for personal projects,
> but when I got my Treo handheld, I realized that it would download
> faster to write extremely bad HTML. For me in my usage, it was more
> important to write really bad HTML in that situation because speed
> is paramount.

It's probably even faster to write perfectly valid HTML, e.g.:

   <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
   <title> ... <title>
   <p> ...

...which is perfectly valid.


> Arguing against sending XHTML as text/html, when using any of the
> XHTML mime types is just purely impractical, is futile.

I'm not arguing in favour of those other MIME types. I'm arguing in
favour of simple HTML 4.01! It's a perfectly valid specification, it's
not obsolete, and it happens to be the most widely supported variant
of (X)HTML. Why use anything else?


> I also said that XHTML was easier to parse and makes it easier to
> transition to understand some XML concepts.

It is a lot _harder_ to parse if you are just sending it to an HTML4 /
Tag Soup parser and hoping it turns out right. When sent as text/html,
XHTML uses the same parser as HTML4: It is not any easier to parse.


> If you use HTML4 instead of XHTML you aren't going to get parsing
> warnings either.

No, but at least you don't have to rely on possibly-changing error
handling behaviour.


> Another part of the inconsistency is that I wrote a PHP snippet to
> send me application/xhtml+xml when using Mozilla or Opera. Now that
> Opera 7.2 sends the right HTTP_ACCEPT headers, I will probably start
> using that more often. So when I am looking at my own site from my
> own computer, I will be getting application/xhtml+xml (or whatever
> that dreadful MIME type is). So I can get the benefit of XHTML
> without having to inflict it on other people.

See appendix B of my document:

   http://www.hixie.ch/advocacy/xhtml


I've also updated the document to take into account some of your
comments.

Cheers,
-- 
Ian Hickson                                      )\._.,--....,'``.    fL
U+1047E                                         /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'


More information about the thelist mailing list