HTML vs XHTML: Choosing the best tool for the job

Over the last few years there has been an increasing trend for web designers/developers to favor the use of XHTML over HTML. Why has this happened? Well, probably it’s because a lot of big-name web designers and standards gurus have preached the perceived benefits of XHTML over boring old HTML 4. The benefits they give are always the same:

  • The markup is cleaner
  • It is valid XML
  • You get all of the benefits of XML
  • The page will render faster
  • It will help future-proof your work
  • It is the standards-based way of doing it!

Although these points are true in theory, they are nowhere near the mark in practice.

The problem doesn’t lie with XHTML itself, rather, it is how and to which browser the XHTML is served. The fact that your page will give a nice green tick from the w3.org validator when you validate your site doesn’t matter a jot if your web server isn’t serving your pages correctly. If you are serving XHTML with the typical text/html MIME type (which is about 99% of all webservers), all your users will get is malformed HTML instead of your wonderful XHTML.

When a page is served with the text/html mime type, browsers expect the content to be HTML and, thus, use their “tag-soup” parser to process the markup. This means that if you are serving XHTML (Transitional or not!) with the text/html MIME type, your browser will treat it as malformed HTML! So, in one foul swoop, most of the benefits have gone out of the window. It is no longer valid XML, it is not being parsed by an XML parser, and, most damming of all for standards gurus, it is not valid markup any more! I won’t bother re-hashing the details behind this because they are all covered in detail in a brilliant article written by Apple’s WebKit/Safari team.

Alright, so what if we serve the pages with the correct application/xhtml+xml or (the dubious) text/xml MIME type?

Well, here’s the problem: IE doesn’t support XHTML. Yep that’s right, IE (IE7 included) cannot accept XHTML when served with its correct MIME type. Instead of rendering your expertly marked-up and wonderfully valid page, you’ll be prompted to download the file instead. Internet Explorer’s project manager Chris Wilson explains the reasoning behind IE’s lack of XHTML support in this IEblog post from 2005.

Unfortunately, until Internet Explorer natively supports XHTML, there really is no use in using it, unless you want to serve XHTML to supported browsers and HTML to IE, which also means you’ll be supporting two versions of each page - unless you get into using XSLT. All of this eats up development time - time which would be better spent on the site, or adding new features.

By writing valid HTML 4.01, you are giving yourself the largest target audience and the widest support, with the minimum outlay. By using a Strict Doctype such as this, you are ensuring that browsers will try their hardest to render your page as the HTML spec intended, so you will have a better chance that your site will look consistent across browsers. So, overall you get:

  • Better browser support
  • Better search engine support
  • Less server configuration
  • Less development time
  • More time to spend on important things

So, maybe you are beginning to see my point? As it currently stands, all major browser vendors recommend targeting HTML rather than XHTML. These are the people who the web community should be listening to. These are the guys and girls who write the browsers, they know how to get the best from their software, and unless you need to use other XML applications within your documents such as MathML, there really is no point in using XHTML.

In spite of all of the hype about XHTML, there are some like Håkon Wium Lie who think that the emphasis on XHTML over HTML, for general web development, is misguided:

[XHTML2] …has some very good ideas that I hope can become part of the web. However, it’s unrealistic to think that all web authors will switch to an XML-based syntax which demands that browsers stop processing the document on the first error. XML’s draconian policy was an attempt to clean up the web. This was done around 1996 when lots of invalid content entered the web. CSS took a different approach: instead of demanding that content isn’t processed, we defined rules for how to handle the undefined. It’s called “forward-compatible parsing” and means we can add new constructs without breaking the old.

So, I don’t think XHTML is a realistic option for the masses. HTML5 is it.”

However, HTML5 is currently only a draft specification and will not be implemented by browser vendors any time soon. So, for the moment I’ll be sticking with HTML 4 Strict until it becomes clear which technology: XHTML or HTML 5, vendors will favor for general web development.

2007-06-11

Comments

  1. Scott §

    I totally agree. All the cool kids moved to XHTML thinking it was a more standards compliant way to develop, but really that's an illusion.

    HTML 4.01 Strict served as text/html is the most compatible way to develop and the code is just as clean and no less valid than XHTML.

    2007-06-12