There is no shortage on the pages on the internet that talk about HTML vs. XHTML.  The vast majority of these (in the first few pages of Google) seem to favor XHTML.  I don't really have an agenda, so I thought I would post my thoughts on the topic.

I have stated on this blog that I use HTML 4.01 Transistional.  I do so because it is easiest for me.  Some people argue that XHMTL is easier because there are set rules and if you violate those rules, the documents will not render.  Is that a good thing?  Perhaps my time in the late 90's has made my mind work differently than newcomers to the World Wide Web.

The browser wars were ugly.  And I mean literally ugly.  If you wanted to do anything fancy, it required lots of images or compromise.  I learned early on that it was ok that the spacing in IE on my PC was larger than IE on the Mac.  The fonts were all different sizes from browser to browser and OS to OS.  I learned that graceful fallback was part of the web.  Even now, dealnews.com looks "adequate" in IE 6.  I could make it look perfect.  But, the declining traffic from IE6 does not merit my time to fix the errors in IE 6.

So, when I start thinking about HTML vs. XHTML, I want the more flexible of the two.  I find syntax like nowrap='nowrap' very annoying in XHTML.  Especially since I can't say nowrap='yeswrap' and it mean anything.  nowrap=1 I could handle.  But, no, it has to be nowrap='nowrap'.  Geez.

Ok, ok, this is turning into an XHTML hate post.  I don't want to do that.  There are some things about XHTML that I do like.  I like the self closing tags.  My OCD (which I have brought up before) has never liked having an open tag without a closing tag.  so, the <br /> format is appealing to me in that sense.  I love that XHTML elements should always be lower case.  I hate upper case HTML.  It just reads funny.  Like camel case function names.  Some folks on our content team used to use Adobe PageMaker to write up deals.  They would copy and paste the HTML from there into our CMS.  The output would be pretty ugly.

So, I like parts of both.  What is interesting to me is the fact that the "big sites" on the internet don't seem concerned with document types or validation.

Site DocType Validates
Google None No
Yahoo HTML 4.01 Strict No
Live.com (Microsoft) XHTML 1.0 Transitional No
MSN.com XHTML 1.0 Strict Yes
Facebook XHTML 1.0 Strict No
eBay HTML 4.01 Transitional No
YouTube HTML 4.01 Transitional No
Amazon.com None No
Wikipedia XHTML 1.0 Strict Yes
MySpace XHTML 1.0 Transitional No

So, of the 10 most popular sites on the internet (according to Compete.com), two don't include a document type in their front page at all.  Only two of the sites validate according to the W3C.  MSN and Wikipedia both validated on their front page with XHTML 1.0 Strict.  However, neither is sending a Content-Type of application/xhtml+xml.  According to this page, that is a bad thing.  And the search results page for XHTML on MSN.com did not validate.  Kudos to Wikipedia.  Their page on XHTML does validate.  Interestingly, they switch to XHTML 1.0 Transitional for that page.

So, is the internet broken?  No.  The most important validation is that of your users.  Can they use the site?  Does the site look right in their browser?  Most sites have much bigger navigation and content issues than they do document structure.

So, my idea of validation is this:   Does it render the same (or damn near) in the browsers that cover 90% of the internet users?  If so, then your page validates.  The only way to check that is (most likely without SkyNet) the human eye.