Reservations about HTML5

A day or so ago, Adrian Bateman, from Microsoft’s IE team, posted his team’s thoughts on the current draft of the HTML5 spec.

Reading it is brutal. Bateman takes issue with basically everything added since HTML4. He goes through and individually criticizes many of the new tags, sometimes with extremely detailed, multi-paragraph critiques. I guess this is what happens when you’re not sufficiently involved at the beginning.

Of course, there is still plenty of time to complain, since the HTML5 spec won’t reach its final stage until 2022.

Bateman and the IE team, even while sounding like they don’t even want HTML5, do bring up a few things that have been bothering me about the spec.

Let me be very clear: I think new tags like <audio> and <video> are wonderful. Breaking the Adobe monopoly is great. There are still some issues (refusing to specify a codec meaning you can’t build support into the browser, for instance) but those types of tags are going to help push the web to a better place.

The parts that bother me are the new, highly touted “structural” tags, like <header>, <footer>, <section>, and worst, <article> and <aside>.

My issue is that all of these tags are perfect semantic additions, if all you want to do is put magazine articles on the web. These are tags that represent a current prevailing paradigm in text-focused web design that has been strongly influenced by print design and layout. They all have a use on A List Apart, and on most blogs, but besides <header>—or possibly <nav>—can you see any of them in Gmail or Google Calendar?

Ryan Doherty already demonstrated a semantic version of Gmail, and didn’t need a single HTML5 tag to do it. (But please, please, don’t assume I’m putting words in Ryan’s mouth about HTML5. He’s also used the <audio> tag to great effect.)

Issues with the ambiguity of certain new tags aside, these tags privilege, even codify, a certain paradigm in design. For lack of a better word, the “wall of text” style. To that end, I worry that (a) they do nothing to help the cause of emerginghell already mainstream web application development—the Rich Internet Application— and (b) they may actively discourage designers from trying new, paradigm breaking ideas.

I imagine a conversation between a designer and the developer tasked with implementing the design. Maybe this isn’t a “great” developer, but honestly, how many of those are there? The conversation ends: “Well, there just isn’t a tag for that.” Back to the drawing board.

What I like about the <div> and <span> are their flexibility. By themselves, they’re nearly meaningless, just a block-level or inline element, respectively. What makes them work is the way we’ve developed the use of the id and class attributes.

What HTML5’s new tags lack is extensibility, just like HTML4. They freeze us at a moment in time (about 5 years ago) and in design. To that end, I reiterate John Allsopp‘s call for new attributes, like role from XHTML, rather than new tags. Do we really need five new block-level elements? Or should we allow some sort of extensible mechanism to create, in effect, an infinite number of new block-level elements?

Since I suggested it, here’s how I imagine it working: let’s say HTML5 adopts the role attribute, and it’s applicable to everything. Now, instead of XHTML’s short list of values for role, let’s define a list of values, but allow that list to be extended. So, our common values—like header, footer, navigation, menu, section, article, frame, search, banner, posssibly data and interactive, I’m sure we could come up with far more even now—have some defined meaning, but they are not the only possible values.

The tricky part is: how do we go about defining a new role? The simple, naïve solution is allowing people to use anything, and then use CSS selectors and JavaScript to make it work. While this opens up the extensibility to the greatest number of people, it also makes it fairly meaningless, and doesn’t really expand on the id/class system we have now.

The other end of the spectrum is requiring all roles to have a definition that is somehow meaningful to the user-agent. The XML way to do this would be roughly to allow the role attribute to exist in any namespace and still be interpreted, providing a definition for the UA somewhere on the web. Something like this:

...

This, of course, limits the number of people who can create, and host, definitions for roles. It is also incompatible with the HTML-style variant of HTML5 (as opposed to XHTML5, the XML-style variant). So let’s throw this out now.

The other method I can see off the top of my head is a hybrid, and puts much of the onus on the UA to be smart, and on the community to share.

There will need to be a lot of discussion among browser vendors on what “meaningful to the UA” means, and what kind of definitions are necessary from that, but let’s assume that they’ve all come to some sort of consensus—or, since I’ve always thought the consensus model was flawed, that some common format has emerged. There is a file format that can be placed on the web.

This is a hybrid scheme: if all I want is to label some elements with a new role, I can put anything in there. But if I want something meaningful to the UA, with a definition file, I can add it as a <link> element in the header.

MIME-types and formats are left as exercises to the reader.

Now, here’s where the sharing comes in. Let’s say my friend or follower wants to use a similar role, and thinks mine is close enough. He can then put the same <link/> tag. Hopefully, a smart UA, would have cached the definition file, do a conditional GET request, get a 304 Not Modified from my server. If my definition file gets too popular for my server to handle, I could perhaps move it over to Google Code and have my server issue a 301 Moved Permanently, which would hopefully convince UAs to stop pinging me.

There is some question about conflicting role definitions. What makes a “unique key,” is it a combination of a role and a URL? Can I define multiple roles per URL, or only one? If there are multiple roles per URL, can I, as a developer, mix and match? Which definition would take precedence? These are not trivial questions, and would have to go up for debate in WHATWG.

This is just one idea for an extensible method of building on HTML that doesn’t lock us into a single design paradigm. What I don’t like about the current HTML5 spec is that a lot of is is neither extensible nor forward-thinking. I don’t know how many of you plan to keep making the same old blogs and webzines for the next 13 years, but I, for one, would like to move forward.