Sunday 9 August 2009

HTML5 is Coming!

The latest (8 August 2009) draft version of the HTML5 specifications has just been published.

Some of the additions are special dedicated tags for semantic labeling. These are labels that describe the logical content of a block – what it is rather than how it displays - although with Cascading Style Sheets ("CSS"), it's also possible to set associated display parameters for just about any tag type (colours, surrounding boxes, and so on).

Microsoft (who aren't on the HTML5 panel) have queried what the point of these things is, since they don't add any new layout specification tools for the benefit of the website designer. We already have the general-purpose <div> tag that lets us mark out blocks of code, and to assign custom class names and ID names to those blocks, so that they can be displayed in particular ways using CSS. Why duplicate the same functionality in these new tags, <article>, <nav>, <section>, <aside> and so on, if these don't give the webpage designer any new functionality for how a page appears on screen or on paper that they couldn't already achieve with <div>?

Well, even if Microsoft can't quite see the point of them, there are still a number of really good reasons why the end-users and the internet in general need at least some of these new tags.

Blogging
HTML4 came out at the end of the last century (!), and since then the blog phenomenon has pretty much exploded. Blogging software now makes it really easy for authors to produce a mass of rich, mixed, auto-updated content over tens or hundreds of pages. But search engines have to try to make sense of this mess of articles, article links, widgets and addons, and it's not easy. For instance, suppose that I write and upload a blog article about "Einstein and Fish". On Google, "Einstein and fish" currently only gives one result (if it was two words, it'd count as a "Googlewhack").
But as soon as I post the article, the title "Einstein and Fish" will appear in the "recent posts" box in the sidebar of every single page of my blogspace. Point Google's "advanced search" at my blogspace to find how many articles I've written on "Einstein and fish", and instead of one, it'll report back a list of every blog entry I've ever written as apparently containing that piece of search text. It'll also probably include all the text of every widget I've used on the site (like "NASA Photo of the Day"). And this is even though I'm using Blogger, which is Google's own blogsite company.

When webpage designers and companies like Blogger start using the new tags, general-purpose search engines should find it easier to separate out blog articles and webpage content from the surrounding mess of widgets, navigation links, slogans, adverts and general decorative junk.

Client-side reformatting
Some web designers react with outrage at the idea that a browser might display their precious page with a different layout to the one that they carefully designed (to look good on their nice 19" flat-screen monitor).
But people are increasingly looking at web pages on a rangle of devices including mobile phones and ebook readers, and although website designers can in theory produce separate style sheets that allow a page to be displayed with different layouts on every size of device, in practice there's an awful lot who don't bother (including me! :) ). If we use a dedicated blog site, we maybe hope that the site's engineering people will do all that for us, automatically. With CSS-based layouts, some designers tend to go for absolute pixel widths, and frankly, we don't know what devices and screen sizes might be most important a year from now.

Semantic labeling allows dedicated browsers built into these devices to have a good attempt as reformatting and reflowing pages to fit their own tiny screens, by being able to tell which blocks of HTML are the important page content, and which blocks are just there for decoration or navigation.

New Navigation Tools
One of the results of these new tags is that we can expect to see mini-browsers starting to sprout some new navigation buttons. If you have a long page with several sections that takes several sheets to print out, with a figure or two, an inset box with supplementary material, and a navigation bar, then the layout designed for a large screen is going to be hopeless on an iPhone. So what would be cool on an Android mobile phone browser or iPhone would be a function that scans for <section> tags, and then provides additional [<section][section>] buttons that let you skip forwards or backwards through a page. Inset panels with additional info that the designer has "artily" set into the side of the article could be identified by their HTML5 <aside> tag and stripped out and made available on a separate button as [info]. Similarly, if the author produced a number of figures that are referred to in the text, and marked them with the <figure> tag, it'd be handy if the browser could scan for these when the page is loaded, and provide a [figure] button if it finds one, and [<figure][figure>] navigation buttons if it finds several. And it'd also be really handy on a small screen to be able to strip out the navigation bar and put that onto a separate [nav] button, too.
In fact, if this caught on, it'd also be great to be able to jump around a page using these buttons on a conventional "full-size" browser, too.

Accessibility
Finally, if you think that it's difficult navigating a modern "fancy" webpage on a mobile phone, imagine how frustrating it must be if you're sight-impaired, and are using an automated text reader. If you're navigating a page "by ear", it could be useful to be able to find your place again by skipping backwards and forwards a section at a time, until you find a title or intro paragraph that you recognise ... or to be able to jump back and forth between a current reading position and the navigation options, no matter where the designer has put those navigation buttons on the page, or where they happen to appear in the webpage's source code.

One of the problems with CSS, wonderful though it is, is that it allows the designer to place any element in any part of the HTML file, onto any part of the page. This means that the sequential order of chunks of HTML in the field don't necessarily correspond to the order that they have on the screen. A navigation bar that appears at the top of the screen might appear at the bottom of the code. By labelling the sections logically, in a standardised way, it gives audio navigation software the chance of finding key sections of a page and treating them appropriately. For companies and government departments that have disability access policies (and requirements!), adopting HTML5 tags and using them consistently on new projects would be a good initiative both for supporting future standards and for potentially improving long-term disability access.

No comments: