Oli.jp

Articles…

HTML5 structure—HTML 4 and XHTML 1 to HTML5

We’ve covered a lot of ground so far. To recap, HTML5 has several new sectioning content elements that we can use to give relevant parts of web pages more semantic meaning. These new elements are for ‘chunks of related content’ — basically a logical section of the document:

New ‘sectioning content’ elements in a nutshell

  • <section> — a chunk of related content
  • <article> — an independent, self-contained chunk of related content, that still makes sense on it’s own (e.g. in an RSS feed)
  • <aside> — a chunk of content that is tangentially related to the content that surrounds it, but isn’t essential for understanding that content
  • <nav> — a major navigation block (generally site or page navigation)
  • (cf. <div> — a chunk of content with no additional semantics, e.g. for CSS styling hooks)

With very few exceptions (generally in web applications) these new sectioning content elements should have a title, possibly in a <header> element with any other introductory information. We can use this as a rule of thumb for deciding between <section> and <div>:

consciously add a title for each <section>, even if you then hide the title with CSS (as is generally the case with nav for accessibility). If it seems like content that shouldn’t have a title when CSS is disabled, then it’s most probably not a <section>.

The new sectioning content elements can also contain one or more <footer> elements with additional information, such as author (<address>) or copyright (<small>) info, related links etc. It’s important to note that <header> and <footer> apply to the sectioning content element they’re in (this is <body> for a page header or footer). <header> and <footer> can’t contain other <header>s or <footer>s.

Finally, while the words “header”, “footer” and “aside” all come with preconceptions, their semantic meaning comes from the types of content they contain, not from their presentation or relative placement. For example, <aside> could contain a footnote, and a <footer> containing a ‘Top of Page’ link could appear at both the top and bottom of a section.

Now let’s look at example structures for a basic article page; using the standard layout of a page header (with logo etc), navigation tabs, a main column, a side column, and a page footer.

Converting a simple page to HTML 5

Here’s the outline of the parts of our page:

Article Page Layout
  • Page header (site name, logo, search…)
  • Main navigation
  • Main content (wrapper)
    • Article (main column)
      • Article title
      • Article metadata
      • Article content…
      • Article footer
    • Sidebar
      • Sidebar title
      • Sidebar content…
  • Page Footer

So let’s write that in standard POSH HTML 4:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
  <head>
    <title>Article (HTML 4)</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <div id="branding">
      <h1>Site name</h1>
      <!-- other page heading content -->
    </div>
    <ul id="nav">
      <li>Site navigation</li>
    </ul>
    <div id="content">
      <div id="main"> <!-- main content (the article) -->
        <h1>Article title</h1>
        <p class="meta">Article metadata</p>
        <p>Article content…</p>
        <p class="article-footer">Article footer</p>
      </div>
      <div id="sidebar"> <!-- secondary content -->
        <h2>Sidebar title</h2>
        <p>Sidebar content…</p>
      </div>
    </div>
    <div id="footer">Footer</div>
  </body>
</html>

So let’s write that in standard POSH XHTML 1.0:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en">
  <head>
    <title>Article (XHTML 1)</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  </head>
  <body>
    <div id="branding">
      <h1>Site name</h1>
      <!-- other page heading content -->
    </div>
    <ul id="nav"><li>Site navigation</li></ul>
    <div id="content">
      <div id="main"> <!-- main content (the article) -->
        <h1>Article title</h1>
        <p class="meta">Article metadata</p>
        <p>Article content…</p>
        <p class="article-footer">Article footer</p>
      </div>
      <div id="sidebar"> <!-- secondary content -->
        <h2>Sidebar title</h2>
        <p>Sidebar content…</p>
      </div>
    </div>
    <div id="footer">Footer</div>
  </body>
</html>

Now let’s convert that to HTML5, using the new structural elements:

<!-- 'HTML-style' HTML5 -->
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Article (HTML5)</title>
  </head>
  <body>
    <header id="branding"><!-- page header (not in section etc) -->
      <h1>Site name</h1>
      <!-- other page heading content -->
    </header>
    <nav>
      <ul><li>Main navigation</li></ul>
    </nav>
    <div id="content"> <!-- wrapper for CSS styling and no title so not section -->
      <article><!-- main content (the article) -->
        <header>
          <h1>Article title</h1>
          <p>Article metadata</p>
        </header>
        <p>Article content…</p>
        <footer>Article footer</footer>
      </article>
      <aside id="sidebar"><!-- secondary content for page (not related to article) -->
        <h3>Sidebar title</h3> <!-- ref: HTML5-style heading element levels -->
        <p>Sidebar content</p>
      </aside>
    </div>
    <footer id="footer">Footer</footer><!-- page footer -->
  </body>
</html>
<!-- 'XHTML-style' HTML5 -->
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Article (HTML5)</title>
  </head>
  <body>
    <header id="branding"><!-- page header (not in section etc) -->
      <h1>Site name</h1>
      <!-- other page heading content -->
    </header>
    <nav>
      <ul><li>Main navigation</li></ul>
    </nav>
    <div id="content"> <!-- wrapper for CSS styling and no title so not section -->
      <article><!-- main content (the article) -->
        <header>
          <h1>Article title</h1>
          <p>Article metadata</p>
        </header>
        <p>Article content…</p>
        <footer>Article footer</footer>
      </article>
      <aside id="sidebar"><!-- secondary content for page (not related to article) -->
        <h3>Sidebar title</h3> <!-- ref: HTML5-style heading element levels -->
        <p>Sidebar content</p>
      </aside>
    </div>
    <footer id="footer">Footer</footer><!-- page footer -->
  </body>
</html>

Note here we assume that the sidebar contains content not related to the article (such as recent articles etc), so it’s a descendent of <body> (a page sidebar) not <article>. If it only contained content tangentially related to the article we could make <aside> a child of <article>. Also we assume that the page header and footer don’t contain nested <header> or <footer> elements — a complex page header/footer requiring these would need it’s own <section>.

doctype, charset & XHTML-style markup

You’ll notice the doctype and charset are both much simpler. HTML5 is case-insensitive, but WHATWG recommend this style of doctype as it will also work in XHTML (which is case sensitive). While this style charset is recommended, the pre-HTML5 charset declarations are still valid. Also, if you’re viewing XHTML-style code examples (there’s a handy HTML/XHTML code style switcher top right), you’ll note that the charset element still has an XHTML-style trailing slash in the HTML5 example. In fact XHTML-style markup (a closing “/” on empty elements) like this is also valid HTML5! This makes it very easy to migrate to HTML5 from both HTML and XHTML pages. You should try to avoid mixing HTML and XHTML-style code, however — choose one style and stick with it.

HTML5 or XHTML5? Choose HTML5

If you currently use XHTML 1.x you might be thinking to use XHTML5, the XML-compatible version of HTML5. If your website will have a general audience (in other words, ‘people using IE’), don’t. XHTML5 must be sent with an XML mime type (like application/xhtml+xml), and even IE8 still doesn’t support this. However, all of the hallmarks of XHTML coding — writing elements in lower case, correct nesting, closing tags, adding optional elements that add meaning, quoting attribute values — are all compatible (HTML5 is case-insensitive) or encouraged in HTML5.

Browser support (via CSS and JS)

So, does it work? Currently the HTML5 structural elements will work in modern browsers (Firefox 3+, Safari 3+, Opera 9+, Chrome 1+) as long as we declare them as block-level elements via this CSS:

/* Declaring HTML5 elements */
article,aside,details,figcaption,figure,footer,header,hgroup,nav,section,summary{
  display: block;
  }

Additionally, in Internet Explorer 8 and below we need to hack support in via Javascript (I bet you didn’t see that coming ;-) At its most basic, the method is to insert the new HTML5 elements, after which IE <9 will know about (and be able to style) them:

/*@cc_on'abbr article aside audio canvas details figcaption figure footer header hgroup mark menu meter nav output progress section summary time video'.replace(/\w+/g,function(n){document.createElement(n)})@*/

The actual script you should use is longer than this, to support things like printing. The recommended way to use it is to host it yourself by grabbing the latest version from the html5shiv repo (if in doubt copy the content of html5shiv-printshiv.js), saving it on your server, and linking it in the page <head>:

<!--[if lt IE 9]>
<script src="/js/html5shiv-printshiv.js"></script>
<![endif]-->

The HTML5 Shiv is still in active development but there’s no good CDN version to link to currently, so try to keep it updated.

So, all together now…

<head>
  …
  <style type="text/css" media="screen"> /* Declaring HTML5 elements */
  article,aside,canvas,details,figcaption,figure,footer,header,hgroup,menu,nav,section,summary{
    display: block;
    }
  </style>
  <!--[if lt IE 9]>
    <script src="/js/html5shiv-printshiv.js"></script>
  <![endif]--></head>

Finally if you’re using the excellent HTML5 Boilerplate or Modernizr, note that these have the above CSS and JS shivs already built in.

…but IE requiring JS means we’re screwed, right?

You can easily choose to not worry about IE with Javascript turned off on a personal weblog, but if an IE user has JS disabled the new elements (and their associated styling) will be dropped, and the page will ass-plode (feel that déjà vu). While Javascript is becoming more of a requirement with the rise of web apps, IE needing Javascript will probably still be a show-stopper on commercial projects.

You might think that IE’s lack of support without Javascript for these new elements means you can’t use HTML5 at all, but we can still benefit from HTML5’s greater semantic richness — by using HTML5 semantic element names as class names on <div> (so-called “HTML4.5”), in either HTML 4, XHTML 1 or HTML5. You’re probably already using a standard set of class and ID names anyway, and this is in effect a standardised set of semantic class names. HTML5 is basically a superset of HTML 4 or XHTML 1, so as long as you don’t use any new elements HTML5 pages will work in IE. It also has the benefits of simplifying a future move to HTML5, and if you use the HTML5 doctype you benefit from the more detailed HTML5 validators and specification.

Adding HTML5’s semantics via <div class="">

Here’s the HTML 4 version using HTML5 class names:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
  <head>
    <title>Article (HTML 4), with HTML5 class names</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <div id="page-header" class="header"> <!-- page header (note class="header") -->
      <h1>Site name</h1>
      <!-- other page heading content -->
    </div>
      <ul id="main-nav" class="nav">
        <li>Site navigation</li>
      </ul>
    <div id="content">
      <div id="main" class="article"> <!-- main content -->
        <div class="header">
          <h1>Article title</h1>
          <p>Article metadata</p>
        </div>
        <p>Article content…</p>
        <p class="footer">Article footer</p>
      </div>
      <div id="sidebar" class="aside"> <!-- secondary content -->
        <h2>Sidebar title</h2>
        <p>Sidebar content…</p>
      </div>
    </div>
    <div id="page-footer" class="footer">Footer</div>
  </body>
</html>

Here’s the XHTML 1 version using HTML5 class names:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en">
  <head>
    <title>Article (XHTML 1), with HTML5 class names</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  </head>
  <body>
    <div id="page-header" class="header"> <!-- page header -->
      <h1>Site name</h1>
      <!-- other page heading content -->
    </div>
      <ul id="main-nav" class="nav">
        <li>Site navigation</li>
      </ul>
    <div id="content">
      <div id="main" class="article"> <!-- main content -->
        <div class="header">
          <h1>Article title</h1>
          <p>Article metadata</p>
        </div>
        <p>Article content…</p>
        <p class="footer">Article footer</p>
      </div>
      <div id="sidebar" class="aside"> <!-- secondary content -->
        <h2>Sidebar title</h2>
        <p>Sidebar content…</p>
      </div>
    </div>
    <div id="page-footer" class="footer">Footer</div>
  </body>
</html>

Now in HTML5, again using <div> with HTML5 class names rather than the new HTML5 elements:

<!-- 'HTML-style' HTML5 -->
<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Article (HTML5), with HTML5 class names</title>
    <meta charset="utf-8">
  </head>
  <body>
    <div id="page-header" class="header"> <!-- page header -->
      <h1>Site name</h1>
      <!-- other page heading content -->
    </div>
      <ul id="main-nav" class="nav">
        <li>Site navigation</li>
      </ul>
    <div id="content">
      <div id="main" class="article"> <!-- main content -->
        <div class="header">
          <h1>Article title</h1>
          <p>Article metadata</p>
        </div>
        <p>Article content…</p>
        <p class="footer">Article footer</p>
      </div>
      <div id="sidebar" class="aside"> <!-- secondary content -->
        <h2>Sidebar title</h2>
        <p>Sidebar content…</p>
      </div>
    </div>
    <div id="page-footer" class="footer">Footer</div>
  </body>
</html>
<!-- 'XHTML-style' HTML5 -->
<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Article (HTML5), with HTML5 class names</title>
    <meta charset="utf-8" />
  </head>
  <body>
    <div id="page-header" class="header"> <!-- page header -->
      <h1>Site name</h1>
      <!-- other page heading content -->
    </div>
      <ul id="main-nav" class="nav">
        <li>Site navigation</li>
      </ul>
    <div id="content">
      <div id="main" class="article"> <!-- main content -->
        <div class="header">
          <h1>Article title</h1>
          <p>Article metadata</p>
        </div>
        <p>Article content…</p>
        <p class="footer">Article footer</p>
      </div>
      <div id="sidebar" class="aside"> <!-- secondary content -->
        <h2>Sidebar title</h2>
        <p>Sidebar content…</p>
      </div>
    </div>
    <div id="page-footer" class="footer">Footer</div>
  </body>
</html>

You may be wondering why these two examples are so similar — after all, only the doctype and charset differ! That’s because one of HTML5’s core principles is compatibility. If we don’t use any new HTML5 elements, a change of doctype might be all that’s required to convert a well-coded HTML or XHTML page to HTML5.

Why bother with HTML5?

Hopefully by now you’re feeling excited about using HTML5 for a personal project. But if you’ve decided not to use HTML5’s new elements because IE doesn’t support them without Javascript, what’s the point of thinking about HTML5 now? I see several benefits:

  1. Thinking about HTML5’s structural elements (even if we only express the semantics via class names as described above) will make our code more logical and semantic
  2. HTML5 is defined in far greater detail than previous HTML/XHTML specs, giving us more guidance in creating web pages
  3. Another benefit of this detail is more accurate validators (W3C, Validator.nu), with the potential for more detailed error messages
  4. If you think you might convert to HTML5 in the future, the HTML-5-elements-as-class-names approach should remove a lot of the pain of converting (especially with a little regexp magic)
  5. Now that XHTML2 development will be halted, starting to learn about the official future of HTML is a Good Idea™
  6. Using HTML5 is a sliding scale, not all or nothing. You can get benefits from simply changing the doctype, a five second job.
  7. Because browsers use the same parser for HTML5 as HTML 4 or XHTML 1, and because backwards compatibility is a central tenet, using an HTML5 doctype today has almost no disadvantages (make sure to check HTML5 differences from HTML 4, specifically 3.3-3.5).

It’s possible to just change the doctype and get some benefits from having converted to HTML5 (when you use a validator :). However, the more time you put into HTML5 the greater the reward. You’ll get the most benefit from rethinking your site’s semantics from an HTML5 perspective, although for the present I’d recommend adding these extra semantics via the HTML-5-elements-as-class-names approach for commercial projects.