Microformats in Sementic Web

What are Microformats?

Microformats are a way of adding simple markup to human-readable data items such as events, contact details or locations, on web pages, so that the information in them can be extracted by software and indexed, searched for, saved, cross-referenced or combined.

More technically, they are items of semantic markup, using just standard “plain old semantic (X)HTML” (i.e. “POSH“) with a set of common class-names and “rel” values. They are open and available, freely, for anyone to use.

Why Microformats

Why did we come up with microformats?
In short, microformats are the convergence of a number of trends:

  1. a logical next step in the evolution of web design and information architecture
  2. a way for people and organisations to publish richer information themselves, without having to rely upon centralized services
  3. an acknowledgement that (outside of specialist areas) “traditional” metadata efforts have either failed or taken so long to garner any adoption, that a new approach was necessary
  4. a way to use (X)HTML for data.

The semantic Web

The semantic Web is an evolving extension of the Web. It says that Web content can be expressed not only in natural language, but also in a format that can be read and used by software agents. This allows the software agents to more easily find, share, and integrate information.

While the semantic Web is designed for machines first, microformats are designed for humans first. The goal of microformats is to create a web of data that anyone can publish, consume, and so forth. There is a low barrier to entry for the microformats concept, so anyone with an understanding of XHTML can easily publish their own microformats.

Microformats in practice

A common application of microformats is providing contact or event data. The hCard microformats specification provides a guideline for including contact data within a Web page.

The hCard standard is a simple, open, and distributed format for representing people, companies, organisations, and places. It closely follows the vCard standard. The hCard standard defines specific elements for defining pieces of data.

The different data elements are specified using the class attribute (all class names are lowercase). The complete contact card is comprised by the vCard class, so this class is applied to a DIV element that contains the complete contact information. Individual data elements on the card are designated with the appropriate class name. For example, a person’s state is designated by the region class.

The following listing provides a look at a possible hCard for myself. It lists my name, organisation, city (Holtsville), state (New York), and country (USA).

<div id="hcard-Ashish" class="vcard">
<a class="url fn" href="http://www.hurricanesoftwares.com/" mce_href="http://www.hurricanesoftwares.com/">Ashish</a>
<div class="org">Hurricanesoftwares.com</div>
<div class="adr">
<span class="locality">Holtsville</span>,
<span class="region">NY</span>
<span class="country-name">USA</span>
</div></div>

I could easily include this data in a Web page since it is standard XHTML. The data could be easily read by other applications that understand the hCard format. Also, the data could easily be formatted for presentation using standard CSS since the data is contained within basic XHTML.

The hCard Creator tool provides an easy way to assemble the appropriate hCard for a contact. Another common use of microformats is for providing information about events. This is accomplished with the hCalendar format.

The hCalendar specification is an open standard based on the iCalendar standard. The hCalendar format follows the approach used by the hCard standard; that is, class names are used to tag data elements.

The complete event is contained within a DIV element and assigned the vevent class name. Individual aspects of the calendar entry are contained within this DIV element. The start and end dates are marked by the dtstart and dtend class names with the title attribute containing the full date.

The hCalendar Creator is available for marking up your own calendar entries. (Note: I could not get it to work in Internet Explorer 7, but it worked fine in Firefox.) Like hCard data, you can easily present the data on a Web page and style it with CSS while the data is still available in the hCalendar format for use by other applications.

Industry support

I’m happy to say that the IT industry is finally starting to embrace microformats. Yahoo! has been a big proponent of microformats from its inception. In addition, the Eventful site uses them, as does the photo-sharing site Flickr. Even Microsoft recognizes the technology, as proven in this blog post about using microformats with SharePoint. The Twitter site also embraces the hCard standard. Firefox offers the Operator add-on to provide microformats support within the browser.

There are various tools for working with microformats in numerous development languages. A good example is Sumo, which offers a microformats parser for the JavaScript language. A Perl module is available with the Text::Microformat, which offers a microformat parser for Perl.

Adding context

A key concept of the microformats technology is that they are designed for humans first and machines second. The sole purpose of microformats is to create larger, more reliable webs of data, published by more people. The microformats approach is the low-cost, efficient way to build a web of data. Learn more about the various microformats currently available on the microformats site as well as those covered in this article.

Leave me a comment and let me hear your opinion. If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

0 I like it
0 I don't like it