January 2010
« Dec   Feb »

An Incomplete Directory of Open Standards

Through this open world I’m about to ramble,
Through ice and snows, sleet and rain,
I’m about to ride that mornin’ railroad,
P’raps I’ll die on that train.

During the panel discussion at the recent British Computer Society Open Source event, there was discussion (and confusion) about Open Source versus Open Standards. I was asked “So, can you give us some examples of Open Standards”. I rattled off a few, but I thought I’d add a few more here. There is a lot more to be said on the topic, but a good place to start is to list the standards that I think are important.

If I get the time, I plan to turn this into a nice diagram that is much more easily digestible. If there are important standards that I’ve forgotten about that anyone interested in web sites should know about, please let me know in the comments. I’d avoided worrying about file formats (e.g. PNG, MPEG, PDF). And REST isn’t a standard – it is an architectural style that was developed in parallel with the HTTP/1.1 protocol. I’m sure there are many many important ones I’ve left out though.

This is a long and boring post with a record-breaking number of acronyms. So maybe you should stop reading now.

The Internet Plumbing

These standards are the plumbing of the Internet. Like the sewers under a big city, they are impossible to change and will be there forever. They’re infrastructure. Some people are saying Twitter has already become infrastructure, but I’m not convinced about that yet. These standards are split into layers – the link layer is about physical connection to a network and include standards such as Ethernet. The Internet layer routes packets of information across one or more network using the Internet protocol (IP). The Transport Layer is responsible for the reliable delivery of messages, and uses standards such as TCP or UDP. Finally, the Application Layer provides higher level application specific protocols such as DNS, HTTP (and WebDAV) for web servers, FTP, SMTP for mail servers, NTP for time servers, LDAP for user directories and so on. But I’m not here to talk about any of these. I want to talk about the standards that sit on top of these, specifically for web pages.

Making Web Pages

First 10 Years of the W3C - Click for large version

Let’s start with the standards we know and love that make up web pages. Of course we have HTML 4, XHTML and the eagerly awaited HTML 5. We make our HTML pretty using Cascading Style Sheets (CSS) and we interact with the page using the Document Object Model (DOM), which has a large number of associated standards. Note AJAX is not a standard, despite what you might hear. The XMLHttpRequest DOM API (which can be used to implement AJAX) is currently a last call working draft and may be a W3C standard soon. Another client side standard, Scalable Vector Graphics (SVG) never really took off and probably never will.

So we have standards to make interactive web pages that may or may not be semantically rich. But the world would be a better place if these pages can be accessed by as many people as possible. So we have accessibility standards as part of the Web Accessibility Initiative (WAI). These include Web Content Accessibility Guidelines (WCAG) for your web page and the imminent Accessible Rich Internet Applications (WAI-ARIA).

Excellent! Our HTML is neat, we’ve styled it, and all humans can interact with it. But what about the machines? They don’t understand our badly structured markup. If we want machines to be able to understand the content, we need to engage with the semantic web standards and Resource Description Framework (RDF). The UK public sector is keen on HTML+RDFa although this is not a W3C standard yet. You can query your RDF data set using SPARQL and define your ontology (formal representation of concepts and relations between them) using the Web Ontology Language (OWL, not WOL). While we’re at it, a related and very succesful standard which touches my world is the Dublin Core Metadata Standard, which is an ISO standard. I like this good introduction to Semantic Web standards if you want to read more.

The Biggest Standard of the Naughties

Extensible Markup Language (XML) is a hugely successful standard. If you judge the success of a standard by its adoption (which I do), it was the Hit of the Decade. XML Schema (XSD) has replaced the ill-thought-out DTD standard for defining XML structures. Other child standards include the node selection language XPath and query language XQuery. XSL Transformations (XSLT) is my favourite templating language. XML Inclusions (XInclude) joins XML documents together. They’ve also given us XForms to collect data – sadly it hasn’t taken off as I’d have liked.

Also XML related, the Web Services Standards have given us a wonderful way to make remote services play together. The Holy Trinity behind Web Services are SOAP (previously Simple Object Access Protocol) to define the message formats, Web Services Description Language (WSDL) to give service descriptions and Universal Description, Discovery and Integration (UDDI) to find the services. UDDI is actually an OASIS specification, not W3C, but it fits better here.

OASIS standards

All of the standards mentioned so far are open, and unless otherwise stated, are looked after by the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF). These guys look after the Web as we know it. However, there are other standards bodies that create open standards that are more application specific, and some bodies that create standards which might not be considered truly open. Below are some of the important ones.

OASIS (Organization for the Advancement of Structured Information Standards), in their own words, “drives the development, convergence and adoption of open standards for the global information society”. The OASIS standards that touch my world include, in no particular order:

Authentication and Private Data Portability

OASIS tends to focus on fairly large, complex standards which are always at risk from smaller standards which are often easier to implement so have less of a barrier to adoption. The standards that I think will beat SAML include OpenID which has taken the web by storm recently and OAuth. OpenID (under the OpenID Foundation) is a web single sign-on protocol similar to SAML. OAuth (now under the IETF) allows a site to request private user data from another site. Both OpenID and OAuth above rely of XRDS. While we’re talking about users and social networks, other important not-quite-standards are listed below. A great article to learn more about these is the “Overlap of identity technologies” worked example from Google.
  • OpenSocial (Google) – for building social applications (widgets) and share data across networks
  • Friend of a Friend (FOAF) – defines an open technology for connecting social Web sites and the people in them. It uses RDF and OWL.
  • Portable Contents – for moving your social graph around the internet with you

Content Repository Access, and Java Community Standards

CMIS is a Content Repository access standard. Another very successful repository standard you all know well is SQL, which has been a standard with both the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO) for over 20 years. File system standards haven’t seen the same joy, with most major operating systems using their own standard.
Another open content repository standard is the Java Content Repository (JCR) from the Java Community Process (JCP) Programme. Now while these standards are Java language focussed, they are still open. JCP standards are defined in Java Specification Requests (JSRs), of which there are other 300. Some important, well adopted JCP standards include:


For syndication we have RSS, which is looked after by the RSS Advisory Board (the guy that fixes my boiler is on it) and AtomPub, which is an IETF standard. An extension to these, PubSubHubbub, is a Google project which added near-realtime notification to RSS and AtomPub. My boiler guy thinks this specification has holes. For outlines, we have OPML (Outline Processor Markup Language). For example, here is my blogroll as OPML.

Things that start with Open

I thought I’d end with some things I like that aren’t actually standards, but use the word Open in their title.

  • OpenSearch – A set of formats designed to make sharing search results easier
  • OpenStreetMap – Creates and provides free geographic data such as street maps to anyone who wants them. This is more about Open Data than Open Standards, but anyway.
  • OpenCalais- A service that semantically parses your content and identifies people, events, places and more. I used the WordPress plugin Tagaroo on this blog for fun. Only basic use is free, though. Probably doesn’t really belong here. However, below is a screenshot that shows Tagaroo suggesting tags and images for this blog post. The power of semantic analysis.

  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Digg
  • LinkedIn
  • StumbleUpon
  • Technorati

1,979 comments to An Incomplete Directory of Open Standards