in FOAF

Who, what, where, when?

A “Who? what? where? when?” of the Semantic Web is taking shape nicely.

Danny Ayers shows some work with FOAF and the hCard microformat, picking up a theme first explored by Dan Connolly back in 2000: inter-conversion between RDF and HTML person descriptions. Danny generates hCards from SPARQL queries of FOAF, an approach which would pair nicely with GRDDL for going in the other direction.

Meanwhile at W3C, the closing days of the SW Best Practices Group have recently produced a draft of an RDF/OWL Representation of Wordnet. Wordnet is a fantastic resource, containing descriptions of pretty much every word in the English language. Anyone who has spent time in committees, deciding which terms to include in some schema/vocabulary, must surely feel the appeal of a schema which simply contains all possible words. There are essentially two approaches to putting Wordnet into the Semantic Web. A lexically-oriented approach, such as the one published at W3C for Wordnet 2.0, presents a description of words. It mirrors the structure of wordnet itself (verbs, nouns, synsets etc.). Consequently it can be a complete and unjudgemental reflection into RDF of all the work done by the Wordnet team.

The alternate, and complementary, approach is to explore ways of projecting the contents of Wordnet into an ontology, so that category terms (from the noun hierarchy) in Wordnet become classes in RDF. I made a simplistic approach at this some time back (see overview). It has appeal (alonside the linguistic version) because it allows RDF to be used to describe instances of classes for each concept in wordnet, with other properties of those instances. See WhyWordnetIsCool in the FOAF wiki for an example of Wordnet’s coverage of everyday objects.

So, getting Wordnet moving into the SW is great step. It gives us URIs to identify a huge number of everyday concepts. It’s coverage isn’t complete, and it’s ontology is kinda quirky. Aldo Gangemi and others have worked on tidying up the hierarchy; I believe only for version 1.6 of Wordnet so far. I hope that work will eventually get published at W3C or elsewhere as stable URIs we can all use.

In addition to Wordnet there are various other efforts that give types that can be used for the “what” of “who/what/where/when”. I’ve been talking with Rob McCool about re-publishing a version of the old TAP knowledge base. The TAP project is now closed, with Rob working for Yahoo and Guha at Google. Stanford maintain the site but aren’t working on it. So I’ve been working on a quick cleanup (wellformed RDF/XML etc.) of TAP that could get it into more mainstream use. TAP, unlike Wordnet, has more modern everyday commercial concepts (have a look), as well as a lot of specific named instances of these classes.

Which brings me to (Semantic) Wikipedia; another approach to identifying things and their types on the Semantic Web. A while back we added isPrimaryTopicOf to FOAF, to make it easier to piggyback on Wikipedia for RDF-identifying things that have Wiki (and other) pages about them. The Semantic Mediawiki project goes much much further in this direction, providing a rich mapping (classes etc.) into RDF for much of Wikipedia’s more data-oriented content. Very exciting, especially if it gets into the main codebase.

So I think the combination of things like Wordnet, TAP, Wikipedia, and instance-identifying strategies such as “isPrimaryTopicOf”, will give us a solid base for identifying what the things are that we’re describing in the Semantic Web.

And regarding. “Where?” and “when?” … on the UI front, we saw a couple of announcements recently: OpenLayers v1.0, which provides Google-maps-like UI functionality, but opensource and standards friendly. And for ‘when’, a similar offering: the timeline widget. This should allow for fancy UIs to be wired in with RDF calendar or RDF-geo tagged data.

Talking of which… good news of the week: W3C has just announced a Geo incubator group (see detailed charter), whose mission includes updates for the basic Geo (ie. lat/long etc) vocabulary we created in the SW Interest Group.

Ok, I’ve gone on at enough length already, so I’ll talk about SKOS another time. In brief – it fits in here in a few places. When extended with more lexical stuff (for describing terms, eg. multi-lingual thesauri) it could be used as a base for representing the lexically-oriented version of Wordnet. And it also fits in nicely with Wikipedia, I believe.

Last thing, don’t get me wrong — I’m not claiming these vocabs and datasets and bits of UI are “the” way to do ‘who/what/where/when’ on the Semantic Web. They’re each one of several options. But it is great to have a whole pile of viable options at last :)