RDFAuthor does SPARQL

Damian is working on a new version of RDFAuthor that generates SPARQL queries (instead of the older Squish notation). It can also (not sure which protocol(s)) get results from a query service. Here’s a screenshot:

Example screenshot of RDFAuthor for query authoring and result display. Nodes and arcs at top of window, tabular results at bottom

The automatically generated (and hence not 100% user-friendly) query was as follows:

SELECT *
 WHERE
 {
      ?var_4 <http ://xmlns.filsa.net/emir/0.2/#subject> ?var_1 .
      ?var_4 <http ://xmlns.filsa.net/emir/0.2/#from> ?var_5 .
      ?var_4 <http ://xmlns.filsa.net/emir/0.2/#date> ?var_2 .
      ?var_5 <http ://xmlns.com/foaf/0.1/name> ?var_3 .
      ?var_5 <http ://xmlns.com/foaf/0.1/mbox> <mailto :pldms@mac.com> .
 }

What does this say? Pretty much just the following:

Match where we have (in some default RDF graph) the foaf:name of any things with a foaf:mbox of mailto:pldms@mac.com,
whenever there is something that has an emir:subject and emir:date, and that is also emir:from the first thing. SELECT from that the name, subject and date.

This is based on a template graph whose structure can be seen in the screenshot, decorated with yellow variable markers where a node is marked as unknown. Actually I screwed up after the query executed and subsequently marked the mailbox node as unknown too, before taking the screenshot; if that query had been executed, the resultset would have been much larger. The target database is a Joseki service that has an RDF version of the jena-dev mail archives.

A more readable version of the query (untested) might be:

PREFIX emir: <http ://xmlns.filsa.net/emir/0.2/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?date ?sub

WHERE
{
[
emir:subject ?sub ;
emir:date ?date ;
emir:from
[
foaf:name ?name;
foaf:mbox>mailto:pldms@mac.com> .
]
]
}

Google boost Jabber + VOIP, Skype releases IM toolkit, Jabber for P2P SPARQL?

Interesting times for the personal Semantic Web: “Any client that supports Jabber/XMPP can connect to the Google Talk service” Google Talk and Open Communications. It does voice calls too, using “a custom XMPP-based signaling protocol and peer-to-peer communication mechanism. We will fully document this protocol. In the near future, we plan to support SIP signaling.”

Meanwhile, Skype (the P2P-based VOIP and messaging system) have apparently released developer tools for their IM system. From a ZDNet article:

“Skype to wants to embrace the rest of Internet,” Skype co-founder Janus Friis said during a recent interview.

He did offer hypothetical examples. Online gamers involved in massive multiple player mayhem could use Skype IM to taunt rivals and discuss strategy with teammates. Skype’s IM features could be incorporated, Friss suggests, into software-based media players for personal computers, Web sites for dating, blogging or “eBay kinds of auctions,” Friis said.

I spent some time recently looking at collaborative globe-browsing with Google Earth (ie. giving and taking of tours), and yesterday, revisiting Jabber/XMPP as a possible transport for SPARQL queries and responses between friends and FOAFs. Both apps could get a healthy boost from these developments in the industry. Skype is great but the technology could do with being more open; maybe the nudge from Google will help there. Jabber is great but … hardly used by the people I chat with (who are split across MSN, Yahoo, AIM, Skype and IRC).

For a long time I’ve wanted to do RDF queries in a P2P context (eg. see book chapter I wrote with Rael Dornfest). Given Apple’s recent boost for Jabber, and now this from Google, the technology looks to have a healthy future. I want to try exposing desktop, laptop etc RDF collections (addressbooks, calendars, music, photos) directly as SPARQL endpoints exposed via Jabber. There will be some fiddly details, but the basic idea is that Jabber users (including Google and Apple customers) could have some way to expose aspects of their local data for query by their friends and FOAFs, without having to upload it all to some central Web site.

Next practical question: which Jabber software library to start hacking with? I was using Rich Kilmer’s Jabber4R but read that it wasn’t unmaintained, so wondering about switching to Perl or Python…

SPARQLing Protégé-OWL Jena integration

The Jena ARQ SPARQL engine has been very rapidly integrated into Protégé. Nice work from Holger Knublauch, and from Andy Seaborne who explained how Protégé’s native RDF Java structures could manifest themselves via Jena interfaces so that the ARP query engine could work against Protégé data. He also gave a handy overview of the ARP architecture, describing where it has dependencies on Jena, and how it could be attached to other RDF Java libraries instead.

The most amazing thing was how fast it all happened. As a protege-owl lurker, I had been following some discussions on RDF “named graphs”, and jumped in to suggest they take a look at SPARQL’s ability to query against such things.

From my original post

I’d also encourage you to take a look at the SPARQL work on RDF querying, if you haven’t already.

…to Holger’s “This is working indeed!” in less than a day. Holger summarises:

We now have an implementation that wraps a live Protege OWL triple store as a Jena Graph (and Model). This means that arbitrary Jena query services can be executed within Protege.

The relevant call is

OWLModel owlModel = ...; // Protege model
Model model = JenaModelFactory.createModel(owlModel); // Jena model

I also added a quick-and-dirty SPARQL query tab to Protege (see screenshot). This is extremely primitive yet, but hopefully useful on the long run. All this is on CVS and part of the next beta.

Here’s a thumbnail of the screenshot, linking to the full image:
Protégé screenshot showing a SPARQL query and a tabular resultset

I don’t see Andy’s explanation in the list archives, but it is quoted in full in Holger’s post, and is worth reading for those with an interest in Jena and ARQ.

There’s now a Jena Integration of Protege-OWL page explaining the details, and providing a diagram illustrating the integration architecture.

Jena protege integration architecture

The key to this integration is the fact that both systems operate on a low-level “triple” representation of the model. Protege has its native frame store mechanism, which has been wrapped in Protege-OWL with the TripleStore classes. In the Jena world, the corresponding interfaces are called Graph and Model. The Protege TripleStore has been wrapped into a Jena Graph, so that any read access from the Jena API in fact operates on the Protege triples. In order to modify these triples, the conventional Protege-OWL API must be used. However, this mechanisms allows to use Jena methods for querying while the ontology is edited inside Protege.

The details can be explored in CVS, for example see the new SPARQLQueryResults class.

geobloggers: “Network Link” in Google Earth

This is the hidden gem of Google Earth. Adding a “Network Link” allows you to fetch KML data from remote servers. It does this in two ways, Time Based or Location Based. So *anyone* can add dynamic data to Google Maps.

Apparently KML is based on GML. I don’t know Keyhole/Google’s work differs. There seems to be a role here for something simple enough for Google Earth, WorldWind, and other viewer apps to use, when consulting a remote server for info about some area of interest. Maybe it’s GML Web Feature Servers, maybe KML, maybe geo-extended RSS/Atom, or perhaps generic query interfaces like the SPARQL protocol. SOAP, WSDL and REST fit in the picture somewhere. Probably, various things will be used in different environments, depending on application emphasis. We might be looking up the opening-hours of a shop, contact information for an organization, or jobs, events, photos, blog posts, FOAF profiles etc in a certain area, … it isn’t clear where the line is drawn between ‘geographic’ data and the wider unbounded collection of information about the world. GML has strengths at the geographical end of the spectrum, RDF (and its query system, SPARQL) has strengths at the generic, domain-neutral end. RSS/Atom is serving well as a generic carrier for data syndication. It isn’t clear to me yet where KML fits (or SVG, for that matter), but work on the relationship between GML and RDF would seem timely.

The geobloggers post has examples and links to flickr and del.icio.us-based services that expose this interface. I’m going to try making such a service on top of SPARQL…

Sample SPARQL query

PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?book ?title ?authorname
WHERE { 
?book dc:creator ?author .
?author dc:type <http ://hoppa.com/Painters/> .
      ?author dc:title ?authorname .
      ?book dc:title ?title .
}

…works with rdf data describing some books by painters. I tested in Dave Beckett’s Redland-based online SPARQL demo. The query finds 5 results. Seems to have some encoding errors, but apart from that, is fine. There are more DawgShows in the ESW wiki. The sparql.org demo (using Jena) also works.