SPARQLing Protégé-OWL Jena integration

The Jena ARQ SPARQL engine has been very rapidly integrated into Protégé. Nice work from Holger Knublauch, and from Andy Seaborne who explained how Protégé’s native RDF Java structures could manifest themselves via Jena interfaces so that the ARP query engine could work against Protégé data. He also gave a handy overview of the ARP architecture, describing where it has dependencies on Jena, and how it could be attached to other RDF Java libraries instead.

The most amazing thing was how fast it all happened. As a protege-owl lurker, I had been following some discussions on RDF “named graphs”, and jumped in to suggest they take a look at SPARQL’s ability to query against such things.

From my original post

I’d also encourage you to take a look at the SPARQL work on RDF querying, if you haven’t already.

…to Holger’s “This is working indeed!” in less than a day. Holger summarises:

We now have an implementation that wraps a live Protege OWL triple store as a Jena Graph (and Model). This means that arbitrary Jena query services can be executed within Protege.

The relevant call is

OWLModel owlModel = ...; // Protege model
Model model = JenaModelFactory.createModel(owlModel); // Jena model

I also added a quick-and-dirty SPARQL query tab to Protege (see screenshot). This is extremely primitive yet, but hopefully useful on the long run. All this is on CVS and part of the next beta.

Here’s a thumbnail of the screenshot, linking to the full image:
Protégé screenshot showing a SPARQL query and a tabular resultset

I don’t see Andy’s explanation in the list archives, but it is quoted in full in Holger’s post, and is worth reading for those with an interest in Jena and ARQ.

There’s now a Jena Integration of Protege-OWL page explaining the details, and providing a diagram illustrating the integration architecture.

Jena protege integration architecture

The key to this integration is the fact that both systems operate on a low-level “triple” representation of the model. Protege has its native frame store mechanism, which has been wrapped in Protege-OWL with the TripleStore classes. In the Jena world, the corresponding interfaces are called Graph and Model. The Protege TripleStore has been wrapped into a Jena Graph, so that any read access from the Jena API in fact operates on the Protege triples. In order to modify these triples, the conventional Protege-OWL API must be used. However, this mechanisms allows to use Jena methods for querying while the ontology is edited inside Protege.

The details can be explored in CVS, for example see the new SPARQLQueryResults class.