in coding

RDF in Ruby revisited

If you’re interested in collaborating on Ruby tools for RDF, please join the public-rdf-ruby@w3.org mailing list at W3C. Just send a note to public-rdf-ruby-request@w3.org with a subject line of “subscribe”.

Last weekend I had the fortune to run into Rich Kilmer at O’Reilly’s ‘Social graph Foo Camp‘ gathering. In addition to helping decorate my tent, Rich told me a bit more about the very impressive Semitar RDF and OWL work he’d done in Ruby, initially as part of the DAML programme. Matt Biddulph was also there, and we discussed again what it would take to include FOAF import into Dopplr. I’d be really happy to see that, both because of Matt’s long history of contributions to the Semantic Web scene, but also because Dopplr and FOAF share a common purpose. I’ve long said that a purpose of FOAF is to engineer more coincidences in the world, and Dopplr comes from the same perspective: to increase serendipity.

Now, the thing about engineering serendipity, is that it doesn’t work without good information flow. And the thing about good information flow, is that it benefits from data models that don’t assume the world around us comes nicely parceled into cleanly distinct domains. Borrowing from American Splendor – “ordinary life is pretty complex stuff“. No single Web site, service, document format or startup is enough; the trick comes when you hook things together in unexpected combinations. And that’s just what we did in the RDF world: created a model for mixed up, cross-domain data sharing.

Dopplr, Tripit, Fire Eagle and other travel and location services may know where you and others are. Social network sites (and there are more every day) knows something of who you are, and something of who you care about. And the big G in the sky knows something of the parts of this story that are on the public record.

Data will always be spread around. RDF is a handy model for ad-hoc data merging from multiple sources. But you can’t do much without an RDF parser and a few other tools. Minimally, an RDF/XML parser and a basic API for navigating the graph. There are many more things you could add. In my old RubyRdf work, I had in-memory and SQL-backed storage, with a Squish query interface to each. I had a donated RDF/XML parser (from Ruby4R) and a much-improved query engine (with support for optionals) from Damian Steer. But the system is code-rotted. I wrote it when I was learning Ruby beginning 7 years ago, and I think it is “one to throw away”. I’m really glad I took the time to declare that project “closed” so as to avoid discouraging others, but it is time to revisit Ruby and RDF again now.

Other tools have other offerings: Dave Beckett’s Redland system (written in C) ships with a Ruby wrapper. Dave’s tools probably have the best RDF parsing facilities around, are fast, but require native code. Rena is a pure Ruby library, which looked like a great start but doesn’t appear to have been developed further in recent years.

I could continue going through the list of libraries, but Paul Stadig has already done a great job of this recently (see also his conclusions, which make perfect sense). There has been a lot of creative work around RDF/RDFS and OWL in Ruby, and collectively we clearly have a lot of talent and code here. But collectively we also lack a finished product. It is a real shame when even an RDF-enthusiast like Matt Biddulph is not in a position to simply “gem install” enough RDF technology to get a simple job done. Let’s get this fixed. As I said above,

If you’re interested in collaborating on Ruby tools for RDF, please join the public-rdf-ruby@w3.org mailing list at W3C. Just send a note to public-rdf-ruby-request@w3.org with a subject line of “subscribe”.

In six months time, I’d like to see at least one solid, well rounded and modern RDF toolkit packaged as a Gem for the Ruby community. It should be able to parse RDF/XML flawlessy, and in addition to the usual unit tests, it should be wired up to the RDF Test Cases (see download) so we can all be assured it is robust. It should allow for a fast C parser such as Raptor to be used if available, falling back on pure Ruby otherwise. There should be a basic API that allows me to navigate an RDF graph of properties and values using clear, idiomatic Ruby. Where available, it should hook up to external stores of data, and include at least a SPARQL protocol client, eventually a full SPARQL implementation. It should allow multiple graphs to be super-imposed and disentangled. Some support for RDFS, OWL and rule languages would be a big plus. Support for other notations such as Turtle, RDFa, or XSLT-based GRDDL transforms would be useful, as would a plugin for microformat import. Transliterating Python code (such as the tiny Euler rule engine) should be considered. Divergence from existing APIs in Python (and Perl, Javascript, PHP etc) should be minimised, and carefully balanced against the pull of the Ruby way. And (thought I lack strong views here) it should be made available under a liberal opensource license that permits redistribution under GPL. It should also be as I18N and Unicode-friendly as is possible in Ruby these days.

I’m not saying that all RDF toolkits should be merged, or that collaboration is compulsory. But we are perilously fragmented right now, and collaboration can be fun. In six months time, people who simply want to use RDF from Ruby ought to be pleasantly suprised rather than frustrated when they take to the ‘net to see what’s out there. If it takes a year instead of six months, sure whatever. But not seven years! It would be great to see some movement again towards a common library…

How hard can it be?

Add Comment Register



  1. In six months time, people who simply want to use RDF from Ruby ought to be pleasantly suprised rather than frustrated when they take to the ‘net to see what’s out there.

    Yup; something like what ARC seems to provide for the PHP world. Might ActiveRDF be an appropriate basis for this?

  2. For those willing to run on JRuby, Jena works well from JRuby and I assume Pellet’s reasoning features could also be accessed from JRuby.

  3. A belated comment, for the record. As you know, all our work on HyperDE (http://www.tecweb.inf.puc-rio.br/hyperde) is in Ruby. At that time, Demetrius Nunes, my student, wrote a replacement for ActiveRecord named SemanticRecord. This later inspired ActiveRDF (http://www.activerdf.org/), which we have used extensively (and extended) in Explorator (http://www.tecweb.inf.puc-rio.br/explorator).

    In both, we also used Sesame directly using it’s Java API and RJB, which works just fine. I suspect it would also work with Jena, although we have not tried this.

    We are currently looking into migrating both to jRuby.

  4. In six months time, I’d like to see at least one solid, well rounded and modern RDF toolkit packaged as a Gem for the Ruby community.

    Soooo… six months later:


    % gem q -rn rdf

    *** REMOTE GEMS ***

    Bulk updating Gem source index for: http://gems.rubyforge.org
    activerdf (1.6.11, 1.6.10, 1.6.9, 1.6.8, 1.6.6, 1.6.5, 1.6.4, 1.6.3, 1.6.2, 1.6.1, 1.6, 1.5, 1.4, 1.3.1, 1.3, 1.2.3, 1.2.2, 1.2.1, 1.2, 1.1, 1.0)
    activerdf_jena (0.1)
    activerdf_rdflite (1.4.1, 1.4, 1.3, 1.2.3, 1.2.2, 1.2.1, 1.2, 1.1, 1.0)
    activerdf_redland (1.2.2, 1.2.1, 1.2, 1.1, 1.0)
    activerdf_rules (0.0.2, 0.0.1)
    activerdf_sesame (0.2.2, 0.2.1, 0.2, 0.1)
    activerdf_sparql (1.3.6, 1.3.5, 1.3.4, 1.3.3, 1.3.2, 1.3.1, 1.3, 1.2.1, 1.2, 1.1, 1.0)
    rdf (0.3)
    rdf-redland (0.5.1.3, 0.5.1.2, 0.5.1)
    rdf_schema_generator (1.1.0, 1.0.0)
    rdfa (0.0.8, 0.0.7, 0.0.6, 0.0.5, 0.0.4, 0.0.3, 0.0.2, 0.0.1)
    rubyrdf (0.0.1)
    rubyrdf-sesame (0.0.1)

    I want to do simple SPARQL queries against dbpedia in ruby. Which gem should I go with?