Remote remotes

I’ve just closed the loop on last weekend’s XMPP / Apple Remote hack, using Strophe.js, a library that extends XMPP into normal Web pages. I hope I’ll find some way to use this in the NoTube project (eg. wired up to Web-based video playing in OpenSocial apps), but even if not it has been a useful learning experience. See this screenshot of a live HTML page, receiving and displaying remotely streamed events (green blob: button clicked; grey blob: button released). It doesn’t control any video yet, but you get the idea I hope.

Remote apple remote HTML demo

Remote apple remote HTML demo, screenshot showing a picture of handheld apple remote with a grey blob over the play/pause button, indicating a mouse up event. Also shows debug text in html indicating ButtonUpEvent: PLPZ.

This webclient needs the JID and password details for an XMPP account, and I think these need to be from the same HTTP server the HTML is published on. It works using BOSH or other tricks, but for now I’ve not delved into those details and options. Source is in the Buttons area of the FOAF svn: webclient. I made a set of images, for each button in combination with button-press (‘down’), button-release (‘up’). I’m running my own ejabberd and using an account ‘buttons@foaf.tv’ on the foaf.tv domain. I also use generic XMPP IM accounts on Google Talk, which work fine although I read recently that very chatty use of such services can result in data rates being reduced.

To send local Apple Remote events to such a client, you need a bit of code running on an OSX machine. I’ve done this in a mix of C and Ruby: imremoted.c (binary) to talk to the remote, and the script buttonhole_surfer.rb to re-broadcast the events. The ruby code uses Switchboard and by default loads account credentials from ~/.switchboardrc.

I’ve done a few tests with this setup. It is pretty responsive considering how much indirection is involved: but the demo UI I made could be prettier. The + and – buttons behave differently to the left and right (and menu and play/pause); only + and – send an event immediately. The others wait until the key is released, then send a pair of events. The other keys except for play/pause will also forget what’s happening unless you act quickly. This seems to be a hardware limitation. Apparently Apple are about to ship an updated $20 remote; I hope this aspect of the design is reconsidered, as it limits the UI options for code using these remotes.

I also tried it using two browsers side by side on the same laptop; and two laptops side by side. The events get broadcasted just fine. There is a lot more thinking to do re serious architecture, where passwords and credentials are stored, etc. But XMPP continues to look like a very interesting route.

Finally, why would anyone bother installing compiled C code, Ruby (plus XMPP libraries), their own Jabber server, and so on? Well hopefully, the work can be divided up. Not everyone installs a Jabber server. My thinking is that we can bundle a collection of TV and SPARQL XMPP functionality in a single install, such that local remotes can be used on the network, but also local software (eg. XBMC/Plex/Boxee) can also be exposed to the wider network – whether it’s XMPP .js running inside a Web page as shown here, or an iPhone or a multi-touch table. Each will offer different interaction possibilities, but they can chat away to each other using a common link, and common RDF vocabularies (an area we’re working on in NoTube). If some common micro-protocols over XMPP (sending clicks or sending commands or doing RDF queries) can support compelling functionality, then installing a ‘buttons adaptor’ is something you might do once, with multiple benefits. But for now, apart from the JQbus piece, the protocols are still vapourware. First I wanted to get the basic plumbing in place.

Update: I re-discovered a useful ‘which bosh server do you need?’ which reminds me that there are basically two kinds of BOSH software offering; those that are built into some existing Jabber/XMPP server (like the ejabberd installation I’m using on foaf.tv) and those that are stand-alone connection managers that proxy traffic into the wider XMPP network. In terms of the current experiment, it means the event stream can (within the XMPP universe) come from addresses other than the host running the Web app. So I should also try installing Punjab. Perhaps it will also let webapps served from other hosts (such as opensocial containers) talk to it via JSON tricks? So far I have only managed to serve working Buttons/Strophe HTML from the same host as my Jabber server, ie. foaf.tv. I’m not sure how feasible the cross-domain option is.

Update x2: Three different people have now mentioned opensoundcontrol to me as something similar, at least on a LAN; it clearly deserves some investigation

Streaming Apple Events over XMPP

I’ve just posted a script that will re-route the OSX Apple Remote event stream out across XMPP using the Switchboard Ruby library, streaming click-down and click-up events from the device out to any endpoint identified by a Jabber/XMPP JID (i.e. Jabber ID). In my case, I’m connecting to XMPP as the user xmpp:buttons@foaf.tv, who is buddies with xmpp:bob.notube@gmail.com, ie. they are on each other’s Jabber rosters already. Currently I simply send a textual message with the button-press code; a real app would probably use an XMPP IQ stanza instead, which is oriented more towards machines than human readers.

The nice thing about this setup is that I can log in on another laptop to Gmail, run the Javascript Gmail / Google Talk chat UI, and hear a ‘beep’ whenever the event/message arrives in my browser. This is handy for informally testing the laggyness of the connection, which in turn is critical when designing remote control protocols: how chatty should they be? How much smarts should go into the client? which bit of the system really understands what the user is doing? Informally, the XMPP events seem pretty snappy, but I’d prefer to see some real statistics to understand what a UI might risk relying on.

What I’d like to do now is get a Strophe Javascript client running. This will attach to my Jabber server and allow these events to show up in HTML/Javascript apps…

Here’s sample output of the script (local copy but it looks the same remotely), in which I press and release quickly every button in turn:

Cornercase:osx danbri$ ./buttonhole_surfer.rb
starting event loop.

=> Switchboard started.
ButtonDownEvent: PLUS (0x1d)
ButtonUpEvent: PLUS (0x1d)
ButtonDownEvent: MINU (0x1e)
ButtonUpEvent: MINU (0x1e)
ButtonDownEvent: LEFT (0x17)
ButtonUpEvent: LEFT (0x17)
ButtonDownEvent: RIGH (0x16)
ButtonUpEvent: RIGH (0x16)
ButtonDownEvent: PLPZ (0x15)
ButtonUpEvent: PLPZ (0x15)
ButtonDownEvent: MENU (0x14)
ButtonUpEvent: MENU (0x14)
^C
Shutdown initiated.
Waiting for shutdown to complete.
Shutdown initiated.
Waiting for shutdown to complete.
Cornercase:osx danbri$

Skosdex: SKOS utilities via jruby

I just announced this on the public-esw-thes and public-rdf-ruby lists. I started to make a Ruby API for SKOS.

Example code snippet from the readme.txt (see that link for the corresponding output):

require "src/jena_skos"
s1 = SKOS.new("http://norman.walsh.name/knows/taxonomy")
s1.read("http://www.wasab.dk/morten/blog/archives/author/mortenf/skos.rdf" )
s1.read("file:samples/archives.rdf")
s1.concepts.each_pair do |url,c|
  puts "SKOS: #{url} label: #{c.prefLabel}"
end

c1 = s1.concepts["http://www.ukat.org.uk/thesaurus/concept/1366"] # Agronomy
puts "test concept is "+ c1 + " " + c1.prefLabel
c1.narrower do |uri|
  c2 = s1.concepts[uri]
  puts "\tnarrower: "+ c2 + " " + c2.prefLabel
  c2.narrower do |uri|
    c3 = s1.concepts[uri]
    puts "\t\tnarrower: "+ c3 + " " + c3.prefLabel
  end
end

The idea here is to have a lightweight OO API for SKOS, couched in terms of a network of linked “Concepts”, with broader and narrower relations. But this is backed by a full RDF API (in our case Jena, via Java jruby magic). Eventually, entire apps could be built at the SKOS API level. For now, anything beyond broader/narrower and prefLabel is hidden away in the RDF (and so you’d need to dip into the Jena API to get to this data).

The distinguishing feature is that it uses jruby (a Ruby implementation in pure Java). As such it can call on the full powers of the Jena toolkit, which go far beyond anything available currently in Ruby. At the moment it doesn’t do much, I just parse SKOS and make a tiny object model which exposes little more than prefLabel and broader/narrower.

I think it’s worth exploring because Ruby is rather nice for scripting, but lacks things like OWL reasoners and the general maturity of Java RDF/OWL tools (parsers, databases, etc.).

If you’re interested just to see how Jena APIs look when called from jruby Ruby, see jena_skos.rb in svn. Excuse the mess.

I’m interested to hear if anyone else has explored this topic. Obviously there is a lot more to SKOS than broader/narrower, so I’m very interested to find collaborators or at least a sanity check before taking this beyond a rough demo.

Plans – well my main concern is nothing to do with java or ruby, … but to explore Lucene indexing of SKOS data. I am also very interested in the pragmatic question of where SKOS stops and RDFS/OWL starts, … and how exactly we bridge that gap. See flickr for my most recent sketch of this landscape, where I revisit the idea of an “it” property (skos:it, foaf:it, …) that links things described in SKOS to “the thing itself”. I hope to load up enough overlapping SKOS data to get some practical experience with the tradeoffs.

For query expansion, smarter tagging assistants, etc. So the next step is probably to try building a Lucene index similar to the contrib/wordnet utility that ships with Java lucene. This creates a Lucene index in which every “document” is really a word from Wordnet, with text labels for its synonyms as indexed properties. I also hope to look at the use of SKOS + Lucene for “did you mean?” and auto-completion utilities. It’s also worth noting that Jena ships with LARQ, a Lucene-aware extension to ARQ, Jena’s SPARQL engine.

Beautiful plumage: Topic Maps Not Dead Yet

Echoing recent discussion of Semantic Web “Killer Apps”, an “are Topic Maps dead?” thread on the topicmaps mailing list. Signs of life offered include www.fuzzzy.com (‘Collaborative, semantic and democratic social bookmarking’, Topic Maps meet social networking; featured tag: ‘topic maps‘) and a longer-list from Are Gulbrandsen who suggests a predictable hype-cycle dropoff is occuring, as well as a migration of discussions from email into the blog world. For which, see the topicmaps planet aggregator, and through which I indirectly find Steve Pepper’s blog and an interesting post on how TMs relate to RDF, OWL and the Semantic Web (though I’d have hoped for some mention of SKOS too).

Are Gulbrandsen also cites NZETC (the New Zealand Electronic Tech Centre), winner of The Topic Maps Application of the year award at the Topic Maps 2008 conference; see Conal Tuohy’s presentation on Topic Maps for Cultural Heritage Collections (slides in PDF). On NZETC’s work: “It may not look that interesting to many people used to flashy web 2.0 sites, but to anybody who have been looking at library systems it’s a paradigm shift“.

Other Topic Map work highlighted: RAMline (Royal Academy of Music rewriting musical history). “A long-term research project into the mapping of three axes of musical time: the historical, the functional, and musical time itself.”; David Weinberger blogged about this work recently. Also MIPS / Institute for Bioinformatics and Systems Biology who “attempt to explain the complexity of life with Topic Maps” (see presentation from Volker Stümpflen (PDF); also a TMRA’07 talk).

Finally, pointers to opensource developer tools: Ruby Topic Maps and Wandora (Java/GPL), an extraction/mapping and publishing system which amongst other things can import RDF.

Topic Maps are clearly not dead, and the Web’s a richer environment because of this. They may not have set the world on fire but people are finding value in the specs and tools, while also exploring interop with RDF and other related technologies. My hunch is that we’ll continue to see a slow drift towards the use of RDF/OWL plus SKOS for apps that might otherwise have been addressed using TopicMaps, and a continued pragmatism from tool and app developers who see all these things as ways to solve problems, rather than as ends in themselves.

Just as with RDFa, GRDDL and Microformats, it is good and healthy for the Web community to be exploring multiple similar strands of activity. We’re smart enough to be able to flow data across these divides when needed, and having only a single technology stack is I think both intellectually limiting, socially impractical, and technologically short-sighted.

Lqraps! Reverse SPARQL

(update: files are now in svn; updated the link here)

I’ve just published a quick writeup (with running toy Ruby code) of a “reverse SPARQL” utility called lqraps, a tool for re-constructing RDF from tabular data.

The idea is that such a tool is passed a tab-separated (eventually, CSV etc.) file, such as might conventionally be loaded into Access, spreadsheets etc. The tool looks for a few lines of special annotation in “#”-comments at the top of the file, and uses these to (re)generate RDF. My simple implementation does this with text-munging of SPARQL construct notation into Turtle. As the name suggests, this could be done against the result tables we get from doing SPARQL queries; however it is more generally applicable to tabular data files.

A richer version might be able to output RDF/XML, but that would require use of libraries for SPARQL and RDF. This is of course close in spirit to D2RQ and SquirrelRDF and other relational/RDF mapping efforts, but the focus here is on something simple enough that we could include it in the top section of actual files.

The correct pronunciation btw is “el craps”. As in the card game

RDF in Ruby revisited

If you’re interested in collaborating on Ruby tools for RDF, please join the public-rdf-ruby@w3.org mailing list at W3C. Just send a note to public-rdf-ruby-request@w3.org with a subject line of “subscribe”.

Last weekend I had the fortune to run into Rich Kilmer at O’Reilly’s ‘Social graph Foo Camp‘ gathering. In addition to helping decorate my tent, Rich told me a bit more about the very impressive Semitar RDF and OWL work he’d done in Ruby, initially as part of the DAML programme. Matt Biddulph was also there, and we discussed again what it would take to include FOAF import into Dopplr. I’d be really happy to see that, both because of Matt’s long history of contributions to the Semantic Web scene, but also because Dopplr and FOAF share a common purpose. I’ve long said that a purpose of FOAF is to engineer more coincidences in the world, and Dopplr comes from the same perspective: to increase serendipity.

Now, the thing about engineering serendipity, is that it doesn’t work without good information flow. And the thing about good information flow, is that it benefits from data models that don’t assume the world around us comes nicely parceled into cleanly distinct domains. Borrowing from American Splendor – “ordinary life is pretty complex stuff“. No single Web site, service, document format or startup is enough; the trick comes when you hook things together in unexpected combinations. And that’s just what we did in the RDF world: created a model for mixed up, cross-domain data sharing.

Dopplr, Tripit, Fire Eagle and other travel and location services may know where you and others are. Social network sites (and there are more every day) knows something of who you are, and something of who you care about. And the big G in the sky knows something of the parts of this story that are on the public record.

Data will always be spread around. RDF is a handy model for ad-hoc data merging from multiple sources. But you can’t do much without an RDF parser and a few other tools. Minimally, an RDF/XML parser and a basic API for navigating the graph. There are many more things you could add. In my old RubyRdf work, I had in-memory and SQL-backed storage, with a Squish query interface to each. I had a donated RDF/XML parser (from Ruby4R) and a much-improved query engine (with support for optionals) from Damian Steer. But the system is code-rotted. I wrote it when I was learning Ruby beginning 7 years ago, and I think it is “one to throw away”. I’m really glad I took the time to declare that project “closed” so as to avoid discouraging others, but it is time to revisit Ruby and RDF again now.

Other tools have other offerings: Dave Beckett’s Redland system (written in C) ships with a Ruby wrapper. Dave’s tools probably have the best RDF parsing facilities around, are fast, but require native code. Rena is a pure Ruby library, which looked like a great start but doesn’t appear to have been developed further in recent years.

I could continue going through the list of libraries, but Paul Stadig has already done a great job of this recently (see also his conclusions, which make perfect sense). There has been a lot of creative work around RDF/RDFS and OWL in Ruby, and collectively we clearly have a lot of talent and code here. But collectively we also lack a finished product. It is a real shame when even an RDF-enthusiast like Matt Biddulph is not in a position to simply “gem install” enough RDF technology to get a simple job done. Let’s get this fixed. As I said above,

If you’re interested in collaborating on Ruby tools for RDF, please join the public-rdf-ruby@w3.org mailing list at W3C. Just send a note to public-rdf-ruby-request@w3.org with a subject line of “subscribe”.

In six months time, I’d like to see at least one solid, well rounded and modern RDF toolkit packaged as a Gem for the Ruby community. It should be able to parse RDF/XML flawlessy, and in addition to the usual unit tests, it should be wired up to the RDF Test Cases (see download) so we can all be assured it is robust. It should allow for a fast C parser such as Raptor to be used if available, falling back on pure Ruby otherwise. There should be a basic API that allows me to navigate an RDF graph of properties and values using clear, idiomatic Ruby. Where available, it should hook up to external stores of data, and include at least a SPARQL protocol client, eventually a full SPARQL implementation. It should allow multiple graphs to be super-imposed and disentangled. Some support for RDFS, OWL and rule languages would be a big plus. Support for other notations such as Turtle, RDFa, or XSLT-based GRDDL transforms would be useful, as would a plugin for microformat import. Transliterating Python code (such as the tiny Euler rule engine) should be considered. Divergence from existing APIs in Python (and Perl, Javascript, PHP etc) should be minimised, and carefully balanced against the pull of the Ruby way. And (thought I lack strong views here) it should be made available under a liberal opensource license that permits redistribution under GPL. It should also be as I18N and Unicode-friendly as is possible in Ruby these days.

I’m not saying that all RDF toolkits should be merged, or that collaboration is compulsory. But we are perilously fragmented right now, and collaboration can be fun. In six months time, people who simply want to use RDF from Ruby ought to be pleasantly suprised rather than frustrated when they take to the ‘net to see what’s out there. If it takes a year instead of six months, sure whatever. But not seven years! It would be great to see some movement again towards a common library…

How hard can it be?

Ruby client for querying SPARQL REST services

I’ve started a Ruby conversion of Ivan Herman’s Python SPARQL client, itself inspired by Lee Feigenbaum’s Javascript library. These are tools which simply transmit a SPARQL query across the ‘net to a SPARQL-protocol database endpoint, and handle the unpacking of the results. These queries can result in yes/no responses, variable-to-value bindings (rather like in SQL), or in chunks of RDF data. The default resultset notation is a simple XML format; JSON results are also widely available.

All I’ve done so far, is to sit down with the core Python file from Ivan’s package, and slog through converting it brainlessly into rather similar Ruby. Take out “self” and “:” from method definitions, write “nil” instead of “None”, write “end” at the end of each method and class, use Ruby’s iteration idioms, and you’re most of the way there. Of course the fiddly detail is where we use external libraries: for URIs, network access, JSON and XML parsing. I’ve cobbled something together which could be the basis for reconstructing all the original functionality in Ivan’s code. Currently, it’s more proof of concept, but enough to be worth posting.

Files are in SVN and browsable online; there’s also a bundled up download for the curious, brave or helpful. The test script shows all we have at the moment: ability to choose XML or JSON results, and access result set binding rows. Other result formats (notably RDF; there’s no RDF/XML parser currently) aren’t handled, and various bits of the code conversion are incomplete. I’d be super happy if someone came along and helped finish this!

Update:  Ruby’s REXML parser is now clumsily wired in; you get get a REXML Document object (see xml.com writeup) as a way of navigating the resultset.

Ruby dup() and clone() with frozen strings

I’m exhuming some 5-year old Ruby RDF code, and in the process finding a few things got broken while I was in the time capsule. Here in the shiny future, I found myself hitting an unfamiliar “can’t modify frozen string” error. It all worked just fine back in the hazy summer of 2002, I’m sure. I think. Back then SPARQL didn’t exist, Redland’s rapper utility was called rdfdump, the RDFCore syntax cleanup wasn’t finished, Ruby was more permissive about good parenthesis habits, and so on. But that’s another pile of trouble. This frozen string thing was a bit puzzling, but I got to learn about dup() and clone() at least.

On some investigation, it seems the string in question is being passed in from the command line, and apparently the contents of ARGV are frozen. My first guess was that I’d just clone the string object in question, and all would be well. After some fiddling around, I learned the difference between clone() and dup(): the latter doesn’t preserve frozen-ness: I wanted dup(). All seems well. Back to fixing the real problems; details below for the curious, and as a memory jog next time I run into this.

Maybe next time I emerge from the time capsule, Ruby will be fully Unicode happy?

 a="foo"
=> "foo"

a.frozen?
=> false

a.freeze
=> "foo"

b=a.clone
=> "foo"

a.frozen?
=> true

b=a.dup
=> "foo"

b.frozen?
=> false

c=a.clone
=> "foo"
c.frozen?
=> true

Loosly joined

find . -name danbri-\*.rdf -exec rapper –count {} \;


rapper: Parsing file ./facebook/danbri-fb.rdf
rapper: Parsing returned 2155 statements
rapper: Parsing file ./orkut/danbri-orkut.rdf
rapper: Parsing returned 848 statements
rapper: Parsing file ./dopplr/danbri-dopplr.rdf
rapper: Parsing returned 346 statements
rapper: Parsing file ./tribe.net/danbri-tribe.rdf
rapper: Parsing returned 71 statements
rapper: Parsing file ./my.opera.com/danbri-opera.rdf
rapper: Parsing returned 123 statements
rapper: Parsing file ./advogato/danbri-advogato.rdf
rapper: Parsing returned 18 statements
rapper: Parsing file ./livejournal/danbri-livejournal.rdf
rapper: Parsing returned 139 statements

I can run little queries against various descriptions of me and my friends, extracted from places in the Web where we hang out.

Since we’re not yet in the shiny OpenID future, I’m matching people only on name (and setting up the myfb: etc prefixes to point to the relevant RDF files). I should probably take more care around xml:lang, to make sure things match. But this was just a rough test…


SELECT DISTINCT ?n
FROM myfb:
FROM myorkut:
FROM dopplr:
WHERE {
GRAPH myfb: {[ a :Person; :name ?n; :depiction ?img ]}
GRAPH myorkut: {[ a :Person; :name ?n; :mbox_sha1sum ?hash ]}
GRAPH dopplr: {[ a :Person; :name ?n; :img ?i2]}
}

…finds 12 names in common across Facebook, Orkut and Dopplr networks. Between Facebook and Orkut, 46 names. Facebook and Dopplr: 34. Dopplr and Orkut: 17 in common. Haven’t tried the others yet, nor generated RDF for IM and Flickr, which I probably have used more than any of these sites. The Facebook data was exported using the app I described recently; the Orkut data was done via the CSV format dumps they expose (non-mechanisable since they use a CAPCHA), while the Dopplr list was generated with a few lines of Ruby and their draft API: I list as foaf:knows pairs of people who reciprocally share their travel plans. Tribe.net, LiveJournal, my.opera.com and Advogato expose RDF/FOAF directly. Re Orkut, I noticed that they now have the option to flood your GTalk Jabber/XMPP account roster with everyone you know on Orkut. Not sure the wisdom of actually doing so (but I’ll try it), but it is worth noting that this quietly bridges a large ‘social network ing’ site with an open standards-based toolset.

For the record, the names common to my Dopplr, Facebook and Orkut accounts were: Liz Turner, Tom Heath, Rohit Khare, Edd Dumbill, Robin Berjon, Libby Miller, Brian Kelly, Matt Biddulph, Danny Ayers, Jeff Barr, Dave Beckett, Mark Baker. If I keep adding to the query for each other site, presumably the only person in common across all accounts will be …. me.