Commandline PHP for loading RDF URLs into ARC (and Twinkle for query UI)

if ($argc != 2 || in_array($argv[1], array('--help', '-help', '-h', '-?'))) {
This is a command line PHP script with one option: URL of RDF document to load
} else {

$supersecret = "123rememberme"; #Security analysts recommend using data of birth + social security ID here
# *** be careful with real msql passwords ***

$config = array( 'db_host' => 'localhost', 'db_name' => 'sg1', 'db_user' => 'sparql',
'db_pwd' => $supersecret, 'store_name' => 'crawl', );
$store = ARC2::getStore($config);
if (!$store->isSetUp()) { $store->setUp(); }
$profile = $argv[1];
echo "Loading data from " . $profile ;
$store->query('DELETE FROM <'.$profile.'>');
$store->query('LOAD <'.$profile.'>');

FWIW, this is what I’m using to (re)load data into an ARC store from the commandline. I’ll try wiring up my old RDF crawler to this when I get time. Each loaded source is stored as a named graph, with the URI it is loaded from being the named graph URI. ARC just needs the path to the unpacked PHP libraries, and connection details for a MySQL database, and comes with a handy SPARQL endpoint script too, which I’ve been testing with Twinkle.

My public sandbox data is currently loaded up as follows. No promises it’ll stay there, but anyway, adding the following to Twinkle 2.0’s config file section for SPARQL endpoints works for me. The endpoint also directly offers a basic Web interface too, with HTML, XML, JSON etc.

a sources:Endpoint; rdfs:label "FOAF Social Graph Agggregator sandbox".

Querying across ‘social graph’ fragments

PREFIX owl: <>
GRAPH <> { ?p a owl:InverseFunctionalProperty . }
GRAPH <> { [ :openid <> ; ?p ?pv ] }
GRAPH ?src { [ ?p ?pv ; :knows [ :name ?who ] ] }

Just a quick post to record this cut down version of a SPARQL query I’ve been using. The idea is that it is evaluated against a SPARQL RDF dataset where multiple sources are brought together. It tries to find the names of anyone those sources claim knows me, regardless of how those datasets actually identify me (mailbox, mailbox hash, IM accounts, weblog or homepage or openid URL, etc). So long as they use a property/value pair that matches something in my main FOAF file, and so long as the property is tagged as “inverse functional” (ie. uniquely identifying) in the FOAF spec, the identification should go through OK.

Flock browser RDF: describing accounts

Flock is a mozilla-based browser that emphasises social and “web2″ themes. From a social-network-mobility thread, I’m reminded to take another look at Flock by Ian McKellar’s recent comments…

I wrote a bunch of that code when I was at Flock.

It’s all in RDF, I think it’s currently in a SQLite triple store in the user’s profile directory

I took a look. Seems not to be in SQLite files, at least in my fresh installation. Instead there is a file flock-data.rdf which looks to be the product of Mozilla’s ageing RDF engine. I had to clean things up slightly before I could process it with modern (Redland in this case) tools, since it uses Netscape’s pre-RDFCore datatyping notation:

cat flock-data.rdf | sed -e s/NC:parseType/RDF:datatype/

With that tweak out of the way, I can nose around the data using SPARQL. I’m interested in the “social graph” mobility discussions, and in mapping FOAF usage to Brad Fitzpatrick’s model (see detail in his slides).

The model in the writeup from Brad and David Recordon has nodes (standing roughly for accounts) and “is” relations amongst them where two accounts are known to share an owner, or “claims” relations to record a claim associated with one such account of shared ownership with another.

For example in my Flickr account (username “danbri”) I might claim to own the account (also username “danbri”). However you’d be wise not to believe flickr-me without more proof; this could come from many sources and techniques. Their graph model is focussed on such data.

FOAF by contrast emphasises the human social network, with the node graph being driven by “knows” relationships amongst people. We do have the OnlineAccount construct, which is closer to the kind of nodes we see in the “Thoughts on the Social Graph” paper, although they also include nodes for email, IM and hashed mailbox, I believe. The SIOC spec elaborates on this level, by sub-classing its notion of User from OnlineAccount rather than from Person.

So anyway, I’m looking at transformations between such representations, and FLock seems a nice source of data, since it watches over my browsing and when I use a site it knows to be “social”, it keeps a record of the account in RDF. For now, here’s a quick query to give you an idea of the shape of the data:

PREFIX fl: <>
PREFIX nc: <>
FROM <flock-data-fixed.rdf>
?x fl:flockType “Account” .
?x nc:Name ?name .
?x nc:URL ?url .
?x fl:serviceId ?serviceId .
?x fl:accountId ?accountId .

Running this with Redland’s “roqet” utility in JSON mode gives:

“head”: {
“vars”: [ "x", "name", "url", "serviceId", "accountId" ]
“results”: {
“ordered” : false,
“distinct” : true,
“bindings” : [
"x" : { "type": "uri", "value": "urn:flock:ljdanbri" },
"name" : { "type": "literal", "value": "danbri" },
"url" : { "type": "literal", "value": "" },
"serviceId" : { "type": "literal", "value": ";1" },
"accountId" : { "type": "literal", "value": "danbri" }
"x" : { "type": "uri", "value": "urn:typepad:service:danbri" },
"name" : { "type": "literal", "value": "danbri" },
"url" : { "type": "literal", "value": "" },
"serviceId" : { "type": "literal", "value": ";1" },
"accountId" : { "type": "literal", "value": "danbri" }
"x" : { "type": "uri", "value": "urn:flock:flickr:account:35468151816@N01" },
"name" : { "type": "literal", "value": "danbri" },
"url" : { "type": "literal", "value": "" },
"serviceId" : { "type": "literal", "value": ";1" },
"accountId" : { "type": "literal", "value": "35468151816@N01" }
"x" : { "type": "uri", "value": "urn:wordpress:service:danbri" },
"name" : { "type": "literal", "value": "danbri" },
"url" : { "type": "literal", "value": "" },
"serviceId" : { "type": "literal", "value": ";1" },
"accountId" : { "type": "literal", "value": "danbri" }
"x" : { "type": "uri", "value": "urn:flock:youtube:modanbri" },
"name" : { "type": "literal", "value": "modanbri" },
"url" : { "type": "literal", "value": "" },
"serviceId" : { "type": "literal", "value": ";1" },
"accountId" : { "type": "literal", "value": "modanbri" }
"x" : { "type": "uri", "value": "urn:delicious:service:danbri" },
"name" : { "type": "literal", "value": "danbri" },
"url" : { "type": "literal", "value": "" },
"serviceId" : { "type": "literal", "value": ";1" },
"accountId" : { "type": "literal", "value": "danbri" }

You can see there are several bits of information to squeeze in here. Which reminds me to chase up the “accountHomepage” issue in FOAF. They sometimes use a generic URL, eg., while other times an account-specific one, eg. They also distinguish an nc:Name property of the account from a fl:accountId, allowing Flickr’s human readable account names to be distinguished from the generated ID you’re originally assigned. The fl:serviceId is an internal software service identifier it seems, following Mozilla conventions.

Last little experiment: a variant of the above query, but using CONSTRUCT instead of SELECT, to transform into FOAF’s idiom for representing accounts:

?x a foaf:OnlineAccount .
?x foaf:name ?name .
?x foaf:accountServiceHomepage ?url .
?x foaf:accountName ?accountId .

Seems to work… There’s load of other stuff in flock-data.rdf too, but I’ve not looked around much. For eg. you can search tagged URLs –

WHERE {[fl:tag "funny"; nc:URL ?url]}

Begin again

facebook grabThere was an old man named Michael Finnegan
He went fishing with a pinnegan
Caught a fish and dropped it in again
Poor old Michael Finnegan
Begin again.

Let me clear something up. Danny mentions a discussion with Tim O’Reilly about SemWeb themes.

Much as I generally agree with Danny, I’m reaching for a ten-foot bargepole on this one point:

While Facebook may have achieved pretty major adoption for their approach, it’s only very marginally useful because of their overly simplistic treatment of relationships.

Facebook, despite the trivia, the endless wars between the ninja zombies and the pirate vampires; despite being centralised, despite [insert grumble] is massively useful. Proof of that pudding: it is massively used. “Marginal” doesn’t come into it. The real question is: what happens next?

Imagine 35 million people. Imagine them marching thru your front room. Jumping off a table at the same time. Sending you an email. Or turning the tap off when they brush their teeth. 35 million is a fair-sized nation. Taking that 35 million figure I’ve heard waved around, and placing it in the ever scientific Wikipedia listing … that puts the land of Facebook somewhere between Kenya and Algeria in the population charts. Perhaps the figures are exagerrated. Perhaps a few million have wandered off, or forgotten their passwords. Doubtless some only use it every month or few.

Even a million is a lot of use; and a lot of usefulness.

Don’t let anything I ever say here in this blog be taken as claiming such sites and services are only marginally useful. To be used is to be useful; and that’s something SemWeb people should keep in the forefront of their minds. And usually they do, I think, although the community tends towards the forward-looking.

But let’s be backwards-looking for a minute. My concern with these sites is not that they’re marginally useful, but that they could be even more useful. Slight difference of emphasis. was great, back in 2000 when we started FOAF. But it was a walled garden. It had cool graph traversal stuff that evocatively showed your connection path to anyone else in the network. Their network. Then followed Friendster, which got slow as it proved useful to too many people. Ditto Orkut, which everyone signed up to, then wandered off from when it proved there was rather little to do there except add people. MySpace and Facebook cracked that one, … but guess what, there’ll be more.

I got a signup to Yahoo’s Mash yesterday. Anyone wanna be my friend? It has fun stuff (“Mecca Ibrahim smacked The Mash Pet (your Mash pet)!”), … wiki-like profile editing, extension modules … and I’d hope given that this is 2007, eventually some form of API. People won’t live in Facebook-land forever. Nor in Mash, however fun it is. I still lean towards Jabber/XMPP as the long-term infrastructure for this sort of system, but that’s for another time. The appeal of SixDegrees, of Friendster, of Orkut … wasn’t ever the technology. It was the people. I was there ‘cos others were there. Nothing more. And I don’t see this changing, no matter how much the underlying technology evolves. And people move around, drift along to the next shiny thing, … go wherever their friends are. Which is our only real problem here.

Begin again.

I’ve been messing with RDF a bit. I made a sample SPARQL query that asks (exported RDF from) a few networks about my IM addresses; here are the results from Redland/Rasqal JSON.

Loosly joined

find . -name danbri-\*.rdf -exec rapper –count {} \;

rapper: Parsing file ./facebook/danbri-fb.rdf
rapper: Parsing returned 2155 statements
rapper: Parsing file ./orkut/danbri-orkut.rdf
rapper: Parsing returned 848 statements
rapper: Parsing file ./dopplr/danbri-dopplr.rdf
rapper: Parsing returned 346 statements
rapper: Parsing file ./
rapper: Parsing returned 71 statements
rapper: Parsing file ./
rapper: Parsing returned 123 statements
rapper: Parsing file ./advogato/danbri-advogato.rdf
rapper: Parsing returned 18 statements
rapper: Parsing file ./livejournal/danbri-livejournal.rdf
rapper: Parsing returned 139 statements

I can run little queries against various descriptions of me and my friends, extracted from places in the Web where we hang out.

Since we’re not yet in the shiny OpenID future, I’m matching people only on name (and setting up the myfb: etc prefixes to point to the relevant RDF files). I should probably take more care around xml:lang, to make sure things match. But this was just a rough test…

FROM myfb:
FROM myorkut:
FROM dopplr:
GRAPH myfb: {[ a :Person; :name ?n; :depiction ?img ]}
GRAPH myorkut: {[ a :Person; :name ?n; :mbox_sha1sum ?hash ]}
GRAPH dopplr: {[ a :Person; :name ?n; :img ?i2]}

…finds 12 names in common across Facebook, Orkut and Dopplr networks. Between Facebook and Orkut, 46 names. Facebook and Dopplr: 34. Dopplr and Orkut: 17 in common. Haven’t tried the others yet, nor generated RDF for IM and Flickr, which I probably have used more than any of these sites. The Facebook data was exported using the app I described recently; the Orkut data was done via the CSV format dumps they expose (non-mechanisable since they use a CAPCHA), while the Dopplr list was generated with a few lines of Ruby and their draft API: I list as foaf:knows pairs of people who reciprocally share their travel plans., LiveJournal, and Advogato expose RDF/FOAF directly. Re Orkut, I noticed that they now have the option to flood your GTalk Jabber/XMPP account roster with everyone you know on Orkut. Not sure the wisdom of actually doing so (but I’ll try it), but it is worth noting that this quietly bridges a large ‘social network ing’ site with an open standards-based toolset.

For the record, the names common to my Dopplr, Facebook and Orkut accounts were: Liz Turner, Tom Heath, Rohit Khare, Edd Dumbill, Robin Berjon, Libby Miller, Brian Kelly, Matt Biddulph, Danny Ayers, Jeff Barr, Dave Beckett, Mark Baker. If I keep adding to the query for each other site, presumably the only person in common across all accounts will be …. me.

Querying Facebook in SPARQL

A fair few people have been asking about FOAF exporters from Facebook. I’m not entirely sure what else is out there, but Matthew Rowe has just announced a Facebook FOAF generator. It doesn’t dump all 35 million records into your Web browser, thankfully. But it will export a minimal description of you and your Facebook associates. At the moment, you get name, a photo URL, and (in this revision of the tool) a Facebook account name using FOAF’s OnlineAccount construct.

As an aside, this part of the FOAF design provides a way for identifiers from arbitrary services to be described in FOAF without special-purpose support. Some services have shortcut property names, eg. msnChatID and we may add more, but it is also important to allow this kind of freeform, decentralised identification. People shouldn’t have to petition the FOAF spec editors before any given Social Network site’s IDs can be supported; they can always use their own vocabulary alongside FOAF, or use the OnlineAccount construct as shown here.

I’ve saved my Facebook export on my Web site, working on the assumption that Facebook IDs are not private data. If people think otherwise, let me know and I’ll change the setup. We might also discuss whether even sharing the names and connectivity graph will upset people’s privacy expectations, but that’s for another day. Let me know if you’re annoyed!

Here is a quick SPARQL query, which simply asks for details of each person mentioned in the file who has an account on Facebook.

SELECT DISTINCT ?name, ?pic, ?id
[ a :Person;
:name ?name;
:depiction ?pic;
:holdsAccount [ :accountServiceHomepage <> ; :accountName ?id ]
ORDER BY ?name

I tested this online using Dave Beckett’s Rasqal-based Web service. It should return a big list of the first 200 people matched by the query, ordered alphabetically by name.

For “Web 2.0″ fans, SPARQL‘s result sets are essentially tabular (just like SQL), and have encodings in both simple XML and JSON. So whatever you might have heard about RDF’s syntactic complexity, you can forget it when dealing with a SPARQL engine.

Here’s a fragment of the JSON results from the above query:

"name" : { "type": "literal", "value": "Dan Brickley" },
"pic" : { "type": "uri", "value": "" },
"id" : { "type": "literal", "value": "624168" }
"name" : { "type": "literal", "value": "Dan Brickley" },
"pic" : { "type": "uri", "value": "" },
"id" : { "type": "literal", "value": "501730978" }
}, ...

What’s going on here? (a) Why are there two of me? (b) And why does it think that one of us has my Facebook FOAF file’s URL as a mugshot picture?

There’s no big mystery here. Firstly, there’s another guy who has the cheek to be called Dan Brickley. We’re friends on Facebook, even though we should probably be mortal enemies or something. Secondly, why does it give him the wrong URL for his photo? This is also straightforward, if a little technical. Basically, it’s an easily-fixed bug in this version of the FOAF exporter I used. When an image URL is not available, the convertor is still generating markup like “<foaf:depiction rdf:resource=””/>”. This empty URL is treated in RDF as the extreme case of a relative link, ie. the same kind of thing as writing “../../images/me.jpg” in a normal Web page. And since RDF is all about de-contextualising information, your RDF parser will try to resolve the relative link before passing the data on to storage or query systems (fiddly details are available to those that care). If the foaf:depiction property were simply ommitted when no photo was present, this problem wouldn’t arise. We’d then have to make the query a little more flexible, so that it still matched people even if there was no depiction, but that’s easy. I’ll show it next time.

I mentioned a couple of days ago that SPARQL is a query language with built-in support for asking questions about data provenance, ie. we can mix in “according to Facebook”, “according to Jabber” right into the WHERE clause of queries such as the one I show here. I’m not going to get into that today, but I will close with a visual observation about why that is important.

yasn map, borrowed from data junk, valleywag blog
To state the obvious, there’ll always be multiple Web sites where people hang out and socialise. A friend sent me this link the other day; a world map of social networks (thumbnail version copied here). I can’t vouch for the science behind it, but it makes the point that we risk fragmenting Web communities on geographic boundaries if we don’t bridge the various IM and YASN networks. There are lots of ways this can be done, each with different implications for user experience, business model, cost and practicality. But it has to happen. And when it does, we’ll be wanting ways of asking questions against aggregations from across these sites…

OpenID plugin for WordPress

I’ve just installed Alan J Castonguay’s WordPress OpenID plugin on my blog, part of a cleanup that included nuking 11000+ comments in the moderation queue using the Spam Karma 2 plugin. Apologies if I zapped any real comments too. There are a few left, at least!

The OpenID thing appears to “just work”. By which I mean, I could log in via it and leave a comment. I’d be super-grateful if those of you with OpenIDs could take a minute to leave a comment on this post, to see if it works as well as it seems to. If it doesn’t, a bug report (to would be much appreciated. Those of you with LiveJournals or AOL/AIM accounts already have OpenID, even if you didn’t notice. See the HTML source for my homepage to see how I use “” as an OpenID while delegating the hard work to LiveJournal. For more on OpenID, check out these tutorial slides (flash/pdf) from Simon Willison and David Recordon.

Thinking about OpenID-mediated blog comments, the tempting thing then would be to do something with the accumulated URIs. The plugin keeps its data in nice SQL tables and presumably accessible by other WordPress plugins. It’s been a while since I made a WordPress plugin, but they seem to have a pretty good framework accessible to them now.

mysql> select user_id, url from wp_openid_identities;
| user_id | url                |
|      46 | |
1 row in set (0.28 sec)

At the moment, it’s just me. It’d be fun to try scooping up RDF (FOAF, SKOS, SIOC, feeds…) from any OpenID URIs that accumulate there. Hmm I even wrote up that project idea a while back – SparqlPress. At the time I tried prototyping it in Redland + PHP, but nowadays I’d probably use Benjamin Nowack’s ARC library, which provides SPARQL query of a MySQL-backed RDF store, and is written in PHP. This gives it the same dependencies as WordPress, making it ideal for pluginization. If anyone’s looking for a modest-sized practical SemWeb project to hack on, that one could be a lot of fun.

There’s a lot of interesting and creative fuss about “social networking” site interop around lately, largely thanks to the social graph paper from Brad Fitzpatrick and David Recordon. I lean towards the “show me, don’t tell me” approach regarding buddylists and suchlike (as does Julian Bond with Ecademy), which is why FOAF has only ever had the mild-mannered “knows” relationship in the core vocabulary, rather than trying to over-formalise “bestest friend EVER” and other teenisms. So what I like about this WordPress plugin is that it gives some evidence-based raw material for decentralised social networking apps. Blog comments don’t tell the whole story; nothing tells the whole story. But rather than maintain a FOAF “knows” list (or blogroll, or blog-reader config) by hand, I’d prefer to be able to partially automate it by querying information about whose blogs I’ve commented on, and vice-versa. There’s a lot that could be built, intimidatingly much, that it’s hard to know where to start. I suggest that everyone in the SemWeb scene having an OpenID with a FOAF file linked from it would be an interesting platform from which to start exploring…

Meanwhile, I’ll try generating an RDF blogroll from any URIs that show up in my OpenID WordPress table, so I can generate a planetplanet or chumpologica configuration automatically…

SPARQL for vocabulary management: theory vs practice

I’ve lately been thinking about whether the named graph support in SPARQL can help us evolve vocabularies and associated code (eg. generators and translators) in parallel, so that we know when the RDF generators are emitting markup that uses properties which aren’t yet documented in the ontology; or when the ontology contains terms that aren’t being used in any actual data.

Here is a quick example in terms of FOAF, in a form that should run directly with Jena’s ARQ commandline tool. The query shown here takes a handful of specified FOAF files and compares their property usage with information about those properties in the RDF/OWL description of the FOAF spec. Query copied below, followed by the results. This is a pretty simple query; there are lots of related ideas we might explore. What I’d like to figure out (help welcomed!) is quite how to check for properties in instance data that don’t have corresponding definitions in the ontology. I expect it has something to do with ‘unbound’, …

Anyway, some basic FOAF-related querying for now:

PREFIX foaf: <>
PREFIX vs: <>

# Experiment with using SPARQL to compare actual property usage
# with the property declarations in FOAF. This query asks which
# properties are in use, and gets their label from the FOAF spec.
# Q: how to do the contrary, and specify which deployed properties
# are not in the spec?

PREFIX danbri: <>
PREFIX libby: <>
PREFIX kanzaki: <>
PREFIX edd: <>
PREFIX inkel: <>
PREFIX mattb: <>

PREFIX rdf: <>
PREFIX owl: <>
PREFIX rdfs: <>

SELECT DISTINCT ?prop ?label ?graph ?status

FROM NAMED danbri:
FROM NAMED kanzaki:

GRAPH foaf: {
{ ?prop rdf:type ?t }
FILTER (?t = owl:ObjectProperty || ?t = owl:DatatypeProperty ||
?t = rdf:Property || ?t = owl:FunctionalProperty ||
?t = owl:InverseFunctionalProperty ) .
?prop rdfs:label ?label . ?prop rdfs:comment ?c .
?prop vs:term_status ?status .
OPTIONAL { GRAPH ?graph { ?x ?prop ?y . } }
FILTER ( ?graph != foaf: )
ORDER BY ?prop

This gives the following results using ARQ (sparql --query on the commandline):

| prop                   | label                                    | graph    | status     |
| foaf:aimChatID         | "AIM chat ID"                            | danbri:  | "testing"  |
| foaf:aimChatID         | "AIM chat ID"                            | edd:     | "testing"  |
| foaf:based_near        | "based near"                             | inkel:   | "unstable" |
| foaf:based_near        | "based near"                             | mattb:   | "unstable" |
| foaf:based_near        | "based near"                             | kanzaki: | "unstable" |
| foaf:currentProject    | "current project"                        | inkel:   | "testing"  |
| foaf:currentProject    | "current project"                        | libby:   | "testing"  |
| foaf:currentProject    | "current project"                        | kanzaki: | "testing"  |
| foaf:depiction         | "depiction"                              | inkel:   | "testing"  |
| foaf:depiction         | "depiction"                              | libby:   | "testing"  |
| foaf:depiction         | "depiction"                              | danbri:  | "testing"  |
| foaf:depiction         | "depiction"                              | mattb:   | "testing"  |
| foaf:depiction         | "depiction"                              | kanzaki: | "testing"  |
| foaf:depiction         | "depiction"                              | edd:     | "testing"  |
| foaf:depicts           | "depicts"                                | edd:     | "testing"  |
| foaf:family_name       | "family_name"                            | inkel:   | "testing"  |
| foaf:firstName         | "firstName"                              | inkel:   | "testing"  |
| foaf:gender            | "gender"                                 | kanzaki: | "testing"  |
| foaf:givenname         | "Given name"                             | inkel:   | "testing"  |
| foaf:holdsAccount      | "holds account"                          | kanzaki: | "unstable" |
| foaf:homepage          | "homepage"                               | inkel:   | "stable"   |
| foaf:homepage          | "homepage"                               | danbri:  | "stable"   |
| foaf:homepage          | "homepage"                               | mattb:   | "stable"   |
| foaf:homepage          | "homepage"                               | kanzaki: | "stable"   |
| foaf:homepage          | "homepage"                               | edd:     | "stable"   |
| foaf:icqChatID         | "ICQ chat ID"                            | inkel:   | "testing"  |
| foaf:img               | "image"                                  | danbri:  | "testing"  |
| foaf:img               | "image"                                  | mattb:   | "testing"  |
| foaf:img               | "image"                                  | kanzaki: | "testing"  |
| foaf:img               | "image"                                  | edd:     | "testing"  |
| foaf:interest          | "interest"                               | inkel:   | "testing"  |
| foaf:interest          | "interest"                               | libby:   | "testing"  |
| foaf:interest          | "interest"                               | kanzaki: | "testing"  |
| foaf:isPrimaryTopicOf  | "is primary topic of"                    | danbri:  | "testing"  |
| foaf:jabberID          | "jabber ID"                              | inkel:   | "testing"  |
| foaf:jabberID          | "jabber ID"                              | danbri:  | "testing"  |
| foaf:knows             | "knows"                                  | inkel:   | "testing"  |
| foaf:knows             | "knows"                                  | libby:   | "testing"  |
| foaf:knows             | "knows"                                  | danbri:  | "testing"  |
| foaf:knows             | "knows"                                  | mattb:   | "testing"  |
| foaf:knows             | "knows"                                  | kanzaki: | "testing"  |
| foaf:knows             | "knows"                                  | edd:     | "testing"  |
| foaf:made              | "made"                                   | danbri:  | "testing"  |
| foaf:made              | "made"                                   | kanzaki: | "testing"  |
| foaf:maker             | "maker"                                  | inkel:   | "testing"  |
| foaf:maker             | "maker"                                  | libby:   | "testing"  |
| foaf:maker             | "maker"                                  | kanzaki: | "testing"  |
| foaf:maker             | "maker"                                  | edd:     | "testing"  |
| foaf:mbox              | "personal mailbox"                       | inkel:   | "stable"   |
| foaf:mbox              | "personal mailbox"                       | libby:   | "stable"   |
| foaf:mbox              | "personal mailbox"                       | danbri:  | "stable"   |
| foaf:mbox              | "personal mailbox"                       | kanzaki: | "stable"   |
| foaf:mbox              | "personal mailbox"                       | edd:     | "stable"   |
| foaf:mbox_sha1sum      | "sha1sum of a personal mailbox URI name" | inkel:   | "testing"  |
| foaf:mbox_sha1sum      | "sha1sum of a personal mailbox URI name" | libby:   | "testing"  |
| foaf:mbox_sha1sum      | "sha1sum of a personal mailbox URI name" | danbri:  | "testing"  |
| foaf:mbox_sha1sum      | "sha1sum of a personal mailbox URI name" | mattb:   | "testing"  |
| foaf:mbox_sha1sum      | "sha1sum of a personal mailbox URI name" | kanzaki: | "testing"  |
| foaf:mbox_sha1sum      | "sha1sum of a personal mailbox URI name" | edd:     | "testing"  |
| foaf:msnChatID         | "MSN chat ID"                            | inkel:   | "testing"  |
| foaf:msnChatID         | "MSN chat ID"                            | danbri:  | "testing"  |
| foaf:myersBriggs       | "myersBriggs"                            | danbri:  | "testing"  |
| foaf:myersBriggs       | "myersBriggs"                            | mattb:   | "testing"  |
| foaf:myersBriggs       | "myersBriggs"                            | kanzaki: | "testing"  |
| foaf:myersBriggs       | "myersBriggs"                            | edd:     | "testing"  |
| foaf:name              | "name"                                   | inkel:   | "testing"  |
| foaf:name              | "name"                                   | libby:   | "testing"  |
| foaf:name              | "name"                                   | danbri:  | "testing"  |
| foaf:name              | "name"                                   | mattb:   | "testing"  |
| foaf:name              | "name"                                   | kanzaki: | "testing"  |
| foaf:name              | "name"                                   | edd:     | "testing"  |
| foaf:nick              | "nickname"                               | inkel:   | "testing"  |
| foaf:nick              | "nickname"                               | libby:   | "testing"  |
| foaf:nick              | "nickname"                               | danbri:  | "testing"  |
| foaf:nick              | "nickname"                               | mattb:   | "testing"  |
| foaf:nick              | "nickname"                               | kanzaki: | "testing"  |
| foaf:nick              | "nickname"                               | edd:     | "testing"  |
| foaf:pastProject       | "past project"                           | inkel:   | "testing"  |
| foaf:pastProject       | "past project"                           | kanzaki: | "testing"  |
| foaf:plan              | "plan"                                   | danbri:  | "testing"  |
| foaf:plan              | "plan"                                   | kanzaki: | "testing"  |
| foaf:primaryTopic      | "primary topic"                          | inkel:   | "testing"  |
| foaf:primaryTopic      | "primary topic"                          | kanzaki: | "testing"  |
| foaf:primaryTopic      | "primary topic"                          | edd:     | "testing"  |
| foaf:publications      | "publications"                           | kanzaki: | "unstable" |
| foaf:schoolHomepage    | "schoolHomepage"                         | danbri:  | "testing"  |
| foaf:schoolHomepage    | "schoolHomepage"                         | kanzaki: | "testing"  |
| foaf:schoolHomepage    | "schoolHomepage"                         | edd:     | "testing"  |
| foaf:surname           | "Surname"                                | inkel:   | "testing"  |
| foaf:thumbnail         | "thumbnail"                              | danbri:  | "testing"  |
| foaf:title             | "title"                                  | kanzaki: | "testing"  |
| foaf:weblog            | "weblog"                                 | inkel:   | "testing"  |
| foaf:weblog            | "weblog"                                 | libby:   | "testing"  |
| foaf:weblog            | "weblog"                                 | mattb:   | "testing"  |
| foaf:weblog            | "weblog"                                 | kanzaki: | "testing"  |
| foaf:weblog            | "weblog"                                 | edd:     | "testing"  |
| foaf:workplaceHomepage | "workplace homepage"                     | inkel:   | "testing"  |
| foaf:workplaceHomepage | "workplace homepage"                     | libby:   | "testing"  |
| foaf:workplaceHomepage | "workplace homepage"                     | danbri:  | "testing"  |
| foaf:workplaceHomepage | "workplace homepage"                     | mattb:   | "testing"  |
| foaf:workplaceHomepage | "workplace homepage"                     | edd:     | "testing"  |
| foaf:yahooChatID       | "Yahoo chat ID"                          | inkel:   | "testing"  |


Ian Davis shows how even the smallest RDF graph has multiple XML serializations. He missed some variations: (i) rdf:RDF is optional, and (ii) rdf:type has special-case treatment in the grammar (iii) XML base can interact with URIs.

<foaf:homepage rdf:resource="~eve"/>

Simple RDF graph describing Eve

This variety should not be suprising. A good question to ask here, is how much we’d gain from dropping the more esoteric syntactic variations. Imagine for example a syntactic profile of RDF/XML in which rdf:RDF was never needed, rdf:type and literals were never represented as attributes, node elements always (or never!) carried a type, and rdf:nodeID was only used when absolutely (how to define this?) necessary. I still suspect that there would be plenty of variation, because the fundamental practice wouldn’t change. We’d be representing unordered graph data over an ordered tree.

My take: custom syntactic profiles, ones designed for some particular community and purpose have a role. We can define them as RDF/XML subsets (in Relax-NG, Schematron or prose), or as non-RDF XML formats, transformable with GRDDL. Either approach makes life somewhat easier for those working with XML tools, at some cost to those in an RDF environment. But we should also stop looking over our shoulder at XML. RDF/XML is painful for XML developers because they find themselves lacking familiar tools when working with RDF.

This is not because of the particular charm of those tools, but because they exist. If the RDF programming environment were anywhere near as rich with tools as XML’s, this would not be such an issue. Developers are pragmatic, and will use what is available. If RDF tools feel less mature than XML tools, developers will naturally complain if their data formats force them to use only RDF tools. SPARQL is an important thing here, one that might be accompanied by lightweight API standardisation and other steps to lessen the pain of those moving on from pure-XML toolsets.

I’d much rather see folk work on RDF tools and APIs (how about a SPARQL engine in .js or PHP?) than this endless navelgazing on RDF’s XML syntax. This is not to downplay the excellent work that’s already out there (Redland, Jena, etc.), just to note that we’ve still some catching up to do with the XML world. We don’t yet have API portability between tools, beyond that offered by the (still draft) SPARQL spec. It’s that sort of thing that inspires confidence in developers, and that will give them the sense that maybe (just maybe…) they could work with a non-XML toolset. Given a choice between worrying about the flexibility of the RDF/XML syntax versus helping test and document SPARQL support in some RDF toolkit, I know how I’d rather be spending my time…