Eve’n’more

Ian Davis shows how even the smallest RDF graph has multiple XML serializations. He missed some variations: (i) rdf:RDF is optional, and (ii) rdf:type has special-case treatment in the grammar (iii) XML base can interact with URIs.


<rdf:Description
xml:base="http://example.org/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
foaf:name="Eve"
rdf:type="http://xmlns.com/foaf/0.1/Person">
<foaf:homepage rdf:resource="~eve"/>
</rdf:Description>

Simple RDF graph describing Eve

This variety should not be suprising. A good question to ask here, is how much we’d gain from dropping the more esoteric syntactic variations. Imagine for example a syntactic profile of RDF/XML in which rdf:RDF was never needed, rdf:type and literals were never represented as attributes, node elements always (or never!) carried a type, and rdf:nodeID was only used when absolutely (how to define this?) necessary. I still suspect that there would be plenty of variation, because the fundamental practice wouldn’t change. We’d be representing unordered graph data over an ordered tree.

My take: custom syntactic profiles, ones designed for some particular community and purpose have a role. We can define them as RDF/XML subsets (in Relax-NG, Schematron or prose), or as non-RDF XML formats, transformable with GRDDL. Either approach makes life somewhat easier for those working with XML tools, at some cost to those in an RDF environment. But we should also stop looking over our shoulder at XML. RDF/XML is painful for XML developers because they find themselves lacking familiar tools when working with RDF.

This is not because of the particular charm of those tools, but because they exist. If the RDF programming environment were anywhere near as rich with tools as XML’s, this would not be such an issue. Developers are pragmatic, and will use what is available. If RDF tools feel less mature than XML tools, developers will naturally complain if their data formats force them to use only RDF tools. SPARQL is an important thing here, one that might be accompanied by lightweight API standardisation and other steps to lessen the pain of those moving on from pure-XML toolsets.

I’d much rather see folk work on RDF tools and APIs (how about a SPARQL engine in .js or PHP?) than this endless navelgazing on RDF’s XML syntax. This is not to downplay the excellent work that’s already out there (Redland, Jena, etc.), just to note that we’ve still some catching up to do with the XML world. We don’t yet have API portability between tools, beyond that offered by the (still draft) SPARQL spec. It’s that sort of thing that inspires confidence in developers, and that will give them the sense that maybe (just maybe…) they could work with a non-XML toolset. Given a choice between worrying about the flexibility of the RDF/XML syntax versus helping test and document SPARQL support in some RDF toolkit, I know how I’d rather be spending my time…

RDFAuthor does SPARQL

Damian is working on a new version of RDFAuthor that generates SPARQL queries (instead of the older Squish notation). It can also (not sure which protocol(s)) get results from a query service. Here’s a screenshot:

Example screenshot of RDFAuthor for query authoring and result display. Nodes and arcs at top of window, tabular results at bottom

The automatically generated (and hence not 100% user-friendly) query was as follows:

SELECT *
 WHERE
 {
      ?var_4 <http ://xmlns.filsa.net/emir/0.2/#subject> ?var_1 .
      ?var_4 <http ://xmlns.filsa.net/emir/0.2/#from> ?var_5 .
      ?var_4 <http ://xmlns.filsa.net/emir/0.2/#date> ?var_2 .
      ?var_5 <http ://xmlns.com/foaf/0.1/name> ?var_3 .
      ?var_5 <http ://xmlns.com/foaf/0.1/mbox> <mailto :pldms@mac.com> .
 }

What does this say? Pretty much just the following:

Match where we have (in some default RDF graph) the foaf:name of any things with a foaf:mbox of mailto:pldms@mac.com,
whenever there is something that has an emir:subject and emir:date, and that is also emir:from the first thing. SELECT from that the name, subject and date.

This is based on a template graph whose structure can be seen in the screenshot, decorated with yellow variable markers where a node is marked as unknown. Actually I screwed up after the query executed and subsequently marked the mailbox node as unknown too, before taking the screenshot; if that query had been executed, the resultset would have been much larger. The target database is a Joseki service that has an RDF version of the jena-dev mail archives.

A more readable version of the query (untested) might be:

PREFIX emir: <http ://xmlns.filsa.net/emir/0.2/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?date ?sub

WHERE
{
[
emir:subject ?sub ;
emir:date ?date ;
emir:from
[
foaf:name ?name;
foaf:mbox>mailto:pldms@mac.com> .
]
]
}

Religious Technology (What’s in a link?)

I’ve just rediscovered this, while tidying: a letter I received last year, from Hodkin And Company, Solicitors:

We understand that you operate, control or manage a website on which one Mr Damien Steer has placed literally hundreds of pages of our clients’ copyrighted works without the authorization of our clients. Because of the enormity and volume of the infringements, we have broken these down under separate headings [...]

Damian removed the documents immediately, publishing a scan of (his version of) the letter in their place. See also Karin Spaink’s pages for more on the Scientology materials concerned. I’ve not followed the twists and turns of the whole thing, but here’s a note from Karin’s page:

This homepage is approved of by court. Twice, by now. It has thereby become the world’s first legal Fishman Homepage. Read the ruling of the February 1996 lawsuit, summary proceedings, in either English or Dutch. On June 10, 1999, there was a second ruling, this time in full procedure: my page can still stay up. Read the ruling in Dutch or in English. Scientology has appealed this ruling. It is not yet known when pleas will be held.

Although I won, there’s one thing that seriously bugs me, and other people. The court ruled that hyperlinks and url’s refering to pages that contain infringing material must in themselves be considered to be infringing. That cuts at the heart of the net. To name one example: it makes search engines illegal: they often refer to pages that contain infringing material.

More from the letter I received…

You should be aware that numerous permanent injunctions and awards of statutory damages and attorney’s fees have been entered regarding similar infringements. For instance, a jury in the United States District Court in San Jose, California awarded statutory damages in the amount of $75,000 against a Mr. Henson for posting only one of the NOTs works on the Internet.

Maybe I’m in the wrong business?