Remembering Aaron Swartz

“One of the things the Web teaches us is that everything is connected (hyperlinks) and we all should work together (standards). Too often school teaches us that everything is separate (many different ‘subjects’) and that we should all work alone.” –Aaron Swartz, April 2001.

So Aaron is gone. We were friends a decade ago, and drifted out of touch; I thought we’d cross paths again, but, well, no.

Update: MIT’s report is published.

 I’ll remember him always as the bright kid who showed up in the early data sharing Web communities around RSS, FOAF and W3C’s RDF, a dozen years ago:

"Hello everyone, I'm Aaron. I'm not _that_ much of a coder, (and I don't know
much Perl) but I do think what you're doing is pretty cool, so I thought I'd
hang out here and follow along (and probably pester a bit)."

Aaron was from the beginning a powerful combination of smart, creative, collaborative and idealistic, and was drawn to groups of developers and activists who shared his passion for what the Web could become. He joined and helped the RSS 1.0 and W3C RDF groups, and more often than not the difference in years didn’t make a difference. I’ve seen far more childishness from adults in the standards scene, than I ever saw from young Aaron. TimBL has it right; “we have lost one of our own”. He was something special that ‘child genius’ doesn’t come close to capturing. Aaron was a regular in the early ’24×7 hack-and-chat’ RDF IRC scene, and it’s fitting that the first lines logged in that group’s archives are from him.

I can’t help but picture an alternate and fairer universe in which Aaron made it through and got to be the cranky old geezer at conferences in the distant shiny future. He’d have made a great William Loughborough; a mutual friend and collaborator with whom he shared a tireless impatience at the pace of progress, the need to ask ‘when?’, to always Demand Progress.

I’ve been reading old IRC chat logs from 2001. Within months of his ‘I’m not _that_ much of a coder’ Aaron was writing Python code for accessing experimental RDF query services (and teaching me how to do it, disclaiming credit, ‘However you like is fine… I don’t really care.’). He was writing rules in TimBL’s experimental logic language N3, applying this to modelling corporate ownership structures rather than as an academic exercise, and as ever sharing what he knew by writing about his work in the Web. Reading some old chats, we talked about the difficulties of distributed collaboration, debate and disagreement, personalities and their clashes, working groups, and the Web.

I thought about sharing some of that, but I’d rather just share him as I choose to remember him:

22:16:58 <AaronSw> LOL

Embedding queries in RDF – FOAF Group example

Is this crazy or useful? Am not sure yet.

This example uses FOAF vocabulary for groups and openid. So the basic structure here is that Agents (including persons) can have an :openid and can be a :member of a :Group.

From an openid-augmented WordPress, we get a list of all the openids my blog knows about. From an openid-augmented MediaWiki, we get a list of all the openids that contribute to the FOAF project wiki. I dumped each into a basic RDF file (not currently an automated process). But the point here is to explore enumerated groups using queries.

<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns=”http://xmlns.com/foaf/0.1/”>
<Group rdf:about=’#both’>
<!– enumerated membership –>
<member><Agent><openid rdf:resource=’http://danbri.org/’/></Agent></member>
<member><Agent><openid rdf:resource=’http://tommorris.org/’/></Agent></member>
<member><Agent><openid rdf:resource=’http://kidehen.idehen.net/dataspace/person/kidehen’/></Agent></member>
<member><Agent><openid rdf:resource=’http://www.wasab.dk/morten/’/></Agent></member>
<member><Agent><openid rdf:resource=’http://kronkltd.net/’/></Agent></member>
<member><Agent><openid rdf:resource=’http://www.kanzaki.com/’/></Agent></member>

<!– rule-based membership –>

<constructor><![CDATA[
PREFIX : <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
<http://danbri.org/yasns/danbri/both.rdf#thegroup> a :Group; :member [ a :Agent; :openid ?id ]
}
WHERE {
GRAPH <http://wiki.foaf-project.org/_users.rdf> { [ a :Group; :member [ a :Agent; :openid ?id ]. ] }
GRAPH <http://danbri.org/yasns/danbri/_group.rdf> { [ a :Group; :member [ a :Agent; :openid ?id ]. ] }
}
]]></constructor>
</Group>
</rdf:RDF>

This RDF description does it both ways. It enumerates (for simple clients) a list of members of a group whose members are those individuals that are both commentators on my blog, and contributors to the FOAF wiki. At least, to the extent they’re detectable via common use of OpenID URIs. But the RDF group description also embeds a SPARQL query, the kind which generates RDF rather than an SQL-like resultset. The RDF essentially regenerates the enumerated list, assuming the query is run against an RDF dataset with the data graphs appropriately populated.

Now I sorta like this, and I sorta don’t. It may be incredibly powerful, or it may be a bit to clever for its own good.

Certainly there’s scope overlap with the W3C RIF rules work, and with the capabilities of OWL. FOAF has long contained an experimental method for using OWL to do something similar, but it hasn’t found traction. The motivation I have here for trying SPARQL here is that it has built-in machinery for talking about the provenance of data; so I could write a group description this way that says “members are anyone listed as a colleague in http://myworkplace.example.com/stafflist.rdf”. Or I could mix in arbitrary descriptive vocabularies; family tree stuff, XFN, language abilities (speaks-reads-writes) etc.

Where I think this could fall down is in the complexity of the workflow. The queries need executing against some SPARQL installation with a configured dataset, and the query lists URIs of data graphs. But I doubt database admins will want to randomly load any/every RDF file mentioned in these shared queries. Perhaps something like SparqlPress, attached to one’s weblog, and social filters to load only files in queries eg. from friends? Also, authoring these kinds of query isn’t something non-geek users are going to do often, and the sorts of queries that will work will depend of course on the data actually available. Sure I could write a query based on matching the openids of former colleagues, but the group will be empty unless the data listing people as former colleagues is actually out there and in the Web, and written in the terms anticipated by the query.

On the other hand, this mechanism does appeal, and could go way beyond FOAF group definitions. We could see a model where people post data in the Web but also post queries, eg. revisiting the old work Libby and I explored around RSS query. On the other other hand, who wants to make their Web queries public? All that said, the same goes for the data being queried. And since this technique embeds queries inside ordinary RDF data, however we deal with the data visibility issue for RDF/FOAF should also work for the query stuff. Perhaps. Can’t blame me for trying…
I realise this isn’t the clearest of explanations. Let’s try again:

RDF is normally for publishing collections of simple claims about the world. This is an experiment in embedding data-generating-queries amongst these claims, where the query is configured to output more RDF claims (aka statements, triples etc), but only when executed against some appropriate body of RDF data. Since the query is written in SPARQL, it allows the data-generation rules to mention interesting things, such as properties of the source of the data being queried.

This particular experiment is couched in terms of FOAF’s “Group” construct, but the technique is entirely general. The example above defines a group of agents called the “both” group, by saying that an Agent is in that group if it its OpenID URI is listed in each of two RDF documents specified, ie. both a commentator on my blog, and a contributor to the FOAF Wiki. Other examples could be “(fe)male employees” or “family members sharing a blood type” or in fact, any descriptive pattern that can match against the data to hand and be expressed in SPARQL.

planetplanet foafrolls

The PlanetPlanet feed reader (and the Venus variant) exposes its blogroll via RDF/FOAF, typically at “/foafroll.xml” URIs. I ran through the list of Planet installations from the main site, and found the following, which might be interesting for experimentation, crawling, whitelist work etc. Or you could just make a giant feedlist and install Venus yourself, composing your own meta selection from the feeds described in these files.


http://widgetarians.org/foafroll.xml


http://www.planetapache.org/foafroll.xml


http://www.beclan.org/aggregator/foafroll.xml


http://planet.classpath.org/foafroll.xml


http://www.debian.org.hk/planet/foafroll.xml


http://planet.hellug.gr/foafroll.xml


http://planet.freedesktop.org/foafroll.xml


http://planet.humbug.org.au/foafroll.xml


http://planet.gnome-ev.de/foafroll.xml


http://gstreamer.freedesktop.org/planet/foafroll.xml


http://planet.jabber.org/foafroll.xml


http://planet.mozilla.org/foafroll.xml


http://planet.foss.org.my/foafroll.xml


http://planet.go-oo.org/foafroll.xml


http://planet.perl.org/foafroll.xml


http://www.planetpython.org/foafroll.xml


http://planet.slug.org.au/foafroll.xml


http://planetsun.org/foafroll.xml


http://www.planetsuse.org/foafroll.xml


http://planet.twistedmatrix.com/foafroll.xml


http://advocacydev.org/blogs/foafroll.xml


http://planet.arslinux.com/foafroll.xml


http://fossplanet.com/foafroll.xml


http://indyblogs.protest.net/foafroll.xml


http://www.cs.princeton.edu/~mp/malayalam/blogs/foafroll.xml


http://planet.mozillazine.org/foafroll.xml

http://planetjava.org/foafroll.xml # bad xml

http://planetkde.org/foafroll.xml

http://www.planet-im.com/foafroll.xml # no feed urls

http://planet.linux.net.mk/foafroll.xml

IM/RSS bot – BBC Persian News Flash

OK this is old news, but pretty cool so I’m happy to write it up belatedly.

I just logged into MSN chat, and was greeted by Mario Menti’s IM bot, which provides a text-chat UI for navigating the BBC’s news feeds from their Persian service. I’m pasting the output here, hoping it’ll display reasonably. I can’t read a word of it of course, but remember Ian Forrester’s XTech talk a few years back about the headaches for getting I18N right for such feeds (and the varying performance of newsreader clients with right-to-left and mixed direction text). This hack came out of a conversation with Mario and Ian around the BBC Backstage scene, and from comments from a couple of friends in Tehran, this sort of technology direction is much appreciated by those whose news access is restricted. The bot is called bbcpersian at hotmail.co.uk, and seems to still be running 18 months later. See also some more recent hacks from Mario that wire up BBC feeds to twitter.

BBC Persian News Flash says: (23:01:02)

Hi, this is your hourly BBCPersian.com news flash with the 10 most recent new items
1 افزایش نیروها در عراق ‘درحال نتیجه دادن است’
2 انتقاد شدید کروبی از ‘مخالفان احزاب’
3 نواز شریف از پاکستان اخراج شد
4 بازداشت یکی از ‘قاچاقچیان بزرگ’ کلمبیا
5 ترکیه: کشورهای منطقه از اقدامات تنش زا دوری کنند
6 ‘عاشقان قلندر’ جشنواره ای دیگر برپا کردند
7 کاهش ساعت کار ادارات دولتی ایران در ماه رمضان
8 ‘عراقیها احساس امنیت بیشتری نمی کنند’
9 نواز شریف از پاکستان اخراج شد
10 شرکت مردم گواتمالا در انتخابات این کشور

Reply with number 1 to 10 to see more information, or any other message if you want to stop receiving these news flashes

Anyone know what the state of the art is with IM-based feed readers? or have a wishlist?

Flickr’d

Just renewed my Flickr-Pro account for 2 years, ensuring an irregular supply of pigeon, fish and other misc depictions.

I wasn’t 100% happy with the wording of their terms though.

To participate in Flickr pro, you must have a valid Yahoo! ID and, solely if you have not received a free offer or gift for a specific number of days of Flickr pro (“Free pro Period”), you will also need to provide other information, such as your credit card and billing information (your “Registration Data”). If you do not have a Yahoo! ID, you will be prompted to complete the registration process for it before you can register for Flickr pro. In consideration of your use of Flickr pro, you agree to: (a) provide true, accurate, current and complete information about yourself and (b) maintain and promptly update the Registration Data to keep it true, accurate, current and complete. If you provide any information that is untrue, inaccurate, not current or incomplete, or Flickr has reasonable grounds to suspect that such information is untrue, inaccurate, not current or incomplete, Flickr has the right to suspend or terminate your account and delete any information or content therein without liability to Flickr.

The “provide true, accurate, current and complete information about yourself” is only contextually limited to “credit card” and “billing information”; it could also plausibly be read as covering the more general Flickr user profile, on which I’ve every right to omit various bits of information (Missing isn’t broken). The billing system also let me have the choice of storing credit card info or re-entering it again next time it’s used. So it isn’t really clear what they’re asking for here. If my buddy icon doesn’t show enough grey hair, is that inaccurate? :) I guess they’re really focussed on contact details, in which case, it’s best to say so.

I signed up anyways. The Flickr API and the RDF-oriented Perl backup library make it a more reliable option for my photos than my own little Ruby scripts ever were. Back 2-3 years ago I maintained the fantasy that I’d manage my own photos and their metadata; the big reason I switched to Flickr was the commenting/social side. It’s just too hard for per-person sites to maintain that level of interactivity and community (unless you’re super famous or beautiful or both). And a photo site without comments and community, for me, is kind of boring. For decentralists … perhaps some combination of extended RSS feed plus OpenID for comments could come closer these days; but before OpenID, I couldn’t ever see a way for commenting and annotation to be massmarket-friendly in a decentralised manner. And, well, also no need to be grudging: Flickr is a great product. I’ve definitely had my money’s worth…

Open Source Flash Development and WorldKit

Handy article, “Towards Open Source Flash Development” by Carlos Rovira.

Background to looking at this is some great news: Mikel Maron is open-sourcing the WorldKit system, a lightweight Flash/SWF-based Web mapping application. So I’m interested to find some open source tools that would allow me to rebuild it from source.

I also wonder whether SVG hackers might be interested to port some of it to SVG/Javascript. WorldKit supports geo/rss location tagging, so I’m also curious about what it’d take to get full RDF support in there. Has anybody made an RDF parser for SWF/Flash yet?