Remembering Aaron Swartz

“One of the things the Web teaches us is that everything is connected (hyperlinks) and we all should work together (standards). Too often school teaches us that everything is separate (many different ‘subjects’) and that we should all work alone.” –Aaron Swartz, April 2001.

So Aaron is gone. We were friends a decade ago, and drifted out of touch; I thought we’d cross paths again, but, well, no.

Update: MIT’s report is published.

 I’ll remember him always as the bright kid who showed up in the early data sharing Web communities around RSS, FOAF and W3C’s RDF, a dozen years ago:

"Hello everyone, I'm Aaron. I'm not _that_ much of a coder, (and I don't know
much Perl) but I do think what you're doing is pretty cool, so I thought I'd
hang out here and follow along (and probably pester a bit)."

Aaron was from the beginning a powerful combination of smart, creative, collaborative and idealistic, and was drawn to groups of developers and activists who shared his passion for what the Web could become. He joined and helped the RSS 1.0 and W3C RDF groups, and more often than not the difference in years didn’t make a difference. I’ve seen far more childishness from adults in the standards scene, than I ever saw from young Aaron. TimBL has it right; “we have lost one of our own”. He was something special that ‘child genius’ doesn’t come close to capturing. Aaron was a regular in the early ’24×7 hack-and-chat’ RDF IRC scene, and it’s fitting that the first lines logged in that group’s archives are from him.

I can’t help but picture an alternate and fairer universe in which Aaron made it through and got to be the cranky old geezer at conferences in the distant shiny future. He’d have made a great William Loughborough; a mutual friend and collaborator with whom he shared a tireless impatience at the pace of progress, the need to ask ‘when?’, to always Demand Progress.

I’ve been reading old IRC chat logs from 2001. Within months of his ‘I’m not _that_ much of a coder’ Aaron was writing Python code for accessing experimental RDF query services (and teaching me how to do it, disclaiming credit, ‘However you like is fine… I don’t really care.’). He was writing rules in TimBL’s experimental logic language N3, applying this to modelling corporate ownership structures rather than as an academic exercise, and as ever sharing what he knew by writing about his work in the Web. Reading some old chats, we talked about the difficulties of distributed collaboration, debate and disagreement, personalities and their clashes, working groups, and the Web.

I thought about sharing some of that, but I’d rather just share him as I choose to remember him:

22:16:58 <AaronSw> LOL

Mirrors and Prisms: robust site-specific browsers

Mozilla (amongst others, see Chris Messina’s writeup of the trend, also Matt’s) have been exploring site-specific browsers through their Prism project. These combine aspects of the Web and Desktop environments, allowing you to have a desktop app tuned for browsing just one specific Web site. Prism is an application which, when run, will generate new per-site desktop applications. Currently it does not yet have a fancy packaging/installer, so users will need to install Prism plus the site files separately.

I have started to look at Prism as a basis for accessing robust, mirrored sites, so that a single point of failure (or censorship) might be avoided. With a lot help from Matt and others in #prism IRC chat, I have something almost working. The idea is simple: hack Prism so that the running browser code intercepts clicks and (based on some as-yet-undefined logic and preferences) gets the page from a list of mirrors, which might also be fetched dynamically from the ‘net.

I should also mention that one motivation here is for anti-censorship tools, to give users an easy way to access sites which might be blocked by their IP address or URL otherwise. I looked at FoxyProxy as an option but for site-specific robustness, running a full proxy server seems a bit heavy, compared to simply duplicating a set of files. Here’s what the main Prism app looks like:

prism-gutenberg

Screenshot showing Prism config settings for a site-specific browser.

Once you have Prism installed, you can hack a file named webrunner.js to intervene when links are clicked. In OSX, this can be found as /Applications/Prism.app/Contents/Resources/chrome/webrunner/content/webrunner.js.

Edit this: _domActivate : function(aEvent)

I added the following block to the start of this function:

var link = aEvent.target;
if (link instanceof HTMLAnchorElement && !WebRunner._isLinkExternal(link)) {
aEvent.preventDefault();
WebRunner._getBrowser().loadURI(“http://example.org/mirrors/”+link.href,null,null);
}

The idea here being that we intercept clicks, and rewrite them to point to equivalent http:// URIs elsewhere in the Web. As far as this goes, it works as advertised. But what I have is far from working… it would need some code in there to find the right mirror URLs to fetch from. Perhaps a list might be fetched on startup or first time a link is followed. It could also do with some work on packaging, so that this hacked version of Prism plus some actual site-specific browser config can be made into an easy-install Windows .exe or OSX .app. For a Windows installer, I am told that NSIS is a good place to start. You could also imagine a version that hid the mirrored URLs from user’s view. Since Prism has a built-in option to completely hide the URL navigation bar, I didn’t investigate this idea yet.

OK I think I’ve written up everything I learned from the helpful folks in IRC. I hope this repays some karma. If anyone cares to explore this further, or wants to help target student projects on exploring it, please get in touch.

Twitter Iran RT chaos

From Twitter in the last few minutes, a chaos of echo’d posts about army moves. Just a few excerpts here by copy/paste, mostly without the all-important timestamps. Without tools to trace reports to their source, to claims about their source from credible intermediaries, or evidence, this isn’t directly useful. Even grassroots journalists needs evidence. I wonder how Witness and Identi.ca fit into all this. I was thinking today about an “(person) X claims (person) Y knows about (topic) Z” notation, perhaps built from FOAF+SKOS. But looking at this “Army moving in…” claim, I think something couched in terms of positive claims (along lines of the old OpenID showcase site Jyte) might be more appropriate.

The following is from my copy/paste from Twitter a few minutes ago. It gives a flavour of the chaos. Note also that observations from very popular users (such as stephenfry) can echo around for hours, often chased by attempts at clarification from others.

(“RT” is Twitter notation for re-tweet, meaning that the following content is redistributed, often in abbreviated or summarised form)

plotbunnytiff: RT @suffolkinace: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
r0ckH0pp3r: RT .@AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
jax3417: RT @ktyladie: RT @GennX: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection #iran
ktladie: RT @GennX: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection #iran
MellissaTweets: RT @AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
GennX: RT @MelissaTweets: RT @AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection

The above all arrived at around the same time, and cite two prior “sources”:

suffolkinnace: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection   18 minutes ago from web

Who is this? Nobody knows of course, but there’s a twitter bio:

http://twitter.com/suffolkinace # Bio Some-to-be Royal Military Policeman in the British Army. Also a massive Xbox geek and part-time comedian

The other “source” seems to be http://twitter.com/AliAkbar
AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
about 1 hour ago from web
url http://republicmodern.com

This leads us to   http://republicmodern.com/about where we’re told
“Ali Akbar is the founder and president of Republic Modern Media. A conservative blogger, he is a contributor to Right Wing News, Hip Hop Republican, and co-host of The American Resolve online radio show. He was also the editor-in-chief of Blogs for McCain.”

I should also mention that a convention emerged in the last day two replace the names of specific local Twitter users in Tehran with a generic “from Iran”, to avoid getting anyone into trouble. Which makes plenty of sense, but without any in the middle vouching for sources makes it even harder to know which reports to take seriously.
More… back to twitter search, what’s happened since I started this post?

http://twitter.com/#search?q=iranelection%20army

badmsm: RT @dpbkmb @judyrey: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLZ RT! URGENT! #IranElection #gr88
SimaoC: RT @parizot: CONFIRMÉ! L’armée se dirige vers Téhéran contre les manifestants! #IranElection #gr88
SpanishClash: RT @mytweetnickname: RT From Iran:ARMY movement NOT confirmed in last 2:15, plz RT this until confrmed #IranElection #gr88
artzoom: RT @matyasgabor @humberto2210: RT CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! #IranElection #iranrevolution
sjohnson301: RT @RonnyPohl From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection #iran9
dauni: RT @withoutfield: RT: @tspe: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
interdigi: RT @ivanpinozas From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
PersianJustice: Once again, stop RT army movements until source INSIDE Iran verifies! Paramilitary is the threat anyway. #iranelection #gr88
Klungtveit Anyone: What’s the origin of reports of “army moving in” on protesters? #iranelection
Eruethemar: RT @brianlltdhq: RT @lumpuckaroo: Only IRG moving, not national ARMY… this is confirmed for real #IranElection #gr88
SAbbasRaza: RT @bymelissa: RT @alexlobov: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
timnilsson: RT @Iridium24: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
edmontalvo: RT @jasona: RT @Marble68: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
stevelabate: RT army moving into Tehran against protesters. Please RT. #iranelection
ivanpinozas: From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
bschh: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection (via @dlayphoto)
dlayphoto: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection

In short … chaos!

Is this just a social / information problem, or can different tooling and technology help filter out what on earth is happening?

The House that Jack Built

<Farmer> <sowed> <Corn> <kept> <Cock> <woke> <Priest> <married> <Man> <kissed> <Maiden> <milked> <Cow> <tossed> <Dog> <worried> <Cat> <killed> <Rat> <ate> <Malt> <in> <House> <builtBy> <Person foaf:name=”Jack” /> </builtBy> </House> </in> </Malt> </ate> </Rat> </killed> </Cat> </worried> </Dog> </tossed> </Cow> </milked> </Maiden> </kissed> </Man> </married> </Priest> </woke> </Cock> </kept> </Corn> </sowed> </Farmer>

FOAF super-connectivity daydreams from 2002.

“indirectly inspired by John Pilger’s ‘The New Rulers of the World’, FOAFCorp and www.theyrule.net/

This is the man, all tattered and torn, ...

Cross-browsing and RDF

Cross-browsing and RDF

While cross-searching has been described and demonstrated through this paper and associated work, the problem of cross-browsing a selection of subject gateways has not been addressed. Many gateway users prefer to browse, rather than search. Though browsing usually takes longer than searching, it can be more thorough, as it is not dependent on the users terms matching keywords in resource descriptions (even when a thesaurus is used, it is possible for resources to be “missed” if they are not described in great detail).

As a “quick fix”, a group of gateways may create a higher level menu that points to the various browsable menus amongst the gateways. However, this would not be a truly hierarchical menu system, as some gateways maintain browsable resource menus in the same atomic (or lowest level) subject area. One method of enabling cross-browsing is by the use of RDF.

The World Wide Web Consortium has recently published a preliminary draft specification for the Resource Description Framework (RDF). RDF is intended to provide a common framework for the exchange of machine-understandable information on the Web. The specification provides an abstract model for representing arbitrarily complex statements about networked resources, as well as a concrete XML-based syntax for representing these statements in textual form. RDF relies heavily on the notion of standard vocabularies, and work is in progress on a ‘schema’ mechanism that will allow user communities to express their own vocabularies and classification schemes within the RDF model.

RDF’s main contribution may be in the area of cross-browsing rather than cross-searching, which is the focus of the CIP. RDF promises to deliver a much-needed standard mechanism that will support cross-service browsing of highly-organised resources. There are many networked services available which have classified their resources using formal systems like MeSH or UDC. If these services were to each make an RDF description of their collection available, it would be possible to build hierarchical ‘views’ of the distributed services offering a user interface organised by subject-classification rather than by physical location of the resource.

From Cross-Searching Subject Gateways, The Query Routing and Forward Knowledge Approach, Kirriemuir et. al., D-Lib Magazine, January 1998.

I wrote this over 11 (eleven) years ago, as something of an aside during a larger paper on metadata for distributed search. While we are making progress towards such goals, especially with regard to cross-referenced descriptions of identifiable things (ie. the advances made through linked data techniques lately), the pace of progress can be quite frustrating. Just as it seems like we’re making progress, things take a step backwards. For example, the wonderful lcsh.info site is currently offline while the relevant teams at the Library of Congress figure out how best to proceed. It’s also ten years since Charlotte Jenkins published some great work on auto-classification that used OCLC’s Dewey Decimal Classification. That work also ran into problems, since DDC wasn’t freely available for use in such applications. In the current climate, with Creative Commons, Open source, Web 2.0 and suchlike the rage, I hope we’ll finally see more thesaurus and classification systems opened up (eg. with SKOS) and fully linked into the Web. Maybe by 2019 the Web really will be properly cross-referenced…

Problem statement

A Pew Research Center survey released a few days ago found that only half of Americans correctly know that Mr. Obama is a Christian. Meanwhile, 13 percent of registered voters say that he is a Muslim, compared with 12 percent in June and 10 percent in March.

More ominously, a rising share — now 16 percent — say they aren’t sure about his religion because they’ve heard “different things” about it.

When I’ve traveled around the country, particularly to my childhood home in rural Oregon, I’ve been struck by the number of people who ask something like: That Obama — is he really a Christian? Isn’t he a Muslim or something? Didn’t he take his oath of office on the Koran?

It was in the NYTimes, so it must be true. Will the last one to leave the Web please turn off the lights.