FOAF


Or: towards evidence-based ‘add a contact’ filtering…

This just in from LinkedIn:

Have a question? Zander Jules’s network will probably have an answer
You can use LinkedIn Answers to distribute your professional questions to Zander Jules and your extended network. You can get high-quality answers from experienced professionals.

Zander Jules requested to add you as a connection on LinkedIn:

Dan,

Dear
My name is Zander Jules a Banker and accountant with Bank Atlantique Cote Ivoire.I contacting u for a business transfer of a large sum of money from a dormant account. Though I know that a transaction of this magnitude will make any one apprehensive,
but I am assuring u all will be well at the end of the day.I am the personal accounts manager to Engr Frank Thompson, a National of ur country, who used to work with an oil servicing company here in Cote Ivoire. My client, his wife & their 3 children were involved in the ill fated Kenya Airways crash in the coasts of Abidjan in January 2000 in which all passengers on board died. Since then I have made several inquiries to ur embassy to locate any of my clients extended relatives but has been unsuccessful.After several attempts, I decided to trace his last name via internet,to see if I could locate any member of his
family hence I contacted u.Of particular interest is a huge deposit with our bank in our country,where the deceased has an account valued at about $16 million USD.They have issued me notice to provide the next of kin or our bank will declare the account unservisable and thereby send the funds to the bank treasury.Since I have been unsuccessful in locating the relatives for past 7 yrs now, I will seek ur consent to present you as the next of kin of the deceased since u have the same last names, so that the proceeds of this account valued at $16million USD can be paid to u and then u and I can share the money.All I require is your honest cooperation to enable us see this deal through. I guarantee that this will be executed under all legitimate arrangement that will protect you from any breach of the law. In your reply mail, I want you to give me your full names, address, D.O.B, tel& fax #.If you can handle this with me, reach me for more details.

Thanking u for ur coperation.
Regards,

I’m suprised we’ve not seen more of this, and sooner. Youtube contacts are pretty spammy, and twitter have also suffered. The other networks are relatively OK so far. But I don’t think they’re anything like as robust as they’ll need to get, particularly since a faked contact can get privileged access to personal details. Definitely an arms race…

Closing some tabs…

Stephen Fry writing on ’social network’ sites back in January (also in the Guardian):

…what an irony! For what is this much-trumpeted social networking but an escape back into that world of the closed online service of 15 or 20 years ago? Is it part of some deep human instinct that we take an organism as open and wild and free as the internet, and wish then to divide it into citadels, into closed-border republics and independent city states? The systole and diastole of history has us opening and closing like a flower: escaping our fortresses and enclosures into the open fields, and then building hedges, villages and cities in which to imprison ourselves again before repeating the process once more. The internet seems to be following this pattern.

How does this help us predict the Next Big Thing? That’s what everyone wants to know, if only because they want to make heaps of money from it. In 1999 Douglas Adams said: “Computer people are the last to guess what’s coming next. I mean, come on, they’re so astonished by the fact that the year 1999 is going to be followed by the year 2000 that it’s costing us billions to prepare for it.”

But let the rise of social networking alert you to the possibility that, even in the futuristic world of the net, the next big thing might just be a return to a made-over old thing.

McSweenys:

Dear Mr. Zuckerberg,

After checking many of the profiles on your website, I feel it is my duty to inform you that there are some serious errors present. […]

Lest-we-forget. AOL search log privacy goofup from 2006:

No. 4417749 conducted hundreds of searches over a three-month period on topics ranging from “numb fingers” to “60 single men” to “dog that urinates on everything.”

And search by search, click by click, the identity of AOL user No. 4417749 became easier to discern. There are queries for “landscapers in Lilburn, Ga,” several people with the last name Arnold and “homes sold in shadow lake subdivision gwinnett county georgia.”

It did not take much investigating to follow that data trail to Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her three dogs. “Those are my searches,” she said, after a reporter read part of the list to her.

Time magazine punditising on iGoogle, Facebook and OpenSocial:

Google, which makes its money on a free and open Web, was not happy with the Facebook platform. That’s because what happens on Facebook stays on Facebook. Google would much prefer that you come out and play on its platform — the wide-open Web. Don’t stay behind Facebook’s closed doors! Hie thee to the Web and start searching for things. That’s how Google makes its money.

So, last fall, Google rallied all the other major social networks (MySpace, Bebo, Hi5 and so on) and announced a new initiative called OpenSocial. OpenSocial wants to be like Facebook’s platform, only much bigger: Widget makers can write applications for it and they can run anywhere — on MySpace, Bebo and Google’s own social network, Orkut, which is very big in Brazil.

Google’s platform could actually dwarf Facebook — if it ever gets off the ground.

Meanwhile on the widget and webapp security front, we have “BBC exposes Facebook flaw” (information about your buddies is accessible to apps you install; information about you is accessible to apps they install). Also see Thomas Roessler’s comments to my Nokiana post for links to a couple of great presentations he made on widget security. This includes a big oopsie with the Google Mail widget for MacOSX. Over in Ars Technica we learn that KDE 4.1 alpha 1 now has improved widget powers, including “preliminary support for SuperKaramba and Mac OS X Dashboard widgets“. Wonder if I can read my Gmail there…

As Stephen Fry says,  these things are “opening and closing like a flower”. The big hosted social sites have a certain oversimplifying retardedness about them. But the ability for code to go visit data (the widget/gadget model), is I think as valid as the opendata model where data flows around to visit code. I am optimistic that good things will come out of this ferment.

A few weeks ago I had the pleasure of meeting several of the Google OpenSocial crew in London. They took my grumbling about accessibility issues pretty well, and I hope to continue that conversation. Industry politics and punditry aside, I’m impressed with their professionalism and with the tie-in to an opensource implementation through Apache’s ShinDig project. The OpenSocial specs list is open to the public, where Cassie has just announced that “all 0.8 opensocial and gadgets spec changes have been resolved” (after a heroic slog through the issue list). I’m barely tracking the detail of discussion there, things are moving fast. There’s now a proposed REST API, for example; and I learned in London about plans for a formatting/templating system, which might be one mechanism for getting FOAF/RDF out of OpenSocial containers.

If OpenSocial continues to grow and gather opensource mindshare, it’s possible Facebook will throw some chunks of their platform over the wall (ie. “do an Adobe“). And it’ll probably be left to W3C to clean up the ensuring mess and fragmentation, but I guess that’s what they’re there for. Meanwhile there’s plenty yet to be figured out, … I think we’re in a pre-standards experimentation phase, regardless of how stable or mature we’re told these platforms are.

The fundamental tension here is that we want open data, open platforms, … for data and code to flow freely, but to protect the privacy, lives and blushes of those it describes. A tricky balance. Don’t let anyone tell you it’s easy, that we’ve got it figured out, or that all we need to do is “tear down the walls”.

Opening and closing like flowers…

Spotted on the websemantique list, a Youtube playlist of videos from SemanticCamp Paris 1,  16 février. I think they’ve just had a second SemanticCamp already, but these five videos are from the earlier event. Lots of FOAF, RDFa etc.

 L’objectif du SemanticCamp Paris est de créer les conditions pour que les développeurs, les étudiants, les managers et les chercheurs se rencontrent sur le thème du Web Sémantique.


How it works: The Web
Originally uploaded by danbri

Or, “but what do all those links mean?”

Based on the 1994 slides by TimBL which inspired the SWAD-Europe graphics and shirt.

The twist here is just an emphasis that the giant global graph is a graph of idiosyncratic claims, and only sometimes do we all see the world the same way.

ordinary life is pretty complex stuff“ — Harvey Pekar

 My main OpenID provider is currently LiveJournal, delegated from my own danbri.org domain. I suspect it’s much more likely that danbri.org would go offline or be hacked again (sorry DreamHost) than LJ; but either could happen!

In such circumstances, what should a ‘relying party’ (aka consumer) site do? Apparently myopenid has been down today; these are not theoretical scenarios. And my danbri.org site was hacked last year, due to a DreamHost vulnerability. The bad guys merely added viagra adverts; they could easily have messed with my OpenID delegation URL instead.

I don’t know the OpenID 2.0 spec inside-out (to put it mildly!) but one model that strikes me as plausible: the relying party should hang onto FOAF and XFN ‘rel=me’ data that you’ve somehow confirmed (eg. those from http://danbri.org/foaf.rdf or my LJ FOAF) and simply offer to let you log in with another OpenID known to be associated with you. You might not even know in advance that these other accounts of yours offer OpenID; after all there are new services being rolled out on a regular basis. For a confirmed list of ‘my’ URLs, you can poke around to see which are OpenIDs.

danbri$ curl -s http://danbri.livejournal.com/ | grep openid
<link rel=”openid.server” href=”http://www.livejournal.com/openid/server.bml” />

danbri$ curl -s http://flickr.com/photos/danbri/ | grep openid
<link rel=”openid2.provider” href=”https://open.login.yahooapis.com/openid/op/auth” />

Sites do go down. It would be good to have a slicker user experience when this happens. Given that we have formats - FOAF and XFN at least - that allow a user to be associated with multiple (possibly OpenID-capable) URLs, what would it take to have OpenID login make use of this?

For many years, the 24×7 IRC chatrooms #swig and #foaf (and previously #rdfig) have been logged to HTML and RDF by Dave Beckett’s IRC logging code. The RDF idiom used here dates from 2001 or so, and looks like this:

 <foaf:ChatChannel rdf:about=”irc://irc.freenode.net/swig”>
<foaf:chatEventList>
<rdf:Seq>
<rdf:li>
<foaf:chatEvent rdf:ID=”T00-05-11″>
<dc:date>2008-04-03T00:05:11Z</dc:date>

<dc:description>
danbri: do you know of good scutter data for playing with codepiction? would be fun to get back into that (esp. parallelization).
</dc:description>
<dc:creator>
<wn:Person foaf:nick=”kasei”/>
</dc:creator>
</foaf:chatEvent>

Dave has offered to make a one-time search-and-replace fix to the old logs, if we want to agree a new idiom. The main driver for this is that the old logs have a class for ‘chat event’  but use an initial lowercase later for the term, ie. ‘chatEvent’ instead of ‘ChatEvent’. None of these properties are yet documented in the FOAF schema, and since the data for this term is highly concentrated, and its maintainer has offered to change it, I suggest we document these FOAF terms as used, with the fix of having a class foaf:ChatEvent instead of foaf:chatEvent.

Almost all  RDF vocabularies stick to the rule of using initial lowercase letters for properties, and initial capitals for classes. This makes RDF easy to read for those who know this trick; consequently a lowercase class name can be very confusing, for experts and beginners alike. I’d therefore rather not introduce one into FOAF if it can be avoided. But I would like to document the IRC logging data format, and continue to use FOAF for it.

The markup also uses the wordnet:Person class from my old Wordnet 1.6 namespace (currently offline, but will be repaired eventually, albeit with a later Wordnet installation). This follows early FOAF practice, where we minimised terms and used Wordnet a lot more. I suggest Dave updates this to use foaf:Person instead. The dc:creator property used here might also use the new DC Terms notion of ‘creator’, since that flavour of ‘creator’ has a range of DC Terms “Agent” that is a more modern and FOAF-compatible idiom. This btw is a candidate for using instead of foaf:maker, which I introduced with some regret only because the old dc:creator property had such weak usage rules. But then if we change the DC namespace used for ‘creator’ here, should we change the other ones too? Hmm hmm hmm etc.

The main known consumer of this IRC log data is XSLT created and maintained by the ubiquitous Dave. If you know of other downstream consumer apps please let us know by commenting in the blog or by email.

While there are other ways in which such chatlogs might be represented in RDF (eg. SIOC, or using something linked to XMPP), let’s keep the scope of this change quite small. The goal is to make a minimal fix to the current archived instance data  and associated tools, so that the FOAF vocabulary it uses can be documented.

Comments welcomed…

I just signed up to give a talk at the Microformats vEvent in London, May 27th; thanks to the organizers (Frances Berriman and Drew McLellen of microformats.org) for inviting me :)

I’ve called it “One Big Happy Family: Practical Collaboration on Meaningful Markup” and my goal really is to help make it easier for enthusiasts for both RDF and Microformats to say ‘we‘ rather than ‘they‘ a bit more often when discussing complementary efforts from this community. As I said on the foaf-dev list yesterday, “anything good for Microformats is good for FOAF”; vice-versa too, I hope. There’s only one Web and we’re all doing our bit, with the tools and techniques we know best.

Here’s the abstract:

This talk explores some ways in which the Microformat and RDF approaches can complement each other, and some ways in which we can share data, tools and experiences between these two technologies. It will outline the often-unarticulated history of the RDF design, the techniques used for parsing and querying RDF data, and the things made easy and hard through this approach. RDF techniques can be contrasted with the different choices made for Microformats. However these differences obscure an underlying similarity that comes from shared ‘Webby’ values.

Edit: it seems I’m incapable of spelling “compl[ie]mentary”. Freudian slip? :)

BTW the London Web Week site has just gone live; check it out…

Speaks, reads, writes
Stephanie Booth asks:

 I vaguely remember somebody telling me about some emerging “standard” (too big a word) for encoding language skills. Or was it a dream?

That would’ve been me, showing markup from the FOAFX beta from Paola Di Maio and friends, which explores the extension of FOAF with expertise information. This is part of the ExpertFinder discussions alongside the FOAF project (see also wiki, mailing list). FOAFX and the ExpertFinder community are looking at ways of extending FOAF to better describe people’s expertise; both self-described and externally accredited. This is at once a fascinating, important and terrifyingly hard to scope problem area. It touches on longstanding “Web of trust” themes, on educational metadata standards, and on the various ways of characterising topics or domains of expertise. In other words, in any such problem space, there will always be multiple ways of “doing it”. For example, here is how the Advogato community site characterises my expertise regarding opensource software: foaf.rdf (I’m in the Journeyer group, apparently; some weighted average of people’s judgements about me).

One thing FOAFX attempts is to describe language skills. For this, they extend the idiom proposed by Inkel some years ago in his “Speaks, Reads, Writesschema. In the original (which is Spanish, but see also English version), the classification was effectively binary: one could either speak, read, or write a language; or one couldn’t. You could also say you ‘mastered’ it, meaning that you could speak, read and write it. In FOAFX, this is handled differently: we get a 1-5 score. I like this direction, as it allows me to express that I have some basic capability in Spanish, without appearing to boast that I’m anything like “fluent”. But … am I a “1″ or a “2″? Should I poll my long-suffering Spanish-speaking friends? Take an online quiz? Introducing numbers gives the impression of mathematical precision, but in skill characterisation this is notoriously hard (and not without controversy).

My take here is that there’s no right thing to do. So progress and experimentation are to be celebrated, even if the solution isn’t perfect. On language skills, I’d love some way also to allow people to say “I’m learning language X”, or “I’m happy to help you practice your English/Spanish/Japanese/etc.”. Who knows, with more such information available, online Social Network sites could even prove useful…

Here btw is the current RDF markup generated by FOAFX:

<foaf:Person rdf:ID="me">
<foaf:mbox_sha1>6e80d02de4cb3376605a34976e31188bb16180d0</foaf:mbox_sha1>
<foaf:givenname>Dan</foaf:givenname>
<foaf:family_name>Brickley</foaf:family_name>
<foaf:homepage rdf:resource="http://danbri.org/" />
<foaf:weblog rdf:resource="http://danbri.org/words/" />
<foaf:depiction rdf:resource="http://danbri.org/images/me.jpg" />
<foaf:jabberID>danbrickley@gmail.com</foaf:jabberID>
<foafx:language>
<foafx:Language>
<foafx:name>English</foafx:name>
<foafx:speaking>5</foafx:speaking>
<foafx:reading>5</foafx:reading>
<foafx:writing>5</foafx:writing>
</foafx:Language>
</foafx:language>
<foafx:language>
<foafx:Language>
<foafx:name>Spanish</foafx:name>
<foafx:speaking>1</foafx:speaking>
<foafx:reading>1</foafx:reading>
<foafx:writing>1</foafx:writing>
</foafx:Language>
</foafx:language>
<foafx:expertise>
<foafx:Expertise>
<foafx:field>::</foafx:field>
<foafx:fluency>
<foafx:Language>
<foafx:name>English</foafx:name>
</foafx:Language>
</foafx:fluency>
</foafx:Expertise>
</foafx:expertise>
</foaf:Person>

The apparent redundancy in the markup (expertise, Expertise) is due to RDF’s so-called “striped” syntax. I have an old introduction to this idea; in short, RDF lets you define properties of things, and categories of thing. The FOAFX design effectively says, “there is a property of a person called “expertise” which relates that person to another thing, an “Expertise”, which itself has properties like “fluency”.

The FOAFX design tries to navigate between generic and specific, by including language-oriented markup as well as more generic skill descriptions. I think this is probably the right way to go. There are many things that we can say about human languages that don’t apply to other areas of expertise (eg. opensource software development). And there many things we can say about expertise in general (like expressions of willingness to learn, to teach, … indications of formal qualification) which are cross domain. Similarly, there are many things we might say in markup about opensource projects (picking up on my Advogato mention earlier) which have nothing to do with human languages. Yet both human language expertise and opensource skills are things we might want to express via FOAF extensions. For example, the DOAP project already lets us describe opensource projects and our roles in them.

The Semantic Web design challenge here is to provide a melting pot for all these different kinds of data, one that allows each specific problem to be solved adequately in a reasonable time-frame, without precluding the possibility for richer integration at a later date. I have a hunch that the Advogato design, which expresses skills in terms of group membership, could be a way to go here.

This is related to the idea of expressing group-membership criteria through writing SPARQL queries. For example, we can talk about the Group of people who work for W3C. Or we can talk about the Group of people who work for W3C as listed authoritatively on the W3C site. Both rules are expressible as queries; the latter a query that says things about the source of claims, as well as about what those claims assert. This notion of a group defined by a query allows for both flavours; the definition could include criteria relating to the provenance (ie. source) of the claims, but it needn’t. So we could express the idea of people who speak Spanish, or the idea of people who speak french according to having passed some particular test, or being certified by some agency. In either case, the unifying notion is “person X is in group Y”, where Y is a group identified by some URL. What I like about this model, is it allows for a very loose division of labour: skill-related markup is necessarily going to be widely varied. Yet the idea that such scattered evidence boils down to people falling into definable groups, gives some overall cohesion to this diversity. I could for example run a query asking for people with (foafx idiom) “Spanish skills of 2 or more”. I could add a constraint that the person be at least a “Journeyer” regarding their opensource skills, according to Advogato, or perhaps mix in data expressed in DOAP terms regarding their roles in opensource project work. These skills effectively define groups (loosly, sets) of people, and skill search can be pictured in venn diagram terms. Of course all this depends on getting enough data out there for any such queries to be worthwhile. Maybe a Facebook app that re-published data outside of Hotel Facebook would be a way of bootstrapping things here?

I’m in Cork, mainly for the excellent Social Network Portability event on Sunday, but am also staying through Blogtalk’08 which has been great. I’ve uploaded my slides from my talk (slideshare in Flash, included inline here, or a pdf). I have some rough speaking notes too,  maybe I’ll get those online. I have no idea how they relate to whatever actually came out of my mouth during the talk :) Apologies to those without PDF or Flash. I haven’t tried Keynote’s HTML output yet.

Basically much of what I was getting at in the talk, and my thoughts are only just congealing on this … is that the idea of a ‘claim’ is a useful bridge between Semantic Web and Social Networking concerns. Also that it helps us understand how technologies fit together. FOAF defines a dictionary of terms for making claims, as does xfn, hCard. RDF/XML, Microformats, RDFa, GRDDL define textual notations for publishing documents that encode claims, and SPARQL gives us a way of asking questions about the claims made in different documents.

Next Page »