A Penny for your thoughts: New Year wishes from mechanical turkers

I wanted to learn more about Amazon’s Mechanical Turk service (wikipedia), and perhaps also figure out how I feel about it.

Named after a historical faked chess-playing machine, it uses the Web to allow people around the world to work on short low-pay ‘micro-tasks’. It’s a disturbing capitalist fantasy come true, echoing Frederick Taylor’s ‘Scientific Management‘ of the 1880s. Workers can be assigned tasks at the touch of the button (or through software automation); and rewarded or punished at the touch of other buttons.

Mechanical Turk has become popular for outsourcing large scale data cleanup tasks, image annotation, and other topics where human judgement outperforms brainless software. It’s also popular with spammers. For more background see ‘try a week as a turker‘ or this Salon article from 2006. Turk is not alone, other sites either build on it, or offer similar facilities. See for example crowdflower, txteagle, or Panos Ipeirotis’ list of micro-crowdsourcing services.

Crowdflower describe themselves as offering “multiple labor channels…  [using] crowdsourcing to harness a round-the-clock workforce that spans more than 70 countries, multiple languages, and can access up to half-a-million workers to dispatch diverse tasks and provide near-real time answers.”

Txteagle focuses on the explosion of mobile access in the developing world, claiming that “txteagle’s GroundSwell mobile engagement platform provides clients with the ability to communicate and incentivize over 2.1 billion people“.

Something is clearly happening here. As someone who works with data and the Web, it’s hard to ignore the potential. As someone who doesn’t like treating others as interchangeable, replaceable and disposable software components, it’s hard to feel very comfortable. Classic liberal guilt territory. So I made an account, both as a worker and as a ‘requester’ (an awkward term, but it’s clear why ‘employer’ is not being used).

I tried a few tasks. I wrote 25-30 words for a blog on some medieval prophecies. I wrote 50 words as fast as I could on “things I would change in my apartment”. I tagged some images with keywords. I failed to pass a ‘qualification’ test sorting scanned photos into scratched, blurred and OK. I ‘like’d some hopeless Web site on Facebook for 2 cents. In all I made 18 US cents. As a way of passing the time, I can see the appeal. This can compete with daytime TV or Farmville or playing Solitaire or Sudoko. I quite enjoyed the mini creative-writing tasks. As a source of income, it’s quite another story, and the awful word ‘incentivize‘ doesn’t do justice to the human reality.

Then I tried the other role: requester. After a little more liberal-guilt navelgazing (“would it be inappropriate to offer to buy people’s immortal souls? etc.”), I decided to offer a penny (well, 2 cents) for up to 100 people’s new year wish thoughts, or whatever of those they felt like sharing for the price.

I copy the results below, stripped of what little detail (eg. time in seconds taken) each result came with. I don’t present this as any deep insight or sociological analysis or arty meditation. It’s just what 100 people somewhere else in the Web responded with, when asked what they wish for 2011. If you want arty, check out the sheep market. If you want more from ‘turkers’ in their own voice, do visit the ‘Turker Nation’ forum. Also Turkopticon is essential reading,  “watching out for the crowd in crowdsourcing because nobody else seems to be.”

The exact text used was “Make a wish for 2011. Anything you like, just describe it briefly. Answers will be made public.”, and the question was asked with a simple Web form, “Make a wish for 2011, … any thought you care to share”.


Here’s what they said:

When you’re lonely, I wish you Love! When you’re down, I wish you Joy! When you’re troubled, I wish you Peace! When things seem empty, I wish you Hope! Have a Happy New Year!

wish u a happy new year…………

happy new year 2011. may this year bring joy and peace in your life

My wish for 2011 is i want to mary my Girlfriend this year.

I wish I will get pregnant in 2011!

i wish juhi becomes close to me

wish you a wonderful happy new year

wish you happy new year

for new year 2011 I wish Love of God must fill each human heart
Food inflation must be wiped off quickly
corruption must be rooted out smartly
Terrorism must be curtailed quickly
All People must get love, care, clothes, shelter & food
Love of God must fill each human heart…

Happy life.All desires to be fulfilled.

wish to be best entrepreneur of the year 2011

dont work hard if it is possible to do the same smarter way..
Be happy!

New year is the time to unfold new horizons,realise new dreams,rejoice in simple pleasures and gear up for new challenges.wishing a fulfilling 2011.

Remember that the best relationship is one where your love for each other is greater than your need for each other. Happy New Year

To get a newer car, and have less car problems. and have more income

I wish that my son’s health problems will be answered

Be it Success & Prosperity, Be it Fun and Frolic…

A new year is waiting for you. Go and enjoy the New Year on New Thought,”Rebirth of My Life”.

Let us wish for a world as one family, then we can overcome all the problems man made and otherwise.

My wish is to gain/learn more knowledge than in 2010

My new years wish for 2011 is to be happier and healthier.

I wish that I would be cured of heartache.

I am really very happy to wish you all very happy new year…..I wish you all the things to be success in your life and career…….. Just try to quit any bad habit within you. Just forgot all the bad incidents happen within your friends and try to enjoy this new year with pleasant……

Wish you a happy and prosperous new year.

I wish for a job.

I would hope that people will end the wars in the world.

Discontinue smoking and restrict intake of alcohol

I wish that my retail store would get a bigger client base so I can expand.

I Wish a wish for You Dear.Sending you Big bunch of Wishes from the Heart close to where.Wish you a Very Very Happy New Year

I wish for 2011 to be filled with more love and happiness than 2010.

Everything has the solution Even IMPOSSIBLE Makes I aM POSSIBLE. Happy Journey for New Year.

May each day of the coming year be vibrant and new bringing along many reasons for celebrations & rejoices. Happy New year

I have just moved and want to make some great new friends! Would love to meet a special senior (man!!) to share some wonderful times with!!!

My wish is that i wanna to live with my “Pretty girl” forever and also wanna to meet her as well,please god please, finish my this wish, no more aspire from me only once.

that people treat each other more nicely and with greater civility, in both their private and public lives.

that we would get our financial house in order

Year’s end is neither an end nor a beginning but a going on, with all the wisdom that experience can instill in us. Wish u very happy new year and take care

Wish you a very happy And prosperous new year 2011

Tom Cruise
Angelina Jolie
Aishwarya Rai
Arnold
Jennifer Lopez
Amitabh Bachhan
& me..
All the Stars wish u a Very Happy New Year.

Oh my Dear, Forget ur Fear,
Let all ur Dreams be Clear,
Never put Tear, Please Hear,
I want to tell one thing in ur Ear
Wishing u a very Happy “NEW YEAR”!

May The Year 2011 Bring for You…. Happiness,Success and filled with Peace,Hope n Togetherness of your Family n Friends….

i want to be happy

Good health for my family and friends

I wish my husband’s children would stop being so mean and violent and act like normal children. I want to love my husband just as much as before we got full custody.

to get wonderful loving girl for me.. :))

Keep some good try. Wish u happy new year

happy new year to all

My wish is to find a good job.

i wish i get a big outsourcing contract this year that i can re-set up my business and get back on track.

I wish that I be firm in whatever I do. That I can do justice to all my endeavors. That I give my 100%, my wholehearted efforts to each and every minutest work I do.

My wish for 2011, is a little patience and understanding for everyone, empathy always helps.

To be able to afford a new house

“NEW YEAR 2011″
+NEW AIM + NEW ACHIEVEMENT + NEW DREAM +NEW IDEA + NEW THINKING +NEW AMBITION =NEW LIFE+SUCCESS HAPPY NEW YEAR!

let this year be terrorist free world

Wish the world walk forward in time with all its innocence and beauty where prevails only love, and hatred no longer found in the dictionary.

no

Wish u a very happy New Year Friends and make this year as a pleasant days…

I wish the economy would get better, so people can afford to pay their bills and live more comfortably again.

i wish, god makes life beautiful and very simple to all of us. and happy new year to world.

Be always at war with your vices, at peace with your neighbors, and let each new year find you a better man and I wish a very very prosperous new year.

i wish i would buy a house and car for my mom

I wish to have a new car.
This new year will be full of expectation in the field of investment.We concerned about US dollar. Hope this year will be a good for US dollar.

this year is very enjoyment life

Cheers to a New Year and another chance for us to get it right

to get married

Wishing all a meaningful,purposeful,healthier and prosperous New Year 2011.

WISH YOU A HAPPY NEW YEAR 2011 MAY BRING ALL HAPPINESS TO YOU

RAKKIMUTHU

In 2011 I wish for my family to get in a better spot financially and world peace.

Wish that economic conditions improve to the extent that the whole spectrum of society can benefit and improve themselves.

I want my divorce to be final and for my children to be happy.

This 2011 year is very good year for All with Health & Wealth.

I wish that things for my family would get better. We have had a terrible year and I am wishing that we can look forward to a better and brighter 2011.

This year bring peace and prosperity to all. Everyone attain the greatest goal of life. May god gives us meaning of life to all.

This new year will bring happy in everyone’s life and peace among countries.

I hope for bipartisanship and for people to realize blowing up other people isn’t the best way to get their point across. It just makes everyone else angry.

A better economy would be nice too

I wish that in 2011 the government will work together as a TEAM for the betterment of all. Peace in the world.

i wish you all happy new year. may god bless all……

no i wish for you

I wish that my family will move into our own house and we can be successful in getting good jobs for our future.

I wish my girl comes back to me

Wish You Happy New Year for All, especially to the workers and requester’s of Mturk.

Greetings!!!

Wishing you and your family a very happy and prosperous NEW YEAR – 2011

May this New Year bring many opportunities your way, to explore every joy of life and may your resolutions for the days ahead stay firm, turning all your dreams into reality and all your efforts into great achievements.

Wish u a Happy and Prosperous New Year 2011….

Wishing u lots of happiness..Success..and Love

and Good Health…….

Wish you a very very happy new year

WISHING YOU ALL A VERY HAPPY & PROSPEROUS NEW YEAR…….

I wish in this 2011 is to be happy,have a good health and also my family.

I pray that the coming year should bring peace, happiness and good health.

I wish for my family to continue to be healthy, for my cars to continue running, and for no 10th Anniversary attacks this upcoming September.

be a good and help full for my family .

Happy and Prosperous New Year

New day new morning new hope new efforts new success and new feeling,a new year a new begening, but old friends are never forgotten, i think all who touched my life and made life meaningful with their support, i pray god to give u a verry “HAPPY AND SUCCESSFUL NEW YEAR”.

Be a good person,as good as no one

wish this new year brings cheers and happiness to one and all.

For the year 2011 I simply wish for the ability to support my family properly and have a healthier year.

I wish I have luck with getting a better job.

Greater awareness of climate change, and a recovering US economy.

this new year 2011 brings you all prosperous and happiness in your life…….

happy newyear wishes to all the beautiful hearts in the world in the world.god bless you all.

wishing every happy new year to all my pals and relatives and to all my lovely countrymen

Disambiguating with DBpedia

Sketchy notes. Say you’re looking for an identifier for something, and you know it’s a company/organization, and you have a label “Woolworths”.

What can be done to choose amongst the results we find in DBpedia for this crude query?

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select distinct ?x where {
?x a <http://dbpedia.org/ontology/Organisation>;  rdfs:label ?l .
FILTER(REGEX(?l, “Woolworths*”)).
}

More generally, are the tweaks and tricks needed to optimise this sort of disambiguation going to be cross-domain, or do we have to hand-craft them, case by case?

Subject classification and Statistics

Subject classification and statistics share some common problems. This post takes a small example discussed at this week’s ODaF event on “Semantic Statistics” in Tilberg, and explores its expression coded in the Universal Decimal Classification (UDC). UDC supports faceted description, providing an abstract grammar allowing sentence-like subject descriptions to be composed from the “raw materials” defined in its vocabulary scheme.

This makes the mapping of UDC (and to some extent also Dewey classifications)  into W3C’s SKOS somewhat lossy, since patterns and conventions for documenting these complex, composed structures are not yet well established. In the NoTube project we are looking into this in a TV context, in large part because the BBC archives make extensive use of UDC via their Lonclass scheme; see my ‘investigating Lonclass‘ and UDC seminar talk for more on those scenarios. Until this week I hadn’t thought enough about the potential for using this to link deep into statistical datasets.

One of the examples discussed on Tuesday was as follows (via Richard Cyganiak):

“There were 66 fatal occupational injuries in the Washington, DC metropolitan area in 2008″

There was much interesting discussion in Tilburg about the proper scope and role of Linked Data techniques for sharing this kind of statistical data. Do we use RDF essentially as metadata, to find ‘black boxes’ full of stats, or do we use RDF to try to capture something of what the statistics are telling us about the world? When do we use RDF as simple factual data directly about the world (eg. school X has N pupils [currently; or at time t]), and when does it become a carrier for raw numeric data whose meaning is not so directly expressed at the factual level?

The state of the art in applying RDF here seems to be SDMX-RDF, see Richard’s slides. The SDMX-RDF work uses SKOS to capture code lists, to describe cross-domain concepts and to indicate subject matter.

Given all this, I thought it would be worth taking this tiny example and looking at how it might look in UDC, both as an example of the ‘compositional semantics’ some of us hope to capture in extended SKOS descriptions, but also to explore scenarios that cross-link numeric data with the bibliographic materials that can be found via library classification techniques such as UDC. So I asked the ever-helpful Aida Slavic (editor in chief of the UDC), who talked me through how this example data item looks from a UDC perspective.

I asked,

So I’ve just got home from a meeting on semweb/stats. These folk encode data values with stuff like “There were 66 fatal occupational injuries in the Washington, DC metropolitan area in 2008″. How much of that could have a UDC coding? I guess I should ask, how would subject index a book whose main topic was “occupational injuries in the Washington DC metro area in 2008″?

Aida’s reply (posted with permission):

You can present all of it & much more using UDC. When you encode a subject like this in UDC you store much more information than your proposed sentence actually contains. So my decision of how to ‘translate this into udc’ would depend on learning more about the actual text and the context of the message it conveys, implied audience/purpose, the field of expertise for which the information in the document may be relevant etc. I would probably wonder whether this is a research report, study, news article, textbook, radio broadcast?

Not knowing more then you said I can play with the following: 331.46(735.215.2/.4)”2008

Accidents at work — Washington metropolitan area — year 2008
or a bit more detailed:  331.46-053.18(735.215.2/.4)”2008
Accidents at work — dead persons – Washington metropolitan area — year 2008
[you can say the number of dead persons but this is not pertinent from point of view of indexing and retrieval]

…or maybe (depending what is in the content and what is the main message of the text) and because you used the expression ‘fatal injuries’ this may imply that this is more health and safety/ prevention area in health hygiene which is in medicine.

The UDC structures composed here are:

TIME “2008”

PLACE (735.215.2/.4)  Counties in the Washington metropolitan area

TOPIC 1
331     Labour. Employment. Work. Labour economics. Organization of  labour
331.4     Working environment. Workplace design. Occupational safety.  Hygiene at work. Accidents at work
331.46  Accidents at work ==> 614.8

TOPIC 2
614   Prophylaxis. Public health measures. Preventive treatment
614.8    Accidents. Risks. Hazards. Accident prevention. Persona protection. Safety
614.8.069    Fatal accidents

NB – classification provides a bit more context and is more precise than words when it comes to presenting content i.e. if the content is focused on health and safety regulation and occupation health then the choice of numbers and their order would be different e.g. 614.8.069:331.46-053.18 [relationship between] health & safety policies in prevention of fatal injuries and accidents at work.

So when you read  UDC number 331.46 you do not see only e.g. ‘accidents at work’ but  ==>  ‘accidents at work < occupational health/safety < labour economics, labour organization < economy
and when you see UDC number 614.8  it is not only fatal accidents but rather ==> ‘fatal accidents < accident prevention, safety, hazards < Public health and hygiene. Accident prevention

When you see (735.2….) you do not only see Washington but also United States, North America

So why is this interesting? A couple of reasons…

1. Each of these complex codes combines several different hierarchically organized components; just as they can be used to explore bibliographic materials, similar approaches might be of value for navigating the growing collections of public statistical data. If SKOS is to be extended / improved to better support subject classification structures, we should take care also to consider use cases from the world of statistics and numeric data sharing.

2. Multilingual aspects. There are plans to expose SKOS data for the upper levels of UDC. An HTML interface to this “UDC summary” is already available online, and includes collected translations of textual labels in many languages (see progress report) . For example, we can look up 331.4 and find (in hierarchical context) definitions in English (“Working environment. Workplace design. Occupational safety. Hygiene at work. Accidents at work”), alongside e.g. Spanish (“Entorno del trabajo. Diseño del lugar de trabajo. Seguridad laboral. Higiene laboral. Accidentes de trabajo”), CroatianArmenian, …

Linked Data is about sharing work; if someone else has gone to the trouble of making such translations, it is probably worth exploring ways of re-using them. Numeric data is (in theory) linguistically neutral; this should make linking to translations particularly attractive. Much of the work around RDF and stats is about providing sufficient context to the raw values to help us understand what is really meant by “66” in some particular dataset. By exploiting SDMX-RDF’s use of SKOS, it should be possible to go further and to link out to the wider literature on workplace fatalities. This kind of topical linking should work in both directions: exploring out from numeric data to related research, debate and findings, but also coming in and finding relevant datasets that are cross-referenced from books, articles and working papers. W3C recently launched a Library Linked Data group, I look forward to learning more about how libraries are thinking about connecting numeric and non-numeric information.

RDFa in Drupal 7: last call for feedback before alpha release

Stéphane has just posted a call for feedback on the Drupal 7 RDFa design, before the first official alpha release.

First reaction above all, is that this is great news! Very happy to see this work maturing.

I’ve tried to quickly suggest some tweaks to the vocab, by hacking his diagram in photoshop. All it really shows is that I’ve forgotten how to use photoshop, but I’ll upload it here anyway.

So if you click through to the full image, you can see my rough edits.

I’d suggest:

  1. Use (dcterms) dc:subject as the way of pointing from a document to it’s SKOS subject.
  2. Use (dcterms) dc:creator as the relationship between a document and the person that created it (note that in FOAF, we now declare foaf:maker to map as an equivalentProperty to (dcterms)dc:creator).
  3. Distinguish between the description of the person versus their account in the Drupal system; I would use foaf:Person for the human, and sioc:User (a kind of foaf:OnlineAccount) as the drupal account. The foaf property to link from the former to the latter is foaf:account (new name for foaf:holdsAccount).
  4. Focus on SIOC where it is most at-home: in modelling the structure of the discussion; threading, comments and dialog.
  5. Provide a generated URI for the person. I don’t 100% understand Stephane’s comment, “Hash URIs for identifying things different from the page describing them can be implemented quite easily but this case hasn’t emerged in core” but perhaps this will be difficult? I’d suggest using URIs ending “userpage#!person” so the fragment IDs can’t clash with HTML usage.

If the core release can provide this basic structure, including a hook for describing the human person rather than the site-specific account (ie. sioc:User) then extensions should be able to add their own richness. The current markup doesn’t quite work for that end, as the human user is only described indirectly (unless  I understand current reading of sioc:User).

Anyway, I’m nitpicking! This is really great, and a nice and well-deserved boost for the RDFa community.

WOT in RDFa?

(This post is written in RDFa…)

To the best of my knowledge, Ludovic Hirlimann‘s PGP fingerprint is 6EFBD26FC7A212B2E093 B9E868F358F6C139647C. You might also be interested in his photos on flickr, or his workplace, Mozilla Messaging. The GPG key details were checked over a Skype video call with me, Ludo and Kaare A. Larsen.

This blog post isn’t signed, the URIs it referenced don’t use SSL, and the image could be switched by evildoers at any time! But the question’s worth asking: is this kind of scruffy key info useful, if there’s enough of it? If I wrote it somehow in Thunderbird’s editor instead, would it be easier to sign? Will 99.9% of humans ever know enough of what’s going on to understand what signing a bunch of complex markup means?

For earlier discussion of this kind of thing, see Joseph Reagle’s Key-free Trust piece (“Does Google Show How the Semantic Web Could Replace Public Key Infrastructure?”). It’s more PKI-free trust than PK-free.

WordPress trust syndication revisited: F2F plugin

This is a followup to my Syndicating trust? Mediawiki, WordPress and OpenID post. I now have a simple implementation that exports data from WordPress: the F2F plugin. Also some experiments with consuming aggregates of this information from multiple sources.

FOAF has always had a bias towards describing social things that are shown rather than merely stated; this is particularly so in matters of trust. One way of showing basic confidence in others, is by accepting their comments on your blog or Web site. F2F is an experiment in syndicating information about these kinds of everyday public events. With F2F, others can share and re-use this sort of information too; or deal with it in aggregate to spread the risk and bring more evidence into their trust-related decisions. Or they might just use it to find interesting people’s blogs.

OpenID is a technology that lets people authenticate by showing they control some URL. WordPress blogs that use the OpenID plugin slowly accumulate a catalogue of URLs when people leave comments that are approved or rejected. In my previous post I showed how I was using the list of approved OpenIDs from my blog to help configure the administrative groups on the FOAF wiki.

This may all raise more questions than it answers. What level of detail is appropriate? are numbers useful, or just lists? in what circumstances is it sensible or risky to merge such data? is there a reasonable use for both ‘accept’ lists and ‘unaccept’ lists? What can we do with a list of OpenID URLs once we’ve got it? How do we know when two bits of trust ‘evidence’ actually share a common source? How do we find this information from the homepage of a blog?

If you install the F2F plugin (and have been using the OpenID plugin long enough to have accumulated a database table of OpenIDs associated with submitted comments), you can experiment with this. Basically it will generate HTML in RDFa format describing a list of people . See the F2F Wiki page for details and examples.

The script is pretty raw, but today it all improved a fair bit with help from Ed Summers, Daniel Krech and Morten Frederiksen. Ed and Daniel helped me get started with consuming this RDFa and SPARQL in the latest version of the rdflib Python library. Morten rewrote my initial nasty hack, so that it used WordPress Shortcodes instead of hardcoding a URL path. This means that any page containing a certain string – f2f in chunky brackets – will get the OpenID list added to it. I’ll try that now, right here in this post. If it works, you’ll get a list of URLs below. Also thanks to Gerald Oskoboiny for discussions on this and reputation-related aggregation ideas; see his page on reputation and trust for lost more related ideas and sites. See also Peter Williams’ feedback on the foaf-dev list.

Next steps? I’d be happy to have a few more installations of this, to get some testbed data. Ideally from an overlapping community so the datasets are linked, though that’s not essential. Ed has a copy installed currently too. I’ll also update the scripts I use to manage the FOAF MediaWiki admin groups, to load data from RDFa blogs; mine and others if people volunteer relevant data. It would be great to have exports from other software too, eg. Drupal or MediaWiki.

Comment accept list for http://danbri.org/words

WordPress, TinyMCE and RDFa editors

I’m writing this in WordPress’s ‘Visual’ mode WYSIWYG HTML editor, and thinking “how could it be improved to support RDFa?”

Well let’s think. Humm. In RDFa, every section of text is always ‘about’ something, and then has typed links or properties associated with that thing. So there are icons ‘B’ for bold, ‘I’ for italics, etc. And thingies for bulleted lists, paragraphs, fonts. I’m not sure how easily it would handle the nesting of full RDFa, but some common case could be a start: a paragraph that was about some specific thing, and within which the topical focus didn’t shift.

So, here is a paragraph about me. How to say it’s really about me? There could be a ‘typeof’ attribute on the paragraph element, with value foaf:Person, and an ‘about’ attribute with a URI for me, eg. http://danbri.org/foaf.rdf#danbri

In this UI, I just type textual paragraphs and then double newlines are interpreted by the publishing engine as separate HTML paragraphs. I don’t see any paragraph-level or even post-level properties in the user interface where an ‘about’ URI could be stored. But I’m sure something could be hacked. Now what about properties and links? Let’s see…

I’d like to link to TinyMCE now, since that is the HTML editor that is embedded in WordPress. So here is a normal link to TinyMCE. To make it, I searched for the TinyMCE homepage, copied its URL into the copy/paste buffer, then pressed the ‘add link’ icon here after highlighting the text I wanted to turn into a link.

It gave me a little HTML/CSS popup window with some options – link URL, target window, link title and (CSS) class; I’ve posted a screenshot below. This bit of UI seems like it could be easily extended for typed links. For example, if I were adding a link to a paragraph that was about a Film, and I was linking to (a page about) its director or an actor, we might want to express that using an annotation on the link. Due to the indirection (we’re trying to say that the link is to a page whose primary topic is the director, etc.), this might complicate our markup. I’ll come back to that later.

tinymce link edit screenshop

tinymce link edit screenshot

I wonder if there’s any mention of RDFa on the TinyMCE site? Nope. Nor even any mention of Microformats.

Perhaps the simplest thing to try to build into TinyMCE would be for a situation where a type on the link expressed a relationship between the topic of the blog post (or paragraph), and some kind of document.

For example, a paragraph about me, might want to annotate a link to my old school’s homepage (ie. http://www.westergate.w-sussex.sch.uk/ ) with a rel=”foaf:schoolHomepage” property.

So our target RDFa would be something like (assuming this markup is escaped and displayed nicely):

<p about=”http://danbri.org/foaf.rdf#danbri” typeof=”foaf:Person” xmlns:foaf=”http://xmlns.com/foaf/0.1/”>

I’m Dan, and I went to <a rel=”foaf:schoolHomepage” href=”http://www.westergate.w-sussex.sch.uk/”>Westergate School</a>

</p>.

See recent discussion on the foaf-dev list where we contrast this idiom with one that uses a relationship (eg. ‘school’) that directly links a person to a school, rather than to the school’s homepage. Toby made some example RDFa output of the latter style, in response to a debate about whether the foaf:schoolHomepage idiom is an ‘anti-pattern’. Excerpt:

<p about=”http://danbri.org/foaf.rdf#danbri” typeof=”foaf:Person” xmlns:foaf=”http://xmlns.com/foaf/0.1/”>

My name is <span property=”foaf:name”>Dan Brickley</span> and I spent much of the ’80s at <span rel=”foaf:school”><a typeof=”foaf:Organization” property=”foaf:name” rel=”foaf:homepage” href=”http://www.westergate.w-sussex.sch.uk/”>Westergate School</a>.

</p>

So there are some differences between these two idioms. From the RDF side they tell you similar things: the school I went to. The latter idiom is more expressive, and allows additional information to be expressed about the school, without mixing it up with the school’s homepage. The former simply says “this link is to the homepage of Dan’s school”; and presumably defers to the school site or elsewhere for more details.

I’m interested to learn more about how these expressivity tradeoffs relate to the possibilities for visual / graphical editors. Could TinyMCE be easily extended to support either idiom? How would the GUI recognise a markup pattern suitable for editing in either style? Can we avoid having to express relationships indirectly, ie. via pages? Would it be a big job to get some basic RDFa facilities into WordPress via TinyMCE?

For related earlier work, see the 2003 SWAD-Europe report on Semantic blogging from the team then at HP Labs Bristol.

Syndicating trust? Mediawiki, WordPress and OpenID

Fancy title but simple code. A periodic update script is setting user/group membership rules on the FOAF wiki based on a list of trusted (for this purpose) OpenIDs exported from a nearby blog. If you’ve commented on the blog using OpenID and it was accepted, this means you can also perform some admin actions (page deletes, moves, blocking spammers etc.) on the FOAF wiki without any additional fuss.

Both WordPress blogs and Mediawiki wikis have some support for OpenID logins.

The FOAF wiki until recently only had one Sysop and Bureaucrat account (a bureaucrat has the same privileges as a Sysop except for the ability to create new bureaucrat accounts). So I’ve begun an experiment exploring idea of pre-approving certain OpenIDs for bureaucrat activities. For now, I take a list of OpenIDs from my own blog; these appear to be just the good guys, but this might be because only real humans have commented on my blog via OpenID. With a bit of tweaking I’m sure I could write SQL to select out only OpenIDs associated with posts or comments I’ve accepted as non spammy, though.

So now there’s a script I can run (thanks tobyink and others in #swig IRC for help) which compares an externally supplied list of OpenID URIs with those OpenIDs known to the wiki, and upgrades the status of any overlaps to be bureaucrats. Currently the ‘syndication’ is trivial since the sites are on the same machine, and the UI is minimal; I haven’t figured out how best to convey this notion of ‘pre-approved upgrade’ to the people I’m putting in an admin group. Quite reasonably they might object to being misrepresented as contributors; who knows.

But all that aside, take a look and have a think. This kind of approach has a lot going for it. We will have all kinds of lists of people, groups of people, and in many cases we’ll know their OpenIDs. So why not pool what we know? If a blog or wiki has information about an OpenID that shows it is somehow trustworthy, or at least not obviously a spammer, there’s every reason to make notations (eg. FOAF/RDFa) that allow other such sites to harvest and integrate that data…

See also Dan Connolly’s DIG blog post on this, and the current list of Bureaucrats on the FOAF Wiki (and associated documentation). If your names on the list, it just means your OpenID was on a pre-approved list of folk who I trust based on their interactions with my own blog. I’d love to add more sources here and make it genuinely communal.

This is all part of the process of getting FOAF moving again. The brains of FOAF is in the IssueTracker page, and since the site was damaged by spammers and hackers recently I’m trying to make sure we have a happy / wholesome environment for maintaining shared documents. And that’s more than I can do as a solo admin, hence this design for opening things up…

Remote remotes

I’ve just closed the loop on last weekend’s XMPP / Apple Remote hack, using Strophe.js, a library that extends XMPP into normal Web pages. I hope I’ll find some way to use this in the NoTube project (eg. wired up to Web-based video playing in OpenSocial apps), but even if not it has been a useful learning experience. See this screenshot of a live HTML page, receiving and displaying remotely streamed events (green blob: button clicked; grey blob: button released). It doesn’t control any video yet, but you get the idea I hope.

Remote apple remote HTML demo

Remote apple remote HTML demo, screenshot showing a picture of handheld apple remote with a grey blob over the play/pause button, indicating a mouse up event. Also shows debug text in html indicating ButtonUpEvent: PLPZ.

This webclient needs the JID and password details for an XMPP account, and I think these need to be from the same HTTP server the HTML is published on. It works using BOSH or other tricks, but for now I’ve not delved into those details and options. Source is in the Buttons area of the FOAF svn: webclient. I made a set of images, for each button in combination with button-press (‘down’), button-release (‘up’). I’m running my own ejabberd and using an account ‘buttons@foaf.tv’ on the foaf.tv domain. I also use generic XMPP IM accounts on Google Talk, which work fine although I read recently that very chatty use of such services can result in data rates being reduced.

To send local Apple Remote events to such a client, you need a bit of code running on an OSX machine. I’ve done this in a mix of C and Ruby: imremoted.c (binary) to talk to the remote, and the script buttonhole_surfer.rb to re-broadcast the events. The ruby code uses Switchboard and by default loads account credentials from ~/.switchboardrc.

I’ve done a few tests with this setup. It is pretty responsive considering how much indirection is involved: but the demo UI I made could be prettier. The + and – buttons behave differently to the left and right (and menu and play/pause); only + and – send an event immediately. The others wait until the key is released, then send a pair of events. The other keys except for play/pause will also forget what’s happening unless you act quickly. This seems to be a hardware limitation. Apparently Apple are about to ship an updated $20 remote; I hope this aspect of the design is reconsidered, as it limits the UI options for code using these remotes.

I also tried it using two browsers side by side on the same laptop; and two laptops side by side. The events get broadcasted just fine. There is a lot more thinking to do re serious architecture, where passwords and credentials are stored, etc. But XMPP continues to look like a very interesting route.

Finally, why would anyone bother installing compiled C code, Ruby (plus XMPP libraries), their own Jabber server, and so on? Well hopefully, the work can be divided up. Not everyone installs a Jabber server. My thinking is that we can bundle a collection of TV and SPARQL XMPP functionality in a single install, such that local remotes can be used on the network, but also local software (eg. XBMC/Plex/Boxee) can also be exposed to the wider network – whether it’s XMPP .js running inside a Web page as shown here, or an iPhone or a multi-touch table. Each will offer different interaction possibilities, but they can chat away to each other using a common link, and common RDF vocabularies (an area we’re working on in NoTube). If some common micro-protocols over XMPP (sending clicks or sending commands or doing RDF queries) can support compelling functionality, then installing a ‘buttons adaptor’ is something you might do once, with multiple benefits. But for now, apart from the JQbus piece, the protocols are still vapourware. First I wanted to get the basic plumbing in place.

Update: I re-discovered a useful ‘which bosh server do you need?’ which reminds me that there are basically two kinds of BOSH software offering; those that are built into some existing Jabber/XMPP server (like the ejabberd installation I’m using on foaf.tv) and those that are stand-alone connection managers that proxy traffic into the wider XMPP network. In terms of the current experiment, it means the event stream can (within the XMPP universe) come from addresses other than the host running the Web app. So I should also try installing Punjab. Perhaps it will also let webapps served from other hosts (such as opensocial containers) talk to it via JSON tricks? So far I have only managed to serve working Buttons/Strophe HTML from the same host as my Jabber server, ie. foaf.tv. I’m not sure how feasible the cross-domain option is.

Update x2: Three different people have now mentioned opensoundcontrol to me as something similar, at least on a LAN; it clearly deserves some investigation