Flickr & MusicBrainz Machine tags: If you’ve got it, flaunt it

From Sander van Zoest at Uncensored Interview, a convention for representing MusicBrainz identifiers using Flickr’s Machine Tag mechanism.

Example:

A photo of Matthew Dear, tagged as follows:

It also includes a Wikipedia identifier which could be used to link to DBpedia (though this might duplicate information also available within MusicBrainz’s advanced relationships system). There must be many 1000s of artist photos on Flickr, perhaps we’ll see tools to improve their tagging so they can be re-used more easily…

Nearby: Matthew Dear in DBpedia(RDF), in Freebase (RDF), …

Google Data APIs (and partial YouTube) supporting OAuth

Building on last month’s announcement of OAuth for the Google Contacts API, this from Wei on the oauth list:

Just want to let you know that we officially support OAuth for all Google Data APIs.

See blog post:

You’ll now be able to use standard OAuth libraries to write code that authenticates users to any of the Google Data APIs, such as Google Calendar Data API, Blogger Data API, Picasa Web Albums Data API, or Google Contacts Data API. This should reduce the amount of duplicate code that you need to write, and make it easier for you to write applications and tools that work with a variety of services from multiple providers. [...]

There’s also a footnote, “* OAuth also currently works for YouTube accounts that are linked to a Google Account when using the YouTube Data API.”

See the documentation for more details.

On the YouTube front, I have no idea what % of their accounts are linked to Google; lots I guess. Some interesting parts of the YouTube API: retrieve user profiles, access/edit contacts, find videos uploaded by a particular user or favourited by them plus of course per-video metadata (categories, keywords, tags, etc). There’s a lot you could do with this, in particular it should be possible to find out more about a user by looking at the metadata for the videos they favourite.

Evidence-based profiles are often better than those that are merely asserted, without being grounded in real activity. The list of people I actively exchange mail or IM with is more interesting to me than the list of people I’ve added on Facebook or Orkut; the same applies with profiles versus tag-harvesting. This is why the combination of last.fm’s knowledge of my music listening behaviour with the BBC’s categorisation of MusicBrainz artist IDs is more interesting than asking me to type my ‘favourite band’ into a box. Finding out which bands I’ve friended on MySpace would also be a nice piece of evidence to throw into that mix (and possible, since MusicBrainz also notes MySpace URIs).

So what do these profiles look like? The YouTube ‘retrieve a profile‘ API documentation has an example. It’s Atom-encoded, and beyond the video stuff mentioned above has fields like:

  <yt:age>33</yt:age>
  <yt:username>andyland74</yt:username>
  <yt:books>Catch-22</yt:books>
  <yt:gender>m</yt:gender>
  <yt:company>Google</yt:company>
  <yt:hobbies>Testing YouTube APIs</yt:hobbies>
  <yt:location>US</yt:location>
  <yt:movies>Aqua Teen Hungerforce</yt:movies>
  <yt:music>Elliott Smith</yt:music>
  <yt:occupation>Technical Writer</yt:occupation>
  <yt:school>University of North Carolina</yt:school>
  <media:thumbnail url='http://i.ytimg.com/vi/YFbSxcdOL-w/default.jpg'/>
  <yt:statistics viewCount='9' videoWatchCount='21' subscriberCount='1'
    lastWebAccess='2008-02-25T16:03:38.000-08:00'/>

Not a million miles away from the OpenSocial schema I was looking at yesterday, btw.

I haven’t yet found where it says what I can and can’t do with this information…

Mashed remote contrib: BBC music genres meet last.fm (meets OAuth)

I’m not at the BBC’s 2008 hackday-like-event, Mashed. But here’s a quick hack based on the data the BBC audio and music team have made available. The data that caught my eye was “Genres for set of MusicBrainz Artists” based on editorial data entered for bbc.co.uk/music. This is a simple file:

0039c7ae-e1a7-4a7d-9b49-0cbc716821a6    Rock and Indie
003abc43-e2bb-40e5-a080-3c4b9e56ea63    Classical
0053dbd9-bfbc-4e38-9f08-66a27d914c38    Classic Pop and Rock

It maps a MusicBrainz artist ID (increasingly the defacto open standard for identifying artists, at least in popular western music) to a simple genre label.

I haven’t yet found corresponding pages on the BBC music site for each of these genres.

Since last.fm expose my last 12 month’s most commonly played artists for all to mock, it is quite easy to cross-reference these sources to get a summary of my alleged musical interests.

A commandline ruby script online for now:

Airbag:mashed danbri$ ruby lastfm-genres.rb
Classic Pop and Rock: 13
Rock and Indie: 17
Hip Hop; RnB and Dance Hall: 1
World: 1
Dance and Electronica: 12

It’s a while since I wrote any code, clearly: this should at least be sorted and trimmed to the top 3 or so. We’d need to look at a few people’s profiles to figure out the best approach to summarising someone’s interests, and a little thought is needed for representing this in RDF/FOAF.

Now where I see OAuth fitting into this picture is the “what do we do next” step. OAuth potentially addresses a problem we’ve had in the FOAF scene, whereby FOAF generators and adaptors produce a chunk of markup, but there’s no easy/natural way to post this back into the Web. I’m hoping that blogs and hosting sites will allow external FOAF sources (like this script) to update/augment the FOAF descriptions we host in our existing Web sites and profiles. I sent some notes on this to the OAuth list (albeit to a deafening silence).

See also:  mashed last.fm / bbc genres ruby script