As ever, I write one post that perhaps should’ve been two. This is about the use and linking of datasets that aid ‘second screen’ (smartphone, tablet) TV remotes, and it takes as a quick example a navigation widget and underlying dataset that show us how we might expect to navigate TV archives, in some future age when TV lives more fully in the World Wide Web. I argue that access to the ‘raw data‘ and frameworks for embedding visualisation apps are of equal importance when thinking about innovative ways of exploring the ever-growing archives. All of this comes from many discussions with my NoTube colleagues and other collaborators; rambling scribblyness is all my own.
Ben Hammersley points us at a lovely Flash visualization http://www.stanford.edu/group/toolingup/rplviz/”>Mapping the Republic of Letters”.
From the YouTube overview, “Researchers map thousands of letters exchanged in the 18th century’s “Republic of Letters” and learn at a glance what it once took a lifetime of study to comprehend.”
Mapping the Republic of Letters has at its center a multidimensional data set which spans 300 years and nearly 100,000 letters. We use computing tools that help us to measure and analyze data quantitatively, though that will not take us to our goal. While we use software and computing techniques that were designed for scientific and statistical methods, we are seeking to develop computing tools to enhance humanistic methods, to help us to explore qualitative aspects of the Republic of Letters. The subject of our study and the nature of the material require it. The collections of correspondence and records of travel from this period are incomplete. Of that incomplete material only a fraction has been digitized and is available to us. Making connections and resolving ambiguities in the data is something that can only be done with the help of computing, but cannot be done by computing alone. (from ‘methods and philosophy‘)
See their detailed writeup for more on this fascinating and quite beautiful work. As I’m working lately on linking TV content more deeply into the Web, and on ‘second screen’ navigation, this struck me as just the kind of interface which it ought to be possible to re-use on a tablet PC to explore TV archives. Forgetting for the moment difficulties with Flash on iPads and so on, the idea roughly is that it would be great to embed such a visualization within a TV watching environment, such that when the ‘republic of letters’ widget is focussed on some person, place, or topic, we should have the opportunity to scan the available TV archives for related materials to show.
So a glance at Chrome’s ‘developer tools’ panel gave me a link to the underlying data used by the visualisation. I don’t know exactly whose it is, nor how they want it used, so please treat it with respect. Still, there it is, sat in the Web, in tab-separated format, begging to be used. There’s a lot you can do with the Flash application that I’ve barely touched, but I’m intrigued by the underlying dataset. In particular, where they have the string “Tonson, Jacob”, the data linker in me wants to see a Wikipedia or DBpedia link, since they provide explanation, context, related people, places and themes; all precious assets when trying to scrape together related TV materials to inform, educate or entertain someone with. From a few test searches, it turns out that (many? most?) the correspondents are quite easily matched to Wikipedia: William Congreve, Montagu, 1st earl of Halifax, Charles; Hough, bishop of Worcester, John; Stanyan, Abraham; … Voltaire and others. But what about the data?
Lately I’ve been learning just a little about R, a language used mainly for statistics and related analysis. Here’s what it’ll do ‘out of the box’, in untrained hands:
letters<-read.csv('data.txt',sep='\t', header=TRUE)
v_author = letters$Author=="Voltaire"
v_letters = letters[v_author, ]
> cbind(summary(v_letters$dest_country))
The requirements of our project are very much in sync with current work being done in the linked-data/ semantic web community and in the data visualization community, which is why collaboration with computer science has been critical to our project from the start.
Interesting thoughts and observations as always. Regarding your comment
In particular, where they have the string “Tonson, Jacob”, the data linker in me wants to see a Wikipedia or DBpedia link, since they provide explanation, context, related people, places and themes; all precious assets when trying to scrape together related TV materials to inform, educate or entertain someone with. From a few test searches, it turns out that (many? most?) the correspondents are quite easily matched to Wikipedia: William Congreve, Montagu, 1st earl of Halifax, Charles; Hough, bishop of Worcester, John; Stanyan, Abraham; … Voltaire and others. I share your basic instinct, but might also suggest VIAF (The Virtual International Authority File) as a possible source of links, potentially some assistance with disambiguation of names, and potentially over time a far better mechanism for leading scholars to additional scholarly resources (assuming that VIAF identifiers become part of the libary/archives/museum discovery space).
Thanks Eric! Yes I should’ve mentioned VIAF as well; I’ve enjoyed the discussions on the Linked Library Data W3C group about VIAF, SKOS, Dublin Core, FOAF and suchlike. And I definitely feel it’s better to have the Library world handle this kind of cultural heritage authority control, rather than leaving it to chance, ie. Wikipedia. Wikipedia probably isn’t the best mechanism for deciding whether some of these people were sufficiently ‘notable’ or not, after all.