Twitter Iran RT chaos

From Twitter in the last few minutes, a chaos of echo’d posts about army moves. Just a few excerpts here by copy/paste, mostly without the all-important timestamps. Without tools to trace reports to their source, to claims about their source from credible intermediaries, or evidence, this isn’t directly useful. Even grassroots journalists needs evidence. I wonder how Witness and Identi.ca fit into all this. I was thinking today about an “(person) X claims (person) Y knows about (topic) Z” notation, perhaps built from FOAF+SKOS. But looking at this “Army moving in…” claim, I think something couched in terms of positive claims (along lines of the old OpenID showcase site Jyte) might be more appropriate.

The following is from my copy/paste from Twitter a few minutes ago. It gives a flavour of the chaos. Note also that observations from very popular users (such as stephenfry) can echo around for hours, often chased by attempts at clarification from others.

(“RT” is Twitter notation for re-tweet, meaning that the following content is redistributed, often in abbreviated or summarised form)

plotbunnytiff: RT @suffolkinace: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
r0ckH0pp3r: RT .@AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
jax3417: RT @ktyladie: RT @GennX: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection #iran
ktladie: RT @GennX: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection #iran
MellissaTweets: RT @AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
GennX: RT @MelissaTweets: RT @AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection

The above all arrived at around the same time, and cite two prior “sources”:

suffolkinnace: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection   18 minutes ago from web

Who is this? Nobody knows of course, but there’s a twitter bio:

http://twitter.com/suffolkinace # Bio Some-to-be Royal Military Policeman in the British Army. Also a massive Xbox geek and part-time comedian

The other “source” seems to be http://twitter.com/AliAkbar
AliAkbar: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
about 1 hour ago from web
url http://republicmodern.com

This leads us to   http://republicmodern.com/about where we’re told
“Ali Akbar is the founder and president of Republic Modern Media. A conservative blogger, he is a contributor to Right Wing News, Hip Hop Republican, and co-host of The American Resolve online radio show. He was also the editor-in-chief of Blogs for McCain.”

I should also mention that a convention emerged in the last day two replace the names of specific local Twitter users in Tehran with a generic “from Iran”, to avoid getting anyone into trouble. Which makes plenty of sense, but without any in the middle vouching for sources makes it even harder to know which reports to take seriously.
More… back to twitter search, what’s happened since I started this post?

http://twitter.com/#search?q=iranelection%20army

badmsm: RT @dpbkmb @judyrey: RT From Iran: CONFIRMED!! Army moving into Tehran against protesters! PLZ RT! URGENT! #IranElection #gr88
SimaoC: RT @parizot: CONFIRMÉ! L’armée se dirige vers Téhéran contre les manifestants! #IranElection #gr88
SpanishClash: RT @mytweetnickname: RT From Iran:ARMY movement NOT confirmed in last 2:15, plz RT this until confrmed #IranElection #gr88
artzoom: RT @matyasgabor @humberto2210: RT CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! #IranElection #iranrevolution
sjohnson301: RT @RonnyPohl From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection #iran9
dauni: RT @withoutfield: RT: @tspe: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
interdigi: RT @ivanpinozas From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
PersianJustice: Once again, stop RT army movements until source INSIDE Iran verifies! Paramilitary is the threat anyway. #iranelection #gr88
Klungtveit Anyone: What’s the origin of reports of “army moving in” on protesters? #iranelection
Eruethemar: RT @brianlltdhq: RT @lumpuckaroo: Only IRG moving, not national ARMY… this is confirmed for real #IranElection #gr88
SAbbasRaza: RT @bymelissa: RT @alexlobov: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
timnilsson: RT @Iridium24: CONFIRMED!! Army moving into Tehran against protesters! PLEASE RT! URGENT! #IranElection
edmontalvo: RT @jasona: RT @Marble68: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
stevelabate: RT army moving into Tehran against protesters. Please RT. #iranelection
ivanpinozas: From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection
bschh: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection (via @dlayphoto)
dlayphoto: RT From Iran: CONFIRMED!! Army moving into Tehran against protestors! PLEASE RT! URGENT! #IranElection

In short … chaos!

Is this just a social / information problem, or can different tooling and technology help filter out what on earth is happening?

Family trees, Gedcom::FOAF in CPAN, and provenance

Every wondered who the mother(s) of Adam and Eve’s grand-children were? Me too. But don’t expect SPARQL or the Semantic Web to answer that one! Meanwhile, …

You might nevetheless care to try the Gedcom::FOAF CPAN module from Brian Cassidy. It can read Gedcom, a popular ‘family history’ file format, and turn it into RDF (using FOAF and the relationship and biography vocabularies). A handy tool that can open up a lot of data to SPARQL querying.

The Gedcom::FOAF API seems to focus on turning the people or family Gedcom entries  into their own FOAF XML files. I wrote a quick and horrid Perl script that runs over a Gedcom file and emits a single flattened RDF/XML document. While URIs for non-existent XML files are generated, this isn’t a huge problem.

Perhaps someone would care to take a look at this code and see whether a more RDFa and linked-data script would be useful?

Usage: perl gedcom2foafdump.pl BUELL001.GED > _sample_gedfoaf.rdf

The sample data I tested it on is intriguing, though I’ve not really looked around it yet.

It contains over 9800 people including the complete royal lines of England, France, Spain and the partial royal lines of almost all other European countries. It also includes 19 United States Presidents descended from royalty, including Washington, both Roosevelts, Bush, Jefferson, Nixon and others. It also has such famous people as Brigham Young, William Bradford, Napoleon Bonaparte, Winston Churchill, Anne Bradstreet (Dudley), Jesus Christ, Daniel Boone, King Arthur, Jefferson Davis, Brian Boru King of Ireland, and others. It goes all the way back to Adam and Eve and also includes lines to ancient Rome including Constantine the Great and ancient Egypt including King Tutankhamen (Tut).

The data is credited to Matt & Ellie Buell, “Uploaded By: Eochaid”, 1995-05-25.

Here’s an extract to give an idea of the Gedcom form:

0 @I4961@ INDI
1 NAME Adam //
1 SEX M
1 REFN +
1 BIRT
2 DATE ABT 4000 BC
2 PLAC Eden
1 DEAT
2 DATE ABT 3070 BC
1 FAMS @F2398@
1 NOTE He was the first human on Earth.
1 SOUR Genesis 2:20 KJV
0 @I4962@ INDI
1 NAME Eve //
1 SEX F
1 REFN +
1 BIRT
2 DATE ABT 4000 BC
2 PLAC Eden
1 FAMS @F2398@
1 SOUR Genesis 3:20 KJV

It might not directly answer the great questions of biblical scholarship, but it could be a fun dataset to explore Gedcom / RDF mappings with. I wonder how it compares with Freebase, DBpedia etc.

The Perl module is a good start for experimentation but it only really scratches the surface of the problem of representing source/provenance and uncertainty. On which topic, Jeni Tennison has a post from a year ago that’s well worth (re-)reading.

What I’ve done in the above little Perl script is implement a simplification: instead of each family description being its own separate XML file, they are all squashed into a big flat set of triples (‘graph’). This may or may not be appropriate, depending on the sourcing of the records. It seems Gedcom offers some basic notion of ‘source’, although not one expressed in terms of URIs. If I look in the SOUR(ce) field in the Gedcom file, I see information like this (which currently seems to be ignored in the Gedcom::FOAF mapping):

grep SOUR BUELL001.GED | sort | uniq

1 NOTE !SOURCE:Burford Genealogy, Page 102 Cause of Death; Hemorrage of brain
1 NOTE !SOURCE:Gertrude Miller letter “Harvey Lee lived almost 1 year. He weighed
1 NOTE !SOURCE:Gertrude Miller letter “Lynn died of a ruptured appendix.”
1 NOTE !SOURCE:Gertrude Miller letter “Vivian died of a tubal pregnancy.”
1 SOUR “Castles” Game Manuel by Interplay Productions
1 SOUR “Mayflower Descendants and Their Marriages” pub in 1922 by Bureau of
1 SOUR “Prominent Families of North Jutland” Pub. in Logstor, Denmark. About 1950
1 SOUR /*- TUT
1 SOUR 273
1 SOUR AHamlin777.  E-Mail “Descendents of some guy
1 SOUR Blundell, Sherrie Lea (Slingerland).  information provided on 16 Apr 1995
1 SOUR Blundell, William, Rev. Interview on Jan 29, 1995.
1 SOUR Bogert, Theodore. AOL user “TedLBJ” File uploaded to American Online
1 SOUR Buell, Barbara Jo (Slingerland)
1 SOUR Buell, Beverly Anne (Wenge)
1 SOUR Buell, Beverly Anne (Wenge).  letter addressed to Kim & Barb Buell dated
1 SOUR Buell, Kimberly James.
1 SOUR Buell, Matthew James. written December 19, 1994.
1 SOUR Burnham, Crystal (Harris).  Leter sent to Matt J. Buell on Mar 18, 1995.
1 SOUR Burnham, Crystal Colleen (Harris).  AOL user CBURN1127.  E-mail “Re: [...etc.]

Some of these sources could be tied to cleaner IDs (eg. for books c/o Open Library, although see ‘in search of cultural identifiers‘ from Michael Smethurst).

I believe RDF’s SPARQL language gives us a useful tool (the notion of ‘GRAPH’) that can be applied here, but we’re a long way from having worked out the details when it comes to attaching evidence to claims. So for now, we in the RDF scene have a fairly course-grained approach to data provenance. Databases are organized into batches of triples, ie. RDF statements that claim something about the world. And while we can use these batches – aka graphs – in our queries, we haven’t really figured out what kind of information we want to associate with them yet. Which is a pity, since this could have uses well beyond family history, for example to online journalistic practices and blog-mediated fact checking.

Nearby in the Web: see also the SIOC/SWAN telecons, a collaboration in the W3C SemWeb lifescience community around the topic of modelling scientific discourse.