Obama Web jobs in Boston

How could this not be a fun way to spend 6 months?

Obama for America is looking for exceptionally talented web developers who want to play a key role in a historic political campaign and help elect Barack Obama as the next President of the United States.

This six-month opportunity will allow you to:

  • Create software tools which will enable an unprecedented nationwide voter contact and mobilization effort
  • Help build and run the largest online, grassroots fundraising operation in the history of American politics
  • Introduce cutting-edge social networking and online organizing to the democratic process by empowering everyday people to participate on My.BarackObama

They also have a security expert position open.

Successful candidates will join the development team in Boston, MA.

Almost makes me wish I was a US Citizen. Sorry ma’am

OAuth support in Google Accounts and Contacts API

From Wei on the oauth list:

We are happy to announce that the Google Contacts Data API now supports OAuth. This is our first step towards OAuth enabling all Google Data APIs. Please note that this is an alpha release and we may make changes to the protocol before the official release.

See announcement thread for endpoint URL details, supporting tools and implentation discussion.

For more on the Contacts API, see the developer’s guide (although bear in mind it may not be up to date w.r.t. oauth).

Journals of Negative Results

Via the INDUCTIVE mailing list, I learned of the Journal of Interesting Negative Results in Natural Language Processing and Machine Learning

It is becoming more and more obvious that the research community in general, and those who work NLP and ML in particular, are biased towards publishing successful ideas and experiments. Insofar as both our research areas focus on theories “proven” via empirical methods, we are sure to encounter ideas that fail at the experimental stage for unexpected, and often interesting, reasons. Much can be learned by analysing why some ideas, while intuitive and plausible, do not work. The importance of counter-examples for disproving conjectures is already well known. Negative results may point to interesting and important open problems. Knowing directions that lead to dead-ends in research can help others avoid replicating paths that take them nowhere. This might accelerate progress or even break through walls!

That’s healthy thinking, although the site/project/journal seems very new, not much up there yet. However it does have a page of links to other such journals, events, forums and articles in favour of documenting scientific failures. Listed in there is an upcoming AAA-08 Workshop, What Went Wrong and Why: Lessons from AI Research and Applications.

The workshop has its own Web site  with materials from an earlier 2006 event with intriguing abstracts from Douglas Lenat, John McCarthy and others.

The second workshop will continue our analysis of failures in research.  In addition to examining the links between failure and insight, we would like to determine if there is a hidden structure behind our tendency to make mistakes that can be utilized to provide guidance in research.

WHY MIGHT CONNECTING WITH ZANDER JULES BE A GOOD IDEA?

Or: towards evidence-based ‘add a contact’ filtering…

This just in from LinkedIn:

Have a question? Zander Jules’s network will probably have an answer
You can use LinkedIn Answers to distribute your professional questions to Zander Jules and your extended network. You can get high-quality answers from experienced professionals.

Zander Jules requested to add you as a connection on LinkedIn:

Dan,

Dear
My name is Zander Jules a Banker and accountant with Bank Atlantique Cote Ivoire.I contacting u for a business transfer of a large sum of money from a dormant account. Though I know that a transaction of this magnitude will make any one apprehensive,
but I am assuring u all will be well at the end of the day.I am the personal accounts manager to Engr Frank Thompson, a National of ur country, who used to work with an oil servicing company here in Cote Ivoire. My client, his wife & their 3 children were involved in the ill fated Kenya Airways crash in the coasts of Abidjan in January 2000 in which all passengers on board died. Since then I have made several inquiries to ur embassy to locate any of my clients extended relatives but has been unsuccessful.After several attempts, I decided to trace his last name via internet,to see if I could locate any member of his
family hence I contacted u.Of particular interest is a huge deposit with our bank in our country,where the deceased has an account valued at about $16 million USD.They have issued me notice to provide the next of kin or our bank will declare the account unservisable and thereby send the funds to the bank treasury.Since I have been unsuccessful in locating the relatives for past 7 yrs now, I will seek ur consent to present you as the next of kin of the deceased since u have the same last names, so that the proceeds of this account valued at $16million USD can be paid to u and then u and I can share the money.All I require is your honest cooperation to enable us see this deal through. I guarantee that this will be executed under all legitimate arrangement that will protect you from any breach of the law. In your reply mail, I want you to give me your full names, address, D.O.B, tel& fax #.If you can handle this with me, reach me for more details.

Thanking u for ur coperation.
Regards,

I’m suprised we’ve not seen more of this, and sooner. Youtube contacts are pretty spammy, and twitter have also suffered. The other networks are relatively OK so far. But I don’t think they’re anything like as robust as they’ll need to get, particularly since a faked contact can get privileged access to personal details. Definitely an arms race…

Restarter martyr: sounds of Firefox 3b5

I use Firefox with a lot of tabs. I guess I’m a multi-tasker. Or I have surplus attention. Or they fixed enough memory leaks in Firefox 3  so that opening a new tab is almost cost-free. Until you restart your browser (is there a bug open for this? I couldn’t find one).

I’ve just made a quick movie from the unedited sound of Firefox 3 restarting this morning: Krafty Nacirema (multitasking mix).

Raw materials include: New Order’s ‘Krafty‘ and ‘Confusion‘; John S. Hall’s ‘America Kicks Ass‘ rant, and who knows what else. I liked how it all sounded, so here it is.

Nokiana: the one about the CIA, Syria, and the N95

Matt Kane resurfaced on Bristol‘s underscore mailing list  with this intriguing snippet, after some travels around the middle-east: ” … discovered N95s (not mine) cannot be taken into Syria”.

I asked for the backstory, which goes like this:

Quite a palaver. Got the train from Istanbul to Syria (amazing trip!). At the border they didn’t search the bags of “westerners” but asked us all to show our phones and cameras. They glanced at them all quickly, checking the brand (“Nikon, ok. SonyEricsson, ok”). One guy had an N95 and they led him off the train. His sister informed us that they’d said it wasn’t allowed in Syria, and that if she knew her brother he’d not give it up without a fight. Despite being on contract, he argued with them for an hour and a half, even calling the embassies in Damascus and Ankara. In the end he gave it up, with a promise that they’d send it on to the airport from where he was leaving. A few days later we’re chatting with a barman and spot his phone – an N95, and yes, he got it in Syria! A few days after that we found out the full story from our hotel owner in Damascus. Apparently the CIA gave a load of bugged N95s to high-ranking Kurdish officials in Iraq, many of which were then smuggled into Syria and given as gifts to various shady characters. After the Hezbollah guy was assassinated in Damascus a few months ago, the Syrians set about trying to root out spies, which led to this ban on bringing N95s into the country. Apparently.

This is the first I’ve heard of it, but searching throws up a few references to rigged N95s as “spy phones”.

Somewhat-unrelated aside: I don’t believe the relevant functionality is exposed in the N95’s widget APIs yet. I had trouble making it vibrate, let alone self-destruct after this message. But at least widget/gadget/app security is getting some attention lately. It can’t be too long before “spy widgets” on your phone become a real concern, particularly since the exposure of phone APIs to 3rd party apps is such a creative combination. I should be clear that AFAIK, Nokia’s N95 widget platform is free of such vulnerabilities currently, and any “spy phone” mischief so far has been achieved through other kinds of interference. But it does make me glad to see a Widgets 1.0: Digital Signature spec moving along at W3C…

(Back to) The Future of Interactive Media

The Internet is beginning a fundamental transition into the broadband, commercial information superhighway of the future. Today, the Internet offers immediate opportunities for commercial applications by connecting millions of PC, Macintosh and workstation users with businesses and organizations around the world. Tomorrow, as network capabilities and performance increase, this global link will deliver interactive services, information and entertainment into consumers’ homes. Mosaic Communications Corporation intends to support companies and consumers throughout this transition, and to accelerate the coming of this new era with tools that ease and advance online communications.

Mosaic Communications Corporation: Who We Are: Our Story: The Future of Interactive Media

jwz and friends have restored mcom.com to it’s former 1994-era glory, reminding us that the future’s always up for grabs.

RDF in Ruby revisited

If you’re interested in collaborating on Ruby tools for RDF, please join the public-rdf-ruby@w3.org mailing list at W3C. Just send a note to public-rdf-ruby-request@w3.org with a subject line of “subscribe”.

Last weekend I had the fortune to run into Rich Kilmer at O’Reilly’s ‘Social graph Foo Camp‘ gathering. In addition to helping decorate my tent, Rich told me a bit more about the very impressive Semitar RDF and OWL work he’d done in Ruby, initially as part of the DAML programme. Matt Biddulph was also there, and we discussed again what it would take to include FOAF import into Dopplr. I’d be really happy to see that, both because of Matt’s long history of contributions to the Semantic Web scene, but also because Dopplr and FOAF share a common purpose. I’ve long said that a purpose of FOAF is to engineer more coincidences in the world, and Dopplr comes from the same perspective: to increase serendipity.

Now, the thing about engineering serendipity, is that it doesn’t work without good information flow. And the thing about good information flow, is that it benefits from data models that don’t assume the world around us comes nicely parceled into cleanly distinct domains. Borrowing from American Splendor – “ordinary life is pretty complex stuff“. No single Web site, service, document format or startup is enough; the trick comes when you hook things together in unexpected combinations. And that’s just what we did in the RDF world: created a model for mixed up, cross-domain data sharing.

Dopplr, Tripit, Fire Eagle and other travel and location services may know where you and others are. Social network sites (and there are more every day) knows something of who you are, and something of who you care about. And the big G in the sky knows something of the parts of this story that are on the public record.

Data will always be spread around. RDF is a handy model for ad-hoc data merging from multiple sources. But you can’t do much without an RDF parser and a few other tools. Minimally, an RDF/XML parser and a basic API for navigating the graph. There are many more things you could add. In my old RubyRdf work, I had in-memory and SQL-backed storage, with a Squish query interface to each. I had a donated RDF/XML parser (from Ruby4R) and a much-improved query engine (with support for optionals) from Damian Steer. But the system is code-rotted. I wrote it when I was learning Ruby beginning 7 years ago, and I think it is “one to throw away”. I’m really glad I took the time to declare that project “closed” so as to avoid discouraging others, but it is time to revisit Ruby and RDF again now.

Other tools have other offerings: Dave Beckett’s Redland system (written in C) ships with a Ruby wrapper. Dave’s tools probably have the best RDF parsing facilities around, are fast, but require native code. Rena is a pure Ruby library, which looked like a great start but doesn’t appear to have been developed further in recent years.

I could continue going through the list of libraries, but Paul Stadig has already done a great job of this recently (see also his conclusions, which make perfect sense). There has been a lot of creative work around RDF/RDFS and OWL in Ruby, and collectively we clearly have a lot of talent and code here. But collectively we also lack a finished product. It is a real shame when even an RDF-enthusiast like Matt Biddulph is not in a position to simply “gem install” enough RDF technology to get a simple job done. Let’s get this fixed. As I said above,

If you’re interested in collaborating on Ruby tools for RDF, please join the public-rdf-ruby@w3.org mailing list at W3C. Just send a note to public-rdf-ruby-request@w3.org with a subject line of “subscribe”.

In six months time, I’d like to see at least one solid, well rounded and modern RDF toolkit packaged as a Gem for the Ruby community. It should be able to parse RDF/XML flawlessy, and in addition to the usual unit tests, it should be wired up to the RDF Test Cases (see download) so we can all be assured it is robust. It should allow for a fast C parser such as Raptor to be used if available, falling back on pure Ruby otherwise. There should be a basic API that allows me to navigate an RDF graph of properties and values using clear, idiomatic Ruby. Where available, it should hook up to external stores of data, and include at least a SPARQL protocol client, eventually a full SPARQL implementation. It should allow multiple graphs to be super-imposed and disentangled. Some support for RDFS, OWL and rule languages would be a big plus. Support for other notations such as Turtle, RDFa, or XSLT-based GRDDL transforms would be useful, as would a plugin for microformat import. Transliterating Python code (such as the tiny Euler rule engine) should be considered. Divergence from existing APIs in Python (and Perl, Javascript, PHP etc) should be minimised, and carefully balanced against the pull of the Ruby way. And (thought I lack strong views here) it should be made available under a liberal opensource license that permits redistribution under GPL. It should also be as I18N and Unicode-friendly as is possible in Ruby these days.

I’m not saying that all RDF toolkits should be merged, or that collaboration is compulsory. But we are perilously fragmented right now, and collaboration can be fun. In six months time, people who simply want to use RDF from Ruby ought to be pleasantly suprised rather than frustrated when they take to the ‘net to see what’s out there. If it takes a year instead of six months, sure whatever. But not seven years! It would be great to see some movement again towards a common library…

How hard can it be?

Google Social Graph API, privacy and the public record

I’m digesting some of the reactions to Google’s recently announced Social Graph API. ReadWriteWeb ask whether this is a creeping privacy violation, and danah boyd has a thoughtful post raising concerns about whether the privileged tech elite have any right to experiment in this way with the online lives of those who are lack status, knowledge of these obscure technologies, and who may be amongst the more vulnerable users of the social Web.

While I tend to agree with Tim O’Reilly that privacy by obscurity is dead, I’m not of the “privacy is dead, get over it” school of thought. Tim argues,

The counter-argument is that all this data is available anyway, and that by making it more visible, we raise people’s awareness and ultimately their behavior. I’m in the latter camp. It’s a lot like the evolutionary value of pain. Search creates feedback loops that allow us to learn from and modify our behavior. A false sense of security helps bad actors more than tools that make information more visible.

There’s a danger here of technologists seeming to blame those we’re causing pain for. As danah says, “Think about whistle blowers, women or queer folk in repressive societies, journalists, etc.”. Not everyone knows their DTD from their TCP, or understand anything of how search engines, HTML or hyperlinks work. And many folk have more urgent things to focus on than learning such obscurities, let alone understanding the practical privacy, safety and reputation-related implications of their technology-mediated deeds.

Web technologists have responsibilities to the users of the Web, and while media education and literacy are important, those who are shaping and re-shaping the Web ought to be spending serious time on a daily basis struggling to come up with better ways of allowing humans to act and interact online without other parties snooping. The end of privacy by obscurity should not mean the death of privacy.

Privacy is not dead, and we will not get over it.

But it does need to be understood in the context of the public record. The reason I am enthusiastic about the Google work is that it shines a big bright light on the things people are currently putting into the public record. And it does so in a way that should allow people to build better online environments for those who do want their public actions visible, while providing immediate – and sometimes painful – feedback to those who have over-exposed themselves in the Web, and wish to backpedal.

I hope Google can put a user support mechanism on this. I know from our experience in the FOAF community, even with small scale and obscure aggregators, people will find themselves and demand to be “taken down”. While any particular aggregator can remove or hide such data, unless the data is tracked back to its source, it’ll crop up elsewhere in the Web.

I think the argument that FOAF and XFN are particularly special here is a big mistake. Web technologies used correctly (posh – “plain old semantic html” in microformats-speak) already facilitate such techniques. And Google is far from the only search engine in existence. Short of obfuscating all text inside images, personal data from these sites is readily harvestable.

ReadWriteWeb comment:

None the less, apparently the absence of XFN/FOAF data in your social network is no assurance that it won’t be pulled into the new Google API, either. The Google API page says “we currently index the public Web for XHTML Friends Network (XFN), Friend of a Friend (FOAF) markup and other publicly declared connections.” In other words, it’s not opt-in by even publishers – they aren’t required to make their information available in marked-up code.

The Web itself is built from marked-up code, and this is a thing of huge benefit to humanity. Both microformats and the Semantic Web community share the perspective that the Web’s core technologies (HTML, XHTML, XML, URIs) are properly consumed both by machines and by humans, and that any efforts to create documents that are usable only by (certain fortunate) humans is anti-social and discriminatory.

The Web Accessibility movement have worked incredibly hard over many years to encourage Web designers to create well marked up pages, where the meaning of the content is as mechanically evident as possible. The more evident the meaning of a document, the easier it is to repurpose it or present it through alternate means. This goal of device-independent, well marked up Web content is one that unites the accessibility, Mobile Web, Web 2.0, microformat and Semantic Web efforts. Perhaps the most obvious case is for blind and partially sighted users, but good markup can also benefit those with the inability to use a mouse or keyboard. Beyond accessibility, many millions of Web users (many poor, and in poor countries) will have access to the Web only via mobile phones. My former employer W3C has just published a draft document, “Experiences Shared by People with Disabilities and by People Using Mobile Devices”. Last month in Bangalore, W3C held a Workshop on the Mobile Web in Developing Countries (see executive summary).

I read both Tim’s post, and danah’s post, and I agree with large parts of what they’re both saying. But not quite with either of them, so all I can think to do is spell out some of my perhaps previously unarticulated assumptions.

  • There is no huge difference in principle between “normal” HTML Web pages and XFN or FOAF. Textual markup is what the Web is built from.
  • FOAF and XFN take some of the guesswork out of interpreting markup. But other technologies (javascript, perl, XSLT/GRDDL) can also transform vague markup into more machine-friendly markup. FOAF/XFN simply make this process easier and less heuristic, less error prone.
  • Google was not the first search engine, it is not the only search engine, and it will not be the last search engine. To obsess on Google’s behaviour here is to mistake Google for the Web.
  • Deeds that are on the public record in the Web may come to light months or years later; Google’s opening up of the (already public, but fragmented) Usenet historical record is a good example here.
  • Arguing against good markup practice on the Web (accessible, device independent markup) is something that may hurt underprivileged users (with disabilities, or limited access via mobile, high bandwidth costs etc).
  • Good markup allows content to be automatically summarised and re-presented to suit a variety of means of interaction and navigation (eg. voice browsers, screen readers, small screens, non-mouse navigation etc).
  • Good markup also makes it possible for search engines, crawlers and aggregators to offer richer services.

The difference between Google crawling FOAF/XFN from LiveJournal, versus extracting similar information via custom scripts from MySpace, is interesting and important solely to geeks. Mainstream users have no idea of such distinctions. When LiveJournal originally launched their FOAF files in 2004, the rule they followed was a pretty sensible one: if the information was there in the HTML pages, they’d also expose it in FOAF.

We need to be careful of taking a ruthless “you can’t make an omelete without breaking eggs” line here. Whatever we do, people will suffer. If the Web is made inaccessible, with information hidden inside image files or otherwise obfuscated, we exclude a huge constituency of users. If we shine a light on the public record, as Google have done, we’ll embarass, expose and even potentially risk harm to the people described by these interlinked documents. And if we stick our head in the sand and pretend that these folk aren’t exposed, I predict this will come back to bite us in the butt in a few months or years, since all that data is out there, being crawled, indexed and analysed by parties other than Google. Parties with less to lose, and more to gain.

So what to do? I think several activities need to happen in parallel:

  • Best practice codes for those who expose, and those who aggregate, social Web data
  • Improved media literacy education for those who are unwittingly exposing too much of themselves online
  • Technology development around decentralised, non-public record communication and community tools (eg. via Jabber/XMPP)

Any search engine at all, today, is capable of supporting the following bit of mischief:

Take some starting point a collection of user profiles on a public site. Extract all the usernames. Find the ones that appear in the Web less than say 10,000 times, and on other sites. Assume these are unique userIDs and crawl the pages they appear in, do some heuristic name matching, … and you’ll have a pile of smushed identities, perhaps linking professional and dating sites, or drunken college photos to respectable-new-life. No FOAF needed.

The answer I think isn’t to beat up on the aggregators, it’s to improve the Web experience such that people can have real privacy when they need it, rather than the misleading illusion of privacy. This isn’t going to be easy, but I don’t see a credible alternative.