Bruce Schneier: Our Data, Ourselves

Via Libby; Bruce Schneier on data:

In the information age, we all have a data shadow.

We leave data everywhere we go. It’s not just our bank accounts and stock portfolios, or our itemized bills, listing every credit card purchase and telephone call we make. It’s automatic road-toll collection systems, supermarket affinity cards, ATMs and so on.

It’s also our lives. Our love letters and friendly chat. Our personal e-mails and SMS messages. Our business plans, strategies and offhand conversations. Our political leanings and positions. And this is just the data we interact with. We all have shadow selves living in the data banks of hundreds of corporations’ information brokers — information about us that is both surprisingly personal and uncannily complete — except for the errors that you can neither see nor correct.

What happens to our data happens to ourselves.

This shadow self doesn’t just sit there: It’s constantly touched. It’s examined and judged. When we apply for a bank loan, it’s our data that determines whether or not we get it. When we try to board an airplane, it’s our data that determines how thoroughly we get searched — or whether we get to board at all. If the government wants to investigate us, they’re more likely to go through our data than they are to search our homes; for a lot of that data, they don’t even need a warrant.

Who controls our data controls our lives. [...]

Increasingly, we’re going to be seeing this data flow through protocols like OAuth. SemWeb people should get their heads around how this is likely to work. It’s rather likely we’ll see SPARQL data stores with non-public personal data flowing through them; what worries me is that there’s not yet any data management discipline on top of this that’ll help us keep track of who is allowed to see what, and which graphs should be deleted or refreshed at which times.

I recently transcribed some notes from a Robert Scoble post about Facebook and data portability into the FOAF wiki. In it, Scoble reported some comments from Dave Morin of Facebook, regardling data flow. Excerpts:

For instance, what if a user wants to delete his or her info off of Facebook. Today that’s possible. But what about in a really data portable world? After all, in such a world Facebook might have sprayed your email and other data to other social networks. What if those other social networks don’t want to delete your data after you asked Facebook to?

Another case: you want your closest Facebook friends to know your birthday, but not everyone else. How do you make your social network data portable, but make sure that your privacy is secured?

Another case? Which of your data is yours? Which belongs to your friends? And, which belongs to the social network itself? For instance, we can say that my photos that I put on Facebook are mine and that they should also be shared with, say, Flickr or SmugMug, right? How about the comments under those photos? The tags? The privacy data that was entered about them? The voting data? And other stuff that other users might have put onto those photos? Is all of that stuff supposed to be portable? (I’d argue no, cause how would a comment left by a Facebook user on Facebook be good on Flickr?) So, if you argue no, where is the line? And, even if we can all agree on where the line is, how do we get both Facebook and Flickr to build the APIs needed to make that happen?

I’d like to see SPARQL stores that can police their data access behaviour, with clarity for each data graph in the store about the contexts in which that data can be re-exposed, and the schedule by which the data should be refreshed or purged. Making it easy for data to flow is only half the problem…

OpenID and Wireless sharing

via Makenshi in #openid chat on Freenode IRC:

<Makenshi>: I found a wireless captive portal solution that supports openid.

With the newest release of CoovaAP, some new features in Chilli are demonstrated in combination with RADIUS to allow OpenID based authentication. (

I’m happy to see this. It’s very close to some ideas I was discussing with Schuyler Earle and Jo Walsh some years ago around NoCatAuth, FOAF and community wireless. Some semweb stories may yet come to life.

At the moment, the options available for wireless ‘net sharing are typically: let everyone in, have a widely known secret for accessing your network, or let more or less nobody in without individually approving them. Although the likes of Bruce Schneier argue the merits of open wireless, most 802.11 kit now comes out of the box closed by default, and usually stay that way. Having a standards-based and decentralised way of saying “you can use my network, but only if you login with some identifiable public persona first” would be interesting.

OpenID takes away a significant part of the problem space, allowing experimentation with a whole range of socially oriented policies on top. Doubtless there are legal risks, big privacy issues, and lurking security concerns. But there is also potential for humanising interactions that are currently rather anonymous. In the city I live in, Bristol, there’s a community wireless effort, Bristol Wireless, as well as wireless Internet in countless local cafes. Plus commercial hotspots and whatever the city council are up to. Currently these are fragmented, and offer a variety of approaches. Could OpenID offer a common approach for Bristolians to connect? I like the idea that (for those that choose to ‘go public’) OpenIDs could link scattered presence across community sites. Having OpenID-based login used eg. for cafe-based access could be a nice step in that direction. But would people trust their local cafe to know what they’re doing online any more than they trust Google? Should they?