The Cloud

David Saintloth´╗┐ had a post about reasonable defaults when collecting EXIF data for photos (time/date, GPS, camera info, etc.,.) For example that cameras should just collect the data automatically instead of prompting and we can determine at a later date, during publication/sharing, what data we want to go out into the world. The premise would be to allow that metadata to be more useful to us when we want to sort or organize our photos.

It ties in with a thought I've had about everyone having their own personal cloud of metadata. Take for instance every post we make here on Facebook, every picture, every comment. We see them as ours. But they're out here and to some degree they are out here making money for Facebook. We get value out of them when others interact. But that interaction is also another channel for Facebook to build market and make money. Sometimes we have to question whether that's a reasonable exchange. We also have to question what's being done with our data that we don't know about and what happens when we do know and don't agree? We've all seen posts about someone dying and the pain people go through in trying to either get into a loved one's account or have it shut down. Not only that but what happens to the information that intersects with someone else's personal history when that person's data is removed.

I've had thoughts about people's data in general. Everything you do throughout a day creates some data-point that could be linked back to you, exif data, phone logs, e-mails, browser history, medical records, gps data, shopping history, card charges, etc.,. Some of it we share, most of it we don't do willingly. Much of it may have value. At the same time we might be giving a lot of it away for free and helping other's make a profit.

What I'd ultimately like to see is some sort of rights provisioning and more personal control of that cloud of information. For example here on FB I can control what goes out and to whom (to some extent); Everyone, Friends, Friends of Friends, Not Acquaintances. But I can't control the behind the scenes data use. Why not the same level of control with all of my personal data? Not only that but the ability to revoke permission and be sure that the data is removed from whatever shared locations.

It would certainly be a huge technical challenge to tag and license every bit of metadata that a person creates throughout a day, much less a lifetime. But we could also think about the benefits it would give to people and society as a whole especially on the medical data modeling front. If we could be sure that our medical data was being used in the manner we approved of, we might feel more comfortable about sharing it and scientists might find missing trends in diagnostics, treatments, and diseases.

The key would be in the rights provisioning and in having strong legal controls in place to avoid abuse by the government, corporations and even individuals.

Think about just the social media front. If all these posts on FB were licensed by you, every comment, every photo... and then you decided you were done with FB, where does the data go? Deleted? Kept but tagged as "removed"? How do I get it back? How do I visit something important to my history? What if we could keep the data ourselves in our own personal cloud or revoke the temporary license we've given FB to our content and license it instead to a new social media site or divvy it up among multiple outlets.

It would be super disruptive, but it might also make large service providers compete for top contributors unlike the passive approach they have today. That licensing agreement might be something like creative commons, where we can get some simplified and standardized definitions that are clear to people who their data will be used and if they will be compensated in some way (no compensation, data for service, or data for cash/credit).

I imagine having this big cloud of data there are layers to it like an onion where people choose what data they feel is private and what they want to share in what ways. Then people choose how they license themselves out to services like Facebook and Pinterest and services that haven't even come into existence yet. And then when they don't want to license to that service they send a revocation request. The service is legally obligated to revoke/return the information per the license agreement.

There will be a few challenges beyond the technical (storage, keys, standards for licensing, education). Security is always going to be A Number One among those challenges. Another significant challenge will be putting legal controls in place so that the government doesn't have easy access to the massive amount of data. And also will be the point where two people's data intersects and who ultimately has the right to the data.

Easy examples of this might be if I take a picture of a friend and we stop being friends. If I've shared it on social media and then revoke it, does the friend lose access because it was taken with my camera and is considered mine. Or can the friend put in a notice of co-ownership and keep a copy. Or can the friend put in a notice of complete ownership and I lose my copy. The same question holds true of comments on a post or article, chat transcripts, e-mails, etc.,. How do we deal with companies or individuals that put holds on data because they claim co-ownership or the data is needed for historical purposes.

I see this regularly where people are talking about a controversial issue and there's a lot of good conversation sharpening an argument back and forth. Then suddenly someone gets upset and they remove themselves and their comments from the discussion. This leaves a huge hole in the context of the conversation making it difficult for the rest of the people in the conversation to fill in accurately. Should that still be allowed? Should people be allowed to rewrite their own digital history by removing their own offensive or poorly thought out commentary? Or can we mark it in a way that reflects the nature of the persons decision to retract it? Say if I retreated from a discussion on dragging sheep across different surfaces and deleted my comments. Should I have to give a reason? Like "removed due to lack of knowledge on subject" or "deleted because of change in position". Or should the comments stay if there are replies and signs of a conversation. Or should we be able to flag them and validate that our opinions have changed with experience and maturity.

I think about posts that I made earlier in my life and how someone might go back and see them and think that my feelings remain the same. It might impact job opportunities or friendships or political careers. But what if over time we could go back and tag and reflect on those comments and use them to build new narratives about who we are and the direction we ultimately want to go? I think there's power (and significant responsibility) in having those capabilities. I think there's also a great opportunity for disrupting how we use the internet and how we share information.

The last challenge is really anonymity. There may be many reasons why we want to interact with the world in an anonymous fashion. Either flying under the digital radar because we don't want people snooping, or because we feel that a particular activity needs privacy and respect (think online forums for rape counseling, seeking help with substance abuse, or things of a sexual nature. Or there might just be the need to remain anonymous for purely entertainment purposes (i.e. I don't want random people from xbox live sending me Facebook requests because we played a game of Halo 2 years ago). So we'll need the ability to make data private and/or create aliases under which certain data lives. This data would still ultimately need to be traceable back to a primary key for licensing, media, and legal issues, but it shouldn't be easily tracked.

That's again where needing strict controls in place comes in. The government (local, state or federal) needs to have strict controls about how they can access the data and when. Our private data in the cloud should be like an onion that they have to ask permission to peal back successive layers and each layer requires a warrant and a strong proof of need (more so than the warrants they give out like candy today). However, the data itself might make this easier. If we have all this data in a cloud it might make it easy for a Judicial search in some true/false way; i.e. "was suspect within 100 yards of 1234 Dove Lane on Friday" or "Does suspect have pictures of Bo Bradley" or "has suspect searched for 'how to kill wife'" and get back a simple "yes" or a "no" or some statistical likelihood before proceeding to provide a warrant and peel back the next layer.