Do every app and web service we use need to keep all the data they collect until long after we’re all dead? I believe more services should be designed to auto-delete old junk by default, and only optionally store things for longer. —and since the introduction of the GDPR, the European Parliament agrees with me.
Last month, I wrote about how Flattr automatically deletes data they’ve collect from your web browser history after three months. You get little to no value from them holding on to it for any longer, and they don’t get much value from it either.
Since the introduction of the General Data Protection Regulation (GDPR), this should have been the new default. Yet it’s hard to come by examples of any other company that voluntarily deletes the data it has collected about you when its no longer necessary to retain it. Let’s have a quick refresher on what the GDPR says about privacy and data erasure by default:
The following excerpt from Recital 39 has a clearer plain-English explanation:
The GDPR spells out in fairly clear terms that personal data should be deleted when it’s no longer strictly necessary to retain it. You can argue to no end over what types of data needs to be retained and stored for how long. Those discussions over data retention policies is what the European Parliament wants companies to have. However, even the biggest players in the data harvesting business appears to still store most data forever by default. I’ll look at two examples next.
Microsoft Account/Graph
Microsoft collects a huge amount of data on consumer via their product and services. The Microsoft Account Privacy dashboard lists an enormous amount of data collected through Microsoft’s many services including which apps and programs you open in Windows, what you do in some of these programs and apps, what websites you use, and more.
Separate from the Privacy Dashboard, you’ll also find individual export options for specific Microsoft products like Outlook, Skype, and To-Do at their respective websites.
A quick look at the exported data reveals that Microsoft Edge does collect every thing you do online. Including the web address (URL) of advertisements loaded in frames on other pages, login and session data passed in URL redirects. It’s interesting to note that these URLs don’t even appear inside your browsing history inside Microsoft Edge; because they’d hold no real value to end-users. Microsoft still collects and stores this data on you.
Microsoft doesn’t offer you any tools to delete all data older than a certain data, or filter your data by keywords. You can download a copy of or delete all your activity (for some services) with a single click, or you can explore data listed chronologically and delete one item at a time. Notably, your data export doesn’t contain data from Skype and Outlook. You can’t setup any policies to blocklist keywords from appearing in your history nor setup automatic deletion of old data.
Google Account
Depending on how many Google products you use, your Google My Activity history will contain your entire life. Google stores information about everything you do on Search, Mail, Chrome, Android, YouTube, and all its other products and platforms.
Update (): Google now offers to auto-delete collected location and search history, but users must turn this on manually.
The Takeout dataset isn’t entirely complete and is missing some apps like Podcasts and music you’ve uploaded to Google Play Music.
Google records pretty much everything you ever do in their products and on their platforms; including how much time you spend with individual apps and which ads you click on around the web.
You can go into My Activity and search for and delete data by keywords or purge data older than whatever time you select. This is a big improvement over what Microsoft offers you, but you still can’t set up any kind of data retention policies. All data associated with your account is stored forever by default. Unless you setup a calendar event to remind you to do this every six months, you’re not going to get around to periodic cleanups of old data.
Data needs expiration dates
I understand and even value that some data collection can lead to better products and experiences. When I type “windows installation” into a web search, I’m more likely looking for information about the operating system than carpentry services or glass isolation fact sheets. However, you don’t need to store ten yeas worth of data on me to figure that out. Six, three, or even a single month worth of my online habits should suffice! This holds even more true for targeted ads than it does for personalized search results.
Personal data, especially usage and trends data, should have a fairly short shelf life. There’s no need to store everything forever just because it has become increasingly feasible for companies to do just that. I called out Microsoft and Google specifically in this article but they’re not alone in this.
Like with so much else surrounding the GDPR, we’re all still waiting with baited breaths to see how its going to be enforced. I’m curious to see whether any data protection agencies in Europe will go after any company over never-expiring data retention policies by default. Even if the default doesn’t change, people should at least be offered the option to set up their own retention policies.