visualizing personal data takeouts

This commit is contained in:
Wouter Groeneveld 2022-02-15 14:33:32 +01:00
parent aca3bc8d6b
commit 704e22fe4d
5 changed files with 61 additions and 0 deletions

View File

@ -5,6 +5,7 @@ categories:
- braindump
tags:
- Hasselt
- datavis
---
Last December, our energy provider, the [Vlaamse Energieleverancier](https://www.vlaamseenergieleverancier.be/), got bust. The problem wasn't (only) the electricity, but the steady increase in gas prices. At the end of 2020, we were fed up with our then current provider, and switched to the Vlaamse. A bad move, apparently. Our new contract involved a fixed price of `2.25` c€/kWh, in hindsight, a ridiculously low one in 2021. Our new energy provider charged `7.14` c€/kWh in December, and it looks like for January, it's going to climb to `12.50`. No wonder the company went bust. Help! What's going on here?

View File

@ -0,0 +1,58 @@
---
title: Visualizing Personal Data Takeouts
date: 2022-02-15T13:49:00+01:00
categories:
- webdesign
tags:
- privacy
- datavis
---
Erik Kemp's [Campus Talk S02E04](https://www.youtube.com/watch?v=Y-rfmiu-_XU)---_Show me the data!_---demonstrated how to visualize your data takeout using the [TransparencyVis tool](https://transparency-vis.vx.igd.fraunhofer.de/) from Fraunhofer and the University of Darmstadt, Germany. Ever since the introduction of the GDPR in Europe, we as consumers have been able to request a so-called "takeout" of our own data, the data that companies have and keep track of us. This goes from a simple name + address + email list to a complicated zip-file containing gigabytes of scary things, like Google's takeout, available at https://takeout.google.com/.
The problem is, what do we learn from that data? Which patterns emerge? How serious is this data tracking thing, really? You'd have to dig through log files, photos, e-mails, conversations, telephone records, etc and categorize/group timestamps to get an idea on how bad the situation is. That is where TransparencyVis comes in: drop your takeout zip file from services such as Google, Facebook, Twitter, etc, and they'll render a nice visualization for you:
![](../takeout.jpg "My Google Takeout, visualized.")
This is about `2 GiB` of information, excluding photos, but including Google Drive documents. At first, the scatter plot still is a bit confusing. The easiest way to look at it is this: the vertical bands are timestamps, and each dot---more smears is more activity---is a recorded activity. You can see that between 7AM and 8AM, up to 23PM, a _lot_ of stuff was gathered. But what's more scary: between 2015 and 2018, even during the night, when I generally turn my smartphone OFF, it was somehow still sending data? At least those events are timestamped at a nighttime. The orange dots are _location trackers_ (Google Location tracking). The pink ones are _conversation trackers_ (Google Hangouts chat history). The (hard to see) green ones are _browsing history_ (Google Chrome).
Since the beginning of 2021, I've been [gradually getting rid of my Google life](/post/2021/03/getting-rid-of-tracking-using-lineageos/). Before the Google Takeout, I already:
- Deleted all Google Photos;
- Deleted all Google Maps locations;
- Deleted most critical e-mail conversations;
However, I decided to keep the e-mail account as a spam catch-all. My behavior towards reclaiming my privacy is clearly visible in the above scatter plot:
- Since 2018, I stopped using Nexus phones and disabled Google's location tracking service (less but still present orange dots);
- Since half 2019, I stopped using Google Hangouts (no more pink dots);
- Since 2020, I stopped using Google Chrome (no more green dots);
- Since 2021, I stopped using Android all-together;
My first Google-enabled phone, a Nexus 4, was bought in 2014---that's no coincidence, judging from the graph. Things got worse since then. To me, it is really, _really_ scary to analyze this takeout, especially because of the location tracking I was blissfully unaware of back in the day. Sure, you can turn that down (somewhat), but the point is that it's enabled by default, and if you are simply a casual Android user, you have no idea!
Furthermore, all Google Hangouts conversations, even from back in 2014, is unencrypted and accounted for. This goes from casual chats with friends to serious matters with my wife that I am very uncomfortable with sharing with Google---or any online service, for that matter, even if it would be my own. I even found Google Chrome history from 2012 with loads of eBay searchers for jewelry (not mine, haha, but still!). If all this doesn't wake you up, I'm not sure what does. I have a few colleagues who are very fatalistic in this matter: "They'll track you anyway, I say let them have it!". I personally think that's a very dangerous statement to make.
## What to do about it?
Simple. If you want to be aggressive, close your (insert social media or tech giant here) account.
Is it that simple? No. If you read about [that hell called LinkedIn](https://blog.sugoi.be/posts/that-hell-called-linkedin/), the conclusion is that while it sucks, closing the account isn't always an option:
> Why dont I close my account instead of complaining? Because LinkedIn is unfortunatelly the goto place to check someones professional profile. Same reason why I have a Facebook account: its because everybodys there. Not because I enjoy their product.
I sympathize. While I did get rid of my LinkedIn account (seriously, your website is your CV), I feel very much the same for things like WhatsApp. Roy Tang wrote about [instant messaging apps](https://roytang.net/2022/02/im-apps/) and published an interesting chart containing recommendations for use, privacy-wise and open-source-wise. The conclusion? We should rely on decentralized solutions such as Jabber/XMPP and Matrix and avoid WhatsApp/Facebook (Messenger). I'd love to self-host these things, but convincing my family to use them is simply not going to happen. They didn't even understand the appeal of Signal when WhatsApp changed their privacy policy last year.
Roy's conclusion?
> I still use Messenger regularly because my family's group chats are there and some oldies would have trouble migrating. I have a couple of other friend group chats there because that's most convenient for everybody.
*Sigh*.
I've been relying on ProtonMail for a year now, and while I'm happy I finally used my own `@brainbaking.com` domain, I question its relevance, privacy-wise. E-mail is inherently unencrypted, and while I'm fine with that, I'm not fine with Google collecting information on my purchase and vacation behavior. But what is the point of shunning Google services if 80% of the internet does not follow along? Most of my written e-mails _still_ go through Google's SMTP servers, and it would not take a smart AI system to couple my new e-mail address to my old Google user account.
I do feel better and do not regret the move. But all this talk about privacy doesn't seem to change much in the behavior of people in general. Furthermore, it's _involving_: installing LineageOS cost me more than a day to get it right, and I'm an "IT guy". Most people interested in reading posts such as this very one are already pretty privacy-conscious (and IT nerds) anyway.
Reclaiming your privacy shouldn't be hard nor obscure, yet I have the feeling it still is. Until that time comes, take a look at the TransparencyVis tool, and try not to be too scared.

View File

@ -5,6 +5,7 @@ categories:
- braindump
tags:
- Hasselt
- datavis
---
After plotting the [natural gas prices of each quarter the last five years](/post/2022/01/natural-gas-prices-and-the-energy-market), it was only logical to expect a future post of other utilities such as water. As expected, the trend line is going up, but luckily, not as badly as the gas prices. We've been steadily paying more and more for our yearly water bills, but unlike the natural gas plot (see link above), we've apparently also been using more and more water.

View File

@ -54,6 +54,7 @@
<outline text="ancientelectronics" title="ancientelectronics" description="" type="rss" version="RSS" htmlUrl="https://ancientelectronics.wordpress.com/" xmlUrl="https://ancientelectronics.wordpress.com/feed/"/>
<outline text="Byte Cellar" title="Byte Cellar" description="" type="rss" version="RSS" htmlUrl="https://bytecellar.com/" xmlUrl="https://bytecellar.com/feed/"/>
<outline text="Digging the Digital" title="Digging the Digital" description="" type="rss" version="RSS" htmlUrl="https://diggingthedigital.com/" xmlUrl="https://diggingthedigital.com/feed/"/>
<outline text="foo::bar" title="foo::bar" description="" type="rss" version="RSS" htmlUrl="https://blog.sugoi.be/" xmlUrl="https://blog.sugoi.be/index.xml"/>
<outline text="FOSS Academic" title="FOSS Academic" description="" type="rss" version="RSS" htmlUrl="https://fossacademic.tech/" xmlUrl="https://fossacademic.tech/feed.xml"/>
<outline text="i need coffee" title="i need coffee" description="" type="rss" version="RSS" htmlUrl="https://ineed.coffee/" xmlUrl="https://ineed.coffee/feed.xml"/>
<outline text="KN100 | Kevin Norman" title="KN100 | Kevin Norman" description="" type="rss" version="RSS" htmlUrl="https://kn100.me/" xmlUrl="https://kn100.me/index.xml"/>

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB