Google is the ultimate Big Brother. Facebook is not far behind: with what it knows about me, I could rebuild my life whenever I wanted. However: How much does Amazon know about me?
That is what we wanted to find out now that the company gives the option to download all the data it collects when using its services or products such as its smart speakers. There is good and bad news. Let’s go with them.
First the bad news
Amazon collects a lot of data. That is the reality that one can confirm by downloading everything it has on us. The process to obtain this information is simple: just make the request by following the steps indicated in the Amazon documentation.
In fact, to request the data we will have to go to this link indicated in that documentation, which will take us to a page with a drop-down menu in which we can choose which collected data we want to download. You can choose any of the options shown in the image in case you only want that specific data. For example, only for Alexa and Echo devices, only for the Kindle, or only for Advertising.
I wanted to download all of them, so I chose the “Request all your data” option shown at the bottom of that dropdown menu.
From there it is time to wait. And wait a long time, because those requests take weeks to process. In my case, the email that notified me that the data could now be downloaded took 22 days to arrive from the time I made the request.
Amazon also does not make downloading this data especially fast or convenient: on that web page there is a huge list of small .ZIP files that we can download individually: nowhere does an option appear that allows us to download them all at once. We will have to click on each “Download” button to save them on our computer.
Now for the good news: data collection is reasonable and non-invasive.
Once those files were downloaded —in my case, about 130— I began to unzip them. Most of which were very small and were under 10 KB, and when you unzip them you can quickly see how the vast majority are files of text in .CSV formatprepared to be imported in spreadsheets and that show tabulated information about our use of Amazon products and services.
Now comes the important thing. What data does Amazon collect? After analyzing these files, we were able to verify how, except in very specific cases, the information that you save from us is exactly what is used for your online store.
Thus, Amazon saves information about our shipping addresses —this is done in one of the few PDFs that appear in the collection of data collected—, partial data about the credit or debit cards that we use to pay for the items (last four digits, expiration, associated bank), and of course tables related to our operations on the Amazon website: purchases made, returns, wish list, etc.
That Amazon collects and stores this data seems logical considering that we use it when buying items and using Amazon’s online store. There is nothing alarming hereor at least not if indeed the information downloaded in this segment is all that Amazon stores.
Amazon records our use or participation (engagement) in the apps we download through its app store. I have a Kindle Fire HD tablet that was mostly used by my kids to play games or watch videos, and Amazon logged the duration of those sessions. That lets you know which apps or games are most popular and used among your users.
The rest of the files are mostly quite innocuous as well. We have a list with the “markets” (“marketplaces”) in which we have entered with any of our devices —for example, when we download games—, lists with entries that indicate if we have taken advantage of any promotion, or tables with information about notifications that Amazon has sent us by email.
There is also a table —the one derived from the registration.csv file— that shows the devices in which we have used the Amazon account, but without clearly identifying them: their serial numbers appear and then the generic name that Amazon generates —for example, “Javier’s 14th Android Device”—as well as the dates the Amazon account was activated and deactivated on the device. Again, reasonable data that Amazon normally has stored as part of our usage history of its services.
Alexa is not a concern, but Amazon is very interested in how much we read and watch on Kindle and Prime Video
The only thing I found relatively curious were two things. The first, that Amazon knows what car I have. Actually, it is totally normal, since I have bought a spare part such as the windshield wiper there.
The second, the one that probably worries more users, is what kind of data Amazon collects from our use of its smart speakers or Prime Video. Its Echo family is very popular and gives access to convenient functions through voice commands, but to what extent is our privacy invaded?
Judging from the data collected, that invasion seems once again almost nil. In the files downloaded in my case there were no Alexa recordings, but the reason is simple: we used my wife’s Amazon account for this device, because hers is the one associated with Prime.
In the end it doesn’t matter if I didn’t download them, because everything Amazon records when we talk to Alexa is recorded and accessible in the Alexa privacy section of our Amazon account. When visiting that web page we will see a list of the voice orders that we have given, and if we display any of them we will be able to reproduce the audio chain that has been recorded on Amazon’s servers.
When reviewing those commands, I have only been able to find audio clips in which it controlled the music reproduction of the speaker or asked for the time.
There are no weird recordings of conversations caught in the background for example, and once again it seems that the behavior here of the data collection is as expected. On that same web page it is also possible to delete all the recordings, which gives the user control over those files.
Perhaps the most curious thing about all this data collection has been to see how Amazon collects a lot of information about our reading sessions in its e-book readers. There are quite a few files related to the Kindle, although most are once again innocuous.
However, there are some files in which there is a clear monitoring of our reading activity. The “Kindle.Devices.ReadingSession.csv” file is the most revealing here: it shows the start and end time of the session, the e-book identifier through its ASIN code, and then two even more curious pieces of information: the time we have been reading (in milliseconds), and how many pages we went through in that session.
Here there is of course a particular obsession of Amazon to know what books we have read and if those books have interested us or not. These metrics are very similar to those that the service maintains for Prime Video, and which show in CSV tables what movie or series we watched and how many seconds we were watching it in each session.
Is that Amazon interest legitimate? Well, certainly one could argue that thanks to that Amazon knows what books, series or movies work in its catalogs, so from that point of view the collection of that data that Amazon does seems logical.
That, and all the spreadsheets on *how* I read (page turns, highlights, reading sessions), my Whole Foods purchases, my contacts lists, my video watching, my Amazon purchases, every time I’ve entered a physical store, etc etc etc etc pic.twitter.com/3JROpVBjkO
— Alina Utrata (@AlinaUtrata) January 23, 2022
That it saves them with that level of detail may seem somewhat exaggerated, and that is precisely why there are critics in social networks about that collection of audios from Alexa or about the pages we have turned when reading a book on our Kindle.
The truth is that the data collected by Amazon does not seem exaggerated. They are basically a history of our activity in their services, and in many cases part of this data is useful to make the service more comfortable for users —it is useful to be able to access our orders or not have to enter shipping addresses for each purchase, For example-.
The suspicions here may arise with how Amazon may use that data, but again that usage and preference collection is usually meant to improve services and referral systems: If the Kindle or Prime Video collect data, it seems reasonable to think that (at least in large part) it is to help us choose our next book, series or movie.
It is true that our purchase history can be useful for other purposes, such as personalized advertising. Amazon itself recognizes this type of scenario in its privacy notice and confirms that “we work with third parties such as advertisers, publishers, social networks, search engines, advertising providers and advertising companies that work on their own, to improve the relevance of the advertisements we offer. With everything and with that, all that data collected by Amazon They don’t seem particularly invasive. nor a serious threat to our privacy.
Image: Jonathan Borba