Browsing Histories Are Enough To Identify Internet Users, Mozilla Research Found

When browsing the web, users will leave digital footprints. This include browsing histories.

In a study published by three employees from Firefox creator Mozilla, it was revealed that most people when browsing the web have unique habits, meaning that they leave unique web browsing history.

This, if given to those who know how to deal with it, can reveal many sensitive things.

The data inside browsing history, is said to be enough to reliably track and re-identify the owner.

Even small samples of a user's browsing history are sometimes sufficient.

This is why tech companies can use them to accurately profile internet users, so these people can be shown targeted advertisements from online advertisers.

The study here, debunks an online myth that browsing history, even the anonymized one, isn't useful for online advertisers.

Google Lawsuit
Profile size / n unique domains in browser history. (Credit: Sarah Bird, Ilana Segall, and Martin Lopatka / Mozilla)

According to Firefox' research, even a small list of 50 to 150 of the user's favorite and most accessed domains can let advertisers create a unique tracking profile.

The Mozilla research paper is named "Replication: Why We Still Can't Browse in Peace: On the Uniqueness and Reidentifiability of Web Browsing Histories":

"We examine the threat to individuals' privacy based on the feasibility of reidentifying users through distinctive profiles of their browsing history visible to websites and third parties. [...] The original work demonstrated that browsing profiles are highly distinctive and stable."

"We reproduce those results [...] to detail the privacy risk posed by the aggregation of browsing histories.

In the study, the researchers gathered dataset consisting of two weeks of browsing data from ~52,000 Firefox users.

The study finds that using browsing histories gathered from the users, the team could identify 48,919 distinct browsing profiles, of which 99% are unique.

"High uniqueness holds even when histories are truncated to just 100 top sites. We then find that for users who visited 50 or more distinct domains in the two-week data collection period, ~50% can be reidentified using the top 10k sites," the researcher said.

Based on a previous study in 2012 titled "Why Johnny Can't Browse in Peace: On the Uniqueness of Web Browsing History Patterns", the researchers wanted to re-evaluate if browsing history was still a valid fingerprinting vector.

In this case, it's more than ever.

"Finally, we observe numerous third parties pervasive enough to gather web histories sufficient to leverage browsing history as an identifier."

Whatever a person does on the web, will leave a trail.

Even if a browser is set to not store browsing history, like for example, through private browsing mode. People should know that websites they visit, will keep logs of the IP addresses are their visitors.

These are basically the addresses of the computers used by users, when visiting those websites.

And not just that, as computers people use, also store digital footprints of its users.

When a person does something on a computer, simple things like opening a file, deleting a file, reading an email and so forth, those kinds of activities are also stored on that computer.

When an online digital footprint jumps from the internet to users; hard drive, it should be noted that even the data is deleted, it can still be recovered.

These different pieces of information, made at different times, could track and re-identify the owner.

And not only the information is valued by advertisers, as it can also aid investigation by the authorities.

So here, the findings by Mozilla's researchers should alarm those who care about privacy when browsing the web.