Skip to content
10 min

Differential Privacy: what is it?

Protecting sensitive data is an issue that is becoming more heartfelt and important every day; Differential Privacy is the latest in protecting one's information.

The recent events involving Facebook and Cambridge Analytica have given, if possible, further prominence to a problem that only marginally affected individuals but that companies were already very familiar with.

Indeed, the breach of sensitive data can cause devastating financial losses and compromise an organization's reputation for years. From serious business and data losses to fines and remediation costs-data breaches have far-reaching consequences.

According to the Data Breach Report, conducted by the Ponemon Institute and sponsored by IBM Security, in 2019 the total cost of breaches resulted in nearly $4 billion in losses, as reported by 507 companies spread across 16 different geographies and 17 industries.

 

Netflix case

 

netflix

 

Methods for anonymizing data have already been devised over the years, but the evolution of hacking methods challenges even the most advanced techniques. Ask Netflix for confirmation.

In 2007, Netflix promised a $1 million prize to anyone who could come up with the best collaborative filtering algorithm. Two researchers at the University of Texas-Austin, however, took the challenge in the opposite direction.

Netflix released a selection of its users' data for the contest, but removed all personally identifiable information: no names, no movie titles. However, researchers were able to de-anonymize a number of users through the nominally anonymized data set. All they had to do was scan IMDB, a reference site for movie reviews, and compare these rating patterns with those of Netflix. The result? The two researchers identified 80 percent of the anonymized Netflix Database.

 

Differential Privacy in a nutshell. What is it and how does it work?

Simplifying before elaborating further: Differential Privacy serves to eliminate the possibility that such a Reverse Engineering operation could take place. It does this by adding background "noise" so that malicious attackers cannot work on clean data but only on information that is intentionally garbled and unusable.

Let us now look in detail at how this mechanism works and why it is the future of sensitive data protection.

Distracting, confusing and limiting: that's the secret of Differential Privacy

 

differential_privacy

 

Differential Privacy works through a complex mathematical structure that uses two mechanisms to protect personal or confidential information within data sets:

  • A small amount of statistical "noise" is added to each result to mask the contribution of individual data points. This noise works to protect an individual's privacy without significantly affecting the accuracy of the answers extracted by analysts and researchers.
  • The amount of information revealed by each query is calculated and subtracted from an overall privacy budget in order to block further queries that could compromise, definitely, personal privacy.

Through these methodologies, Differential Privacy makes it impossible to infer any timely and accurate information about a particular person by hiding the dataset that person has released while browsing.

 

What is meant by noise and how does it "add" to our sensitive data?

 

differential_privacy

 

This is the most technical and most fascinating part of the whole process. The exemplification is from the journal Foundations and Trends in Theoretical Computer Science and is based on a basic question, any question, to which a person must answer "yes" or "no." Before the person utters the final answer, so-called "noise" is inserted, which is nothing more than a random element that intervenes to shuffle the cards.

Let us imagine that we toss a coin before the answer is recorded: if "heads" comes out, the actual answer will be entered into the model. If "tails" comes out, another coin will be tossed, which will determine the answer "yes" or "no" depending on whether the result of the toss is "heads" or "tails," respectively.

A friend has your back

Partnering with Bureau Veritas and passionate about Cybersecurity and everything that revolves around a computer and its applications for more than two decades, we at Goodcode specialize in implementing Differential Privacy systems for those companies who wish to add an extra layer of security to the protection of their company's sensitive data and related customers.

To do so, simply contact us here.

We have already assisted many businesses that have experienced theft of data crucial to the fate of the business, helped them move forward with recovery efforts and return to full operations, but we believe that the safest way to take care of your business is to avoid taking any kind of risk.

At Goodcode, we protect you from attackers, what else are friends for?