How to anonymize data ?

According to the CNIL (Commission nationale de l’According to the French Data Protection Act, anonymization refers to a process that aims to guarantee the respect of an individual’s privacy by preventing his or her identification by means of a set of data. It is part of the solutions that allow to exploit personal data without infringing the rights and freedoms of individuals. How to implement it ?

What are the techniques of’anonymization of’personal data ?

L’anonymization of personal data allows them to be reuse while ensuring the protection of the privacy of their owners. It also makes it possible to store them beyond the established retention period.

To effectively anonymize personal data, it is necessary to observe several rules:

Analyze the relevance of information to determine which information should be retained;
Remove elements that make the person concerned easily identifiable;
Categorize the information in order of importance in order to remove unnecessary information;
Delineate the tolerable fineness for any stored information.

Only in this way will it be possible to choosing the right technique’appropriate anonymization. There are two main types.

Randomization

The randomization is an anonymization technique that consists in destroy the link between a personal data and an individual. Specifically, it involves making changes to attributes in a dataset. Aiming to make the information less precise, this process does not, however, affect its general distribution.

Adding several centimeters to the real height of individuals and permuting their date of birth are examples of randomization.

Generalization

The generalization implies the dilution of personal data by changing their scale or size. Allowing common attributes to be rendered to the individuals involved, this anonymization technique prevents the isolation of an individual in a dataset.

As an illustration, instead of indicating the date of birth of persons in a database, it is possible to indicate only the year.

It should be noted, however, that these techniques ofanonymization of data are not not infallible. It is thus in everyone’s interest to learn how to protect their personal data.

How to anonymize a database ?

L’anonymization of private databases must respect some conditions. In particular, it is necessary to replace sensitive data with fictitious information before it is disseminated. It is also necessary to What are the techniques used to remove all unnecessary personal data from the production process?.

In addition, it is important to comply with the requirements of the General Data Protection Regulation (GDPR) for non-production environments.

The most common way to anonymize a database is to use a snap mapabolish fields containing personal data. It can be a IP address, of a social security number, etc. Most of the time, however, this action leads to the deletion of useful data such as geographical information.

Another solution is to replace personal data fields with new information. In this case, however, it is still possible to re-identify the individuals concerned by combining several databases.

Especially regarding the anonymization of data in non-production environments such as testing or development, the RGPD recommends the masking or thestatic anonymization. This process prevents access to sensitive data by replacing it with information that is similar but not useful.

Anonymization is not the only solution to preserve sensitive data. For example, it is also possible to use the method of encryption. The latter can be applied for a database as well as for a mobile device.

For more information, please read the article on how to protect your personal data on a smartphone.

How to anonymize cloud data ?

In another article, we mentioned that the risk to personal data is one of the disadvantages of cloud computing. To limit it, thedata anonymization stored in the computing cloud is crucial.

In this case, the data processing can also be based on the masking. Making the original data disappear, thestatic anonymization is essentially put at the service of right to be forgotten. L’Article 26 of the GDPR states that masking removes the liability of this European regulation.

This is a particularly interesting solution for companies, as it will allow them to take full advantage of cloud computing.