Data Privacy

With the GDPR and other data governance regulations coming into effect, more and more organizations are placing necessary importance on managing data privacy. At its heart, this means making sure personal data is well monitored, private and classified accordingly.  

But what do organizations who handle sensitive data need to know about data privacy? And how can you improve your data privacy practices?

In this guide, we’ll cover everything you need to know. Along the way, you’ll find information on:

  • What data privacy is
  • Why it’s important
  • Frequent data privacy issues
  • Techniques to maintaining data privacy (and get a better grip of your data), including:
    • Discovery and classification
    • Creating a ‘Record of Processing Activity’ document
    • Data mapping
    • Anonymizing and pseudonymizing sensitive data

Ready? Let’s get down to the specifics.

webinar - data anonymization for better software testing - watch now

What is Data Privacy?

Data privacy consists of the actions taken to protect and preserve personally identifiable information (PII) from third party access. PII, at its core, is information that distinguishes a specific person’s identity. This information can be sensitive or non-sensitive, meaning that your organization will need to understand what data you can process and share and what data you can’t.

Most sensitive PII can be split into different categories, including:

  • Online details. This consists of personal information given out via online interactions and website accounts. It can include email addresses, names, phone numbers.
  • Financial information. One of the most sensitive sets of data, financial data consists of credit/debit card details and bank information.
  • Geographic location. This is an individual’s address or physical whereabouts. (I.e. Your location settings on an application may update your location and share it with others.)
  • Medical history. Your medical history may contain information of previous or current diagnoses and treatment.
  • Political record. Political affiliations and personal opinions are also classified as sensitive information. Alongside race and ethnicity, sexual orientation and religious beliefs, there are strict regulations around processing political data under the GDPR.

Each of these personal data sets are sensitive and should be kept private and away from third party access, unless there are specific circumstances that apply or special permissions that are given. Additionally, if you choose to analyse any of these data sets, your organization should take the necessary steps to anonymize or mask certain telling data fields.

Why is Data Privacy Important?

Personal information must be kept personal.

When private data gets into the wrong hands, they can misuse or share it with those who do not have permission to view it. For your customers, this could lead to identity fraud, scams or a breach of trust.

For your organization, however, a data breach could result in reputational damage and a hefty fine. British Airways, for example, received a $229 million fine  after hackers stole extremely sensitive customer data.

Without an awareness of data privacy and the steps you need to take to ensure you’re processing and sharing data legally; you could find yourself in trouble.

The Most Challenging Data Privacy Issues

There are huge global issues surrounding data privacy and how organizations such as yours must navigate the handling of sensitive data.

Here are three of the biggest data privacy concerns:

  1. The rise of data. At our current pace, we create 5 quintillion bytes of data per day, globally. But even on a smaller scale, your business will only have more data to handle – and more private data to segment.
  2. The cost and time of managing data. Your data teams won’t be able to sift through your data, classify it, map it, keep it private and use it for innovation manually. With maintaining data privacy comes the need to use smarter, integrated data tools.
  3. The GDPR. Now that strict data protection regulations have been put in place for organizations who process EU data, the need for data privacy has never been more prevalent.

An awareness of these wider issues at play will help you avoid any data privacy slip ups.

How Do You Overcome These Data Privacy issues?

At the root of data privacy is the need for transparency and understanding. Fortunately, there are numerous practices you can adopt that’ll help you get to grips with your data pipelines. 

Ultimately, these practices will help you identify critical data sets and, in turn, better monitor and manage data privacy.

1. Data discovery: pinpoint your critical data

In order to identify sensitive data in your pipelines, you must first be aware of it. As a result, data discovery is a crucial part of any data management and data governance exercise.

However, many organizations still rely on manually discovering and classifying their sensitive data. Unfortunately, for those who do, it takes a lot of time and effort to sift through large amounts of data, particularly if it sits in multiple locations. This means that some of your PII may slip through the cracks.

It’s much easier and safer to conduct the discovery and classification process with the help of intuitive tools. For instance, with the CloverDX Harvester, you can dive into your databases quickly and pick out your sensitive data in areas you didn’t even think to look. The automated process scans data sources and uses matching algorithms to determine the type of data you have. From there, you can discover where your PII is and make decisions on the data you would like to delete, migrate or anonymize.  

2. Classify and label sensitive data

Once sensitive data has been found, you must analyse and classify it. This is a way of segregating personal, private information from the data that requires less intervention.

Usually, organizations will classify data by tagging each piece of information not only by its type (name, date of birth, etc.), but also whether it’s intent is ‘public’, ‘private’ or ‘restricted’, with public data being the least-critical. The way you define each data set should be determined and defined to your employees through a thorough data classification policy.

Ultimately, classifying sensitive and non-sensitive data will help with your subsequent data protection and security practices, too. This is vital for following necessary data regulations and avoiding potential fines.

2. Create a thorough ROPA

Under the GDPR, it’s essential that organizations with 250 or more employees create a Record of Processing Activities (ROPA). This is so that you (and authoritative bodies) have a full track record of your data activities.

This document is useful for monitoring data flows and ensuring you’re treating sensitive data appropriately.

It should contain:

  • A list of departments in your organization that process personal or sensitive data
  • The controller and processor/s of the data in question
  • The information of your Data Protection Officer (DPO)
  • A complete rundown of each data processing activity, listing:
    • A description of the activity
    • The purpose and legal basis of the activity
    • What data you’re collecting and how you’re processing it (are you sharing it with third parties?)
    • Who the data belongs to (customer, employee etc.)
    • How long you plan to store the data and where you’re storing it

If you struggle to monitor your activities, you may benefit from a data integration tool.

As an example, CloverDX’s ‘Data Model Bridge’ gives you better transparency into your data sets, in turn helping you monitor your data flows, get better control over your data processes through recordable, repeatable data models, and create a thorough ROPA document. As you can record and model your data structures in one place, you needn’t have to trace scattered processing footprints to piece together your historical activities.

Equally, the CloverDX Harvester can pick out sensitive data should you need to check you’ve documented every single critical processing activity in your ROPA.

Blog: 5 Data-Driven Steps to Keeping your Regulators Happy

3. Map your data  

To define it simply, data mapping is the process of linking data fields to corresponding, related target data fields. This is essential for migrating data, as it ensures that certain types of data end up in the right home (e.g. a set of business phone numbers should go to a correlating field of business phone numbers).

But, the data mapping process itself can be useful for data privacy, too.

If, for instance, a data subject requires access to their sensitive data, an effective data mapping system will make auditing and discovering data much simpler.

4. Anonymize or pseudonymize sensitive data fields

Personal data doesn’t have to be useless or deleted. Anonymization and pseudonymization techniques can be used for maintaining data privacy, all the while giving your organization the ability to continue with your data innovation projects.

The first technique, anonymization, is self-explanatory. You simply disguise the private or sensitive data fields by:

  • Blurring data with approximation values to render the meaning obsolete
  • Scrambling letters to obscure names or addresses
  • Masking private data with random characters

However, as much as anonymization will protect your data subjects, it’ll make your data unusable and unrealistic, making it difficult to analyse your datasets. If you wanted to understand how many of your customers were named ‘John’, for instance, it would be impossible to do so if the name field was replaced with incoherent characters.

This is where pseudonymization comes in. Pseudonymization is less about anonymizing data, but rather creating ‘pseudonyms’ for the personal data fields. Much like an author may not go by their birth name, pseudonymization takes away the attributions to the data subject without rendering the data unusable.

These techniques can tie into managing your ‘end of life’ data, too. Once you’re obliged to get rid of data, you can use anonymization or pseudonymization techniques to keep it for testing and analysis.

Managing Data Privacy with CloverDX

There are many cogs to data privacy. The act of protecting and preserving personal data takes time, thought and a fair amount of paperwork.

But, with the right assisting technology and automation, the more time-consuming tasks can become much simpler. CloverDX’s Harvester, for example, can cut the manual time needed for discovering and classifying sensitive personal information, as well as making it more accurate.  

If you’d like help on your data privacy journey, or have any questions, please let our team know. We’d love to show you how CloverDX can help.

New call-to-action