Data Anonymization

Turn sensitive production data into anonymized,
freely usable data sets. At enterprise scale.

Production Data

The best data you could imagine for development, visualization, testing, analytics or prototyping.
Yet, production data is heavily guarded because of privacy, regulations, licensing costs and system resources.
As a result, you simply can’t use or share production data as you’d want to.

Anonymized Data

Anonymization strips production data of sensitive parts and retains important characteristics, such as statistical distribution, relationships and anomalies, too.
Anonymization can be automated to create freely usable copies of production systems. Plus you don’t need to pay for extra production licenses.
You can safely provide life-like data without breaching compliance.

Of course, developers can generate synthetic data but that’s merely a far approximation.

Synthetic Data

  • Random numbers, dates or strings
  • Random lookups for names, cities, ...
  • Basic domain rules (e.g. age between 0-99)

Life-like Data

  • Resembles original data as best as possible
  • Tracks relationships among multiple systems
  • Retain statistical distribution & real values (bad data too!)
  • Retains business logic in data
  • Yet with all this, anonymization obscures sensitive information and relationships
Sensitive attributes


Close to the Real Thing

Anonymization (strictly speaking “pseudonymization”) is an advanced technique that outputs data with relationships and properties as close to the real thing as possible, obscuring the sensitive parts and working across multiple systems, ensuring consistency.

Data Anonymization in Software Testing

See how data anonymization can help improve software release quality

with Pavel Svec, Senior Consultant.

Anonymization at Enterprise Scale

1000’s of database tables and dozen of systems.
That’s where we come in with the CloverDX Data Anonymization Solution.

Complete Sensitive Data Map

Automated Discovery

CloverDX Harvester is the first component in the solution. It crawls all your data (not just metadata structures) and scans for patterns representing sensitive information. The resulting map is a true and complete representation of occurences of sensitive information in your data.

Anonymization Policy

Applying Policies

Each piece of sensitive data can have its own rules and anonymization policy. This gives you fine control of the level of treatment required.

Anonymization Engine

Repeatable Engine

Based on the previous steps, we produce the second key component of the solution, the anonymization engine. It runs on top of the CloverDX Data Integration Platform, and can be re-generated easily upon changes. This engine has the policies built in and produces the anonymized copy of your production data on demand.

Production Data
Anonymized Data
