A Second Wind: New Features Strengthen CloverDX Data Profiler

In Data Profiling with CloverDX, we discussed the value of the new product that will soon enrich our product portfolio - CloverDX Data Profiler. The beta testing has been up and running, and so far we have received a lot of valuable feedback from the first beta build that allows us to further enhance the features and usability of CloverDX Data Profiler. Let’s see what the new build offers.

Integration with CloverDX Designer

The new build of the CloverDX Data Profiler introduces a key feature - the first step toward integration with the CloverDX suite. If you are interested in taking part in the development and feedback process of CloverDX products or the beta testing, you can register and get the new build.

The CloverDX Data Profiler has several important use cases, one being its integral role in analysis in the early stages of your projects. The insight into data gained from CloverDX Profiler is valuable for project planning, illustrating the importance of clean data, while offering some direction in the design of your ETL process.

Aside from gathering information during the definition of the profiling job, with this build’s new feature, you are also preparing your resources to use later in the CloverDX Designer. This work includes: defining the database connection and metadata that describe the schemas of your data source. The profiler may assist you in this process. It is especially useful for data sources that don’t include schema as their integral part, such as CSV files.

Exporting Metadata and Transformation

The second beta build of CloverDX Data Profiler includes the option to export both database connection metadata and a sample CloverDX transformation graph containing the reader that accessed the data source you previously defined. These are ready to be used in CloverDX Designer. You can see official CloverDX Data Profiler documentation to learn more about this feature.

Other Improvements in the New Build

Here is a short summary of the most notable improvements in the second build:

  • export to CloverDX Designer graph
  • visual enhancement of CloverDX Data Profiler job
  • removed dependency on internal database for metrics calculation
    • this results in performance boost, 4 GB of data with 30 fields and all metrics enabled takes 30 minutes
  • metrics that might be time consuming are visually distinguished in profiler
  • other tweaks and changes that enhance the overall performance and stability

Data Profiler

The Next Steps

This is our first step in providing integration of CloverDX Data Profiler with the rest of the CloverDX suite. The next stages will be:

  • full integration with CloverDX Designer that will allow you the capability to profile any data flow inside your CloverDX graph, and
  • integration with CloverDX Server, where you will be able to set up events based on conditions detected in these data flows inside CloverDX graphs running on the server.
Posted on December 01, 2011

Where to go next