Quick Tip - Organizing Executions History

Reusability is very important topic when it comes to job design in CloverDX. We are strong advocates of the DRY principle, which can be a big help during development. There is small culprit when trying to make sense in multiple executions of same, highly configurable job.

Child job tagging

This is extremely helpful during development but may create all kinds of problems for DevOps when not managed properly. For example, the following image outlines multiple processes which read and process Payment files. If one or more of these processes fails, it is virtually impossible to see which file was not possible to load. It could be potentially even worse when the executed process could be a dummy wrapper for something more complex.

Execution tree without labels

As a developer, you can make browsing through an executions tree way more transparent. It is possible to change labels either via proper naming of any given Subgraph or through the Execution Label property configured directly from configuration dialogue or via Input mapping of ExecuteGraph/ExecuteJobflow. This results in better annotated execution structure and helps pinpoint issues more efficiently.

Execution tree with labels

Default job labels

In the case of the same top-level job running multiple times, we will find the picture very like the following one in Execution History. This is also not too helpful when dataLoad.jbf behaves different for each configuration, e.g. loading data for different regions.

It may not be straightforward to identify which region failed to refresh its data. But there is a way to fix it.

List of triggered jobs without labels defined

Execution Label property can be set also be set in a job file itself and as usual, it can be populated by any job parameter. In our example, dataLoad.jbf also includes one parameter, called REGION. This parameter is used as such a label.

Apply label to any CloverDX job

This allows for Executions History be annotated a little better than usual. Each individual trigger provides REGION as means of identification of the data target as well as the appropriate tag for the Executions History listing.

Making sense of execution history for labeled jobs is easier

Quite a different picture, right? I hope you will find this tip useful.

More from Tech Blog

  • Efficient data modelling with DBT and ETL data pipeline

    In CloverDX we sometimes get a question if and how we can work with DBT. These questions typically come up when IT/data engineering wants to empower data... Analytics and BI
  • Building custom REST APIs for ETL processes

    HTTP APIs currently drive data integration space. No matter which system enterprises use, most of them these days do support some way to extract or modify... API
  • Under the hood of CloverDX Cluster

    We frequently get a question what a CloverDX Cluster is, how it works and advise around configuration. So let me shed some light on it as I’ll try to... Deployment
  • CloverDX Server installation on RHEL and CentOS

    Starting CloverDX 5.16.0, server installer is available via an RPM package making it easy to install and maintain going forward using YUM or DNF package... Deployment
  • CloverDX as Kafka event consumer

    In previous article, we covered how to establish a Kafka connection and how to publish messages into a Kafka topic. This article will cover the other side... CloverDX How-To
  • How to connect and publish messages to Kafka

    Kafka is a distributed event streaming platform capable of handle massive volumes of events. It is designed and operates similar to a messaging queue.... Data Transformation

Visit CloverDX Blog

Read On