Starting a new CloverDX project

Starting a new project and don't know how to prepare yourself? Three things to keep in mind to get on the right foot with projects on any scale.

We often get questions such as 'What is a best practice for project structure?', 'How do you work on a single project in parallel?', 'What's the best go-live strategy?', etc. To cover at least some of these FAQs, we plan to create an article series to help kick-start the setup stage of any CloverDX project. In this one, let's focus on questions you should ask yourself when starting a new project.

I'll address some observations from our in-house Consulting team as well, to outline which practices are better avoided. In upcoming articles, we will discuss how to work in CI/CD pipelines or how to set up your environment to be parallel development-ready.

I'll be mostly referring to GIT, Jenkins and CloverDX 5.10.1 as this is our predominant software stack at this time but the same principles can be used for any CloverDX release, any type of modern VCS or development automation platform. 

Common problems

I've seen my fair share of projects where developers are joining in and leaving as the project becomes active/inactive or when multiple developers are working on the same project in parallel. What almost always happens is that the project accumulates some problems which are very difficult to get rid of down the line:

  • Difficult to make-sense-of project structure
  • Inconsistent naming convention
  • Hard-coded usernames and passwords
  • Copy&Paste "version control" (SameFile1.grf, SameFile2.grf, ...)
  • Duplicated parameters in various PRM files.

But strangely, all of these problems can be very easily avoided from very beginning by setting up and documenting coding standards which should become sort of a project bible. Simple README.txt describing most basic coding standards and project structure usually does the trick. As CloverDX is a very visual tool, one just needs to know where to start looking when trying to make sense of the project or "how to behave" when making changes.

Set project standards

Think about standards as a foundation for a building. It will be very difficult to build a skyscraper on a foundation built for a garden shack. Such a building would have to be supported from all its sides on every floor just not to topple over. Chances are, even if heavily supported, it'd eventually sink.

It always helps when there is a simple set of rules for all projects that everyone should be familiar with (like naming convention, always use parameters for passwords, ...) but others might be difficult to adhere to from project to project. As it is a good idea to keep project documentation along the project, format usually depends on project sizing, from one to couple of Markdown files up to full HTML, PDF or Word documentation. In most of our projects, you will find directory docs precisely for such purpose. Smaller ones may feature a single README.md file in project root just to outline the most important information, which includes:

  • Brief project description (how it works) and its purpose
  • Installation and configuration instructions
  • Project structure
  • Dependencies

Having a copy of global standards along these docs may be helpful when these are still evolving and project conventions adhere to older guidelines.

What's important to mention is, one should not be afraid of changing the default project layout. This layout is meant for smaller projects with a single purpose. When a multipurpose project is being developed it may be more convenient to break it down to modules. Such breakdown helps with testing, orientation and often even architecture design. So, consider changing default layout from

sandbox://LargeProject/graph/SAP_DataValidation.grf
sandbox://LargeProject/graph/SAP_Loader.grf
sandbox://LargeProject/jobflow/SAP_Main.jbf
sandbox://LargeProject/jobflow/SAP_DimCustomer.fmt

to

sandbox://LargeProject/module/SAP/graph/DataValidation.grf
sandbox://LargeProject/module/SAP/graph/Loader.grf
sandbox://LargeProject/module/SAP/jobflow/Main.jbf
sandbox://LargeProject/module/SAP/meta/DimCustomer.fmt

And storing common project files in root which could make some things obvious. For example, metadata sandbox://LargeProject/meta/MetadataSharedCrossSystems.fmt may appear cross modules and therefore one should be careful with modifications.

Last but not least. It all comes down to people (developers). You may have rigorous, perfectly balanced rules but if developers will not adhere to them, they will be useless. 

Version control

Especially when working on a large project where multiple developers may be involved, it is crucial to have files in VCS. In the past, it proved beneficial to have separate branches for each (major) feature or per client, merging them to master for smoke testing in CI environment (like Jenkins) automatically.

Even when your infrastructure does not yet support CI pipelines, having multiple developers working on separate branches may be beneficial if they're working on the same part of a project. This way, they only need to address shared resource changes once (upon merge) and not throughout the development process as would happen when working with single branch.

Since CloverDX executable artefacts are simple text files, any VCS is compatible so it does not matter if your enterprise uses Git, Subversion, Perforce or other(s).

Development environment

With small projects, the process is a fair bit simpler as such projects can be implemented locally in CloverDX Designer. But even medium-sized projects often require Jobflows which are strictly a server feature. It is tempting to share one sandbox to collaborate on it. Especially when such approach is supported by the platform and any remote changes are synchronized with Designer, still it usually causes more trouble than it is worth.

The reason being, during development, some shared components may be adjusted to e.g. allow debugging etc. For example, one developer disables DatabaseWriter so the database gets unaffected by changes made, and another one relies on data to be uploaded because they're working on reconciliation process. When multiple developers are required to work in parallel, use sandboxes dedicated to each one of them. Where you use multiple branches for code development, it is even pre-requisite.

It is not uncommon to see in our sandboxes ACME_Ltd project named as:

  • repcekb_ACME_Ltd
  • svecp_ACME_Ltd
  • ...

Of course, there are some occasions where this approach is valid - when developers are working on small isolated features and resources. 

Summary

We covered some basics which need to be decided and set up before any CloverDX project development starts. In short; define standards and project structure, set up version controlmirror codebase for each developer.

Would like to know more? Let me know in comments.

More from Tech Blog

  • New release: bugfixes for Eclipse Temurin JDK issues

    Today, we released three bugfix releases of CloverDX – CloverDX 5.8.3, CloverDX 5.9.2 and CloverDX 5.10.2. All three releases fix an issue which prevented... CloverDX
  • Hierarchical data structures - JSON

    Historically, there were 2 options to read JSON files prior to CloverDX 5.6. Following 5.6 we introduced a new option – to work with JSON directly in CTL.... CloverDX How-To
  • Customizing metadata propagation

    Metadata propagation, i.e. the ability to push metadata out from connected components is in the product since CloverETL 4.0.0. A new addition in CloverDX... Java
  • CTL2 error handling - try/catch block

    Poor data quality, format changes and unreachable data sources are just a few examples of runtime problems that can wreak havoc on a seemingly robust data... Feature
  • Deployment templating for CloverDX Server

    As more and more companies move towards cloud or container deployments, CloverDX has introduced a number of features, supporting both an infrastructure as... CloverDX Server

Visit CloverDX Blog

Read On