CloverDX Blog on Data Integration

How to Design Versatile Subgraphs Using Optional Ports

Written by Pavel Najvar | June 30, 2015

Optional ports introduced in CloverDX 4.1 allow you to design generic and versatile subgraphs that replace potentially redundant variations of the same subgraph offering various combinations of inputs and outputs.

Example of a subgraph with optional ports

However, giving users such freedom of choice means you have to deal with numerous design challenges to handle the missing edge connections.

In this blog post, I will share three key concepts that will help you master the wizardry behind creating versatile subgraphs.

Demo Scenario

Sample subgraph with Optional ports

Let's assume we're building a subgraph called “Lean DataIntersection” - auser friendly“ version of DataIntersection component that is forgiving in terms of what's connected to it (remember, standard DataIntersection needs all ports connected) and would take care of pre-sorting the inputs as a bonus (the two FastSort components).

Here are two sample scenarios showing how such a subgraph can be used:

Sample scenario 1: Notice only the middle output connected

Sample scenario 2: Using only one input connected

Setting Optional Ports

We want to set the second input and two output ports as optional so that users can freely to use the “forgiving” component.

Setting up optional ports

You can choose between two modes of optional ports (right-click the port in the vertical bar, or in outline):

Optional port (edge receives zero records)

If we select this mode the second FastSort will receive 0 input records (no edge connected). See the illustration below to view the results. Not exactly what we want, right?

Optional port set to receive zero records

Optional port (edge is removed)

Unfortunately, the second option is not much better at solving our dilemma either. In this case, instead of zero records, the edge would be removed completely at runtime and we would end up with a crippled subgraph illustrated below.

Removal of the edge caused error

Either way, merely setting ports to optional does not give you the results you'd expect. There's more to set than just the ports.

Key Concept I: Dynamically Enabling Components

One of the key factors to versatile subgraphs is to have Clover dynamically enable/disable parts of the subgraph that are affected by the missing input. In this case we want to disable the second FastSort and DataIntersection completely whenever the optional input is not connected.

Dynamically enable/disable portions of a subgraph

To set components to be enabled only if certain inputs/output are connected, go to „Enable“ menu of a component (right-click). In our case we're using “When Input Port 1 Is Connected”.

Dynamically enable/disable component based on a condition

Instant indication of dynamically disabled component

The “?” icon indicates the component is set to enable/disable dynamically.

NOTE
Wondering why the top FastSort is also dynamically disabled? There's no optional port so that part will always work, right? Well, yes. Leaving it always enabled would work just fine but in the “one input port SimpleCopy” mode it would be sorting all the data without a real purpose and a good design avoids such costly operations whenever possible.

Key Concept II: Component Pass-through

User defined pass-through

When CloverDX disables a component, it needs to know how to bypass it. A disabled component is always replaced by a single edge and often it's simply obvious (e.g. disabling a sorter simply creates a “short-circuit”).

For complicated components like DataIntersection or subgraphs having multiple inputs and outputs you need to tell CloverDX which ports to connect.

Setting pass-through for components with multiple inputs and outputs

You can set pass-through for any component or subgraph by going to Edit and scrolling all the way down to Common properties and setting Pass Through Input Port and Pass Through Output Port.

Why do I need to set pass-through?

If we didn't set the DataIntersection pass-through properly, CloverDX would connect the first input port with the first output port (Port 0 > Port 0) by default which is not what we want.

However, you can typically ignore pass-through as the default behavior mostly works just fine.

Key Concept III: Metadata Propagation

Versatile generic graphs tend to depend heavily on metadata propagation rather than having everything predefined. For example, our Lean Data Intersection has no internal metadata whatsoever; everything is driven by what the parent graph provides by connecting edges with metadata to it.

This is where you can easily fall into a trap. Do not rely on metadata propagating from ports that are set as optional. Keep in mind that with some use-cases there will be no edge connected, thus no metadata!

The solution is to either use manually assigned metadata or use edge metadata (Select Metadata from Another Edge) so that your metadata is propagated from edges that are guaranteed to receive metadata at all times.

Dynamically disabling components connected to optional ports and setting correct pass-through will save you a lot of trouble with metadata propagation. In fact, if you do everything else right, you likely won't have problems with metadata propagation at all!

Conclusion

These three are basic concepts will help you change the way how you design versatile subgraphs. You can read more about more advanced use cases for optional ports in our follow-up blog. Remember to watch for the final version of CloverDX 4.1!