AI in CloverDX

AI-powered data transformation in CloverDX

Our philosophy at CloverDX has always been to give our users maximum control and customizability, and to build features that help you work smarter and more productively - the integration of AI into CloverDX is no different.

AI in CDX

AI features in CloverDX

Transform data with the power of AI and NLP in CloverDX

PRIVACY AND CONTROL

Locally hosted AI text classification

Perform data classification and anonymization tasks - such as sentiment analysis or PII identification - using locally hosted NLP models. 

Plug in your own model, or grab one from the CloverDX Marketplace, and get full control over your data. 

Benefits

  • Secure and private: Using locally-hosted models means data never leaves your CloverDX Server, so you don’t have any extra worries about security or governance.

Need to know

  • Runs on your AI-capable infrastructure: For maximum performance we recommend hosting CloverDX Server on infrastructure equipped with NVIDIA GPU(s)
THE POWER OF LLMS

Transform data with OpenAI

A new OpenAI component for CloverDX enables you to integrate ChatGPT to your data transformations, with the ability to handle back-and-forth prompt chaining chat to process and refine responses.

Benefits

  • Stay in control: You’re always in full control over what data you send to OpenAI.
  • Bring your own key: Use your own OpenAI key and select which OpenAI model you want to use.

Need to know

  • Cost and governance: Keep in mind when using OpenAI integration, data might be leaving the CloverDX platform to a 3rd party service. 
Locally hosted AI

Privacy-first AI data transformation with local-run models

Four new components available now to use in your CloverDX Designer workflows.

These components are essentially wrappers around models - you plug in various machine learning models to implement different use cases.

Models run locally – either on your own hardware or in your own cloud environment, so your data stays in-house, for complete privacy and governance.

You can download models from the CloverDX Marketplace, or use your own. 

Data classification components

3 components that you can use for various data classification use cases. The use case is defined by the model you insert to the component.

  • AITextClassifier: scores input text field(s) against pre-trained set of classes.
  • AIZeroShotClassifier: allows you to define your own classes, and scores input text against them.
  • AITokenClassifier: breaks input text into sub-word units (tokens) and scores them against a pre-trained set of classes.

Anonymization component

Mask data identified by the model, without needing to send your data to a 3rd party for anonymization.

  • AIAnonymizer: allows you to run a token classification model, e.g. to identify PII, and then masks the identified tokens in the output.
.
illustration-cdx7--data-analyze@2x
All the power of OpenAI

Data transformation using OpenAI/ChatGPT

The CloverDX OpenAI component allows you to use OpenAI LLMs in your data workflows, with customizable response processing, for power and flexibility.

Define your custom prompt and the data you want to send, and then react to the LLM response – essentially allowing you to have a back-and-forth conversation with the LLM.

For example: if you ask for a JSON response from the LLM, you can detect if the response is valid, and if not, send instructions to OpenAI to fix the error.

  • OpenAIClient: compose and send queries to ChatGPT, and process the response
illustration-cdx7--platform@2x

Note: Incubation features in CloverDX

All the components listed here are currently in Incubation

What does Incubation mean?

Incubation features in CloverDX are tested, supported and available for use, but they’re under active development and will likely see changes.

AI in CloverDX: Frequently asked questions

Here are some questions we've been asked about data transformation with AI in CloverDX. If you have questions that aren't listed here, we're always happy to answer them - just get in touch.

You can choose. If you use the OpenAI component, then you define exactly what data you want to send to AI for analysis or transformation.

But if you don't want to send data to OpenAI (or any other 3rd party), you can run AI models locally, so your data all stays in-house.

 

AI features in CloverDX don't have any extra cost, and you don't need any extra licences to use them.

If you use the OpenAIClient component, you'll need your own OpenAI account, and you'll pay via that as you usually would for any other OpenAI processing.

 

Using any AI features in CloverDX is completely optional - no data is shared with 3rd parties unless you choose to send data to OpenAI with the OpenAI component.

You can still benefit from AI capabilities by using the locally-hosted AI models for specific data classification and anonymization tasks. Because these models run completely in your environment, data stays under your control.

Locally-hosted AI components allow you to 'plug in' AI models to perform specific tasks, such as data classification. These models live on your hardware and run locally - no data is sent to any 3rd parties.

The OpenAI component does use a 3rd party (OpenAI) to process data. The advantage of this approach is you get the full power and flexibility of OpenAI, and you don't need your own powerful hardware to run queries. The CloverDX OpenAI component allows you to always specify exactly what data you send to OpenAI.

No - the AI models available through the CloverDX Marketplace (designed to be plugged into the locally-run AI components) are publicly available models that have been adapted to work with CloverDX by wrapping into a CloverDX library.

Information on each model, including its source and license information, is available on each model's detail panel.

No, we're not developing any models ourselves right now.

But if you have an existing model of your own, we can help you wrap it to make it compatible with CloverDX so you can use it as part of your CloverDX data workflows. 

If you want to talk to us about this, just get in touch.

You can either download an optimized AI model from the CloverDX Marketplace, get your own models (e.g. from HuggingFace, GitHub, etc.), or use your own built and trained models.

Models can then be plugged into the CloverDX locally-run AI components to run on your own hardware. 

Running AI models locally can be slow and resource intensive (CPU variants). But you can use NVIDIA GPU for maximum performance.

To make this easier, we created a new Docker image available on DockerHub.

This image is pre configured with all the dependencies so that you can take advantage of GPU acceleration for your machine learning workloads.

The image can be deployed on any machine where all software and hardware requirements are met. The machine requires NVIDIA GPU with properly configured drivers, CUDA, container toolkit and more.

The easiest way to use this image is to run it on AWS in AWS Deep Learning Base GPU AMI (Ubuntu 24.04) which runs on AWS EC2 G4dn or AWS EC2 G5 instances.

Later this year we're planning on releasing the Clover Assistant, to help boost your productivity when building your data workflows. 

Sign up to our product information mailing list to be the first to hear about new feature releases, and get invites to our live release walkthroughs with our VP Product.

Be first to see what’s new

Sign up to our product info mailing list to be notified of all the new features, and get invites to our first-look live release walk-throughs.

Stay up to date with CloverDX