Learn How to Manage CouchDB with CloverDX

<span id=Learn How to Manage CouchDB with CloverDX" />

NoSQL databases are fast and used for high volumes of data and documents – and they are today's rising stars. When a customer asked us for help to manage CouchDB, we decided it would be helpful to others if we shared some tips in a blog post.

Apache’s CouchDB is an open source NoSQL document-oriented database, an alternative to the well known MongoDB. It's free and pretty functional, and because CloverDX can read and write JSON and has an HTTPconnector, we can work with and manage it pretty easily.

About Apache CouchDB

A brief introduction from the Apache CouchDB documentation describes the interface method and the way it stores data:

The CouchDB API is the primary method of interfacing to a CouchDB instance. Requests are made using HTTP and requests are used to request information from the database, store new data, and perform views and formatting of the information stored within the documents.

For nearly all operations, the submitted data, and the returned data structure, is defined within a JavaScript Object Notation (JSON) object.

CloverDX can actually both read and write JSON, and then use CloverDX HTTP connectors to control the database. The CouchDB returns HTTP Status Codes as a response to any operation, which can be used for later processing in CloverDX. Read more about CouchDB API in the documentation: http://docs.couchdb.org/en/latest/api/basics.html#errors.

Demo Project

Download sample project here: CouchDB_project.zip

Manage CouchDB Databases

In this example, we first delete the database created for demo purposes so we can show how CloverDX can easily control the CouchDB database via HTTPConnector using standard request methods. More information about this is available in CloverDX HTTPConnector documentation: http://doc.CloverDX.com/documentation/UserGuide/topic/com.CloverDX.gui.docs/docs/httpconnector.html

As you can see in the picture below, we delete the database with the name white_house, using the request method DELETE. The response is saved in the output file delete_db.out.

Manage CouchDB with CloverDX: Manage CouchDB

Right after, we create a new database for each different insert method using the request method PUT. This time, we get the output on an edge using standard CloverDX Output mapping.

Manage CouchDB with CloverDX

JSON File Creation in CloverDX

There are a couple of ways to insert JSON documents into CouchDB. The following example demonstrates these three different ones:

  • Single insert from CloverDX to the DB
  • Bulk insert directly from CloverDX to the DB
  • JSON file upload to the DB

The sample data used in the example is provided in a CSV file and sent to the JSONWriter component for generating the final JSON file. The JSONWriter component uses specific mapping.

Further information about the component can be found in CloverDX documentation: http://doc.CloverDX.com/documentation/UserGuide/topic/com.CloverDX.gui.docs/docs/jsonwriter.html

In the following steps, we’ll be using previously created JSON files in the format described below.

JSON file for Single Record Inserting Method

The “_id” field is unique ID field which can be provided on insert or is generated by CouchDB.

{
"_id":"1",
"NAMELAST":"LAST_NAME",
"NAMEFIRST":"FIRST_NAME",
.
.
.
"CALLER_NAME_LAST":"CALLER_LAST_NAME",
"CALLER_NAME_FIRST":"CALLER_FIRST_NAME",
"Description":"DESCRIPTION STRING",
"RELEASE_DATE":"MM/DD/YYYY\r"
}

JSON file for Bulk Insert Method

The bulk method requires a set of single documents to be wrapped in an array named “docs” as shown below:

More information about HTTP Bulk Document API is available here: http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API

{
"docs":[
{
"_id":"1",
"NAMELAST":"LAST_NAME",
"NAMEFIRST":"FIRST_NAME",
.
.
.
"CALLER_NAME_LAST":"CALLER_LAST_NAME",
"CALLER_NAME_FIRST":"CALLER_FIRST_NAME",
"Description":"DESCRIPTION STRING",
"RELEASE_DATE":"MM/DD/YYYY\r"
},
{
"_id":"14208",
"NAMELAST":"LAST_NAME",
"NAMEFIRST":"FIRST_NAME",
.
.
.
"CALLER_NAME_LAST":"CALLER_LAST_NAME",
"CALLER_NAME_FIRST":"CALLER_FIRST_NAME",
"Description":"DESCRIPTION STRING",
"RELEASE_DATE":"MM/DD/YYYY\r"
}
]
}

JSON file Upload to the DB

The format of the JSON file used in the third method (uploading a JSON file into the DB) is the same as in bulk method, only saved in a local file.

Creating the JSON file

The following picture shows the part of the graph where the UniversalDataReader component reads the CSV file with data and copies it to three different edges each for the JSON file types described above. Two of them are processed further later, while the last one is saved to the local file.

Manage CouchDB with CloverDX

Inserting and Uploading JSON into CouchDB

For all of the methods we’ve mentioned, the Additional HTTP Headers property needs to be set to: Content-Type=application/json.

Read more about the HTTP headers: http://docs.couchdb.org/en/latest/api/basics.html#http-headers

The first used is actually the simplest one – inserting JSON documents one by one using POST request method. However, it’s also the slowest, due to multiple requests on the DB (for each single record).

Manage CouchDB with CloverDX

CouchDB provides bulk_docs API for inserting multiple documents using a single JSON docs array.

The bulk_docs API is called using URL suffix: _bulk_docs – i.e.: http://URL/DB_NAME/_bulk_docs

Read more about this here: http://docs.couchdb.org/en/latest/api/database/bulk-api.html?highlight=bulk#db-bulk-docs

Manage CouchDB with CloverDX

The third method is basically the same as the bulk method, only it uses the local file instead of data directly from the edge with defined Input file URL in CloverDX HTTPConnector.

Manage CouchDB with CloverDX

Reading from CouchDB

Reading an Entire Database using Temp View

Reading from a database is performed using CloverDX HTTPConnector.

In the case of reading the whole database, API method _temp_view needs to be called using POST and with Request Content specified in CloverDX HTTPConnector component i.e.: { "map" : "function(doc) { emit(doc._id, doc); }"}

Read more about CouchDB views and mapping function: http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

Manage CouchDB with CloverDX

Reading a Single Document

The CloverDX HTTPConnector reads a single record from the database using the GET method and a defined ID suffix in URL – i.e. http://URL/DB_NAME/ID

Manage CouchDB with CloverDX

JSON Extract

After getting the JSON file from CouchDB, extract is desired for further data manipulation in CloverDX. For this, CloverDX has a special component – JSONExtract. Check out more on JSONExtract in the CloverDX documentation:
http://doc.CloverDX.com/documentation/UserGuide/topic/com.CloverDX.gui.docs/docs/jsonextract.html

Posted on May 20, 2014
Try CloverDX for 45-days  Full access to Tech Support as if you were a customer

Where to go next