Data Migration

Data migration can be a daunting prospect, despite being a very common process for organizations to undertake. It’s not often something anyone looks forward to doing, but is usually a necessary part of a transition to something better.

This guide offers an overview of what a data migration is, some examples of when you might need to do one, and some tips for planning a successful project.

What is a Data Migration?

Data migration is the process of moving data from one place to another - typically between applications, storage systems or databases. There is one active data set before and after the migration. This is in contrast to data integration where data is active in both, or several, places.

Data migrations are one of the most common data processes and almost all companies will eventually undertake one. There can be many reasons for data migrations, including application replacements or upgrades, business process changes, data volume growth and performance requirements.

The data of course needs to be moved from one place to another, but it also needs to be made fit for purpose in the new system. This often requires data validation, correcting problems in the source or during transportation, converting data formats, and more complex data transformations such as merging values or calculating new ones.

New call-to-action

Data Migration Approaches

There are three main approaches when it comes to data migration:

  1. Merge the two systems into one brand new one
  2. Migrate one of the systems to the other one.
  3. Leave the systems as they are but create a common view on top of them - a data warehouse.

Whichever approach you choose, there are several different methodologies for undertaking a data migration.

Data Migration Techniques

Big Bang Data Migration

A 'big bang' data migration is when you migrate all data in one operation. This may take a while, but for users there is a single point in time where they can no longer use the old data and the new system goes live. From their point of view, the change was made in a single 'big bang' event.

Big bang migrations typically have significant preparation periods and short down times, during which the system is unavailable. The ideal big bang migration has zero down time, but you can't always guarantee this.

The overall process can be visualized like this:

cdx-diagram-data-migration

Big bang data migration process

  1. Design phase
    This is where you plan your project scope and goals, analyze your data samples and create your data migration plan. Based on what you discover, you can propose an architecture, as well as a schedule (and a budget).
  2. Development and testing phase
    This stage is where you implement the proposed architecture. This stage is where most time is generally spent, developing and testing the migration tools (whether you’re using traditional programming languages or more specialized tools). Testing is vital, and is usually done on samples of your data. No data is migrated during this phase.
  3. The big bang
    This is where all your data actually gets migrated. It usually requires downtime of source and target systems to ensure data consistency.
  4. User acceptance testing (UAT)
    Let users verify the migration result. If everything is ok, the source system can be turned off.

Only after data owners and all other stakeholders confirm that the migration was successful can the whole process be considered complete.

The fact that the migration itself only happens toward the end of the process is both a benefit and a drawback of a big bang migration. It can be beneficial, as users do not need to think about two different systems simultaneously and switch between two live systems.

On the other hand, the data migration being so late increases the burden on the planning, development and testing phases and insufficiencies in these areas can lead to expensive failures.

It's important to remember however that a data migration is never a one-time thing

Trickle Data Migration

A trickle migration can be likened to an agile approach to a data migration, breaking the migration down into many smaller sub-migrations, each with its own set of goals, data, deadlines and scope.

This approach allows stakeholders to verify the success of each individual phase, giving a stepwise indication of progress. Should any of the sub-processes fail, it is usually necessary to re-run only the failed process and lessons learned from that failure can be applied to subsequent runs.

Trickle migrations, however, may require more complex planning. They also place a higher burden on users of the data since they have to keep working with two systems while the ongoing overall migration takes place.

Ebook - Migrating data workloads to cloud - download now

Big Bang vs. Trickle Data Migrations

Each methodology has its own pros and cons. Regardless of the methodology, migrations can be expensive, especially if badly planned or executed.

Advantages of a big bang data migration:

  • Generally less costly: there’s less training required, and less managing of parallel systems
  • It can be less complex: there’s no need to consider parallel systems
  • All changes happen one time only, in a relatively short space of time, so there’s a single defined cut off point

Disadvantages of a big bang data migration:

  • A high risk of expensive failure: unexpected trouble, coupled with a lack of agile project management, means problems may only be discovered after the migration (when it can be too late to fix them)
  • If the migration fails, a complete roll back is usually required
  • Requires downtime: depending on the organization, and the systems you’re using, downtime might not be an option

Advantages of a trickle data migration:

  • Less prone to expensive surprises due to chunking of work and more frequent runs
  • Zero downtime required as the migration is incremental
  • If a single phase fails, it’s only that phase that needs to be rolled back and repeated

Disadvantages of a trickle data migration:

  • More expensive - you need to maintain multiple live environments
  • Needs effort to keep two systems running - not just for technical staff, but also for end users.
  • It can be trickier to pull off due to complicated syncing issues

Choosing the Right Data Migration Methodology

The differences between a trickle and a big bang migration are considerable. Deciding which option to choose should be made very early on. The decision is often driven by several key questions:

  • What is the migration deadline?
  • Can the system (or systems) experience downtime?
  • Do you fully understand your data so you can plan the whole process end to end?

There is no simple rule, however big bang migrations are normally selected where the scope is well defined from the outset and where deadlines or other project properties mandate it.

Conversely, trickle migrations are beneficial when the migration can be easily split into several different stages. They are also suitable when the scope is hard to define. In such cases, the trickle migration's phased migration approach allows you to migrate “easier” data first while dealing with the more complex processes later.

Another major consideration is the experience of your team. Each migration can pose different technical and project management challenges. Ensuring you have the most suitable people with the most relevant experience on your team can play a significant role in the final outcome. Teams that prefer an agile approach typically prefer trickle migrations and, conversely, teams who are more used to a waterfall methodologies prefer the big bang approach.

Planning a Data Migration

The importance of good planning for your data migration can’t be overestimated. A bad plan can result in project failure - and data migration failure can often jeopardize bigger business initiatives, not to mention leading to a loss of trust and a large bill.

However, a data migration is often a chance for businesses to dive deeply into and rethink their data. Investing time and effort into the planning process can not only mean a more successful migration project, but can pay off in more efficient systems and greater business value.

Defining Scope

  • Clearly define your goals - what you want to do, and more importantly, what success looks like. And it would be wrong to think you always have to move everything. In general, your main planning rule should be to find the smallest possible subset of the source data (smallest in terms of complexity rather than actual number of records) that will get you to your goal.

Estimating Effort

  • This requires looking at many things, including (but certainly not limited to): volumes of data, the age of your data, its complexity; the value of your data, timing of the migration (and any other projects dependent on it), and the approach you’re taking. Automation can rapidly speed up delivery.

Finding the Right People

  • It’s not just an IT project. The business users are the ones who understand the data, and they need to be involved.

Contracting External Experts

  • If you want external help, getting an expert in both your systems is ideal, but someone who at least understands your target system is valuable (after all, you probably already know a lot about your own source system). Take time evaluating potential partners, and make sure that they have not just the technical knowledge, but also the approach and manner you want to work with. Will they go the extra mile for you?

Finding the Right Way

  • Build a scalable, repeatable process from the outset. Believe us, it will save you time, effort and frustration later on.
Everything you should consider when planning a data migration: Read the Data Migration for Humans ebook

Data Migration Process

A data migration involves a number of stages:

  1. Defining the scope
  2. Finding teams & resources
  3. Planning
  4. Data discovery
  5. Budget
  6. Milestone start
  7. Implementation
  8. Reiteration
  9. Project sign-off
  10. Go-live
  11. Ongoing data integration
  12. Contingency

Each stage has its own considerations and needs to be planned properly to deliver a successful project. Read more details in the post: 13 Stages of a Successful Data Migration

Data Migration Examples

Typically, data migration occurs during an upgrade of existing hardware, transfer to a completely new system, or instances such as application replacements, business process changes, data volume growth, or the need for better performance.

Some examples of when a data migration might be needed:

    1. Mergers and acquisitions: the need to unify completely separate worlds when companies combine. IT stacks and data usually need to be joined to provide one unified system for the new entity.
    2. Modernizing for performance: when a current system is struggling to keep up with performance requirements. For example, two legacy systems create problems of performance, duplicate data and compatibility. One company saw operating costs reduced by over 40 percent when they moved to a new ERP system. 27 million records needed to be migrated from the incumbent systems to the new one, but the project delivered huge efficiency gains.
    3. Moving to the cloud is also often the driver for a data migration. It can be complex, as it usually means migrating multiple applications. Even with cloud providers’ migration tools, custom development is often required.
    4. Moving a data warehouse from one database to another: Migrating data like this can be more complicated than it first appears, because the move is often not a 1:1 copy of the data. The migration often involves data cleansing exercises, or more complicated changes if the target database behaves differently from the source (e.g. if the target is a columnar database while the source is a traditional relational database, the data will need to be transformed before it is loaded).

Data Migration with CloverDX

Read more about how the CloverDX Data Integration Platform can help with data migration projects - reducing time to delivery, automating complex processes, handling errors and managing data at scale.

Data Migration with CloverDX New call-to-action

Read More About Data Migration