Major Checkpoints for Database Migration Testing

I had the opportunity to work on my very first data migration project a few days ago. The objective of our project was to ensure the correctness and completeness of data migrated from a source system with Oracle as its database to a target system with MS SQL as its database.  I learned a lot while working on this project. I got an opportunity to hone my SQL query writing skills and got well versed with the major checkpoints and activities performed during database migration testing. I wanted to share my learning’s and experience in this post.

23 million data sets had to be migrated from Oracle to SQL. In order to do this, we divided the activities into two phases:

Info1Our job was to test and verify the correctness and completeness of around 23 million data sets at the end of each activity. The real challenge was to perform this job manually without automation. What made it even more interesting was that there’s a chance of up to 5% data loss during data migration.

To achieve our goal, we divided our tasks into 4 major Phases:

Info2

The experience I gained while working on this project can be divided into 3 parts:

  1. Basic skills required for database migration testing
  2. Major checkpoints for database migration testing
  3. Major activities performed during database testing

Basic skills required for database migration testing

Other than testing skills required for all types of testing, one needs following skills for data migration testing

  1. SQL query writing for legacy and target database
  2. Knowledge of the data migration testing tool (if any) being used or advance knowledge of Excel features for data comparison

Major checkpoints for database migration testing, data profile matching

  1. Data profiling
  • Table count
  • Data type matching of columns
  • Identifying different classes of records from business point of view.
  • Performing Check Sum on column holding numerical data
  • Row count matching of legacy and target database
  • Column count matching of legacy and target database
  • Matching of total Null values per column of legacy and target database
  • Matching of Not Null count per column
  • Checking the Distinct count per column
  • Checking the Group By count per column
  • Matching of Summation of numeric fields
  • Checking the Min value per numeric column
  • Checking of Max value per numeric column
  • Matching of Average data value in numeric columns
  • Checking the biggest values in non-numeric columns
  • Checking the shortest values in non-numeric columns

2. Checking  data redundancy
3. Matching  control tables to verify the exact transfer of relationships among tables
4. Functional testing on migrated data
5. Random sampling- Picking and matching data from corresponding tables randomly.

Major activities performed during database testing

  1. Identification and matching of the number of tables in legacy and target databases.
  2. Identification and matching of the data types of columns in corresponding tables.
  3. Identification and matching of the relationship between tables in legacy database and target database.
  4. Identification and matching of business rules in legacy database.
  5. Identification of primary key attribute for each table
  6. Identification of columns with Not Null property
  7. Identification of numeric fields for data profiling
  8. Query writing for both databases to identify all data profiling checkpoints:
    1. Row count
    2. Column count
    3. Null values count per column
    4. Not Null count per column
    5. Distinct values count per column
    6. Group By count per column
    7. Summation of numeric fields
    8. Min value per numeric column
    9. Max value per numeric column
    10. Average value of numeric fields
    11. Biggest values in non-numeric columns
    12. Shortest values in non-numeric columns
  9. Matching the data profiling results of both databases
  10. Query writing for random sampling to ensure that data has been migrated correctly and completely
  11. Using migrated data with the related application to test its functional use

Migrating data is a challenging activity. Users expect it to be fast and efficient, without loss of data.  Thorough data migration testing has to be performed to reduce risk and ensure that data has been migrated as per requirements. I believe that using the points outlined above, you can refine your test approach and ensure a better data migration testing process.

Mayank Raj | Trainee Test Analyst | Zen Test Labs

Automating data migration testing

I had the opportunity to be a part of data migration project recently. I was involved in automated data migration testing, which I found it to be a very exciting and challenging form of testing.  I wanted to share my learning’s in this post.

During conversion or migration projects, data from the legacy or source systems is extracted, transformed and loaded into the target system. This is called as the ETL process. The data needs to be transformed as the data model of any two systems is different. As a standard practice the data transformations are managed in the data mapping document, which forms the basis for development as well as testing.

Testing the migrated data is usually a critical and challenging aspect. Testing the migrated data manually is a very time consuming process and so automated data validation is a good way to go ahead.

In my latest project, data from two source systems was migrated to target system. The data from the source systems’ UI was compared with the data from target system’s UI, since we did not have access to the database. Data migration was performed based on incremental loading approach to ensure cent per cent verification. The approach was to load small subsets of data every week for verification. This type of a process was a perfect solution to client’s challenges as in the event of any mismatch only that specific subset of data could be reversed.

I am also listing some of the key challenges we overcame during the course of the project

  1. We had to create scripts that could read source values and the use field level mapping rules to calculate the expected results at the destination. This had to be done because the mappings between the fields of the source system and the target system were different; i.e., both systems had their own structure
  2. We had to verify values at the target system as some extra fields were present in it leading to a mismatch with the source system
  3. We had to read the data on the target system to verify as some amount of data in the source system was in a CSV format with header changes for each customer column
  4. We also created a strong log generation mechanism that generated a result for every iteration. It also went onto ensure that when any mismatch occurs not only field name mismatches are captured but also values get captured
  5. The results also included the time taken to execute each record
  6. To counter the fact that most of the data migration was done in files of XML tab separated formats, we had to generate the input file for automation in excel format

We also went onto to create a customized data migration automation testing framework (illustrated below) to overcome these challenges which lead to a successful project.

Have any of you worked on such projects? Would love to hear some of your experiences.

Anand Gharge | Test Manager | Zen Test Labs