I had the opportunity to be a part of data migration project recently. I was involved in automated data migration testing, which I found it to be a very exciting and challenging form of testing. I wanted to share my learning’s in this post.
During conversion or migration projects, data from the legacy or source systems is extracted, transformed and loaded into the target system. This is called as the ETL process. The data needs to be transformed as the data model of any two systems is different. As a standard practice the data transformations are managed in the data mapping document, which forms the basis for development as well as testing.
Testing the migrated data is usually a critical and challenging aspect. Testing the migrated data manually is a very time consuming process and so automated data validation is a good way to go ahead.
In my latest project, data from two source systems was migrated to target system. The data from the source systems’ UI was compared with the data from target system’s UI, since we did not have access to the database. Data migration was performed based on incremental loading approach to ensure cent per cent verification. The approach was to load small subsets of data every week for verification. This type of a process was a perfect solution to client’s challenges as in the event of any mismatch only that specific subset of data could be reversed.
I am also listing some of the key challenges we overcame during the course of the project
- We had to create scripts that could read source values and the use field level mapping rules to calculate the expected results at the destination. This had to be done because the mappings between the fields of the source system and the target system were different; i.e., both systems had their own structure
- We had to verify values at the target system as some extra fields were present in it leading to a mismatch with the source system
- We had to read the data on the target system to verify as some amount of data in the source system was in a CSV format with header changes for each customer column
- We also created a strong log generation mechanism that generated a result for every iteration. It also went onto ensure that when any mismatch occurs not only field name mismatches are captured but also values get captured
- The results also included the time taken to execute each record
- To counter the fact that most of the data migration was done in files of XML tab separated formats, we had to generate the input file for automation in excel format
We also went onto to create a customized data migration automation testing framework (illustrated below) to overcome these challenges which lead to a successful project.
Have any of you worked on such projects? Would love to hear some of your experiences.
Anand Gharge | Test Manager | Zen Test Labs