Verifying 800 Million data sets in record time!

I recently was fortunate to be a part of a unique project at Zen Test Labs. This was a post-merger scenario wherein the acquirer (bank) had to consolidate the customer information systems of the two banks into a single system. This meant mapping the acquired bank’s product, service and customer portfolio, to a new and modified version of the acquirer’s products and services.

Among many other factors, ensuring seamless service to existing customers of the acquired bank implied that such customers should not expect undue increase in service charges. Processing customer data using enhanced systems required that the service fees were within the threshold that the customer would expect in normal course of business. Testing for “Go Live” was tricky since it required that for each acquired customer, the bank had to compare the results from the “Go Live” with historical data for the customer. With hundreds of thousands of customers and millions of transactions in a month, manual verification was a gigantic task, potentially impossible to accomplish.

Zen Test Labs creatively addressed this situation by leveraging its Data Migration Testing framework and extending it to include customer specific scenario. For example, each data component of the source and target data files were mapped, rules applied and integrated into the testing framework. A utility was then designed to pick each record from the source, apply the logic of migration then check if the corresponding value of the record in the target file is within the tolerance level as per the logic. During execution the selected components from the imported source and target data were compared and flagged if not meeting the tolerance levels. Once all the records were compared the utility reported:

  1. All transactions migrated as per the logic
  2. All transactions which did not meet the tolerance criteria
  3. Transactions in the target database which did not have any relation with the migration process

The framework and utility testing itself adopted an approach with three layers of testing:

  1. Utility testing using dummy data for source, target and the mapping
  2. Sampling of output files and manual verification with real data
  3. Verify against “Thumb Rules”. One of the examples of this was; the total number of Pass records and Fail records should total the count of primary key of source data.

Overall I found this project very challenging and interesting. Leveraging the data migration testing framework we created a comprehensive utility in approximately three weeks. The quality and performance of the utility was so sharp that it compared one data component with 600,000 to 700,000 records in 10 to 12 minutes. The total number of data values verified in this project was over 800 Million in a span of 30 days which is as good as verifying at least one data for the entire population of European Union! With our output files we provided great deal of ‘Data Profiled’ information of migrated customers to the bank which was used to understand behavioral patterns of the migrated customers and the performance of the products after migration.

Ravikiran Indore |Sr Consultant |Zen Test Labs


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s