Data Comparison: Complex Cases and Comparison Difficulties

How to compare incomparable columns? This article describes a few cases of complex comparisons with data conversion. The built-in database system functions and DTM Data Comparer conversion feature will be utilized in this process. In all examples, we'll deal with Microsoft SQL Server and Transact-SQL functions.

We'll examine three cases:

  • Compare date column with string column contains dates in the different format.
  • Compare codes with non-significant positions.
  • Compare phone number columns with and without are code and additional separators.

Case 1: different date format

In the first case, we have tow columns with 'datetime' and 'varchar' data types. The first of them contains the classic date but in the second dates stored in the short format (dd mon yyyy). As you can see at the screenshot direct comparison marks columns as different.

  DTM Data Comparer: compare datetime and varchar columns without conversion

We'll use CONVERT built-in function to make values comparable. In out case the conversion string is CONVERT(VARCHAR(20),<column name>,106). To apply this function we should use another comparison mode, "Query to Table" instead of "Table to Table". this mode allows us to enter mentioned expression into "select" list for the first table. The comparison complete successfully in this case.

  DTM Data Comparer: compare datetime and varchar columns after conversion

Case 2: product codes with extra symbols

In the second case we have two columns with product codes but actually code in the second table starts from two non-significant '0' symbols. Unfortunately, both codes are strings and we can't compare them as integer numbers.
For this case, we use internal conversion option of DTM Data Comparer.

  DTM Data Comparer: compare codes without conversion

At the begin, we should switch conversion feature on at the Settings->Comparison window. By default, the comparison tool uses conversion for synchronization only. This option allows us to use it for comparison as well.

Next, we open mapping window and use automatic map for all columns except column with product codes. For this column, we add SUBSTRING(%s,3,7) function that extracts the substring from the third symbol with seven symbols length and click "map column". The program replaces '%s' code string to actual column name during comparison project execution.

  DTM Data Comparer: switching on conversion option for comparison
 
DTM Data Comparer: code conversion string

After closing "mapping window" and rerun the data comparison process we got correctly compared product codes as shown at the screenshot.
As you can see the program shows converted value in the data grid for the second table now.

  DTM Data Comparer: compare product codes after conversion

Case 3: phone numbers with different format

The last case likes the second one but we use server side conversion as a part of the query. The first table contains 7-digit phone numbers but the second has a prefix with area code separated by space char. Also, the second table has dots as a group of digit separator.

  DTM Data Comparer: compare phone numbers without conversion

To make the first value comparable we should add are code with prefix and '.' insert signs as symbols separator. There is conversion expression that we use in the mentioned query instead of the first table:

'714 '+SUBSTRING(PhoneNo,1,3)+'.'+SUBSTRING(PhoneNo,4,2)+'.'+SUBSTRING(PhoneNo,6,2)

Now both strings have the same format and our comparison process can be completed. Like for case #2, the program shows converted values. But the reason is we use SQL statement (Query-to-Table comparison mode) instead of the original table.

  DTM Data Comparer: compare phone numbers after conversion