Project Description
The SSIS DQS Matching Transformation uses Data Quality Services (DQS) to find duplicate data within the SSIS data flow.
A note beforehand: The DQS API is not officially supported by Microsoft.
The SSIS DQS matching component is developed and tested for SQL Server 2012 with DQS assembly version 11.0.3000.0. We try to support future versions of DQS and SQL Server. However, we cannot guarantee that the component will work with all future versions of
DQS.
The Data Quality Services (DQS) data matching process enables reducing duplicate data. The matching analyzes the degree of duplication in all records. It returns weighted probabilities of a match between each set of compared records.
Similar to standard data quality processes in DQS, you have to perform the matching by building a knowledge base. With the SSIS DQS Matching component you can execute a matching activity in a SSIS package.
You will find additional information about creating a knowledge base and the matching policy in the following MSDN articles:Data matching.
With the SSIS DQS Matching component you could check duplicate data from different data sources - whether the data source is a SQL Server, an Excel file, an Oracle database or any other data sources.
A very good description on how to use the component can be found on the Data Quality Services Blog:
Automating the data matching process in SQL Server Data Quality Services (DQS)
More Information:
- Automating the data matching process in SQL Server Data Quality Services (DQS)
- Matching Policy - A Closer Look into Data Quality Services Data Matching
- Data Matching
- Run a Matching Project
- Data Quality Matching in the MDS Add-in for Excel
- DQS Cleansing Transformation
- Overview of the DQS Cleansing Transform
- Using the SSIS DQS Cleansing Component