I was playing with Data Quality Services when I though that it would be nice if I could implement what I had learned and create a quick demo that shows how to standardize data using the technology. So here it goes:
Note: This is not a step by step Guide – I am assuming that you are aware of different pieces of DQS. I am going to post a brief outline of how did I go about implementing a quick demo.
1) SQL Server 2012 Data Quality Services
2) SQL Server Integration services to show the DQS cleansing component in action.
We want to standardize to Gender to have only two values M & F – but now data coming from two files creates data inconsistency issues:
Two files that look like this:
File 1: (Excel)
File 2: (CSV)
Let’s see how SQL server 2012’s data quality services would help us solve this problem:
1) I created a Knowledge base with a domain Gender and created domain values that looks like this:
2) Tested the Knowledge base quickly by creating a Data Quality Project. It worked!
3) Switched to SSIS and created a Data Flow which uses following components:
What would be interesting to you to see mapping tab in the DQS cleansing transformation component, so here it is:
5) And let me show you the Target File – note that you can choose to NOT have Gender_Source, Gender_Status & Record Status. The column of interest is Gender_Output
That’s about it for this post. I hope this helps!
- How to Delete a Knowledge Base in SQL Server 2012 Data Quality Services? (parasdoshi.com)
- How I think of what is KnowledgeBase in SQL Server 2012 Data Quality Services (parasdoshi.com)
- Guest Blog: How we use Fuzzy Lookup add-in in our company to solve data inconsistency problems: (parasdoshi.com)
- Master Data Services: Analogy to remember what are Models, Entities and Attriutes (parasdoshi.com)