This Blog Post is meant to share how I think of what is KnowledgeBase in SQL Server 2012 Data Quality Services (DQS) – In my mind, Knowledge Base (KB) captures:
1. WHAT needs cleaning
2. HOW to clean what needs cleaning.
Let’s dive a little deeper, In DQS – a Knowledge Base let’s you do three things: Knowledge Discovery, Domain Management & Matching Policy.
1. Knowledge Discovery:
This activity helps you find “WHAT” needs cleaning. DQS has inbuilt algorithms that helps in analyzing errors, inconsistencies and data quality issues in the sample data-set.
2. Domain management:
This activity helps in defining the rules that will be applied to create “HOW” to clean the data.
3. Matching Policy:
This activity helps in identifying “WHAT” needs to be De-Duplicated (De-Dup) and then it goes about helping create the “HOW” to De-DUP the data.
In this short blog post, I shared how I think of what is Knowledge Base in SQL Server 2012 Data Quality Services. And here’s the official resource if you want to continue learning: DQS Knowledge Bases and Domains
Awesome Blog Posts about DQS written by fine Community folks:
- SQL SERVER – Step by Step Guide to Beginning Data Quality Services in SQL Server 2012 – Introduction to DQS (sqlauthority.com)
- SQL SERVER – Why Do We Need Data Quality Services – Importance and Significance of Data Quality Services (DQS) (sqlauthority.com)
- SQL SERVER – Advanced Data Quality Services with Melissa Data – Azure Data Market (sqlauthority.com)
- SQL Server 2012 – Data Quality Services (DQS) – Learning Resources (dattatreysindol.com)