Adding Knowledge to a Knowledge Base
This topic describes the ways in which you can add knowledge to a knowledge base in Data Quality Services (DQS). Before you can perform data quality operations, you have to have knowledge about the data. You acquire that knowledge by building and maintaining a data quality knowledge base, and adding to it knowledge related to a specific type of data source. The knowledge base is a repository of knowledge about your data that enables you to understand your data and maintain its integrity.
The knowledge base contains data domains that relate to the data source. For each data domain, the knowledge base stores all identified terms, spelling errors, validation and business rules, and reference data that can be used to perform data quality actions on the data source. DQS uses this knowledge to identify incorrect or invalid data, or perform matching.
You can add knowledge to a knowledge base in the following computer-assisted or interactive ways.
Knowledge discovery analyzes a sample of data for data quality criteria, and then adds the knowledge it has gained to the knowledge base. This is a computer-assisted process that identifies data inconsistencies and syntax errors, and proposes changes to the data. The knowledge discovery activity is a wizard that includes a page that you can interactively manage domain values on. For more information, see Perform Knowledge Discovery.
DQS enables you to interactively change and augment the metadata that is generated by the computer-assisted knowledge discovery activity. You do so in the Domain Management activity, where you can apply a change to a specific data value.
For more information in documentation, see Change Domain Values.
For a video demonstrating how to perform domain management, click here. Note that in this video, you change domain values in the Managing Domain Values page of the Knowledge Discovery wizard. You can also perform these steps in the Domain Values Page of the Domain Management activity.
You can import a domain from a .dqs data file into an existing knowledge base, or you can import an entire knowledge base from a .dqs into a new knowledge base. To do so, you first need to export an existing domain or knowledge base to a .dqs file. A .dqs file containing a domain includes all domain data; a .dqs file containing a knowledge base will contain all knowledge base information, including domains and the matching policy.
You can import domain values from an Excel spreadsheet file into an existing domain or knowledge base. To do so, you must first create an Excel spreadsheet with the domain values that you want to import, and ensure that Excel is installed on the Data Quality Client computer for you to be able to import values using Data Quality Client. You cannot export domain values from a domain or knowledge base to an Excel file.
For more information in documentation, see Import Values from an Excel File into a Domain or Import Domains from an Excel File in Knowledge Discovery.
After you have run a cleansing data quality project using a knowledge base, you can import knowledge created during cleansing back into that knowledge base. This enables you to keep knowledge generated during the project, and to continuously build the knowledge in the knowledge base.
For more information in documentation, see Import Cleansing Project Values into a Domain.
DQS ships with a pre-built knowledge base called DQS Data that contains domains for United States company and address data. This knowledge base can be used to quickly start a project without creating a new knowledge base. The DQS Data knowledge base is read-only, but the data steward can create a new knowledge base based on it.
For more information in documentation, see Using the DQS Default Knowledge Base.