Data Integrity

Enforcing data integrity guarantees the quality of the data in the database. For example, if an employee is entered with an employee ID value of 123, the database should not permit another employee to have an ID with the same value. If you have an employee_rating column intended to have values ranging from 1 to 5, the database should not accept a value outside that range. If the table has a dept_id column that stores the department number for the employee, the database should permit only values that are valid for the department numbers in the company.

Two important steps in planning tables are to identify valid values for a column and to decide how to enforce the integrity of the data in the column. Data integrity falls into the following categories:

  • Entity integrity

  • Domain integrity

  • Referential integrity

  • User-defined integrity

Entity Integrity

Entity integrity defines a row as a unique entity for a particular table. Entity integrity enforces the integrity of the identifier columns or the primary key of a table, through UNIQUE indexes, UNIQUE constraints or PRIMARY KEY constraints.

Domain Integrity

Domain integrity is the validity of entries for a specific column. You can enforce domain integrity to restrict the type by using data types, restrict the format by using CHECK constraints and rules, or restrict the range of possible values by using FOREIGN KEY constraints, CHECK constraints, DEFAULT definitions, NOT NULL definitions, and rules.

Referential Integrity

Referential integrity preserves the defined relationships between tables when rows are entered or deleted. In SQL Server, referential integrity is based on relationships between foreign keys and primary keys or between foreign keys and unique keys, through FOREIGN KEY and CHECK constraints. Referential integrity makes sure that key values are consistent across tables. This kind of consistency requires that there are no references to nonexistent values and that if a key value changes, all references to it change consistently throughout the database.

When you enforce referential integrity, SQL Server prevents users from doing the following:

  • Adding or changing rows to a related table if there is no associated row in the primary table.

  • Changing values in a primary table that causes orphaned rows in a related table.

  • Deleting rows from a primary table if there are matching related rows.

For example, with the Sales.SalesOrderDetail and Production.Product tables in the AdventureWorks2008R2 database, referential integrity is based on the relationship between the foreign key (ProductID) in the Sales.SalesOrderDetail table and the primary key (ProductID) in the Production.Product table. This relationship makes sure that a sales order can never reference a product that does not exist in the Production.Product table.

Referential integrity using foreign/primary keys

User-Defined Integrity

User-defined integrity lets you define specific business rules that do not fall into one of the other integrity categories. All the integrity categories support user-defined integrity. This includes all column-level and table-level constraints in CREATE TABLE, stored procedures, and triggers.