Partitioning

Partitioning a database improves performance and simplifies maintenance. By splitting a large table into smaller, individual tables, queries that access only a fraction of the data can run faster because there is less data to scan. Maintenance tasks, such as rebuilding indexes or backing up a table, can run more quickly.

Partitioning can be achieved without splitting tables by physically putting tables on individual disk drives. Putting a table on one physical drive and related tables on a separate drive can improve query performance because, when queries that involve joins between the tables are run, multiple disk heads read data at the same time. SQL Server filegroups can be used to specify on which disks to put the tables.

Hardware Partitioning

Hardware partitioning designs the database to take advantage of the available hardware architecture. Following are examples of hardware partitioning:

  • Multiprocessors that enable multiple threads of operations, permitting many queries to run at the same time. Alternatively, a single query may be able to run faster on multiple processors by letting components of the query run at the same time. For example, each table referenced in the query can be scanned at the same time by a different thread.

  • RAID (redundant array of independent disks) devices that enable data to be striped across multiple disk drives, permitting faster access to the data, because more read and write heads read data at the same time. A table striped across multiple drives can typically be scanned faster than the same table stored on a single drive. Alternatively, storing tables on separate drives from related tables can significantly improve the performance of queries joining those tables. For more information, see RAID.

Horizontal Partitioning

Horizontal partitioning divides a table into multiple tables. Each table then contains the same number of columns, but fewer rows. For example, a table that contains 1 billion rows could be partitioned horizontally into 12 tables, with each smaller table representing one month of data for a specific year. Any queries requiring data for a specific month only reference the appropriate table.

Determining how to partition the tables horizontally depends on how data is analyzed. You should partition the tables so that queries reference as few tables as possible. Otherwise, excessive UNION queries, used to merge the tables logically at query time, can affect performance. For more information about querying horizontally partitioned tables, see Scenarios for Using Views.

Partitioning data horizontally based on age and use is common. For example, a table may contain data for the last five years, but only data from the current year is regularly accessed. In this case, you may consider partitioning the data into five tables, with each table containing data from only one year.

For more information, see Partitioned Tables and Indexes.

Vertical Partitioning

Vertical partitioning divides a table into multiple tables that contain fewer columns. The two types of vertical partitioning are normalization and row splitting:

  • Normalization is the standard database process of removing redundant columns from a table and putting them in secondary tables that are linked to the primary table by primary key and foreign key relationships.

  • Row splitting divides the original table vertically into tables with fewer columns. Each logical row in a split table matches the same logical row in the other tables as identified by a UNIQUE KEY column that is identical in all of the partitioned tables. For example, joining the row with ID 712 from each split table re-creates the original row.

Like horizontal partitioning, vertical partitioning lets queries scan less data. This increases query performance. For example, a table that contains seven columns of which only the first four are generally referenced may benefit from splitting the last three columns into a separate table.

Vertical partitioning should be considered carefully, because analyzing data from multiple partitions requires queries that join the tables. Vertical partitioning also could affect performance if partitions are very large.