Creating a Partitioned View
A partitioned view joins horizontally partitioned data from a set of member tables across one or more servers, making the data appear as if from one table. Microsoft® SQL Server™ 2000 distinguishes between local and distributed partitioned views. In a local partitioned view, all participating tables and the view reside on the same instance of SQL Server. In a distributed partitioned view, at least one of the participating tables resides on a different (remote) server. In addition, SQL Server 2000 differentiates between partitioned views that are updatable and views that are read-only copies of the underlying tables.
Distributed partitioned views can be used to implement a federation of database servers. A federation is a group of servers administered independently, but which cooperate to share the processing load of a system. Forming a federation of database servers by partitioning data is the mechanism that enables you to scale out a set of servers to support the processing requirements of large, multitiered Web sites. For more information, see Designing Federated Database Servers.
Before implementing a partitioned view, you must first partition a table horizontally. In designing a partitioning scheme, it must be clear what data belongs to each member table. The original table is replaced with several smaller member tables. Each member table has the same number of columns as the original table, and each column has the same attributes (such as data type, size, collation) as the corresponding column in the original table. If you are creating a distributed partitioned view, each member table is on a separate member server. For the greatest location transparency, the name of the member databases should be the same on each member server, although this is not a requirement. For example: Server1.CustomerDB, Server2.CustomerDB, Server3.CustomerDB.
You design the member tables so that each table stores a horizontal slice of the original table based on a range of key values. The ranges are based on the data values in a partitioning column. The range of values in each member table is enforced by a CHECK constraint on the partitioning column, and ranges cannot overlap. For example, you cannot have one table with a range from 1 through 200000, and another with a range from 150000 through 300000 because it would not be clear which table contains the values from 150000 through 200000.
For example, you are partitioning a Customer table into three tables. The CHECK constraint for these tables is:
-- On Server1: CREATE TABLE Customers_33 (CustomerID INTEGER PRIMARY KEY CHECK (CustomerID BETWEEN 1 AND 32999), ... -- Additional column definitions) -- On Server2: CREATE TABLE Customers_66 (CustomerID INTEGER PRIMARY KEY CHECK (CustomerID BETWEEN 33000 AND 65999), ... -- Additional column definitions) -- On Server3: CREATE TABLE Customers_99 (CustomerID INTEGER PRIMARY KEY CHECK (CustomerID BETWEEN 66000 AND 99999), ... -- Additional column definitions)
After creating the member tables, you define a distributed partitioned view on each member server, with each view having the same name. This allows queries referencing the distributed partitioned view name to run on any of the member servers. The system operates as if a copy of the original table is on each member server, but each server has only a member table and a distributed partitioned view. The location of the data is transparent to the application.
You build the distributed partitioned views by:
- Adding linked server definitions on each member server containing the connection information needed to execute distributed queries on the other member servers. This gives a distributed partitioned view access to data on the other servers.
- Setting the lazy schema validation option, using sp_serveroption, for each linked server definition used in distributed partitioned views. This optimizes performance by ensuring the query processor does not request meta data for any of the linked tables until data is actually needed from the remote member table.
- Creating a distributed partitioned view on each member server. The views use distributed SELECT statements to access data from the linked member servers, and merges the distributed rows with rows from the local member table.
To create distributed partitioned views for the preceding example, you must:
- Add a linked-server definition named Server2 with the connection information for Server2, and a linked server definition named Server3 for access to Server3.
- Create this distributed partitioned view:
CREATE VIEW Customers AS SELECT * FROM CompanyDatabase.TableOwner.Customers_33 UNION ALL SELECT * FROM Server2.CompanyDatabase.TableOwner.Customers_66 UNION ALL SELECT * FROM Server3.CompanyDatabase.TableOwner.Customers_99
- Perform the same steps on Server2 and Server3.
Updatable Partitioned Views
If a local or distributed partitioned view is not updatable, it can serve only as a read-only copy of the original table. An updatable partitioned view can exhibit all the capabilities of the original table.
A view is considered an updatable partitioned view if:
- The view is a set of SELECT statements whose individual result sets are combined into one using the UNION ALL statement. Each individual SELECT statement references one SQL Server base table. The table can be either a local table or a linked table referenced using a four-part name, the OPENROWSET function, or the OPENDATASOURCE function (you cannot use an OPENDATASOURCE or OPENROWSET function that specifies a pass-through query).
The view will not be updatable if a trigger or cascading update or delete is defined on one or more member tables.
Member tables are defined in the FROM clause in each SELECT statement in the view definition. Each member table must adhere to these rules:
- Member tables cannot be referenced more than once in the view.
- Member tables cannot have indexes created on any computed columns.
- Member tables must have all PRIMARY KEY constraints on an identical number of columns.
- Member tables must have the same ANSI padding setting. For more information about the ANSI padding setting, see SET ANSI_PADDING.
Columns are defined in the select list of each SELECT statement in the view definition. The columns must follow these rules.
- All columns in each member table must be included in the select list. SELECT * FROM <member table> is acceptable syntax.
- Columns cannot be referenced more than once in the select list.
- The columns must be in the same ordinal position in the select list
- The columns in the select list of each SELECT statement must be of the same type (including data type, precision, scale, and collation). For example, this view definition fails because the first column in both SELECT statements does not have the same data type:
CREATE VIEW NonUpdatable AS SELECT IntPrimaryKey, IntPartNmbr FROM FirstTable UNION ALL SELECT NumericPrimaryKey, IntPartNmbr FROM SecondTable
Partitioning Column Rules
A partitioning column exists on each member table and, through CHECK constraints, identifies the data available in that specific table. Partitioning columns must adhere to these rules:
- Each base table has a partitioning column whose key values are enforced by CHECK constraints. The key ranges of the CHECK constraints in each table do not overlap with the ranges of any other table. Any given value of the partitioning column must map to only one table. The CHECK constraints can only use these operators: BETWEEN, AND, OR, <, <=, >, >=, =.
- The partitioning column cannot be an identity, default or timestamp column.
- The partitioning column must be in the same ordinal location in the select list of each SELECT statement in the view. For example, the partitioning column is always the first column in each select list, or the second column in each select list, and so on.
- Partitioning columns cannot allow nulls.
- Partitioning columns must be a part of the primary key of the table.
- Partitioning columns cannot be computed columns.
- There must be only one constraint on the partitioning column. If there is more than one constraint, SQL Server ignores all the constraints and will not consider them when determining whether or not the view is a partitioned view.
- There are no restrictions on the updatability of the partitioning columns.
A partitioned column that meets all these rules will support all of the optimizations that are supported by the SQL Server 2000 query optimizer. For more information, see Resolving Distributed Partitioned Views.
Data Modification Rules
In addition to the rules defined for updatable partitioned views, data modification statements referencing the view must adhere to the rules defined for INSERT, UPDATE and DELETE statements.
Note You can modify data through a partitioned view only if you install Microsoft SQL Server 2000 Enterprise Edition or Microsoft. SQL Server 2000 Developer Edition.
INSERT statements add data to the member tables through the partitioned view. The INSERT statements must adhere to these rules:
- All columns must be included in the INSERT statement even if the column can be NULL in the base table or has a DEFAULT constraint defined in the base table.
- The DEFAULT keyword cannot be specified in the VALUES clause of the INSERT statement.
- INSERT statements must supply a value that satisfies the logic of the CHECK constraint defined on the partitioning column for one of the member tables.
- INSERT statements are not allowed if a member table contains a column with an identity property.
- INSERT statements are not allowed if a member table contains a timestamp column.
- INSERT statements are not allowed if there is a self-join with the same view or any of the member table.
UPDATE statements modify data in one or more of the member tables through the partitioned view. The UPDATE statements must adhere to these rules:
- UPDATE statements cannot specify the DEFAULT keyword as a value in the SET clause even if the column has a DEFAULT value defined in the corresponding member table
- The value of a column with an identity property cannot be changed: however, the other columns can be updated.
- The value of a PRIMARY KEY cannot be changed if the column contains text, image or ntext data.
- Updates are not allowed if a base table contains a timestamp column.
- Updates are not allowed if there is a self-join with the same view or any of the member tables.
- The DEFAULT keyword cannot be specified in the SET clause of the UPDATE statement.
DELETE statements remove data in one or more of the member tables through the partitioned view. The DELETE statements must adhere to this rule:
- DELETE statements are not allowed if there is a self-join with the same view or any of the member tables.
Distributed Partition View Rules
In addition to the rules defined for partitioned views, distributed (remote) partition views have these additional conditions:
- A distributed transaction will be started to ensure atomicity across all nodes affected by the update.
- The XACT_ABORT SET option must be set to ON.
- smallmoney and smalldatetime columns in remote tables are mapped as money and datetime respectively. Consequently, the corresponding columns in the local tables should also be money and datetime.
- Any linked server cannot be a loopback linked server, that is, a linked server that points to the same instance of SQL Server.
A view that references partitioned tables without following all these rules may still be updatable if there is an INSTEAD OF trigger on the view. The query optimizer, however, may not always be able to build execution plans for a view with an INSTEAD OF trigger that are as efficient as the plans for a partitioned view that follows all of the rules.