定义了索引视图时的分区切换

将数据分区可以快速、有效地管理和访问数据的子集,同时又能维护整个数据集合的完整性。为分区数据定义索引视图可以进一步提高查询的速度和效率。所定义的这些视图称为“分区对齐索引视图”。

如果满足以下条件,则索引视图将与它引用的表实现分区对齐:

  • 索引视图和表的索引的分区函数:

    • 定义了相同数量的分区。

    • 定义了相同的分区边界值。

    • 分区函数的参数必须是相同的列。

  • 视图定义的投影列表包含已分区表的分区依据列(与包含分区依据列的表达式相对)。

  • 当视图定义执行分组时,分区依据列是包含在视图定义中的组合列之一。

  • 当视图引用多个表(使用联接、子查询、函数等)时,索引视图只与其中一个已分区表进行分区对齐。

在 SQL Server 2008 中,分区对齐索引视图可与为之定义这些视图的已分区表一起切换。现在,对数据分区或子集执行的维护操作将更有效,因为这些操作只针对所需的数据,而不是整个表。使用索引视图将数据分区的其他好处包括:

  • 自动维护。运行 INSERT、UPDATE 或 DELETE 语句时,SQL Server 2008 将自动维护索引视图。

  • 改进聚合。如果索引视图通过聚合减少了足够多的行,则会显著提高聚合查询的性能。在 SQL Server 2008 Enterprise 中,当查询优化器自动将聚合查询与相应的索引视图进行匹配或者查询使用 NOEXPAND 提示直接引用索引视图时,查询性能会比查询访问整个表时更有效。

将 ALTER TABLE...SWITCH 语句用于索引视图时的限制条件

使用 Transact-SQL ALTER TABLE...SWITCH 语句在索引视图引用的源表和目标表之间传输数据子集时,表和索引必须满足使用分区切换高效传输数据中所列的条件。请参阅该主题的“表和索引的要求”部分。

有关随定义的索引视图一起移动分区的详细信息,请参阅使用分区切换高效传输数据中的“移动分区的额外要求”部分。

可以使用分区对齐索引视图的一种典型情况是大型数据仓库。在大型数据仓库中,通常会按日期将较大的事实数据表分区,并且经常会对与多个维度表相联接的事实数据表运行聚合查询。通过使用 ALTER TABLE … SWITCH 语句来换入/换出某个分区有以下好处:

  • 迅速将事实数据表最早分区中的陈旧数据存档。

  • 迅速将新数据送入大型事实数据表的最新分区。

下图显示了将新数据送入事实数据表的最新分区时分区对齐索引视图的工作原理。切换到事实数据表中的新分区带有为新分区创建的聚合。

分区对齐索引视图切换

示例

下例显示如何使用 ALTER TABLE … SWITCH 语句切换出表的旧分区(其中带有部分分区对齐索引视图)。

在本例中,销售事实数据表 [f_sales] 包含 2006 年的全年销售数据。销售数据按季度进行了分区。频繁地运行按日期和商店汇总销售量的查询,并创建一个索引视图以提高针对该事实数据表所定义的聚合查询的速度。现在将 2006 年第一季度的数据存档到一个存档表中,事实数据表的索引视图仍保持正确和可用。

USE MASTER
GO

DROP DATABASE test_partition
GO

--Create database with the file groups that will be used by the partition schemes.
CREATE DATABASE test_partition
ON PRIMARY (NAME = 'paiv_Dat0', FILENAME='C:\temp\test_partition_DataFile0.mdf'),
FILEGROUP FG1 (NAME = 'paiv_Dat1', FILENAME = 'C:\temp\test_partition_DataFile1.ndf'),
FILEGROUP FG2 (NAME = 'paiv_Dat2', FILENAME = 'C:\temp\test_partition_DataFile2.ndf'),
FILEGROUP FG3 (NAME = 'paiv_Dat3', FILENAME = 'C:\temp\test_partition_DataFile3.ndf'),
FILEGROUP FG4 (NAME = 'paiv_Dat4', FILENAME = 'C:\temp\test_partition_DataFile4.ndf'),
FILEGROUP FG5 (NAME = 'paiv_Dat5', FILENAME = 'C:\temp\test_partition_DataFile5.ndf')
LOG ON (NAME = 'paiv_log', filename='C:\temp\test_partition_log.ldf')
GO
USE test_partition
GO

-- Create partition function and partition scheme.
CREATE PARTITION FUNCTION [PF1] (int)
AS RANGE LEFT FOR VALUES (20060331, 20060630, 20060930, 20061231);
GO
CREATE PARTITION SCHEME [PS1]
AS PARTITION [PF1] 
TO ([FG1], [FG2], [FG3], [FG4], [FG5]
, [PRIMARY]);
GO

-- Create fact table.
CREATE TABLE dbo.f_sales (date_key INT NOT NULL, cust_key INT, store_key INT, amt MONEY) ON PS1(date_key);
GO

-- Populate data into table f_sales. 
SET NOCOUNT ON
GO
DECLARE @d INT, @ds INT, @cs INT, @s INT
SET @d = 20060101
SET @ds = 7  -- date_key increment step

WHILE (@d <= 20061231) 
BEGIN
WHILE @d%100 > 0 AND @d%100 < 29
BEGIN
SET @cs = 10  -- # of records for customer/store for that date
SET @s = CAST ( RAND() * 1000 as INT )
WHILE (@cs > 0)
BEGIN
INSERT dbo.f_sales (date_key, cust_key, store_key, amt)
VALUES (@d, CAST ( RAND() * 1000 as INT ), @s, CAST ( RAND() * 1000 as MONEY ) )
SET @cs = @cs - 1
END
SET @d = @d + @ds
END
SET @d = @d + @ds
END
GO


-- The table with clustered index is partitioned using the partition scheme specified.
CREATE CLUSTERED INDEX UCIdx_f_sales on dbo.f_sales (date_key, cust_key, store_key) ON PS1(date_key)
GO

--Create indexed view, which aggregates on the date and store.
CREATE VIEW dbo.v_f_sales_sumamt WITH SCHEMABINDING AS
(
SELECT date_key, store_key, sum(ISNULL(amt,0)) AS amt, count_big(*) AS cnt
FROM dbo.f_sales AS sales
GROUP BY date_key, store_key
)
GO

-- Materialize the view. The indexed view is now partition-aligned with table f_sales.
CREATE UNIQUE CLUSTERED INDEX ucidx_v_sales_sum
ON dbo.v_f_sales_sumamt (date_key) ON PS1(date_key)
GO

-- Check data distribution in various partitions of the table & the indexed view.
SELECT OBJECT_NAME(p.object_id) as obj_name, p.index_id, p.partition_number, p.rows, a.type, a.filegroup_id 
FROM sys.system_internals_allocation_units a
JOIN sys.partitions p
ON p.partition_id = a.container_id
WHERE p.object_id IN (OBJECT_ID(N'dbo.f_sales'), OBJECT_ID(N'dbo.v_f_sales_sumamt '))
ORDER BY obj_name, p.index_id, p.partition_number

-- Create archive table to receive the partition that will be switched out of table f_sales. 
CREATE TABLE dbo.sales_archive (date_key INT NOT NULL, cust_key INT, store_key INT, amt MONEY) ON FG1
GO
CREATE CLUSTERED INDEX UCIdx_sales_archive on dbo.sales_archive (date_key, cust_key, store_key) ON FG1
GO
--Create indexed view with view definition matching v_f_sales_sumamt on table f_sales.
CREATE VIEW dbo.v_sales_archive_sumamt WITH SCHEMABINDING AS
(
SELECT date_key, store_key, sum(ISNULL(amt,0)) AS amt, count_big(*) AS cnt
FROM dbo.sales_archive AS sales
GROUP BY date_key, store_key
)
GO

-- Materialize the view. The indexed view is partition-aligned with table sales_archive.
CREATE UNIQUE CLUSTERED INDEX ucidx_v_sales_sum
ON dbo.v_sales_archive_sumamt(date_key) ON FG1
GO

-- Check data distribution in various partitions of the table and the indexed view. 
SELECT OBJECT_NAME(p.object_id) as obj_name, p.index_id, p.partition_number, p.rows, a.type, a.filegroup_id 
FROM sys.system_internals_allocation_units a
JOIN sys.partitions p
ON p.partition_id = a.container_id
WHERE p.object_id IN (OBJECT_ID(N'dbo.sales_archive'), OBJECT_ID(N'dbo.v_sales_archive_sumamt '))
ORDER BY obj_name, p.index_id, p.partition_number

-- Data associated with the old partition of the source table - [f_sales] and the indexed view [v_f_sales_sumamt] - 
-- is switched out to archive table [sales_archive] and the indexed view [v_sales_archive_sumamt].
ALTER TABLE dbo.f_sales SWITCH PARTITION 1 TO dbo.sales_archive

-- Data distribution in various partitions shows that 
-- partition 1 of [f_sales] and the indexed view [v_f_sales_sumamt] are now empty 
-- and these rows are now in [sales_archive] and [v_sales_archive_sumamt], respectively.
SELECT OBJECT_NAME(p.object_id) as obj_name, p.index_id, p.partition_number, p.rows, a.type, a.filegroup_id 
FROM sys.system_internals_allocation_units a
JOIN sys.partitions p
ON p.partition_id = a.container_id
WHERE p.object_id IN (OBJECT_ID(N'dbo.sales_archive'), OBJECT_ID(N'dbo.v_sales_archive_sumamt '), 
OBJECT_ID(N'dbo.f_sales'), OBJECT_ID(N'dbo.v_f_sales_sumamt '))
ORDER BY obj_name, p.index_id, p.partition_number

如上例所示,切换分区后,表 [f_sales] 的分区 1 和索引视图 [v_f_sales_sumamt] 中的所有数据均被移到对应的表 [sales_archive] 和索引视图 [v_sales_archive_sumamt] 中。

若要获取 2006 年 7 月按日期和商店汇总的销售量,可以查询表 [f_sales] 或索引视图 [v_f_sales_sumamt],如以下两个查询所示。这两种情况下的结果相同,但是运行索引视图查询会显著提高查询性能,因为索引视图将预先计算的聚合进行了实体化并将行数减至原来的 1/10,如下例所示。

-- This query runs against the table [f_sales]
SELECT date_key, store_key AS [Store Number], sum(ISNULL(amt,0)) AS Sales_Amount
FROM dbo.f_sales
WHERE date_key >= '20060701' AND date_key < '20060801'
GROUP BY date_key, store_key
ORDER BY date_key, store_key
OPTION (EXPAND VIEWS)

-- This query runs against the indexed view [v_f_sales_sumamt]
-- the result of this query is the same as the one against the table
-- the indexed view materializes the pre-calculated aggregate, resulting in significant improvements in query performance   
SELECT date_key, store_key AS [Store Number], sum(ISNULL(amt,0)) AS Sales_Amount
FROM dbo.v_f_sales_sumamt WITH (NOEXPAND)
WHERE date_key >= '20060701' AND date_key < '20060801'
GROUP BY date_key, store_key

使用 ALTER TABLE…SWITCH 语句进行分区切换操作会非常迅速,并且只涉及元数据。当索引视图与源表和目标表进行分区对齐时,通过分区切换可以将源表中的数据子集送入目标表,同时还引入对齐索引视图的相关部分。数据集合的完整性得以维护,因为与源表关联的所有对象(例如表、索引和索引视图)都包含在分区切换中。通过使用已分区表的索引视图,可以显著提高引用该表的聚合查询的性能。