Designing Indexed Views

Views are also known as virtual tables because the result set returned by the view has the same general form as a table with columns and rows, and views can be referenced just like tables in SQL statements. The result set of a standard view is not stored permanently in the database. Every time a query references a standard view, SQL Server substitutes the definition of the view into the query internally until a modified query is formed that only references base tables. It then runs the resulting query as usual. For more information, see View Resolution.

For a standard view, the overhead of dynamically building the result set for each query that references a view can be significant for views that involve complex processing of large numbers of rows, such as aggregating lots of data, or joining many rows. If such views are frequently referenced in queries, you can improve performance by creating a unique clustered index on the view. When a unique clustered index is created on a view, the result set is stored in the database just like a table with a clustered index is stored.

Another benefit of creating an index on a view is that the optimizer starts using the view index in queries that do not directly name the view in the FROM clause. Existing queries can benefit from the improved efficiency of retrieving data from the indexed view without having to be recoded. For more information, see Resolving Indexes on Views.

As modifications are made to the data in the base tables, the data modifications are reflected in the data stored in the indexed view. The requirement that the clustered index of the view be unique improves the efficiency with which SQL Server can find the rows in the index that are affected by any data modification.

The capability of the query optimizer to take advantage of indexed views when it processes queries has improved over earlier versions when both the query and the view definition contain the following matching elements:

  • Scalar expressions. For example, the query optimizer can match the following query with a scalar expression in its predicate:

    SELECT ColA, ColB FROM TableT WHERE ColC * (ColD + 10) > 50
    

    To an index created on this view:

    CREATE VIEW V1 WITH SCHEMABINDING AS
    SELECT ColA, ColB, ColC * (ColD + 10) AS ExpCol
    FROM dbo.TableT 
    

    Scalar expressions that include user-defined functions can also be matched in a similar manner.

  • Scalar aggregate functions. For example, the following query that contains a scalar aggregate function in its SELECT list:

    SELECT COUNT_BIG (*) FROM dbo.TableT
    

    Can be matched with an index created on this view:

    CREATE VIEW V2 WITH SCHEMABINDING AS
    SELECT COUNT_BIG (*) AS Cnt 
    FROM dbo.TableT 
    

The query optimizer will also consider the following when it selects a query plan:

  • Whether an interval of values defined in a query predicate falls within an interval defined in an indexed view. For example, consider an index created on the following view:

    CREATE VIEW V1 WITH SCHEMABINDING AS
    SELECT ColA, ColB, ColC FROM dbo.TableT
    WHERE ColA > 1 and ColA < 10
    

    Now consider the following query:

    SELECT ColB, ColC FROM dbo.TableT
    WHERE ColA > 3 and ColA < 7
    

    The query optimizer would match this query with view V1 because the interval between 3 and 7 defined in the query falls within the interval between 1 and 10 defined in the indexed view.

  • How much an expression defined in a query is equivalent to that defined in an indexed view. SQL Server tries to match expressions by considering their column references, literals, the logical operators AND, OR, NOT, BETWEEN and IN, and the comparison operators =, <>, >, <, >=, and <=. Arithmetic operators, such as + and %, and parameters are not considered.

    For example, the query optimizer would match the following query:

    SELECT ColA, ColB from dbo.TableT
    WHERE ColA < ColB 
    

    With an index created on this view:

    CREATE VIEW V1 WITH SCHEMABINDING AS
    SELECT ColA, ColB FROM dbo.TableT
    WHERE ColB > ColA 
    

Remember, as is the case with all indexes, SQL Server chooses to use an indexed view in its query plan only if the query optimizer determines whether it helps to do this.

Indexed views can be created in any edition of SQL Server 2008. In SQL Server 2008 Enterprise, the query optimizer automatically considers the indexed view. To use an indexed view in all other editions, the NOEXPAND table hint must be used.

Guidelines for Designing an Indexed View.

Indexed views work best when the underlying data is infrequently updated. The maintenance of an indexed view can be greater than the cost of maintaining a table index. If the underlying data is updated frequently, the cost of maintaining the indexed view data may outweigh the performance benefits of using the indexed view. If the underlying data is updated periodically in batches but treated primarily as read-only between updates, consider dropping any indexed views before updating, and rebuilding them afterward. Doing this may improve performance of the updates.

Indexed views improve the performance of the following types of queries:

  • Joins and aggregations that process many rows.

  • Join and aggregation operations that are frequently performed by many queries.

    For example, in an online-transaction-processing (OLTP) database that is recording inventories, many queries would be expected to join the ProductMaster, ProductVendor, and VendorMaster tables. Although each query that performs this join may not process many rows, the overall join processing of hundreds of thousands of such queries can be significant. Because these relationships are not likely to be updated frequently, the overall performance of the whole system could be improved by defining an indexed view that stores the joined results.

  • Decision support workloads.

    Analysis systems are characterized by storing summarized, aggregated data that is infrequently updated. Additionally aggregating the data and joining many rows characterizes many decision support queries. Also, decision support systems sometimes contain wide tables with many columns or columns that are large, or both. Queries that reference a narrow subset of these columns can benefit from an indexed view that includes only the columns in the query, or a narrow superset of those columns. Creating narrow indexed views that contain a subset of the columns of a single table is known as a vertical partitioning strategy because it splits tables vertically. For example, consider the following table and indexed view:

    CREATE TABLE wide_tbl(a int PRIMARY KEY, b int, ..., z int)
    CREATE VIEW v_abc WITH SCHEMABINDING AS
    SELECT a, b, c
    FROM dbo.wide_tbl
    WHERE a BETWEEN 0 AND 1000
    CREATE UNIQUE CLUSTERED INDEX i_abc ON v_abc(a)
    

    The following query can be answered just using v_abc:

    SELECT b, count_big(*), SUM(c)
    FROM wide_tbl 
    WHERE a BETWEEN 0 AND 1000
    GROUP BY b
    

    View v_abc occupies many fewer pages than table wide_tbl. Therefore, it may be better for the optimizer to choose it as an access path to solve the previous query.

    If you want to split a complete table vertically, instead of a subset of it, we recommend that you use a nonclustered index on the table that uses an INCLUDE clause to include only the columns that you want, instead of an indexed view. For more information, see CREATE INDEX (Transact-SQL).

Indexed views typically do not improve the performance of the following types of queries:

  • OLTP systems that have many writes.

  • Databases that have many updates.

  • Queries that do not involve aggregations or joins.

  • Aggregations of data with a high degree of cardinality for the GROUP BY key. A high degree of cardinality means the key contains many different values. A unique key has the highest possible degree of cardinality because every key has a different value. Indexed views improve performance by reducing the number of rows a query has to access. If the view result set has almost as many rows as the base table, there is little performance benefit from using the view. For example, consider the following query on a table that has 1,000 rows:

    SELECT PriKey, SUM(SalesCol)
    FROM ExampleTable
    GROUP BY PriKey
    

    If the cardinality of the table key is 100, an indexed view built using the result of this query would only have 100 rows. Queries that use the view would on average require one tenth of the reads required against the base table. If the key is a unique key, the cardinality of the key is 1000 and the view result set returns 1000 rows. If the view and the ExampleTable base table have rows of equal size, a query has no performance gain by using this indexed view instead of directly reading the base table.

  • Expanding joins. These are views that have result sets that are larger than the original data in the base tables.

Combining Indexed Views with Queries

Although the restrictions on the types of views that can be indexed may prevent you from designing a view that solves a complete problem, you may be able to design multiple smaller indexed views that speed parts of the process.

Consider the following examples:

  • A frequently executed query aggregates data in one database, aggregates data in another database, and then joins the results. Because an indexed view cannot reference tables from more than one database, you cannot design a single view to perform the whole process. However, you can create an indexed view in each database that does the aggregation for that database. If the optimizer can match the indexed views against existing queries, at least the aggregation processing will be speeded up without the requirement to recode existing queries. Although the join processing is not faster, the overall query is faster because it uses the aggregations stored in the indexed views.

  • A frequently executed query aggregates data from several tables, and then uses UNION to combine the results. UNION is not allowed in an indexed view. You can again design views to do each of the individual aggregation operations. The optimizer can then select the indexed views to speed up queries without a requirement to recode the queries. While the UNION processing is not improved, the individual aggregation processes are.

Design indexed views that can satisfy multiple operations. Because the optimizer can use an indexed view even when it is not specified in the FROM clause, a well-designed indexed view can speed the processing of many queries.

For example, consider creating an index on the following view:

CREATE VIEW ExampleView WITH SCHEMABINDING
AS
SELECT GroupKey, SUM(Colx) AS SumColx, COUNT_BIG(Colx) AS CountColx
FROM MyTable
GROUP BY GroupKey

Not only can this view satisfy queries that directly reference the view columns, it can also be used to satisfy queries that query the base table and contain expressions such as SUM(Colx), COUNT_BIG(Colx), COUNT(Colx), and AVG(Colx). All such queries will be faster because they only have to retrieve the small number of rows in the view instead of reading the full number of rows from the base tables.

Similarly, an indexed view that aggregates data and groups by day can be used to satisfy queries that aggregate over several different ranges of more than 1 day, such as 7, 30, or 90 days.