What's New (Analysis Services - Data Mining)

Note

For SQL Server 2008 R2, there have been no changes to the content that is listed in this topic.

This latest release of Microsoft SQL Server Analysis Services introduces new features and enhancements. For information about the new in-memory analysis features, and about Sharepoint integrated mode for Analysis Services, see PowerPivot for Sharepoint, see PowerPivot for SharePoint.

Compatibility with the Data Mining Add-Ins for Office 2007

SQL Server 2008 R2 supports the creation, management, and use of data mining models from Microsoft Excel when you use the SQL Server 2008 Data Mining Add-ins for Office 2007. The version of this popular free add-in can be used to connect to instances of Analysis Services that are using either SQL Server 2008 R2 or SQL Server 2008. You cannot directly use the in-memory multidimensional data sets that are created by PowerPivot for Excel.

Compatibility with PowerPivot for Excel 2010

You can install the Data Mining Add-ins on the same computer as the PowerPivot for Excel add-in, and use them within the same Excel 2010 workbook. However, to use the Data Mining add-ins you must have installed a 32-bit version of Excel 2010. The PowerPivot client can run on either a 32-bit or 64-bit version of Excel 2010.

For more information about the add-ins, see Data Mining Add-ins for Office 2007.

New Samples and Resources

Books Online no longer includes SQL Server sample databases and sample applications. These sample databases and sample applications are now available on the SQL Server Samples Web site. This Web site makes it easier for users to find these samples, and provides additional new samples that are related to Microsoft SQL Server and Business Intelligence. On the SQL Server Samples Web site, you can do the following:

  • Browse through samples contributed by developers, users, and the Microsoft Most Valuable Professional (MVP) community.

  • Download both sample databases and code projects.

  • View or participate in a discussion area where you can report issues and ask questions about the samples for each technology area.

  • Additional samples that use the PowerPivot client and the new Sharepoint Integrated instance of Analysis Services can be found on the site, PowerPivot.com.

Data Mining Features in SQL Server 2008

The R2 release of SQL Server 2008 supports the following features that were new in SQL Server 2008.

Creation of Holdout Test Sets

When you create a mining structure, you can now divide the data in the mining structure into training and testing sets. The definition of the partition is stored with the structure, so that you can reuse the training and testing sets with any mining models that are based on that structure.

For more information about how to use training and testing data sets, see Partitioning Data into Training and Testing Sets (Analysis Services - Data Mining).

For more information about all the model validation features in SQL Server 2008, see Validating Data Mining Models (Analysis Services - Data Mining).

Filtering on Model Cases

You can now attach filters to a mining model, and apply the filter during both training and testing. Applying a filter to the model lets you control the data that is used to train the model, and lets you more easily assess the performance of the model on subsets of the data.

For more information about how to create mining model filters, see Creating Filters for Mining Models (Analysis Services - Data Mining).

For information about filtering data for mining model testing, see Tools for Charting Model Accuracy (Analysis Services - Data Mining).

Cross-Validation of Multiple Mining Models

Cross-validation is an established method of assessing the accuracy of data mining models. In cross-validation, you iteratively partition the mining structure data into subsets, build models on the subsets, and then measure the accuracy of the model for each partition. By reviewing the returned statistics, you can determine how reliable the mining model is, and more easily compare models that are based on the same structure.

For more information, Cross-Validation (Analysis Services - Data Mining).

Enhancements to the Microsoft Time Series Algorithm

To improve the accuracy and stability of some predictions in time series models, a new algorithm has been added to the Microsoft Time Series algorithm. Based on the well-known ARIMA algorithm, the new algorithm provides better long-term predictions than the ARTxp algorithm that Analysis Services has been using. (ARTxp is an auto-regressive tree algorithm that is optimized for either a single time slice or short-term predictions.)

For more information about time series mining models, see Microsoft Time Series Algorithm and PredictTimeSeries (DMX).

Drillthrough to Structure Cases and Structure Columns

In SQL Server 2008, if you enable drillthrough on a mining structure, you can query the mining structure and return details about the cases used for both training and testing. You can create drillthrough queries on a structure by using Data Mining Extensions (DMX).

For more information, see Using Drillthrough on Mining Models and Mining Structures (Analysis Services - Data Mining).

For examples of DMX queries on a mining structure, see SELECT FROM <structure>.CASES.

For examples of drillthrough from a model to structure data, see SELECT FROM <model>.CASES (DMX).

Aliasing Mining Model Columns

You can now add aliases to columns in a mining model to make it easier to understand column content and reference the column in DMX statements.

For more information about how to manage and view aliases, see Setting Properties on a Mining Model, or How to: Create an Alias for a Model Column.

For information about how to create a column alias by using DMX, see ALTER MINING STRUCTURE (DMX).

Querying the Data Mining Schema Rowsets

In SQL Server 2008. many of the existing OLE DB data mining schema rowsets have been exposed as a set of system tables that you can easily query by using DMX statements. This makes it easy to retrieve metadata related to models and structures, to extract details from the mining model content, or to monitor an Analysis Services instance or service.

For more information, see Querying the Data Mining Schema Rowsets (Analysis Services - Data Mining).

Changes to Clustering Models

In SQL Server 2008, the default configuration of the Microsoft clustering algorithm was changed to use z-score normalization by default. The intent of this change is to minimize the effect of attributes that might have large magnitudes and many outliers. Typically, z-score normalization improves clustering results. However, it might alter the clustering results on non-normal distributions. Also, customers who migrate solutions from an earlier version of Analysis Services to SQL Server 2008 Analysis Services might notice that clustering models now produce different results. For more information, see Microsoft Clustering Algorithm Technical Reference

Enhancements to Analysis Services

If you use Analysis Services to create OLAP cubes that you also use for data mining, you might find it much easier to design dimensions and their related hierarchies and attributes. The Dimension Designer includes a new Attribute Relationship designer that helps you design attribute relationships and make sure that attribute relationships follow best practices.

For more information, see What's New (Analysis Services - Multidimensional Database).