How to: Create a Cross-Validation Report

To create a cross-validation report in SQL Server Analysis Services, you must complete steps that are slightly different from the procedures for other kinds of accuracy charts. This is because a cross-validation report does not use external data, but instead uses only the original data that was used to train the model. Therefore, you do not need to select a data source or map columns in the data source to the columns in the mining model.

The following procedure describes the steps to configure and generate a cross-validation report. For general information about cross-validation, see Cross-Validation (Analysis Services - Data Mining).

For information about how to create other kinds of accuracy charts, see How to: Create an Accuracy Chart for a Mining Model.

Note

Only models that are related to the currently selected structure are available for cross-validation. If the structure supports models that have different predictable attributes, or if the structure supports a combination of clustering and non-clustering models, you must create separate reports for each predictable output, or create separate reports for the clustering and non-clustering models.

To select a mining structure for cross-validation

  1. Open the Data Mining Designer in Business Intelligence Development Studio.

  2. In Solution Explorer, open the database that contains the structure or model for which you want to create a report.

  3. Double-click the mining structure to open the structure and its related models in Data Mining Designer.

  4. Click the Mining AccuracyChart tab.

  5. Click the Cross Validation tab.

To set cross-validation options

  1. On the Cross Validation tab, for Fold Count, click the down arrow to select a number between 1 and 10. The default value is 10.

    The Fold Count represents the number of partitions that will be created within the original data set. If you set Fold Count to 1, the training set will be used without partitioning.

  2. For Target Attribute, click the down arrow, and select a column from the list. If the model is a clustering model, select #Cluster to indicate that the model does not have a predictable attribute.

    You can select only one predictable attribute per report. By default, all related models that have the same predictable attribute are included in the report.

  3. For Max Cases, type a number that is large enough to provide a representative sample of data when the data is split among the specified number of folds. If the number is greater than the count of cases in the model training set, all cases will be used.

    If the training data set is very large, setting the value for Max Cases limits the total number of cases processed, and lets the report finish faster. However, you should not set Max Cases too low or there may be insufficient data for cross-validation.

  4. Optionally, for Target State, type the value of the predictable attribute that you want to model. For example, if the column Bike Buyer has two possible values, 1 (Yes) and 2 (No), you can enter the value 1 to assess the accuracy of the model for bike buyers.

    Note

    If you do not enter a value, the Target Threshold option is not available, and the model is assessed for all possible values of the predictable attribute.

  5. Optionally, for Target Threshold, type a decimal number between 0 and 1 to specify the minimum probability that a prediction must have to be counted as accurate.

    For more information about setting probability thresholds, see Cross-Validation Report (Analysis Services - Data Mining).

  6. Click Get Results.

To print the report

  1. Right-click the completed report on the Cross Validation tab.

  2. In the shortcut menu, select Print or Print Preview to review the report first.

To create a copy of the report in Microsoft Excel

  1. Right-click the completed report on the Cross Validation tab.

  2. In the shortcut menu, select Select All.

  3. Right-click the selected text, and select Copy.

  4. Paste the selection into an open Excel workbook. If you use the Paste option, the report is pasted into Excel as HTML, which preserves row and column formatting,. If you paste the report by using the Paste Special options for text or Unicode text, the report is pasted in row-delimited format.