Tools for Charting Model Accuracy (Analysis Services - Data Mining)
The Mining Accuracy Chart tab, which is available in both SQL Server Management Studio and Business Intelligence Development Studio, provides multiple tools for use in validating mining models:
Lift charts, profit charts, and scatter plots can all be viewed in the Lift Chart tab. Use the Input Selection tab to choose a model and set options, and then click the Lift Chart tab and select the type of chart that you want from the Chart Type list. A scatter plot is displayed automatically if the model represents a linear regression.
Classification matrices, sometimes called confusion tables, can be configured in the Input Selection tab and then displayed on the Classification Matrix tab.
Cross-validation reports can be configured and viewed on the Cross-Validation tab of the Mining Accuracy Chart tab.
The Mining Accuracy Chart tab cannot be used with time series models.
A lift chart plots the results of prediction queries from a testing dataset against known values for the predictable column that exist in the dataset. The chart displays the results of the mining model, together with a representation of the results that an ideal model would produce, and a representation of the results of random guessing. Any improvement over the random line is called lift. The more lift that the model demonstrates, the more effective the model is. Only mining models that contain discrete predictable attributes can be compared in a lift chart.
You can create a lift chart by using the Input Selection tab to configure the target model and choose a test data set. Then, click the Lift Chart tab to view the completed chart.
A profit chart is a variant of the lift chart that integrates information about the business cost of using the predictions generated by a model. After you enter facts related to costs, such as mailing fees, Analysis Services displays a curve that shows the lift provided by the model, and also calculates the return on the investment when the model is used.
You can create a profit chart on by using the Input Selection tab to configure the target model and choose a test data set. Then, click the Lift Chart tab and select Profit Chart from the Chart Type list. The Profit Chart Setting dialog box automatically opens. After you have configured parameters that are unique to profit charts, the chart that is displayed in the Mining Accuracy Chart tab automatically changes to display profits and losses per unit.
A scatter plot charts the accuracy of a model that predicts a continuous attribute, comparing the actual values versus the predicted values for each case. A scatter plot is generated instead of a lift chart whenever the predictable attributes has continuous values.
If your model supports the required predictable column and input columns, you can create a scatter plot on the Mining Accuracy Chart tab of Data Mining Designer. First, you use the Input Selection tab to configure the target model and choose a test data set. Then, click the Lift Chart tab. The chart that is displayed in the Mining Accuracy Chart tab automatically changes to display a graph that shows the linear relationship between the inputs and the predicted values.
For More Information: Scatter Plot (Analysis Services - Data Mining)
A classification matrix is another way of examining how accurately the mining models in a structure create predictions. To build a classification matrix, Analysis Services counts the number of good and bad predictions, using the actual values that exist in the testing dataset. The matrix is a valuable tool because it not only shows how frequently the model correctly predicted a value, but also shows which values the model predicted incorrectly. A classification matrix shows the actual count of true positives, false positives, true negatives, and false negatives for each predictable attribute.
You can create a classification matrix in the Mining Accuracy Chart tab of Data Mining Designer. First, use the Input Selection tab to configure the target model and choose a test data set. Then, click the Classification Matrix tab. The chart is automatically displayed, with no further configuration required.
Cross-validation is an advanced data mining technique that helps you measure the validity of your model. When you create a cross-validation report, Analysis Services divides your data set into multiple cross-sections, automatically creates and trains multiple models on the subsets, and then calculates accuracy for all the models. By reviewing the statistics that are generated, you can assess how well a model generalizes across different data sets, or determine which of several models on a structure performs the best.
You can create a cross-validation report in the Mining Accuracy Chart tab of Data Mining Designer by selecting a model or structure, and then using the Cross Validation tab to set options for the number of folds, the target attribute, and so forth.