SystemGetClusterCrossValidationResults (Analysis Services - Data Mining)
Partitions the mining structure into the specified number of cross-sections, trains a model for each partition, and then returns accuracy metrics for each partition.
Note This stored procedure can be used only with a mining structure that contains at least one clustering model. To cross-validate non-clustering models, you must use SystemGetCrossValidationResults (Analysis Services - Data Mining).
The Return Type table contains scores for each individual partition and aggregates for all models.
The following table describes the columns returned.
Column Name | Description |
|---|---|
ModelName | The name of the model that was tested. |
AttributeName | The name of the predictable column. For cluster models, always null. |
AttributeState | A specified target value in the predictable column. For cluster models, always null. |
PartitionIndex | An 1-based index that identifies which partition the results apply to. |
PartitionSize | An integer that indicates how many cases were included in each partition. |
Test | The type of test that was performed. |
Measure | The name of the measure returned by the test. Measures for each model depend on the type of the predictable value. For a definition of each measure, see Cross-Validation (Analysis Services - Data Mining). For a list of measures returned for each predictable type, see Cross-Validation Report (Analysis Services - Data Mining). |
Value | The value of the specified test measure. |
To return accuracy metrics for the entire data set, use SystemGetClusterAccuracyResults (Analysis Services - Data Mining).
Also, if the mining model has already been partitioned into folds, you can bypass processing and return only the results of cross-validation by using SystemGetClusterAccuracyResults (Analysis Services - Data Mining).
The following example demonstrates how to partition a mining structure into three folds, and then test two clustering models that are associated with the mining structure.
Line three of the code lists the specific mining models that you want to test. If you do not specify the list, all clustering models associated with the structure are used.
Line four of the code specifies the number of folds, and line five specifies the maximum number of cases to use.
Because these are clustering models, you do not need to specify a predictable attribute or value.
CALL SystemGetClusterCrossValidationResults( [v Target Mail], [Cluster 1], [Cluster 2], 3, 10000 )
Sample results:
ModelName | AttributeName | AttributeState | PartitionIndex | PartitionSize | Test | Measure | Value |
|---|---|---|---|---|---|---|---|
Cluster 1 |
|
| 1 | 3025 | Clustering | Case Likelihood | 0.930524511864121 |
Cluster 1 |
|
| 2 | 3025 | Clustering | Case Likelihood | 0.919184178430778 |
Cluster 1 |
|
| 3 | 3024 | Clustering | Case Likelihood | 0.929651120490248 |
Cluster 2 |
|
| 1 | 1289 | Clustering | Case Likelihood | 0.922789726933607 |
Cluster 2 |
|
| 2 | 1288 | Clustering | Case Likelihood | 0.934865535691068 |
Cluster 2 |
|
| 3 | 1288 | Clustering | Case Likelihood | 0.924724595688798 |

Note