PredictHistogram (DMX)
Returns a table that represents a histogram for the prediction of a given column.
A histogram generates statistics columns. The column structure of the returned histogram depends on the type of column reference that is used with the PredictHistogram function.
Scalar Columns
For a <scalar column reference>, the histogram that the PredictHistogram function returns consists of the following columns:
The value that is being predicted.
$Support
$Probability
$ProbabilityVariance
Microsoft data mining algorithms do not support $ProbabilityVariance. This column always contains 0 for Microsoft algorithms.
$ProbabilityStdev
Microsoft data mining algorithms do not support $ProbabilityStdev. This column always contains 0 for Microsoft algorithms.
$AdjustedProbability
The $AdjustedProbability column is an Analysis Services extension to the Microsoft OLE DB for Data Mining specification.
Cluster Columns
The histogram that the PredictHistogram function returns for a <cluster column reference> consists of the following columns:
$Cluster (represents the cluster name)
$Distance
$Probability
The following example returns the predicted state of the Bike Buyer column in a singleton query. The query also returns the top two most likely states of the Bike Buyer attribute, based on the adjusted probability obtained by using the PredictHistogram function.
SELECT [TM Decision Tree].[Bike Buyer], TopCount(PredictHistogram([Bike Buyer]),$AdjustedProbability,3) From [TM Decision Tree] NATURAL PREDICTION JOIN (SELECT 28 AS [Age], '2-5 Miles' AS [Commute Distance], 'Graduate Degree' AS [Education], 0 AS [Number Cars Owned], 0 AS [Number Children At Home]) AS t
