Topic Status: Some information in this topic is preview and subject to change in future releases. Preview information describes new features or changes to existing features in Microsoft SQL Server 2016 Community Technology Preview 2 (CTP2).
This function returns the likelihood that an input case will fit in the existing model. Used only with clustering models.
By default, the result of the PredictCaseLikelihood function is normalized. Normalized values are typically more useful as the number of attributes in a case increase and the differences between the raw probabilities of any two cases become much smaller.
The following equation is used to calculate the normalized values, given x and y:
x = likelihood of the case based on the clustering model
y = Marginal case likelihood, calculated as the log likelihood of the case based on counting the training cases
Z = Exp( log(x) – Log(Y))
Normalized = (z/ (1+z))
The following example returns the likelihood that the specified case will occur within the clustering model, which is based on the Adventure Works DW database.
SELECT PredictCaseLikelihood() AS Default_Likelihood, PredictCaseLikelihood(NORMALIZED) AS Normalized_Likelihood, PredictCaseLikelihood(NONNORMALIZED) AS Raw_Likelihood, FROM [TM Clustering] NATURAL PREDICTION JOIN (SELECT 28 AS [Age], '2-5 Miles' AS [Commute Distance], 'Graduate Degree' AS [Education], 0 AS [Number Cars Owned], 0 AS [Number Children At Home]) AS t
The difference between these results demonstrates the effect of normalization. The raw value for CaseLikelihood suggests that the probability of the case is about 20 percent; however, when you normalize the results, it becomes apparent that the likelihood of the case is very low.