Structure and Usage of DMX Prediction Queries
In Microsoft SQL Server Analysis Services, you can use the prediction query in Data Mining Extensions (DMX) to predict unknown column values in a new dataset, based on the results of a mining model.
The type of query you use depends on what information you want to obtain from a model. If you want to create simple predictions in real time, for example to know if a potential customer on a Web site fits the persona of a bike buyer, you would use a singleton query. If you want to create a batch of predictions from a set of cases that are contained within a data source, you would use a regular prediction query.
To build a prediction query in DMX, you use a combination of the following elements:
-
SELECT [FLATTENED]
-
TOP
-
FROM <model> PREDICTION JOIN
-
ON
-
WHERE
-
ORDER BY
The SELECT element of a prediction query defines the columns and expressions that will appear in the result set, and can include the following data:
-
Predict or PredictOnly columns from the mining model.
-
Any column from the input data that is used to create the predictions.
-
Functions that return a column of data.
The FROM <model> PREDICTION JOIN element defines the source data to be used to create the prediction. For a singleton query, this is a series of values that are assigned to columns. For an empty prediction join, this is left empty.
The ON element maps the columns that are defined in the mining model to columns in an external dataset. You do not have to include this element if you are creating an empty prediction join query or a natural prediction join.
You can use the WHERE clause to filter the results of a prediction query. You can use a TOP or ORDER BY clause to select most likely predictions. For more information about using these clauses, see SELECT (DMX).
For more information about the syntax of a prediction statement, see SELECT FROM <model> PREDICTION JOIN (DMX) and SELECT FROM <model> (DMX).
