Understanding the Select Statement (DMX)

The SELECT statement is the basis for most queries that you create with Data Mining Extensions (DMX) in Microsoft SQL Server Analysis Services. It can perform many different kinds of tasks, such as browsing and predicting against data mining models.

Following are the tasks that you can complete by using the SELECT statement:

  • Browse a data mining model. The schema rowset defines the structure of a model.

  • Discover the possible values of a mining model column.

  • Browse the cases that are assigned to nodes in a mining model, or browse representations of those cases.

  • Perform predictions against a variety of input sources.

  • Copy mining models.

Each of these tasks uses a different data domain. You define the data domain in the FROM clause of the statement. For example, if you are browsing the data mining model object, your data domain is the columns that are defined by the schema rowset. Conversely, if you browse the cases of the model, your data domain is the actual column names in the model, such as Gender, Bike Buyer, and so on. In the first case, you are looking at the metadata that is stored in the schema rowset that defines the model; in the second case, you are actually looking at values, or representations of the values, that were used to train the mining model.

Anything that is included in the expression list or in the WHERE clause must come from the data domain that is defined by the FROM clause.

SELECT Types

You use the clauses in the SELECT statement to define the type of task that you want to perform. You can perform the following categories of tasks:

  • Predicting

  • Browsing

  • Copying

  • Drillthrough

Predicting

You can perform predictions based on a mining model by using the following query types.

Query Type

Traits

SELECT FROM [NATURAL] PREDICTION JOIN

Returns a prediction that is created by joining the columns in the mining model to the columns of an internal data source. The domain for this query type is the predictable columns from the model and the columns from the input data source.

SELECT FROM <model>

Returns the most likely state of the predictable column, based only on the mining model. This query type is a shortcut for creating a prediction with an empty prediction join. The domain for this query type is the predictable columns from the model.

You can include any one of the browsing or predicting SELECT statements within the FROM and WHERE clauses of a prediction join SELECT statement. For more information about imbedding a select statement, see SELECT FROM PREDICTION JOIN (DMX). For more information about prediction query types and structure, see Prediction Queries (DMX).

Back to Select Types

Browsing

You can browse the contents of a mining model by using the following query types.

Query Type

Traits

SELECT DISTINCT FROM <model>

Returns all the state values from the mining model for the specified column. The domain for this query type is the data mining model.

SELECT FROM <model>.CONTENT

Returns content that describes the mining model. The domain for this query type is the content schema rowset.

SELECT FROM <model>.DIMENSION_CONTENT

Returns content that describes the mining model. The domain for this query type is the content schema rowset.

SELECT FROM <model>.PMML

Returns the Predictive Model Markup Language (PMML) representation of the mining model, for algorithms that support this functionality. The domain for this query type is the PMML schema rowset.

Back to Select Types

Copying

You can copy a mining model and its associated mining structure into a new model, which you name within the statement, by using the following query type.

Query Type

Traits

SELECT INTO <new model>

Creates a copy of the mining model. The domain for this query type is the data mining model.

Back to Select Types

Drillthrough

You can browse the cases, or a representation of the cases, that were used to train the model, by using the following query types.

Query Type

Traits

SELECT FROM <model>.CASES or SELECT FROM <model>.SAMPLE_CASES

Returns cases, or a representation of cases, that were used to train the mining model. The domain for this query type is the data mining model.

Back to Select Types