SELECT FROM <model> PREDICTION JOIN (DMX)

Uses a mining model to predict the states of columns in an external data source. The PREDICTION JOIN statement matches each case from the source query to the model.

Syntax

SELECT [FLATTENED] [TOP <n>] <select expression list> 
FROM <model> | <sub select> [NATURAL] PREDICTION JOIN 
<source data query> [ON <join mapping list>] 
[WHERE <condition expression>]
[ORDER BY <expression> [DESC|ASC]]

Arguments

  • n
    Optional. An integer that specifies how many rows to return.
  • select expression list
    A comma-separated list of column identifiers and expressions that are derived from the mining model.
  • model
    A model identifier.
  • sub select
    An embedded select statement.
  • source data query
    The source query.
  • join mapping list
    Optional. A logical expression that compares columns from the model to columns from the source query.
  • condition expression
    Optional. A condition to restrict the values that are returned from the column list.
  • expression
    Optional. An expression that returns a scalar value.

Remarks

The ON clause defines the mapping between the columns from the source query and the columns from the mining model. This mapping is used to direct columns from the source query to columns in the mining model so that the columns can be used as inputs in creating the predictions. Columns in the <join mapping list> are related by using an equal sign (=), as shown in the following example:

[MiningModel].ColumnA = [source data query].Column1 AND 
[MiningModel].ColumnB = [source data query].Column2 AND
...

If you are binding a nested table in the ON clause, make sure that you bind the key column with any non-key columns so that the algorithm can correctly identify which case the record of the nested column belongs to.

The source query for the prediction join can either be a table or a singleton query.

You can specify prediction functions that do not return a table expression in the <select expression list> and the <condition expression>.

NATURAL PREDICTION JOIN automatically maps together column names from the source query that match column names in the model. If you use NATURAL PREDICTION, you should omit the ON clause. NATURAL implies that no ON clause is specified, and that the columns are mapped by name.

Singleton Query Example

The following example shows how to create a query to predict whether a specific person will buy a bicycle in real time, where the data is not stored, but instead is entered directly into the query. The person in the query has the following traits:

  • 35 years old
  • Owns a house
  • Owns two cars
  • Has two children at home

Using the TM Decision Tree mining model and the known characteristics about the subject, the query returns a Boolean value that describes whether the person bought the bike and a set of tabular values, pulled from the PredictHistogram function, that describe how the prediction was made.

SELECT
  [TM Decision Tree].[Bike Buyer],
  PredictHistogram([Bike Buyer])
FROM
  [TM Decision Tree]
NATURAL PREDICTION JOIN
(SELECT 35 AS [Age],
  '5-10 Miles' AS [Commute Distance],
  '1' AS [House Owner Flag],
  2 AS [Number Cars Owned],
  2 AS [Total Children]) AS t

Open Query Example

The following example shows how to create a query that predicts whether each person from an external dataset will buy a bike. Because the table exists within a data source view on the instance of Analysis Services, the query uses OPENQUERY to retrieve the data. Because the table differs from the column structure of the mining model, the ON statement is used to map the table to the model.

The query returns the first and last name of each person in the table, together with a Boolean column that describes whether each person is likely to buy a bike and the probability that each person will buy a bike.

SELECT
  t.[LastName],
  t.[FirstName],
  [TM Decision Tree].[Bike Buyer],
  PredictProbability([Bike Buyer])
From
  [TM Decision Tree]
PREDICTION JOIN
  OPENQUERY([Adventure Works DW],
    'SELECT
      [LastName],
      [FirstName],
      [MaritalStatus],
      [Gender],
      [YearlyIncome],
      [TotalChildren],
      [NumberChildrenAtHome],
      [Education],
      [Occupation],
      [HouseOwnerFlag],
      [NumberCarsOwned]
    FROM
      [dbo].[ProspectiveBuyer]
    ') AS t
ON
  [TM Decision Tree].[Marital Status] = t.[MaritalStatus] AND
  [TM Decision Tree].[Gender] = t.[Gender] AND
  [TM Decision Tree].[Yearly Income] = t.[YearlyIncome] AND
  [TM Decision Tree].[Total Children] = t.[TotalChildren] AND
  [TM Decision Tree].[Number Children At Home] = t.[NumberChildrenAtHome] AND
  [TM Decision Tree].[Education] = t.[Education] AND
  [TM Decision Tree].[Occupation] = t.[Occupation] AND
  [TM Decision Tree].[House Owner Flag] = t.[HouseOwnerFlag] AND
  [TM Decision Tree].[Number Cars Owned] = t.[NumberCarsOwned]

Association Algorithm Example

The following example shows how to create a prediction by using a model that is built from the Microsoft Association algorithm. The query returns three products that are most likely to exist in the same basket as the following three products:

  • Mountain Bottle Cage
  • Mountain Tire Tube
  • Mountain-200

A parameter is used with the prediction function in the SELECT list to limit the number of items that are returned by the query.

SELECT
  PREDICT([Association].[v Assoc Seq Line Items], 3)
FROM
  [Association]
NATURAL PREDICTION JOIN
(SELECT (SELECT 'Mountain Bottle Cage' AS [Model]
  UNION SELECT 'Mountain Tire Tube' AS [Model]
  UNION SELECT 'Mountain-200' AS [Model]) AS [products]) AS t

See Also

Reference

SELECT (DMX)
Data Mining Extensions (DMX) Data Definition Statements
Data Mining Extensions (DMX) Data Manipulation Statements
Data Mining Extensions (DMX) Statement Reference

Help and Information

Getting SQL Server 2005 Assistance