Lesson 4: Executing Market Basket Predictions

New: 5 December 2005

In this lesson, you will use the SELECT FROM <model> PREDICTION JOIN (DMX) form of the SELECT statement to create predictions based on the association models you created in Lesson 2: Adding Mining Models to the Market Basket Mining Structure. These predictions types are defined below.

The SELECT FROM <model> PREDICTION JOIN (DMX) form of the SELECT statement contains three parts:

  • A list of the mining model columns and prediction functions that are returned in the result set. This can also contain input columns from the source data.
  • The source query defining the data that is being used to create a prediction. For example, in a batch query this could be a list of customers.
  • A mapping between the mining model columns and the source data. If these names match, then you can use NATURAL syntax and leave out the column mappings.

You can further enhance the query by using prediction functions. Prediction functions provide additional information, such as the probability of a prediction occurring, and the support for the prediction in the training dataset. For more information about prediction functions, see Functions (DMX).

You can also use the prediction query builder in Business Intelligence Development Studio to create prediction queries. For more information, see Using the Prediction Query Builder to Create DMX Prediction Queries.

Lesson Tasks

You will perform the following tasks in this lesson:

  • Create a query that determines what other items a customer will likely purchase, based on items already existing in their shopping cart. You will create this query by using the mining model with the default MINIMUM_PROBABILITY.
  • Create a query that determines what other items a customer will likely purchase based on items already existing in their shopping cart. You will create this query by using the mining model a MINIMUM_PROBABILITY of 0.01.

Singleton PREDICTION JOIN Statement

The first step is to use the SELECT FROM <model> PREDICTION JOIN (DMX) in a singleton prediction query. The following is a generic example of the singleton statement:

SELECT <select list>
    FROM [<mining model>] NATURAL PREDICTION JOIN
(SELECT '<value>' AS [<column>], 
    (SELECT 'value' AS [<nested column>] UNION
        SELECT 'value' AS [<nested column>] ...) 
    AS [<nested table>])
AS [<input alias>]

The first line of the code defines columns from the mining model that the query returns, as well as the mining model used to generate the prediction:

SELECT <select list> FROM [<mining model>] 

The next lines of the code define the products in the shopping cart that will be used to predict additional products that a customer will add:

(SELECT '<value>' AS [<column>], 
    (SELECT 'value' AS [<nested column>] UNION
        SELECT 'value' AS [<nested column>] ...) 
    AS [<nested table>])

Create a Prediction by Using a Model with the Default MINIMUM_PROBABILITY

To create an association query

  1. In Object Explorer, right-click the instance of Analysis Services, point to New Query, and then click DMX.

    Query Editor opens and contains a new, blank query.

  2. Copy the generic example of the PREDICTION JOIN statement into the blank query.

  3. Replace the following:

    <select list> 
    

    with:

    PREDICT([Association].[Products],INCLUDE_STATISTICS,3)
    

    You could just include the column name [Products], but by using the Predict function, you can limit the number of products that are returned by the algorithm to three. You can also use INCLUDE_STATISTICS, which returns the support, probability, and adjusted probability for each product. These statistics help you rate the accuracy of the prediction.

  4. Replace the following:

    [<mining model>] 
    

    with:

    [Association]
    
  5. Replace the following:

    (SELECT '<value>' AS [<column>], 
        (SELECT 'value' AS [<nested column>] UNION
            SELECT 'value' AS [<nested column>] ...) 
        AS [<nested table>])
    

    with:

    (SELECT (SELECT 'Mountain Bottle Cage' AS [Model]
      UNION SELECT 'Mountain Tire Tube' AS [Model]
      UNION SELECT 'Mountain-200' AS [Model]) AS [Products]) AS t
    

    This statement uses the UNION statement to specify three products that must be included in the shopping cart along with the predicted products. The Model column in the SELECT statement corresponds to the model column contained in the nested products table.

    The complete statement should now be as follows:

    SELECT
      PREDICT([Association].[Products],INCLUDE_STATISTICS,3)
    From
      [Association]
    NATURAL PREDICTION JOIN
    (SELECT (SELECT 'Mountain Bottle Cage' AS [Model]
      UNION SELECT 'Mountain Tire Tube' AS [Model]
      UNION SELECT 'Mountain-200' AS [Model]) AS [Products]) AS t
    
  6. On the File menu, click Save DMXQuery1.dmx As.

  7. In the Save As dialog box, browse to the appropriate folder, and name the file Association Prediction.dmx.

  8. On the toolbar, click the Execute button.

    The query returns a table containing three products (HL Mountain Tire, Fender Set - Mountain, and ML Mountain Tire) in order of likelihood of existing in the same shopping cart as the three products specified in the query. The table also contains statistics describing the accuracy of the prediction.

Create a Prediction by Using a Model with a MINIMUM_PROBABILITY of 0.01

To create an association query

  1. In Object Explorer, right-click the instance of Analysis Services, point to New Query, and then click DMX.

    Query Editor opens and contains a new, blank query.

  2. Copy the generic example of the PREDICTION JOIN statement into the blank query.

  3. Replace the following:

    <select list> 
    

    with:

    PREDICT([Modified Association].[Products],INCLUDE_STATISTICS,3)
    
  4. Replace the following:

    [<mining model>] 
    

    with:

    [Modified Association]
    
  5. Replace the following:

    (SELECT '<value>' AS [<column>], 
        (SELECT 'value' AS [<nested column>] UNION
            SELECT 'value' AS [<nested column>] ...) 
        AS [<nested table>])
    

    with:

    (SELECT (SELECT 'Mountain Bottle Cage' AS [Model]
      UNION SELECT 'Mountain Tire Tube' AS [Model]
      UNION SELECT 'Mountain-200' AS [Model]) AS [Products]) AS t
    

    This statement uses the UNION statement to specify three products that must be included in the shopping cart along with the predicted products. The Model column in the SELECT statement corresponds to the model column contained in the nested products table.

    The complete statement should now be as follows:

    SELECT
      PREDICT([Modified Association].[Products],INCLUDE_STATISTICS,3)
    From
      [Modified Association]
    NATURAL PREDICTION JOIN
    (SELECT (SELECT 'Mountain Bottle Cage' AS [Model]
      UNION SELECT 'Mountain Tire Tube' AS [Model]
      UNION SELECT 'Mountain-200' AS [Model]) AS [Products]) AS t
    
  6. On the File menu, click Save DMXQuery1.dmx As.

  7. In the Save As dialog box, browse to the appropriate folder, and name the file Modified Association Prediction.dmx.

  8. On the toolbar, click the Execute button.

    The query returns a table containing three products (HL Mountain Tire, Water Bottle, Fender Set - Mountain) in order of likelihood of existing in the same shopping cart as the three products specified in the query. The table also contains statistics describing the accuracy of the prediction.

    You can see from the results of this query that the value of the MINIMUM_PROBABILITY parameter affects the results returned by the query.

This is the last step in the Market Basket tutorial. You now have a set of models that you can use to predict the types of products that customers tend to purchase at the same time.

To learn how to use DMX in another predictive scenario, see Bike Buyer DMX Tutorial.