Creating Predictions (Basic Data Mining Tutorial)

After you have tested the accuracy of your mining models and decided that you are satisfied with them, you can create Data Mining Extensions (DMX) prediction queries by using Prediction Query Builder on the Mining Model Prediction tab in Data Mining Designer.

Prediction Query Builder has three views. With the Design and Query views, you can build and examine your query. You can then run the query and view the results in the Result view.

For more information about how to use Prediction Query Builder, see Creating DMX Prediction Queries.

Creating the Query

The first step in creating a prediction query is to select a mining model and input table.

To select a model and input table

  1. On the Mining Model Prediction tab of Data Mining Designer, in the Mining Model box, click Select Model.

  2. In the Select Mining Model dialog box, navigate through the tree to the Targeted Mailing structure, expand the structure, select TM_Decision_Tree, and then click OK.

  3. In the Select Input Table(s) box, click Select Case Table.

  4. In the Select Table dialog box, in the Data Source list, select Adventure Works DW2008.

  5. In Table/View Name, select the ProspectiveBuyer (dbo) table, and then click OK.

    The ProspectiveBuyer table most closely resembles the vTargetMail case table.

Mapping the Columns

After you select the input table, Prediction Query Builder creates a default mapping between the mining model and the input table, based on the names of the columns. At least one column from the structure must match a column in the external data.

Important

The data that you use to determine the accuracy of the models must contain a column that can be mapped to the predictable column.

To map the structure columns to the input table columns

  1. Right-click the lines connecting the Mining Model window to the Select Input Table window, and select Modify Connections.

    Notice that not every column is mapped. We will add mappings for several Table Columns.

  2. Under Table Column, click the Bike Buyer cell and select ProspectiveBuyer.Unknown from the dropdown.

    This maps the predictable column, [Bike Buyer], to an input table column.

  3. Click OK.

  4. In Solution Explorer, right-click the Targeted Mailing data source view and select View Designer.

  5. Right-click the ProspectiveBuyer table title and select New Named Calculation.

  6. In the Column name box, type calcAge.

  7. In the Expression box, type DATEDIFF(YYYY,[BirthDate],getdate()) and click OK.

    The input table has no corresponding Age column. This expression will calculate customer age from the input table BirthDate column. Since Age was identified as the most influential column for predicting bike buying, it must exist in both the model and input table.

  8. In Data Mining Designer, select the Mining Model Prediction tab and re-open the Modify Connections window.

  9. Under Table Column, click the Age cell and select ProspectiveBuyer.calcAge from the dropdown.

  10. Click OK.

Designing the Prediction Query

To design the prediction query

  1. The first button on the toolbar of the Mining Model Prediction tab is the Switch to design view / Switch to result view / Switch to query view button. Click the down arrow on this button, and select Design.

  2. In the grid on the Mining Model Prediction tab, click the cell in the first empty row in the Source column, and then select Prediction Function.

    This specifies the target column for the PredictProbability function. For more information about functions, see Data Mining Extensions (DMX) Function Reference.

  3. In the Prediction Function row, in the Field column, select PredictProbability.

  4. From the Mining Model window above, select and drag [Bike Buyer] into the Criteria/Argument cell.

    When you let go, [TM_Decision_Tree].[Bike Buyer] appears in the Criteria/Argument cell.

  5. Click the next empty row in the Source column and then select TM_Decision_Tree**.**

  6. In the TM_Decision_Tree row, in the Field column, select Bike Buyer.

  7. In the TM_Decision_Tree row, in the Criteria/Argument column, type =1.

  8. Click the next empty row in the Source column and then select ProspectiveBuyer.

  9. In the ProspectiveBuyer row, in the Field column, select ProspectiveBuyerKey.

    This adds the unique identifier to the prediction query so that you can identify who is and who is not likely to buy a bicycle

  10. Add five more rows to the grid. Fore each row, select ProspectiveBuyer as the Source and then add the following columns in the Field cells:

    • calcAge

    • LastName

    • FirstName

    • AddressLine1

    • AddressLine2

Finally, run the query and browse the results.

To run the query and view results

  1. In the Mining Model Prediction tab, select the Result button.

  2. After the query runs and the results are displayed, you can review the results.

    The Mining Model Prediction tab displays contact information for potential customers who are likely to be bike buyers. The Expression column indicates the probability of the prediction being correct. You can use these results to determine which potential customers to target for the mailing.

  3. Click the Save button to save the results.