Testing a Filtered Model (Basic Data Mining Tutorial)
Applies To: SQL Server 2016 Preview
Now that you have determined that the TM_Decision_Tree model is the most accurate, you will customize the model to better suit the needs of the Adventure Works Cycles targeted mailing campaign. Specifically, the Marketing department would like to know if there are any differences between male and female customers. The information could help them decide which magazines to use for advertising and which products to feature in their mailings.
Filtering enables you to easily create models built on subsets of your data. The filter is applied only to the model and does not change the underlying data source.
In this lesson, you will create a model that is filtered on gender, to predict the characteristics that most influence buying behavior in males and female.
First you will make a copy of the TM_Decision_Tree model.
In SQL Server Data Tools (SSDT), in Solution Explorer, select BasicDataMining.
Click the Mining Models tab.
Right click the TM_Decision_Tree model, and select New Mining Model.
In the Model name field, type TM_Decision_Tree_Male.
Next, create a filter to select customers for the model based on their gender.
Right-click the TM_Decision_Tree_Male mining model to open the shortcut menu.
-- or --
Select the model. On the Mining Model menu, select Set Model Filter.
In the Model Filter dialog box, click the top row in the grid, in the Mining Structure Column text box.
The drop-down list displays only the names of the columns in that table.
In the Mining Structure Column text box, select Gender.
The icon at the left side of the text box changes to indicate that the selected item is a table or a column.
Click the Operator text box and select the equal (=) operator from the list.
Click the Value text box, and type M.
Click the next row in the grid.
Click OK to close the Model Filter dialog box.
The filter displays in the Properties window. Alternately, you can launch the Model Filter dialog from the Properties window.
Repeat the above steps, but this time name the model TM_Decision_Tree_Female and type F in the Value text box.
Models cannot be used until they have been deployed and processed. For more information on processing models, see Processing Models in the Targeted Mailing Structure (Basic Data Mining Tutorial).
Right-click the TM_Decision_Tree_Male model and select Process Mining Structure and all Models
Click Run to process the new models.
After processing is complete, click Close on both processing windows.
You now have two new models displayed in the Mining Models tab.
View the results and assess the accuracy of the filtered models in much the same way as you did for the previous three models. For more information, see:
Select the Mining Model Viewer tab in Data Mining Designer.
In the Mining Model box, select TM_Decision_Tree_Male.
Slide Show Level to 3.
Change the Background value to 1.
Place your cursor over the node labeled All to see the number of bike buyers versus non-bike buyers.
Repeat steps 1 - 5 for TM_Decision_Tree_Female.
Explore the results for the TM_Decision_Tree and the models filtered for gender. Compared to all bike buyers, male and female bike buyers share some of the same characteristics as the unfiltered bike buyers but all three have interesting differences as well. This is useful information that Adventure Works Cycles can use to develop their marketing campaign.
Switch to the Mining Accuracy Chart tab in Data Mining Designer in SQL Server Data Tools (SSDT) and select the Input Selection tab.
In the Select data set to be used for Accuracy Chart group box, select Use mining structure test cases.
On the Input Selection tab of Data Mining Designer, under Select predictable mining model columns to show in the lift chart, select the checkbox for Synchronize Prediction Columns and Values.
In the Predictable Column Name column, verify that Bike Buyer is selected for each model.
In the Show column, select each of the models.
In the Predict Value column, select 1.
Select the Lift Chart tab to display the lift chart.
You will now notice that all three Decision Tree models provide significant lift compared to the random guess model, and also outperform the Clustering and Naive-Bayes models.
For more information on filters, see Filters for Mining Models (Analysis Services - Data Mining).
For an example of how to apply filters to nested tables, see Intermediate Data Mining Tutorial (Analysis Services - Data Mining).