Creating a Targeted Mailing Mining Model Structure (Basic Data Mining Tutorial)
Applies To: SQL Server 2016 Preview
The first step in creating a targeted mailing scenario is to use the Data Mining Wizard in SQL Server Data Tools (SSDT) to create a new mining structure and decision tree mining model.
In this task you will set up a new mining structure, and add an initial mining model based on the Microsoft Decision Trees algorithm. To create the structure, you will first select tables and views and then identify which columns will be used for training and which for testing.
To create a mining structure for the targeted mailing scenario
In Solution Explorer, right-click Mining Structures and select New Mining Structure to start the Data Mining Wizard.
On the Welcome to the Data Mining Wizard page, click Next.
On the Select the Definition Method page, verify that From existing relational database or data warehouse is selected, and then click Next.
On the Create the Data Mining Structure page, under Which data mining technique do you want to use?, select Microsoft Decision Trees.
If you get a warning that no data mining algorithms can be found, the project properties might not be configured correctly. This warning occurs when the project attempts to retrieve a list of data mining algorithms from the Analysis Services server and cannot find the server. By default, SQL Server Data Tools will use localhost as the server. If you are using a different instance, or a named instance, you must change the project properties. For more information, see Creating an Analysis Services Project (Basic Data Mining Tutorial).
On the Select Data Source View page, in the Available data source views pane, select Targeted Mailing. You can click Browse to view the tables in the data source view and then click Close to return to the wizard.
On the Specify Table Types page, select the check box in the Case column for vTargetMail to use it as the case table, and then click Next. You will use the ProspectiveBuyer table later for testing; ignore it for now.
On the Specify the Training Data page, you will identify at least one predictable column, one key column, and one input column for your model. Select the check box in the Predictable column in the BikeBuyer row.
Notice the warning at the bottom of the window. You will not be able to navigate to the next page until you select at least one Input and one Predictable column.
Click Suggest to open the Suggest Related Columns dialog box.
The Suggest button is enabled whenever at least one predictable attribute has been selected. The Suggest Related Columns dialog box lists the columns that are most closely related to the predictable column, and orders the attributes by their correlation with the predictable attribute. Columns with a significant correlation (confidence greater than 95%) are automatically selected to be included in the model.
Review the suggestions, and then click Cancel to ignore the suggestions.
If you click OK, all listed suggestions will be marked as input columns in the wizard. If you agree with only some of the suggestions, you must change the values manually.
Verify that the check box in the Key column is selected in the CustomerKey row.
If the source table from the data source view indicates a key, the Data Mining Wizard automatically chooses that column as a key for the model.
Select the check boxes in the Input column in the following rows. You can check multiple columns by highlighting a range of cells and pressing CTRL while selecting a check box.
On the far left column of the page, select the check boxes in the following rows.
Ensure that these rows have checks only in the left column. These columns will be added to your structure but will not be included in the model. However, after the model is built, they will be available for drillthrough and testing. For more information about drillthrough, see Drillthrough Queries (Data Mining)