Creating a Market Basket Structure and Model (Intermediate Data Mining Tutorial)
Applies To: SQL Server 2016 Preview
Now that you have created a data source view, you will use the Data Mining Wizard to create a new mining structure. In this task, you will create a mining structure and a mining model that is based on the Microsoft Association algorithm.
In Solution Explorer in SQL Server Data Tools (SSDT), right-click Mining Structures and select New Mining Structure to open the Data Mining Wizard.
On the Welcome to the Data Mining Wizard page, click Next.
On the Select the Definition Method page, verify that From existing relational database or data warehouse is selected, and then click Next.
On the Create the Data Mining Structure page, under Which data mining technique do you want to use?, select Microsoft Association Rules from the list, and then click Next. The Select Data Source View page appears.
Select Ordersunder Available data source views, and then click Next.
On the Specify Table Types page, in the row for the vAssocSeqLineItems table, select the Nested check box, and in the row for the nested table vAssocSeqOrders, select the Case check box. Click Next.
On the Specify the Training Data page, clear any boxes that might be checked. Set the key for the case table, vAssocSeqOrders, by selecting the Key check box next to OrderNumber.
Because the purpose of the market basket analysis is to determine which products are included in a single transaction, you do not have to use the CustomerKey field.
Set the key for the nested table, vAssocSeqLineItems, by selecting the Key check box next to Model. The Input check box is also automatically selected when you do this. Select the Predictable check box for Model as well.
In a market basket model, you do not care about the sequence of products in the shopping basket, and therefore you should not include LineNumber as a key for the nested table. You would use LineNumber as a key only in a model where the sequence is important. You will create a model that uses the Microsoft Sequence Clustering algorithm in Lesson 4.
Select the check box to the left of IncomeGroup and Region,but do not make any other selections. Checking the leftmost column adds the columns to the structure for later reference, but the columns will not be used in the model. Your selections should look like the following:
On the Specify Columns' Content and Data Typepage, review the selections, which should be as shown in the following table, and then click Next.
Columns Content Type Data Type IncomeGroup Discrete Text Order Number Key Text Region Discrete Text vAssocSeqLineItems Model Key Text
On the Create testing set page, the default value for the option Percentage of data for testing is 30 percent. Change this to 0. Click Next.
Analysis Services provides different charts for measuring model accuracy. However, some accuracy chart types, such as the lift chart and cross-validation report, are designed for classification and estimation. They are not supported for associative prediction.
On the Completing the Wizard page, in Mining structure name, type Association.
In Mining model name, type Association.
Select the option Allow drill through, and then click Finish.
Data Mining Designer opens to display the Association mining structure that you just created.