Adding a Data Source View with Nested Tables (Intermediate Data Mining Tutorial)

To create the mining model that will be used for market basket analysis, you must create a new data source view. This data source view will also be used for the sequence clustering scenario.

This data source view is different from others that you may have worked with because it contains a nested table. A nested table is a table that contains multiple rows of information about a single row in the case table. For example, if your model analyzes the purchasing behavior of customers, you would typically use a table that has a unique row for each customer as the case table. However, each customer might make multiple purchases, and you might want to analyze the sequence of purchases, or products that are frequently purchased together. To logically represent these purchases in your model, you add another table to the data source view that lists the purchases for each customer. This nested purchases table has a many-to-one relationship with the customer table. The nested table might contain many rows for each customer, each row containing a single product that was purchased, perhaps with additional information about the order that the purchases were made, the price at the time of the order, or any promotions that applied. You can use the information in the nested table as inputs to the model, or as the predictable attribute.

In this lesson, you add a new data source view to the AdventureWorksDW2008 data source, add the case and nested tables to this view, and specify the many-to-one relationship between the case and nested table. There are two parts to this process:

  • First, you define the relationship between the case table and the nested table.

  • Second, you define how the columns of data are used in the model.

It is important that you correctly specify the relationship between the case table and the nested table, to avoid errors when you process the model. For more information about working with case and nested tables, and how to choose a nested table key, see Nested Tables (Analysis Services - Data Mining).

To add a data source view

  1. In Solution Explorer, right-click Data Source Views, and select New Data Source View.

    The Data Source View Wizard opens.

  2. On the Welcome to the Data Source View Wizard page, click Next.

  3. On the Select a Data Source page, under Relational data sources, select the Adventure Works DW2008 data source that you created in Basic Data Mining Tutorial. Click Next.

  4. On the Select Tables and Views page, select the following tables, and then click the right arrow to include them in the new data source view:

    • vAssocSeqOrders

    • vAssocSeqLineItems

  5. Click Next.

  6. On the Completing the Wizard page, by default the data source view is named Adventure Works DW2008. Change this to Orders, and then click Finish.

    Data Source View Designer opens to display the Orders data source view.

To create a new relationship between tables

  1. In Data Source View Designer, position the two tables so that the tables are aligned horizontally, with the vAssocSeqLineItems table on the left side and the vAssocSeqOrders table on the right side.

  2. Select the OrderNumber column in the vAssocSeqLineItems table.

  3. Drag the column to the vAssocSeqOrders table, and put it on the OrderNumber column.

    Important

    Make sure to drag the OrderNumber column from the vAssocSeqLineItemsnested table, which represents the many side of the join, to the vAssocSeqOrderscase table, which represents the one side of the join.

    A new many-to-one relationship now exists between the vAssocSeqLineItems and vAssocSeqOrders tables. If you have joined the tables correctly, the data source view should appear as follows:

    expected many-to-one join on nested and case table