Intermediate Data Mining Tutorial (Analysis Services - Data Mining)

Microsoft Analysis Services provides an integrated environment for creating and working with data mining models. You can easily bind to data sources, create and test multiple models on the same data, and deploy models for use in predictive analysis.

In the Basic Data Mining Tutorial, you learned how to use Business Intelligence Development Studio to create a data mining solution, and you built three models to support a targeted mailing campaign for analyzing customer purchasing behavior and for targeting potential buyers.

To complete the following tutorial, you should to be familiar with the data mining tools and with the mining model viewers that were introduced in the Basic Data Mining Tutorial. This intermediate tutorial builds on that experience and introduces several new scenarios, including forecasting and market basket analysis. You will learn how to create a time series model, an association model, and a sequence clustering model. You will also learn how to use nested tables in a model, and how to create filters on nested tables.

All scenarios use the AdventureWorksDW2008R2 data source, but you will create different data source views for different scenarios. You can do the lessons in any order as long as you create the data source first.

The lessons are independent and can be completed separately.

Lesson Scenarios

After your success with the targeted mailing campaign, you have been asked to apply your knowledge of data mining to develop several new models for use in business planning. These include the following new model types:

  • Time series models, to forecast the sales of products in different regions around the world. You will develop individual models for each region and also a general model that can be used for cross-prediction.

  • Association model, to analyze groupings of products that are purchased during visits to the Adventure Works Cycles e-commerce site. Based on this market basket model, you might recommend products to customers.

  • Sequence clustering model, to analyze the order in which customers buy products. Based on this model, you can plan changes in Web site design or new product offerings.

  • Neural network model and logistic regression models--To perform exploratory analysis of call center data. Based on the insights from the preliminary model, you will create a model to identify possible strategies for improving customer experience with the call center.

What You Will Learn

This tutorial teaches you how to create and work with several types of data mining algorithms. This tutorial also introduces the following concepts:

  • Using nested tables to build models

  • Choosing a nested table key, time series key, or sequence key

  • Filtering nested tables when creating models or making predictions

  • Determining whether you have enough data to support a model

  • Creating a general model and applying it to multiple data sets

This tutorial is divided into the following lessons:

Requirements

Make sure that the following are installed:

  • Microsoft SQL Server 2008 R2

  • Microsoft SQL Server Analysis Services

  • SQL Server with the AdventureWorksDW2008R2 database.

By default, the sample databases are not installed, to enhance security. To install the official databases for Microsoft SQL Server, visit the Microsoft SQL Sample Databases page and select SQL Server 2008R2.

Note

When you are working through a tutorial, you might find it easier to move back and forth between the steps if you add the Next topic and Previous topic buttons to the document viewer toolbar. For more information, see Adding Next and Previous Buttons to Help.