Fill from Example Video Tutorial (Data Mining Table Analysis Tool)

Video Summary

In this tutorial we will learn how to use the Fill From Example Table Analysis Tool for Excel.

Video Transcript

Introduction

Hi, my name is Mary Brennan. I'm a Technical Writer for Microsoft SQL Server.

This video will help you get started using the Fill From Example tool. The Fill From Example tool helps you quickly create new columns of data based on patterns found in an Excel table and samples that you provide for the new values.

In this tutorial, AdventureWorks wants to identify their High Value customers. We can’t possibly enter these values manually for all customers in the database, so we will use Fill From Example to train the model to understand patterns that exist in the data. We will then review the Pattern report and evaluate the new column of data added to our worksheet.

A few things to keep in mind before we get started:

  • You can only work with data in columns. If the series that you want to fill is stored in a row, you can use the Paste, Transpose function in Excel to change the data to a columnar format.

  • Make sure that there is at least one example for every value that you want to predict.

The Wizard

  1. To begin, open the Fill From Example worksheet.

  2. Scroll to the High Value Customer column. Notice that AdventureWorks has used their knowledge of their customers to classify 10 of them as either High Value or Not High Value. The Fill from Example tool will analyze existing patterns in the data together with the examples you entered, and fill in values for the rest of the column.

    If you are not satisfied with the results, you can refine the results by providing more examples.

  3. Select the Fill From Example tab and click anywhere inside the table to activate the Table Analysis Tools.

  4. Under the Table Tools menu select the Analyze tab to open the Table Analysis Tools ribbon.

  5. Click Fill From Example to launch the Wizard.

  6. In the Fill From Example dialog, the tool automatically selects the column that has the most null or missing values, in this case High Value Customer. If we wanted to use a different column, we would select it from the list.

  7. Click Choose columns to be used for analysis. Although the tool makes recommendations about which columns to use for analysis, you can override these recommendations and select different columns. For example, if you know from experience that there is a causal effect between Income and Age and the column that has missing values, you could deselect other columns in the list to get better results.

    If you are not sure of which columns to add as inputs, you can add all of the columns. This is what we will do, with the exception of ID.

  8. Click OK.

  9. Click Run. The report display in a new worksheet.

    If patterns were detected, missing values are automatically filled in on the Fill From Example worksheet.

The Reports

The Pattern report describes the rules used to fill in the missing values. Each influencer or rule is described as a combination of a column, the value in that column, and the relative impact of the rule on the prediction.

Relative Impact is shown as a shaded bar. The longer the bar, the greater the probability that this rule is predictive of the filled-in value.

This means that Pacific Region strongly predicts No and a 2-5 mile commute strongly predicts Yes (for High Value Customer).

The tool also adds a new column to the original data table, named High Value Customer_Extended.

  1. Return to the Fill From Example worksheet.

  2. Scroll over to the new High Value Customer_Extended column.

    Notice that predicted values have been filled in for all rows in the worksheet. These values are based on patterns discovered in the data. The original column that was used for training still remains.

  3. Let’s sort the rows based on Customer Value.

Conclusion

This concludes the Fill From Example video tutorial. For additional help with the Table Analysis Tools, I recommend viewing the other Table Analysis Tools video tutorials and the Help documentation included with the Data Mining Add-ins. Thank you for viewing this tutorial.