Customizing and Processing the Forecasting Model (Intermediate Data Mining Tutorial)

The Microsoft Time Series algorithm provides parameters that affect how a model is created, and how the time data is analyzed. You can change these properties to control how the mining model makes predictions.

For this task in the tutorial, you will make the following change in the parameters:

  • You will adjust the value of the PERIODICITY_HINT parameter for the Forecasting model. This parameter gives the algorithm information about how frequently the pattern in the data is repeated. The data in AdventureWorks DW2008 is patterned on a monthly basis, and the periodicity is at the yearly level. Therefore, you will set the PERIODICITY_HINT parameter to 12, to indicate that a pattern repeats itself after every 12 values.

You will also review the settings of two important parameters that were introduced in SQL Server 2008 as part of enhancements to the Microsoft Time Series algorithm.

  • The FORECAST_METHOD parameter controls whether the time series algorithm is optimized for short-term or long-term predictions. By default, the FORECAST_METHOD parameter is set to MIXED, which means that two different prediction algorithms are blended and balanced to perform both short-term and long-term prediction.

  • The PREDICTION_SMOOTHING parameter controls the mixture of long-term and short-predictions. By default, this parameter is set to 0.5, which generally provides the best balance for overall accuracy.

After the changes have been made, you will process the model.

To change the algorithm parameters

  1. On the Mining Models tab, right-click Forecasting, and select Set Algorithm Parameters.

  2. In the PERIODICITY_HINT row of the Algorithm Parameters dialog box, , click the Value column, then type {12}, including the braces.

  3. In the FORECAST_METHOD row, verify that the Value text box is either blank or set to MIXED. If a different value has been entered, type MIXED to change the parameter back to the default value.

  4. In the PREDICTION_SMOOTHING row, verify that the Value text box is either blank or set to 0.5. If a different value has been entered, click Value and type 0.5 to change the parameter back to the default value.

    Note

    The PREDICTION_SMOOTHING parameter is available only in SQL Server Enterprise. Therefore, you cannot view or change the value of the PREDICTION_SMOOTHING parameter in SQL Server Standard. However, the default behavior is the same.

  5. Click OK.

To process the forecasting model

  1. On the Mining Model menu of BI Development Studio, select Process Mining Structure and All Models.

  2. At the warning asking whether you want to build and deploy the project, click Yes.

  3. In the Process Mining Structure - Forecasting dialog box, click Run.

    The Process Progress dialog box opens to display information about model processing. Model processing may take some time.

  4. After processing is complete, click Close to exit the Process Progress dialog box.

  5. Click Close again to exit the Process Mining Structure - Forecasting dialog box.

Handling Missing Data (Optional)

In many cases, your sales data might have gaps that are filled with nulls, or a store might have failed to meet the reporting deadline, leaving an empty cell at the end of the series. In these scenarios, Analysis Services raises the following error and will not process the model.

"Error (Data mining): Time stamps not synchronized starting with series <series name>, of the mining model, <model name>. All time series must end at the same time mark and cannot have arbitrarily missing data points. Setting the MISSING_VALUE_SUBSTITUTION parameter to Previous or to a numeric constant will automatically patch missing data points where possible."

To avoid this error, you can specify that Analysis Services automatically provide new values to fill in the gaps by using the following methods:

  • Using an average value. The mean is calculated by using all valid values in the same data series.

  • Using the previous value. You can substitute previous values for multiple missing cells, but you cannot fill starting values.

  • Using a constant value that you supply.

To specify that gaps be filled by averaging values

  1. On the Mining Models tab, right-click the Forecasting column, and select Set Algorithm Parameters.

  2. In the Algorithm Parameters dialog box, in the MISSING_VALUE_SUBSTITUTION row, click the Value column, and type Mean.