Creating Predictions for the Call Center Models (Intermediate Data Mining Tutorial)
Now that you have learned something about the interactions between shifts, the number of operators, calls, and service grade, you are ready to create some prediction queries that can be used in business analysis and planning. You will first create some predictions on the exploratory model to test some assumptions. Next, you will create bulk predictions by using the logistic regression model.
This lesson assumes that you are already familiar with how to use the prediction query builder. For general information about how to use Prediction Query Builder, see Creating DMX Prediction Queries.
The following example demonstrates how to make a singleton prediction using the neural network model that was created for exploration. Singleton predictions are a good way to try out different values to see the effect in the model. In this scenario, you will predict the service grade for the midnight shift (no day of the week specified) if six experienced operators are on duty.
To create a singleton query by using the neural network model
In Business Intelligence Development Studio, open the solution that contains the model that you want to use.
In Data Mining Designer, click the Mining Model Prediction tab.
In the Mining Model pane, click Select Model.
The Select Mining Model dialog box shows a list of mining structures. Expand the mining structure to view a list of mining models associated with that structure.
Expand the mining structure Call Center Default, and select the neural network model, Call Center - LR.
From the Mining Model menu, select Singleton Query.
The Singleton Query Input dialog box appears, with columns mapped to the columns in the mining model.
In the Singleton Query Input dialog box, click the row for Shift, and then select midnight.
Click the row for Lvl 2 Operators, and type 6.
In the bottom half of the Mining Model Prediction tab, click the first row in the grid.
In the Source column, click the down arrow, and select Prediction function. In the Field column, select PredictHistogram.
A list of arguments that you can use with this prediction function automatically appears in the Criteria/Arguments box.
Drag the ServiceGrade column from the list of columns in the Mining Model pane to the Criteria/Arguments box.
The name of the column is automatically inserted as the argument. You can choose any predictable attribute column to drag into this text box.
Click the button Switch to query results view, in the upper corner of the Prediction Query Builder.
The expected results contain the possible predicted values for each service grade given these inputs, together with support and probability values for each prediction. You can return to design view at any time and change the inputs, or add more inputs.
Although you can create predictions by using a neural network model, more typically the neural network model is used for exploration of complex relationships. If you already know the attributes that are relevant to the business problem, you can use a logistic regression model to predict the effect of making changes in some attributes. Logistic regression is a statistical method that is commonly used to make predictions based on changes in independent variables: for example, it is used in financial scoring, to predict customer behavior based on customer demographics.
In this task, you will learn how to create a data source that will be used for predictions, and then make predictions to help answer several business questions.
Generating Data used for Bulk Prediction
In this lesson, you will first create an aggregated view of the source data that can be used for making bulk predictions, and then join that data to a mining model in a prediction query. There are many ways to provide input data: for example, you can import staffing levels from a spreadsheet, or provide the values programmatically. Here, you will use the Data Source View designer to create a named query. This named query is a custom T-SQL statement that creates aggregates for each shift, such as the maximum operators, minimum calls received, or the average number of issues generated.
To generate input data for a bulk prediction query
In Solution Explorer, right-click Data Source Views, and then select New Data Source View.
In the Data Source View wizard, select Adventure Works DW2008R2 as the data source, and then click Next.
On the Select Tables and Views page, click Next without selecting any tables.
On the Completing the Wizard page, type the name, Shifts.
This name will appear in Solution Explorer as the name of the data source view.
Right-click the empty design pane, then select New Named Query.
In the Create Named Query dialog box, for Name, type Shifts for Call Center.
This name will appear in Data Source View designer only as the name of the named query.
Paste the following query statement into the SQL text pane in the lower half of the dialog box.
SELECT DISTINCT WageType, Shift, AVG(Orders) as AvgOrders, MIN(Orders) as MinOrders, MAX(Orders) as MaxOrders, AVG(Calls) as AvgCalls, MIN(Calls) as MinCalls, MAX(Calls) as MaxCalls, AVG(LevelTwoOperators) as AvgOperators, MIN(LevelTwoOperators) as MinOperators, MAX(LevelTwoOperators) as MaxOperators, AVG(Issues) as AvgIssues, MIN(Issues) as MinIssues, MAX(Issues) as MaxIssues FROM dbo.FactCallCenter GROUP BY Shift, WageType
In the design pane, right-click the table, Shifts for Call Center, and select Explore Data to preview the data as returned by the T-SQL query.
Right-click the tab, Shifts.dsv (Design), and then click Save to save the new data source view definition.
Predicting Service Metrics for Each Shift
Now that you have generated some values for each shift, you will use those values as input to the logistic regression model that you built, to generate multiple predictions.
To use the new DSV as input to a prediction query
In Data Mining Designer, click the Mining Model Prediction tab.
In the Mining Model pane, click Select Model, and choose Call Cetner - LR from the list of available models.
From the Mining Model menu, clear the option, Singleton Query. A warning tells you that the singleton query inputs will be lost. Click OK.
The Singleton Query Input dialog box is replaced with the Select Input Table(s) dialog box.
Click Select Case Table.
In the Select Table dialog box, selectShifts from the list of data sources. In the Table/View name list, select Shifts for Call Center (it might be automatically selected), and then click OK.
The Mining Model Prediction design surface is updated to show mappings that are created by Analysis Services based on the names and data types of columns in the input data and in the model.
Right-click one of the join lines, and then select Modify Connections.
In this dialog box, you can see exactly which columns are mapped and which are not. The mining model contains columns for Calls, Orders, Issues, and LvlTwoOperators, which you can map to any of the aggregates that you created based on these columns in the source data. In this scenario, you will map to the averages.
Click the empty cell next to LevelTwoOperators, and select Shifts for Call Center.AvgOperators.
Click the empty cell next to Calls, select Shifts for Call Center.AvgCalls. and then click OK.
To create the predictions for each shift
In the grid at the bottom half of the Prediction Query Builder, click the empty cell under Source, and then select Shifts for Call Center.
In the empty cell under Field, select Shift.
Click the next empty line in the grid and repeat the procedure described above to add another row for WageType.
Click the next empty line in the grid. In the Source column, select Prediction Function. In the Field column, select Predict.
Drag the column ServiceGrade from the Mining Model pane down to the grid, and into the Criteria/Argument cell. In the Alias field, type Predicted Service Grade.
Click the next empty line in the grid. In the Source column, select Prediction Function. In the Field column, select PredictProbability.
Drag the column ServiceGrade from the Mining Model pane down to the grid, and into the Criteria/Argument cell. In the Alias field, type Probability.
Click Switch to query result view to view the predictions.
The following table shows sample results for each shift.
Predicted Service Grade
Predicting the Effect of Call Time on Service Grade
You generated some values for each shift, and used those values as input to the logistic regression model. However, given that the business objective is to keep abandon rate within the range 0.00-0.05, the results are not encouraging.
Therefore, based on the original model, which showed a strong influence of response time on service grade, the Operations team decides to run some predictions to assess whether reducing the average time for responding to calls might improve service grade. For example, if you cut the call response time to 90 percent or even to 80 percent of the current call response time, what would happen?
It is easy to create a data source view (DSV) that calculates the average response times for each shift. You can also easily add columns that contain times that represent the target values. You can then use the DSV as input to the model.
The following table shows the results of a prediction query that uses three different response times as inputs. The query also returns the probability of the predicted value, so that you can assess how likely it is that reducing the response time will have an impact on service grade.
In the following table, the first set of numbers represents the predicted service grade, and the second set of numbers (in parentheses) represents the probability of that predicted value. From these results you might conclude that it is worth trying to reduce the response time to 90 percent.
Average response time for shift
90 percent reduction in response time
80 percent reduction in response time
In addition to providing the input values via a data source view, as shown here, you could calculate the inputs programmatically and provide them to the model. By iterating through all possible values, you can find the smallest reduction in response time that guarantees the target service level for each shift.
There are a variety of other prediction queries that you can create on this model. For example, you could predict how many operators are required to meet a certain service level or to respond to a certain number of incoming calls. Because you can include multiple outputs in a logistic regression model, it is easy to experiment with different independent variables and outcomes without having to create many separate models.
The Data Mining Add-Ins for Excel 2007 provide logistic regression wizards that make it easy to answer complex questions, such as how many Level Two Operators would be required to improve service grade to a target level for a specific shift. The data mining add-ins are a free download, and include wizards that are based on the neural network and/or logistic regression algorithms. For more information, see the following links:
SQL Server 2005 Data Mining Add-Ins for Office 2007: Goal Seek and What If Scenario Analysis
SQL Server 2008 Data Mining Add-Ins for Office 2007: Goal Seek Scenario Analysis, What If Scenario Analysis, and Prediction Calculator
You have learned to create, customize, and interpret mining models that are based on the Microsoft Neural Network algorithm and the Microsoft Logistic Regression algorithm. These model types are sophisticated and permit almost infinite variety in analysis, and therefore can be complex and difficult to master. Tools such as the Excel-based charts and PivotTables provided in the Data Source View designer can provide support for the strongest trends detected by the algorithms, and can help you understand the trends that were discovered. However, to fully appreciate the insights from the model, you may want to explore the analysis provided by the model and review your data in some depth, going back and forth between the custom mining model viewer and other tools. By doing so, you can fully develop your understanding of the trends in the data.