Creating Predictions for the Call Center Models (Intermediate Data Mining Tutorial)
Now that you have learned something about the interactions between shifts, day of week, the number of calls and orders, and service grade, you are ready to create some prediction queries that can be used in business analysis and planning. You will first create some predictions on the exploratory model to test some assumptions. Next, you will create bulk predictions using the logistic regression model.
This lesson assumes that you are already familiar with how to use the prediction query builder. For general information about how to use Prediction Query Builder, see Creating DMX Prediction Queries.
The following example demonstrates how to make a singleton prediction using the neural network model that was created for exploration. Singleton predictions are a good way to try out different values to see the effect in the model. In this scenario, you will predict the service grade for the midnight shift (no day of the week specified) if six experienced operators are on duty.
To create a singleton query using the neural network model
In Business Intelligence Development Studio, open the solution that contains the model you want to use.
In Data Mining Designer, click the Mining Model Prediction tab.
In the Mining Model pane, click Select Model.
The Select Mining Model dialog box presents a list of mining structures. Expand the mining structure to view a list of mining models associated with that structure.
Expand the mining structure Call Center, and select the mining model, Call Center - NN.
From the Mining Model menu, select Singleton Query.
The Singleton Query Input dialog box is displayed, with columns mapped to the columns in the mining model.
In the Singleton Query Input dialog box, click the row for Shift, and select midnight.
Click the row for Lvl 2 Operators, and type 6.
In the bottom half of the Mining Model Prediction tab, click the first row in the grid.
Click the Source column, and then select Prediction function. In the Field column, select PredictHistogram.
A list of arguments that you can use with this prediction function automatically appears in the Criteria/Arguments box.
Drag the ServiceGrade column from the list of columns in the Mining Model pane and drop it in the Criteria/Arguments box.
The name of the column is automatically inserted as the argument. You can drop any predictable attribute column into this text box.
Click the button Switch to query results view, in the upper corner of the Prediction Query Builder.
The expected results contain the possible predicted values for each service grade given these inputs, together with support and probability values for each prediction. You can return to design view at any time and change the inputs, or add more inputs.
Although you can create predictions by using a neural network model, more typically the neural network model is used for exploration of complex relationships. If you already know the attributes that are relevant to the business problem, you can use a logistic regression model to predict the effect of changing specific independent variables. Logistic regression is commonly used in scenarios such as financial scoring, for example, to predict customer behavior based on customer demographics or other attributes.
In this task, you will learn how to create a data source that will be used for predictions, and then make predictions to help answer several business questions.
Generating Data used for Bulk Prediction
For this scenario, you will first create an aggregated view of the source data that can be used for making bulk predictions, and then join that data to a mining model in a prediction query. There are many ways to provide input data: for example, you could import staffing levels from a spreadsheet, or provide the values programmatically. Here, for the sake of simplicity, you will use the Data Source View designer to create a named query. This named query is a custom T-SQL statement that creates aggregates for each shift, such as the maximum operators, minimum calls received, or the average number of issues generated.
To generate input data for a bulk prediction query
In Solution Explorer, right-click Data Source Views, and select New Data Source View.
In the Data Source View wizard, select AdventureWorks DW 2008 as the data source, and click Next.
On the page Select Tables and Views, click Next without selecting any tables.
On the page, Completing the Wizard, type the name, Shifts.
This name will appear in Solution Explorer as the name of the data source view.
Right-click the empty design pane, then select New Named Query.
In the Create Named Query dialog box, for Name, type Shifts for Call Center.
This name will appear in Data Source View designer only as the name of the named query.
Paste the following query statement into the SQL text pane in the lower half of the dialog box.
SELECT DISTINCT WageType, Shift, AVG(Orders) as AvgOrders, MIN(Orders) as MinOrders, MAX(Orders) as MaxOrders, AVG(Calls) as AvgCalls, MIN(Calls) as MinCalls, MAX(Calls) as MaxCalls, AVG(LevelTwoOperators) as AvgOperators, MIN(LevelTwoOperators) as MinOperators, MAX(LevelTwoOperators) as MaxOperators, AVG(IssuesRaised) as AvgIssues, MIN(IssuesRaised) as MinIssues, MAX(IssuesRaised) as MaxIssues FROM dbo.FactCallCenter GROUP BY Shift, WageType
In the design pane, right-click the table, Shifts for Call Center, and select Explore Data to preview the data as returned by the T-SQL query.
Right-click the tab, Shifts.dsv (Design) and then click Save to save the new data source view definition.
Predicting Service Metrics for Each Shift
Now that you have generated some values for each shift, you will use those values as input to the logistic regression model that you built, to generate multiple predictions.
To use the new DSV as input to a prediction query
In Data Mining Designer, click the Mining Model Prediction button tab.
In the Mining Model pane, click Select Model, and choose Call Center - LR from the list of available models.
From the Mining Model menu, deselect the option, Singleton Query. A warning tells you that the singleton query inputs will be lost. Click OK.
The Singleton Query Input dialog box is replaced with the Select Input Table(s) dialog box.
Click Select Case Table.
In the Select Table dialog box, selectShifts from the list of data sources. In the Table/View name list, select Shifts for Call Center (it might be automatically selected), and then click OK.
The Mining Model Prediction design surface is updated to show mappings that are created by Analysis Services based on the names and data types of columns in the input data and in the model.
Right-click one of the join lines, and select Modify Connections.
In this dialog box, you can see exactly which columns are mapped and which are not. The mining model contains columns for Calls, Orders, IssuesRaised, and LvlTwoOperators, which you can map to any of the aggregates that you created based on these columns in the source data. For this scenario, you will map to the averages.
Click the empty cell next to LevelTwoOperators, and select Shifts for Call Center.AvgOperators.
Click the empty cell next to Calls, and select Shifts for Call Center.AvgCalls. Click OK.
To create the predictions for each shift
In the grid at the bottom half of the Prediction Query Builder, click the empty cell under Source and select Shifts for Call Center.
In the empty cell under Field, select Shift.
Click the next empty line in the grid and repeat the procedure just described to add another row for WageType.
Click the next empty line in the grid. For Source, select Prediction Function. For Field, select Predict.
Drag the column ServiceGrade from the Mining Model pane down to the grid, and drop the column into the Criteria/Argument cell. In the Alias field, type Predicted Service Grade.
Click the next empty line in the grid. For Source, select Prediction Function. For Field, select PredictProbability.
Again, drag the column ServiceGrade from the Mining Model pane down to the grid, and drop the column into the Criteria/Argument cell. In the Alias field, type Probability.
Click Switch to query result view to view the predictions.
The following table shows an example of results for each shift.
Predicted Service Grade
Predicting the Effect of Call Time on Service Grade
Your original business objective was to determine ways to keep abandon rate within the target range of 0.00-0.05. The neural network model that you developed for exploration indicated that call response time strongly influences service grade. Therefore, the Operations team decides to run some predictions to assess whether reducing the average call response time might improve service grade. For example, if you cut the call response time to 90% or even 80% of the current call response time, what would happen?
It is easy to create a data source view (DSV) that calculates the average response times for each shift, and add columns that calculate some percentage of that average response time. You can then use the DSV as input to the model.
For example, the following table shows the results of a prediction query that takes as input three different response times: the average from the actual data, a value representing 90% of the actual value, and a value representing 80% of the average call response time.
In these results, the first set of predictions in each column represents the predicted service grade, and the second set of numbers (in parentheses) represents the probability of that predicted value. From these results you might conclude that the most cost-effective solution would be to attempt to reduce the response time to 90%.
Average call response time per shift
Reduce response time to 90 percent of current time
Reduce response time to 80 percent of current time
In addition to providing the input values via a data source view, as shown here, you could calculate the inputs programmatically and provide them to the model. By iterating through all possible values, you can find the smallest reduction in response time that guarantees the target service level for each shift.
There are a variety of other prediction queries that you can create on this model. For example, you could predict how many operators are needed to meet a certain service level or to respond to a certain number of incoming calls. Because you can include multiple outputs in a logistic regression model, it is easy to experiment with different independent variables and outcomes without having to create many separate models.
The Data Mining Add-Ins for Excel 2007 provide logistic regression wizards that make it easy to answer complex questions, such as how many Level Two Operators would be needed to improve service grade to a target level for a specific shift. The data mining add-ins are a free download, and include wizards that are based on the neural network and/or logistic regression algorithms. For more information, see the following links:
SQL Server 2005 Data Mining Add-Ins for Office 2007: Goal Seek and What If Scenario Analysis
SQL Server 2008 Data Mining Add-Ins for Office 2007: Goal Seek Scenario Analysis, What If Scenario Analysis, and Prediction Calculator
You have learned to create, customize, and interpret mining models that are based on the Microsoft Neural Network algorithm and the Microsoft Logistic Regression algorithm. These model types are sophisticated and permit almost infinite variety in analysis, and therefore can be complex and difficult to master. Tools such as the Excel-based charts and PivotTables provided in the Data Source View designer can provide support for the strongest trends detected by the algorithms and can help you understand the trends that were discovered. However, to fully appreciate the insights from the model, you may need to explore the analysis provided by the model and review your data in some depth, going back and forth between the custom mining model viewer and other tools to fully develop your understanding of the trends in the data.