Specify a Column to Use as Regressor in a Model
Topic Status: Some information in this topic is preview and subject to change in future releases. Preview information describes new features or changes to existing features in Microsoft SQL Server 2016 Community Technology Preview 2 (CTP2).
A linear regression model represents the value of the predictable attribute as the result of a formula that combines the inputs in such a way that the data is fitted as a closely as possible to an estimated regression line. The algorithm accepts only numeric values as inputs, and automatically detects the inputs that provide the best fit.
However, you can specify that a column be included as a regressor by adding the FORCE_REGRESSOR parameter to the model and specifying the regressors to use. You might want to do this in cases where the attribute has meaning even if the effect is too small to be detected by the model, or when you want to ensure that the attribute is included in the formula.
The following procedure describes how to create a simple linear regression model, using the same sample data that is used for the neural networks tutorial. The model is not necessarily robust, but demonstrates how to use the Data Mining Designer to customize a linear regression model.
How to create a simple linear regression model
In SQL Server Data Tools (SSDT), in Solution Explorer, expand Mining Structures.
Double-click Call Center.dmm to open it in the designer.
From the Mining Model menu, select New Mining Model.
For the algorithm, select Microsoft Linear Regression. For the name, type Call Center Regression.
In the Mining Models tab, change the column usage as follows. All columns not in the following list should be set to Ignore, if they are not already.
Total Operators Input
From the Mining Model menu, select Set Model Parameters.
For the parameter, FORCE_REGRESSOR, in the Value column, type the column names enclosed in brackets and separated by a comma, as follows:
[Average Time Per Issue],[Total Operators]
The algorithm will automatically detect which columns are the best regressors. You only need to force regressors when you want to ensure that a column is included in the final formula.
From the Mining Model menu, select Process Model.
In the viewer, the model is represented a single node containing the regression formula. You can view the formula in the Mining Legend, or you can extract the coefficients for the formula by using queries.