Data Mining (SSAS)
Updated: March 2, 2016
Applies To: SQL Server 2016
SQL Server has been a leader in predictive analytics since the 2000 release, by providing data mining in Analysis Services. The combination of Integration Services, Reporting Services, and SQL Server Data Mining provides an integrated platform for predictive analytics that encompasses data cleansing and preparation, machine learning, and reporting. SQL Server Data Mining includes multiple standard algorithms, including EM and K-means clustering models, neural networks, logistic regression and linear regression, decision trees, and naive bayes classifiers. All models have integrated visualizations to help you develop, refine, and evaluate your models. Integrating data mining into business intelligence solution helps you make intelligent decisions about complex problems.
Data mining (also called predictive analytics and machine learning) uses well-researched statistical principles to discover patterns in your data. By applying the data mining algorithms in Analysis Services to your data, you can forecast trends, identify patterns, create rules and recommendations, analyze the sequence of events in complex data sets, and gain new insights.
In SQL Server 2016, data mining is powerful, accessible, and integrated with the tools that many people prefer to use for analysis and reporting.
SQL Server Data Mining provides the following features in support of integrated data mining solutions:
Multiple data sources: You can use any tabular data source for data mining, including spreadsheets and text files. You can also easily mine OLAP cubes created in Analysis Services. However, you cannot use data from an in-memory database.
Integrated data cleansing, data management, and reporting: Integration Services provides tools for profiling and cleansing data. You can build ETL processes for cleaning data in preparation for modeling, and ssISnoversion also makes it easy to retrain and update models.
Multiple customizable algorithms: In addition to providing algorithms such as clustering, neural networks, and decisions trees, SQL Server Data Mining supports development of your own custom plug-in algorithms.
Model testing infrastructure: Test your models and data sets using important statistical tools as cross-validation, classification matrices, lift charts, and scatter plots. Easily create and manage testing and training sets.
Querying and drillthrough: SQL Server Data Mining provides the DMX language for integrating prediction queries into applications. You can also retrieve detailed statistics and patterns from the models, and drill through to case data.
Client tools: In addition to the development and design studios provided by SQL Server, you can use the Data Mining Add-ins for Excel to create, query, and browse models. Or, create custom clients, including Web services.
Scripting language support and managed API: All data mining objects are fully programmable. Scripting is possible through MDX, XMLA, or the PowerShell extensions for Analysis Services. Use the Data Mining Extensions (DMX) language for fast querying and scripting.
Security and deployment: Provides role-based security through Analysis Services, including separate permissions for drillthrough to model and structure data. Easy deployment of models to other servers, so that users can access the patterns or perform predictions
The topics in this section introduce the principal features of SQL Server Data Mining and related tasks.