Customizing a Data Mining Model (Analysis Services  Data Mining)
After you have selected an algorithm that meets your business needs, you can customize the mining model in the following ways to potentially improve results.
Use different columns of data in the model, or change the usage or content types of the columns.
Create filters on the mining model to restrict the data used in training the model.
Set algorithm parameters to control thresholds, tree splits, and other conditions.
Change the default algorithm that is used to analyze data or make predictions.
The decisions that you make about which columns of data to use in the model, and how to use and process that data, can greatly affect the results of analysis. The following topics provide information to help you understand these choices.
Mining Models (Analysis Services  Data Mining)
Provides an overview of the architecture of a mining model, including the underlying mining structure and the choice of mining columns.
Creating Filters for Mining Models (Analysis Services  Data Mining)
Explains how you can create filters that apply to a mining model, in order to create models based on a subset of the mining structure data.
Feature Selection in Data Mining.
Explains how Analysis Services uses a process called feature selection to select only the most useful attributes for addition to a model. Reducing the number of columns and attributes can improve performance and the quality of the model. The feature selection methods that are available differ depending on the algorithm that you choose.
If you use the Data Mining wizard, you can also have Analysis Services automatically select the data that is most useful for building a particular model.
The choice of algorithm determines what kind of results you will get. For general information about how a specific algorithm works, or the business scenarios where you would benefit from using a particular algorithm, see Data Mining Algorithms (Analysis Services  Data Mining).
The data mining algorithms provided in Analysis Services are also extensively customizable. You can control the behavior of the algorithm and how it processes data by setting algorithm parameters. The following topics provide detailed information about the parameters that each algorithm supports.
Microsoft Decision Trees Algorithm Technical Reference
Microsoft Clustering Algorithm Technical Reference
Microsoft Naive Bayes Algorithm Technical Reference
Microsoft Association Algorithm Technical Reference
Microsoft Sequence Clustering Algorithm Technical Reference
Microsoft Neural Network Algorithm Technical Reference
Microsoft Logistic Regression Algorithm Technical Reference
Microsoft Linear Regression Algorithm Technical Reference
Microsoft Time Series Algorithm Technical Reference
The topic for each algorithm type also lists the prediction functions that can be used with models based on that algorithm.
List of Algorithm Parameters
Each algorithm supports parameters that you can use to customize the behavior of the algorithm and finetune the results of your model. For a description of how to use each parameter, see the following topics:
Property name  Applies to 

AUTO_DETECT_PERIODICITY  
CLUSTER_COUNT  
CLUSTER_SEED  
CLUSTERING_METHOD  
COMPLEXITY_PENALTY  
FORCED_REGRESSOR  
FORECAST_METHOD  
HIDDEN_NODE_RATIO  
HISTORIC_MODEL_COUNT  
HISTORICAL_MODEL_GAP  
HOLDOUT_PERCENTAGE  Microsoft Logistic Regression Algorithm Technical Reference Microsoft Neural Network Algorithm Technical Reference Note This parameter is different from the holdout percentage value that applies to a mining structure. 
HOLDOUT_SEED  Microsoft Logistic Regression Algorithm Technical Reference Microsoft Neural Network Algorithm Technical Reference Note This parameter is different from the holdout seed value that applies to a mining structure. 
INSTABILITY_SENSITIVITY  
MAXIMUM_INPUT_ATTRIBUTES  Microsoft Clustering Algorithm Technical Reference Microsoft Decision Trees Algorithm Technical Reference Microsoft Linear Regression Algorithm Technical Reference Microsoft Naive Bayes Algorithm Technical Reference 
MAXIMUM_ITEMSET_COUNT  
MAXIMUM_ITEMSET_SIZE  
MAXIMUM_OUTPUT_ATTRIBUTES  Microsoft Decision Trees Algorithm Technical Reference Microsoft Linear Regression Algorithm Technical Reference Microsoft Logistic Regression Algorithm Technical Reference 
MAXIMUM_SEQUENCE_STATES  
MAXIMUM_SERIES_VALUE  
MAXIMUM_STATES  Microsoft Clustering Algorithm Technical Reference 
MAXIMUM_SUPPORT  
MINIMUM_IMPORTANCE  
MINIMUM_ITEMSET_SIZE  
MINIMUM_DEPENDENCY_PROBABILITY  
MINIMUM_PROBABILITY  
MINIMUM_SERIES_VALUE  
MINIMUM_SUPPORT  Microsoft Association Algorithm Technical Reference Microsoft Clustering Algorithm Technical Reference Microsoft Decision Trees Algorithm Technical Reference 
MISSING_VALUE_SUBSTITUTION  
MODELLING_CARDINALITY  
PERIODICITY_HINT  
PREDICTION_SMOOTHING  
SAMPLE_SIZE  Microsoft Clustering Algorithm Technical Reference 
SCORE_METHOD  
SPLIT_METHOD  
STOPPING_TOLERANCE 
Additional Requirements
Choosing and preparing data is an important part of the data mining process. For example, the algorithms that Microsoft provides do not allow duplicate keys. The type of data that is required for each model differs depending on the algorithm. For more information, see the Requirements section of the following topics:

After the model has been built and processed, you can view the information by using one of the viewers specific to each model type. Alternatively, you can write custom queries by using Data Mining Extensions (DMX) to obtain more advanced or detailed information about the patterns found in the data.
For information about how to create queries that return model content, see Querying Data Mining Models (Analysis Services  Data Mining).
You can use functions to extend the results that a mining model returns. Some functions also return statistics that represent the probability of an outcome, or other scores. In addition, individual algorithms also support additional functions. For example, if a mining model uses clustering, you can use special functions to find information about the clusters. However, if your model is based on the Time Series algorithm, a different set of functions is available for making predictions and querying model content. For more information, see the Technical Reference Topic for each algorithm.
For examples of how to query a mining model and how to use the prediction functions that are designed for specific model types, see Querying Data Mining Models (Analysis Services  Data Mining).
For a list of prediction functions that are supported for all algorithm types, see Mapping Functions to Query Types (DMX).
When you experiment with different models to solve a business problem, or build variations on a model, you need to measure the accuracy of each model and also evaluate how well each model answers the business problem. For general information about evaluating data mining models, see Validating Data Mining Models (Analysis Services  Data Mining). For more information about how to chart the accuracy of different mining models, seeTools for Charting Model Accuracy (Analysis Services  Data Mining).