Predictor Best Practices

Article
06/07/2013

You can improve the performance of the Predictor resource in the following ways:

Control the sample size used to build an analysis model
Control how many attributes are to be included in an analysis model
Build your model on the same computer that contains your Data Warehouse

You can control the relative accuracy of an analysis model and the time it takes to build it, by varying the sample size. The larger the sample, the more accurate the analysis model and the longer it takes to build the model. However, the accuracy of analysis models larger than 30,000 to 50,000 cases does not significantly improve. You control the number of cases used to build an analysis model by modifying Number of Cases in the Model Build Properties dialog box. You can also pre-sample your data using a view or table and build the model from that sample. With this method, you should be careful to sample the cases at random, instead of choosing them sequentially.

You can control the fractions of attributes used for inputs to predict the accuracy of an analysis model and the time it takes to build the model. The higher the fraction, the more accurate the model, but the longer it takes to build the model. To control the number of attributes that are included in an analysis model, modify Input Attribute Fraction and Output Attribute Fraction in the Model Build Properties dialog box. A value of 1.0 indicates that the Predictor resource is to use all of the attributes available to build the model. If the database contains many attributes, (for example, a product catalog with tens or hundreds of thousands of products) it is best to use a lower number. Unless you have a reason to do otherwise, it is probably best to use equal input and output fractions.

When you try to improve the results of your Prediction model, consider the data sizes and limits, and the performance criteria for the Predictor resource.

It is also recommended that you build your analysis models on the same computer that contains your Commerce Server Data Warehouse.

Data Sizes and Limits

Performance Criteria

Data Sizes and Limits

The following table shows the average and maximum data sizes for the Predictor resource.

Parameter	Average value	Maximum value
Number of products in product catalog	50,000	500,000
Number of cases to build model	30,000	No limit other than the amount of available memory.
Number of transactions per user	20	Total number of attributes
Size of computed model	100 KB	Limited by your database limitations
Maximum number of attributes predicted per call	10	Total number of attributes

Performance Criteria

The following table shows the performance criteria for the Predictor resource. The criteria are based on the average numbers defined in Data Sizes and Limits.

All numbers are measured on a 400 MHz Pentium II with 512 MB of RAM for the service. Client computers are 400 MHz Pentium II with 128 MB of RAM.

Criteria	Component	Minimum	Maximum	Average
Latency to make prediction	Client	0 milliseconds	200 milliseconds per prediction call	100 milliseconds per prediction call
Throughput of prediction requests	Client	30 predictions per second per CPU	Not applicable	50 predictions per second per CPU
New model build time	Service	Not applicable	8 hours	1.5 hours
Load model from database	Client	Not applicable	30 seconds	10 seconds
Service start time	Service	Not applicable	45 seconds	10 seconds

Predictor Best Practices

Data Sizes and Limits

Performance Criteria

See Also

Additional resources