Predictor Best Practices

You can improve the performance of the Predictor resource in the following ways:

  • Control the sample size used to build an analysis model

  • Control how many attributes are to be included in an analysis model

  • Build your model on the same computer that contains your Data Warehouse

You can control the relative accuracy of an analysis model and the time it takes to build it, by varying the sample size. The larger the sample, the more accurate the analysis model and the longer it takes to build the model. However, the accuracy of analysis models larger than 30,000 to 50,000 cases does not significantly improve. You control the number of cases used to build an analysis model by modifying Number of Cases in the Model Build Properties dialog box. You can also pre-sample your data using a view or table and build the model from that sample. With this method, you should be careful to sample the cases at random, instead of choosing them sequentially.

You can control the fractions of attributes used for inputs to predict the accuracy of an analysis model and the time it takes to build the model. The higher the fraction, the more accurate the model, but the longer it takes to build the model. To control the number of attributes that are included in an analysis model, modify Input Attribute Fraction and Output Attribute Fraction in the Model Build Properties dialog box. A value of 1.0 indicates that the Predictor resource is to use all of the attributes available to build the model. If the database contains many attributes, (for example, a product catalog with tens or hundreds of thousands of products) it is best to use a lower number. Unless you have a reason to do otherwise, it is probably best to use equal input and output fractions.

When you try to improve the results of your Prediction model, consider the data sizes and limits, and the performance criteria for the Predictor resource.

It is also recommended that you build your analysis models on the same computer that contains your Commerce Server Data Warehouse.

Data Sizes and Limits

Performance Criteria

Data Sizes and Limits

The following table shows the average and maximum data sizes for the Predictor resource.

Parameter Average value Maximum value
Number of products in product catalog 50,000 500,000
Number of cases to build model 30,000 No limit other than the amount of available memory.
Number of transactions per user 20 Total number of attributes
Size of computed model 100 KB Limited by your database limitations
Maximum number of attributes predicted per call 10 Total number of attributes

Performance Criteria

The following table shows the performance criteria for the Predictor resource. The criteria are based on the average numbers defined in Data Sizes and Limits.

All numbers are measured on a 400 MHz Pentium II with 512 MB of RAM for the service. Client computers are 400 MHz Pentium II with 128 MB of RAM.

Criteria Component Minimum Maximum Average
Latency to make prediction Client 0 milliseconds 200 milliseconds per prediction call 100 milliseconds per prediction call
Throughput of prediction requests Client 30 predictions per second per CPU Not applicable 50 predictions per second per CPU
New model build time Service Not applicable 8 hours 1.5 hours
Load model from database Client Not applicable 30 seconds 10 seconds
Service start time Service Not applicable 45 seconds 10 seconds

See Also

Analysis Model Effectiveness


All rights reserved.