Predictor Best Practices
You can improve the performance of the Predictor resource in the following ways:
Control the sample size used to build an analysis model
Control how many attributes are to be included in an analysis model
Build your model on the same computer that contains your Data Warehouse
You can control the relative accuracy of an analysis model and the time it takes to build it, by varying the sample size. The larger the sample, the more accurate the analysis model and the longer it takes to build the model. However, the accuracy of analysis models larger than 30,000 to 50,000 cases does not significantly improve. You control the number of cases used to build an analysis model by modifying Number of Cases in the Model Build Properties dialog box. You can also pre-sample your data using a view or table and build the model from that sample. With this method, you should be careful to sample the cases at random, instead of choosing them sequentially.
You can control the fractions of attributes used for inputs to predict the accuracy of an analysis model and the time it takes to build the model. The higher the fraction, the more accurate the model, but the longer it takes to build the model. To control the number of attributes that are included in an analysis model, modify Input Attribute Fraction and Output Attribute Fraction in the Model Build Properties dialog box. A value of 1.0 indicates that the Predictor resource is to use all of the attributes available to build the model. If the database contains many attributes, (for example, a product catalog with tens or hundreds of thousands of products) it is best to use a lower number. Unless you have a reason to do otherwise, it is probably best to use equal input and output fractions.
When you try to improve the results of your Prediction model, consider the data sizes and limits, and the performance criteria for the Predictor resource.
It is also recommended that you build your analysis models on the same computer that contains your Commerce Server Data Warehouse.
Data Sizes and Limits
Performance Criteria
Data Sizes and Limits
The following table shows the average and maximum data sizes for the Predictor resource.
Parameter | Average value | Maximum value |
Number of products in product catalog | 50,000 | 500,000 |
Number of cases to build model | 30,000 | No limit other than the amount of available memory. |
Number of transactions per user | 20 | Total number of attributes |
Size of computed model | 100 KB | Limited by your database limitations |
Maximum number of attributes predicted per call | 10 | Total number of attributes |
Performance Criteria
The following table shows the performance criteria for the Predictor resource. The criteria are based on the average numbers defined in Data Sizes and Limits.
All numbers are measured on a 400 MHz Pentium II with 512 MB of RAM for the service. Client computers are 400 MHz Pentium II with 128 MB of RAM.
Criteria | Component | Minimum | Maximum | Average |
Latency to make prediction | Client | 0 milliseconds | 200 milliseconds per prediction call | 100 milliseconds per prediction call |
Throughput of prediction requests | Client | 30 predictions per second per CPU | Not applicable | 50 predictions per second per CPU |
New model build time | Service | Not applicable | 8 hours | 1.5 hours |
Load model from database | Client | Not applicable | 30 seconds | 10 seconds |
Service start time | Service | Not applicable | 45 seconds | 10 seconds |