Performance and Scalability of the Content Selection Framework

Article
06/07/2013

This topic presents tips for how to ensure your custom extensions to the Content Selection Framework (CSF) maintain maximum performance.

Make pipeline components free-threaded and marked as poolable

While it is possible to create CSF pipeline components using Microsoft Visual Basic 6.0, it is not recommended. Visual Basic components are always apartment threaded and therefore the pipeline has to create a new instance of your component for each content request. For maximum performance, implement your pipeline components in Microsoft Visual C++ using the Active Template Library (ATL). Make the components both-threaded and be sure to indicate to the pipeline that it can pool the components by adding the Component Object Model (COM) category, CATID_POOLABLE, to the registration information of your component.

Avoid database access on a per-request basis

For performance reasons, CSF was designed to avoid database or disk access on each content request. Instead, it caches all of the content into in-memory ContentListFactory and Dictionary objects. It queues up campaign item event data and writes it out during the next cache refresh. When writing your own extensions to the CSF, try to retain this design point, if possible.

Too much content to fit in a single in-memory cache

If the amount of content being served by the CSF grows too large, it may not be possible to store all of the content in a single ContentListFactory object. For example, if there are more than 1,000 content items being considered, they should be segmented into multiple content caches. Find a dimension to segment on, such as page group or site name.

Alternatively, in the Load Context stage of your CSF pipeline, you can make a query against a content store, such as a Microsoft SQL Server 2000 database. Use the results of the query to build a ContentListFactory object unique to the request that contains a small subset of the content. If you use this method, build filtering in as part of your query to avoid building a new index on each ContentListFactory object that is created.

Prefer filtering to expression evaluation

When you have a dimension that is commonly used for targeting on your site, consider using a filter instead of expressions to target that dimension. Because filters use pre-built indexes on the ContentListFactory object, they are more efficient than expressions. However, filters generally require a site developer to set them up, whereas a business manager can create and manage expressions.

The "funnel effect"

Try to arrange your filtering and scoring operations in order of increasing performance cost. Filtering should be first (hence its order in the CSF pipeline), followed by rudimentary scoring, such as application of exposure limits. The last step should generally be expression evaluation because it is usually the most costly operation. Your goal should be to eliminate as much content as possible, as early as possible. For example, if your site uses multiple ad banner sizes, any given user request is usually for only one of those sizes, so a filter is used to quickly eliminate all ads with a size that does not match the requested size. You should look for other opportunities to apply such filtering.