Performance considerations and approaches in BI planning solutions and scenarios


Applies to: SharePoint Server 2010 Enterprise

Topic Last Modified: 2011-01-28

In this article:

  • Keep the dimension sizes as small as is needed for the planning process.

  • Keep the number of dimensions used in a cube to the minimum required for planning.

  • Avoid MdxScript calculations when possible and make maximum use of client-side calculations such as on-sheet Excel or relational calculations that are scheduled to run periodically (that is currency conversion).

  • When using MdxScript rules, be thorough in checking not only logical correctness but also its performance. Sometimes the MdxScript statement can be modified to give large performance gains by writing it in slightly different but with the same logical equivalence.

  • Avoid computer hierarchies that are very deep.

  • Avoid complex MDX queries when you design reports and input forms. Queries that contain “WITH” statement and other calculated members will cause the Microsoft SQL Server Analysis Services (SSAS) computer to use limited caching logic and therefore result in less scale and performance

  • If working with a large set of data, create multiple partitions to best manage what static vs. volatile data.

  • Avoid cell level security inside role security definitions as caching logic will be limited when cell level security applies.

  • Design forms (PivotTables) which make use of filters and slicers to limit the number of cells and query for the form (that is, do not design the form with all possible data entry visible but provide a filter or slicer so that the form layout (size) remains consistent from a layout point of view and is then pivoted by changing the slicer.

  • Having multiple PivotTables will increase the query and response time. Prescription is to limit the number of PivotTables both on the same sheet and in the same workbook.

  • When you design forms and reports, set the default for the filters to a lowest level member on the hierarchy. This will allow the default queries to forego unnecessary aggregations calculations on the cube.

  • Do not keep large changes on client-side (for example, within the write-back changes of Excel PivotTables) rather incrementally publish these changes to the server for optimal runtime performance of the SQL Server Analysis Services.

  • Remote users on WAN might have slower network connection and response times. To service remote users, it might be better to do the following:

  • Provide them their own SQL/SharePoint instance in closer proximity to their physical location.

  • Provide them Remote Desktop Services access to Excel on a computer with a closer proximity to the SQL/SharePoint server.

For more information, see SQL Server 2008 White Paper: Analysis Services Performance Guide.

ETL considerations

ETL is the process of extracting data from source systems, transforming that data and loading it into the data model. SSIS_2nd_NoVer is Microsoft premier technology for working with ETL processes. Data integrators design SSIS packages by using Microsoft Business Intelligence Development Studio BIDS which includes the following benefits:

  • Access to a large library of built in ETL logic for such things as merging datasets, column lookups, error-handling, and so on.

  • Fast data transfers for data loading from source to destination.

  • UI to visualize the ETL process.

Planning ETL packages can be decomposed into the following areas:

  • Data import for dimensions, hierarchies, and facts.

  • Data export from fact tables back to source systems.

For our solution, we will create a staging table for each of the dimension, hierarchy, and fact tables that exist in our relational database. The staging tables will be used initially as the target table for loading data from the source system. It is possible to do ETL without using any staging tables at all, because SSIS has the capability of transforming most of the data in-memory and loading directly to the solution tables. However, the benefits of having staging tables include the following:

  • Snapshot the tables and relationships from the source system to facilitate working off a version of the data without risk of losing future access to the source system or be subject to unexpected data changes.

  • Easy audit of the staging data before pushing to the solution tables. For instance, you might decide to do a bulk load to all the solution tables only after someone checks over the staging tables for correctness.

Once data is loaded to staging tables and the necessary transformations are completed, then the process of loading from the staging to the solution tables can begin. By using SSIS, loading from staging to solution tables can be as easy as mapping the columns from the source table to the columns of destination table. SSIS provides a very powerful set of features for ETL, and the reader is encouraged to read further on the topic here, the White Paper: Introduction to SQL Server 2008 Integration Services.

Turn the SQL recovery model to simple when performance ETL. This will improve performance as it will reduce the overhead of unnecessary logging on the database.

It is typical in most scenarios that after a planning process is complete, the results must be collected and processed for export back to the source system where the data will be persisted and made available for reporting requirements.

In our solution, because the data is already stored in the fact tables in a normalized manner, preparing the data for export only requires some simple joins with the dimension tables.

Data export however, can become slightly trickier when the data that you want to export does not exist in your fact table at all and rather, only exists as calculations on the cube. How to overcome this scenario? The answer is to use ad-hoc distributed queries against the OLAP cube. (For more information, see Planning modeling and reporting guide for BI planning solutions and scenarios.

Security should be defined on the SSAS database through security roles. For optimal performance, it is best to keep security as simple as possible.

The highest level of security is the database security and we recommend that each IW that requires access to the data model have at least ‘read definition’ to the database.

The second broadest level of security is defined on the cube and this controls where individual cubes are read-only or read/write capable or inaccessible.

The next level of security is defined on dimensions. You can choose to let particular members of a dimension to be visible or hidden. It is recommended that you stop at dimension security for optimal performance from the SQL Server Analysis Services.

If you define security to the lowest level of access it would be at the cell level. Performance will be adversely affected as caching logic would be severely limited on queries from any user who have cell level security defined.

You can create security roles on your SQL Server Analysis Services database by using Microsoft SQL Server Business Intelligence Development Studio (BIDS).

It is important to be aware that BIDS can perform defining highly complex configurations to security and may require more technical expertise for an admin to set up and maintain. However, this can be a good area for customization where security may be defined elsewhere, either with a structured spreadsheet or linked table from SharePoint that can then be feed into a translation module and ultimately updates the OLAP security accordingly.

Dynamic security for SQL Server 2008 Analysis Services can also be set up for cases in which standard SQL Server Analysis Services roles are insufficient to cover all the complex relationships that exist. The scenario occurs when

  • Each IW requires access to a specific set of dimensions members

  • There are few overlaps in dimensional security requirements, that is there are many unique combinations of dimension members assigned to different IWs.