This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

Calculating Average in a Multi-Dimensional Cube

Peter Chen

Averages are among the most requested form of aggregated data in OLAP cubes, but coming up with correct formulas can be surprisingly tricky. In this article, Peter Chen shows you how.

Multi-dimensional cubes such as those provided by SQL Server 7's OLAP Services and SQL Server 2000's OLAP Services provide you with the mechanism for rapid response times to queries of pre-calculated aggregated data such as count, sum, and average. However, the AVG aggregate function is missing from the drop-down list box of the measurement properties in the cube editor (which only includes SUM, COUNT, MIN, MAX, and DISTINCT COUNT.) Furthermore, although Analysis Services provides an AVG function, you can only use it with a "set." Here's the definition of the AVG function from BOL:

  • AVG: average value of a numeric expression evaluated over a set.

    avg(«Set»[, «Numeric Expression»])
    

The AVG function is used when you want to calculate average by dividing the total amount by the member count. Take the Sales cube in the FoodMart2000 database, for example. If you create a calculated measure with the following value expression, it will show you the average unit sales at different levels in the time dimension:

Avg([time].currentmember.children,[Measures].[Unit Sales])

In other words, you can't use the AVG function in cases where you need to calculate average by dividing the total amount by a specific count varying at different levels or different members. For example, how can you calculate an amount average by work days? The total work days vary each month.

Before you attempt to solve this task, you need to understand the difference of the processing sequence order between the derived measurement and calculated measurement in a cube. Derived measurement such as count and sum is done during the cube process, while calculated measurement is done after the cube process is completed. Therefore, the average calculation should be defined in the calculated measurement.

Average by the work days

It's not uncommon—especially in the manufacturing or financial industries—to need to calculate the weekly, monthly, quarterly, or yearly average by the actual work days instead of calendar elapsed days. Here's how to do it: You need to add the number of work days attributes in the time dimension table on SQL Server, and you need to add the corresponding time dimension property—"num_work_days"—in the cube.

For example, let's assume that users want to view the average funded loan count by work days for the past six months. In this case, the smallest granularity is month, and we only need to calculate the work days each month.

CREATE TABLE [work_day] 
(
  [work_day_key] [smallint] IDENTITY (0, 1) NOT NULL ,
  [work_date] [datetime] NULL ,
  [mtd_work_days] [tinyint] NULL ,
  CONSTRAINT [work_day_pk] PRIMARY KEY  CLUSTERED 
  (
        [work_day_key]
  )  ON [PRIMARY] 
)

Here's representative sample data for work_day:

work_day_key

work_date

mtd_work_days

1892

2001-05-29 00:00:00.000

20

1893

2001-05-30 00:00:00.000

21

1894

2001-05-31 00:00:00.000

22

1895

2001-06-01 00:00:00.000

1

1896

2001-06-04 00:00:00.000

2

1897

2001-06-05 00:00:00.000

3

CREATE TABLE [work_month]
(
  [work_month_key] [smallint] IDENTITY (0, 1) NOT NULL ,
  [Year_month] [varchar] (7) NULL ,
  [num_work_days] [tinyint] NULL ,
  CONSTRAINT [work_month_pk] PRIMARY KEY  CLUSTERED 
  (
        [work_month_key]
  )  ON [PRIMARY] 
)

You need to populate the work_month table from a work_day table and create a view (vw_past6_work_month, in this example) only for the past six months. The data from the view will look something like this as of 6/9/01. The view is included in the accompanying Download file.

work_month_key

year_month

num_work_days

85

2000-12

20

86

2001-01

21

87

2001-02

19

88

2001-03

22

89

2001-04

21

90

2001-05

22

91

2001-06

6

After you create a past_6month shared dimension with the vw_past6_work_month view through the dimension wizard, you need to add a [num days month] to this dimension property of the year_month level. In the dimension editor, add a new member property by right-clicking on the [member properties] as shown in Figure 1.

The rest is simple. Create a calculated dimension called AvgLoanCount in the calculated member builder and input the following MDX code in the value expression box:

iif([past6_month].CurrentMember.level.name = 
"Year Month", [Measures].[funded count]/
cdbl([past6_month].currentmember.properties
("num work days")), 0)

Note that the average is done only at the "Year Month" level. Because the member property is saved as text/string data type, cdbl(), a VBA function, is used to convert the [num work days] member property to numeric.

Weighted average

The average is calculated by the contributing members. All you need to do is create two derived measurements and one calculated measurement. The derived measurements are shown here:

SumAmount: sum of the numeric measurement
CountMember: count of the contributing members

And this is the calculated measurement:

SumAmount/CountMember

To avoid a divided-by-zero error, use the iif function to check whether the denominator is zero:

iif([Measures].[ CountMember]=0,0,
[Measures].[ SumAmount]/[Measures].[ CountMember])

Because the CountMember might not be useful to the client, you should set the [visible] property in the advanced folder to false.

[Related articles: Amir Netz's April 1999 white paper on "OLAP Services: DISTINCT COUNT and Basket Analysis" at https://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnolap/html/distinct2.asp and Carl Nolan's "Introduction to Multidimensional Expressions (MDX)" at https://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnolap/html/intromdx.asp.—Ed.]

In conclusion, although there might be other kinds of averages that I haven't covered in this article, by knowing that calculated measurement is calculated after cube processing, it should be very easy to implement your average.

Download CUBEAVG.SQL

To find out more about Microsoft SQL Server Professional and Pinnacle Publishing, visit their website at http://www.pinpub.com/html/main.isx?sub=57

Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.

This article is reproduced from the September 2001 issue of Microsoft SQL Server Professional. Copyright 2001, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. Microsoft SQL Server Professional is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-493-4867 x4209.