Understanding Aggregations in Formulas
This topic introduces aggregations and provides an overview of the types of aggregations that are possible with PowerPivot for Excel. PowerPivot for Excel contains these tools for building aggregations:
You can build PivotTables and PivotCharts that are based on PowerPivot data. Excel PivotTables are a popular tool for grouping and summarizing data in worksheets. PowerPivot is integrated with the PivotTable features in Excel and provides many enhancements.
You can use the DAX formula language to design custom aggregations. DAX can be used to create calculated columns in PowerPivot tables and to create measures in PivotTables and PivotCharts.
The final section of this topic provides links to more detailed information on how to build aggregations.
Aggregations are a way of collapsing, summarizing, or grouping data. When you start with raw data from tables or other data sources, the data is often flat, meaning there is lots of detail but it has not been organized or grouped in any way. This lack of summaries or structure might make it difficult to discover patterns in the data. Therefore, an important part of the analyst's job is to define aggregations that simplify, abstract, or summarize patterns in answer to a specific business question.
Choosing Groups for Aggregation
When you aggregate data, you group data by attributes such as product, price, region, or date and then define a formula that works on all the data in the group. For example, when you create a total for a year, you are creating an aggregation. If you then create a ratio of this year over the previous year and present those as percentages, it is a different type of aggregation.
The decision of how to group the data is driven by the business question. For example, aggregations can answer the following questions:
Counts How many transactions were there in a month?
Averages What were the mean sales in this month, by salesperson?
Minimum and maximum values Which sales districts were the top five in terms of units sold?
To create a calculation that answers these questions, you must have detailed data that contains the numbers to count or sum, and that numeric data must be related in some way to the groups that you will use to organize the results.
If the data does not already contain values that you can use for grouping, such as a product category or the name of the geographical region where the store is located, you might want to introduce groups to your data by adding categories. When you build groups in Excel, you must manually type or select the groups you want to use from among the columns in your worksheet.
However, in a relational system, hierarchies such as categories for products are often stored in a different table than the fact or value table. Usually the category table is linked to the fact data by some kind of key. For example, suppose you find that your data contains product IDs but not the names of products or their categories. To add the category to a flat Excel worksheet, you would have to copy in the column that contained the category names. However, in a PowerPivot workbook, you can import the product category table to your workbook, create a relationship between the table with the number data and the product category list, and then use the categories to group data. For more information, see Creating Relationships Between Tables.
Choosing a Function for Aggregation
After you have identified and added the groupings to use, you must decide which mathematical functions to use for aggregation. Often the word aggregation is used as a synonym for the mathematical or statistical operations that are used in aggregations, such as sums, averages, minimum, or counts. However, PowerPivot for Excel enables you to create custom formulas for aggregation, in addition to the standard aggregations found in Excel.
For example, given the same set of values and groupings that were used in the preceding examples, you could create custom aggregations that answer the following questions:
Filtered counts How many transactions were there in a month, excluding the end-of-month maintenance window?
Ratios using averages over time What was the percentage growth or decline in sales compared to the same period last year?
Grouped minimum and maximum values Which sales districts were ranked top for each product category, or for each sales promotion?
When you have a general idea of how your data should be grouped to be meaningful, and the values that you want to work with, you can decide whether to build a PivotTable or create calculations within a table. PowerPivot for Excel extends and improves the native ability of Excel to create aggregations such as sums, counts, or averages. You can create custom aggregations in PowerPivot either within the PowerPivot window, or within the Excel PivotTable area.
In a calculated column, you can create aggregations that take into account the current row context to retrieve related rows from another table, and then sum, count, or average those values in the related rows.
In a measure, you can create dynamic aggregations that use both filters defined within the formula, and filters imposed by the design of the PivotTable and the selection of Slicers, column headings, and row headings.
For more information, see Building Formulas for Calculated Columns and Measures.
Adding Groupings to a PivotTable
When you design a PivotTable, you drag fields that represent groupings, categories, or hierarchies, to the columns and rows section of the PivotTable to group the data. You then drag fields that contain numeric values into the values area so that they can be counted, averaged, or summed.
If you add categories to a PivotTable but the category data is not related to the fact data, you might get an error or peculiar results. Usually PowerPivot for Excel will try to correct the problem, by automatically detecting and suggesting relationships. For more information, see Work with Relationships in PivotTables.
You can also drag fields into Slicers, to select certain groups of data for viewing. Slicers are a new feature in Excel and in PowerPivot for Excel that let you interactively group, sort, and filter the results in a PivotTable.
Working with Groupings in a Formula
You can also use groupings and categories to aggregate data that is stored in tables by creating relationships between tables, then creating formulas that leverage those relationships to look up related values.
In other words, if you want to create a formula that groups values by a category, you would first use a relationship to connect the table containing the detail data and the tables containing the categories, and then build the formula.
For more information about how to build formulas that use lookups, see Understanding the Use of Relationships and Lookups in Formulas.
Using Filters in Aggregations
A new feature in PowerPivot is the ability to apply filters to columns and tables of data, not only in the user interface and within a PivotTable or chart, but also in the very formulas that you use to calculate aggregations. Filters can be used in formulas both in calculated columns and in measures.
For example, in the new DAX aggregation functions, instead of specifying values over which to sum or count, you can specify an entire table as the argument. If you did not apply any filters to that table, the aggregation function would work against all the values in the specified column of the table. However, in DAX you can create either a dynamic or static filter on the table, so that the aggregation operates against a different subset of data depending on the filter condition and the current context.
By combining conditions and filters in formulas you can create aggregations that change depending on the values supplied in formulas, or that change depending on the selection of rows headings and column headings in a PivotTable.
For more information, see Filtering Data in Formulas.
The following table lists some of the standard aggregation functions provided by Excel, and provide links to the implementation of these functions in PowerPivot for Excel. The DAX version of these functions behaves much the same as the Excel version, with some minor differences in syntax and handling of certain data types.
Standard Aggregation Functions
Returns the average (arithmetic mean) of all the numbers in a column.
Returns the average (arithmetic mean) of all the values in a column. Handles text and non-numeric values.
Counts the number of numeric values in a column.
Counts the number of values in a column that are not empty.
Returns the largest numeric value in a column.
Returns the largest value from a set of expressions evaluated over a table.
Returns the smallest numeric value in a column.
Returns the smallest value from a set of expressions evaluated over a table.
Adds all the numbers in a column.
DAX Aggregation Functions
DAX includes aggregation functions that let you specify a table over which the aggregation is to be performed. Therefore, instead of just adding or averaging the values in a column, these functions let you create an expression that dynamically defines the data to aggregate.
The following table lists the aggregations functions that are available in DAX.
Averages a set of expressions evaluated over a table.
Counts a set of expressions evaluated over a table.
Counts the number of blank values in a column.
Counts the total number of rows in a table.
Counts the number of rows returned from a nested table function, such as filter function.
Returns the sum of a set of expressions evaluated over a table.
Differences between DAX and Excel Aggregation Functions
Although these functions have the same names as their Excel counterparts, they utilize the PowerPivot VertiPaq engine and have been rewritten to work with tables and columns. You cannot use a DAX formula in an Excel workbook, and vice versa. They can only be used in the PowerPivot window and in PivotTables that are based on PowerPivot data. Also, although the functions have identical names, the behavior may be slightly different. For more information, see the individual function reference topics.
The way that columns are evaluated in an aggregation is also different from the way that Excel handles aggregations. An example may help illustrate.
Suppose you want to get a sum of the values in the Amount column in the table Sales, so you create the following formula:
In the simplest case, the function gets the values from a single unfiltered column, and the result is the same as in Excel, which always just adds up the values in the column, Amount. However, in PowerPivot, the formula is interpreted as "Get the value in Amount for each row of the Sales table, and then add up those individual values. PowerPivot evaluates each row over which the aggregation is performed and calculates a single scalar value for each row, and then performs an aggregation on those values. Therefore, the result of a formula can be different if filters have been applied to a table, or if the values are calculated based on other aggregations that might be filtered. For more information, see Context in DAX Formulas.
In addition to the new table aggregation functions described in the previous section, DAX has aggregation functions that work with dates and times you specify, to provide built-in time intelligence. These functions use ranges of dates to get related values and aggregate the values. You can also compare values across date ranges.
The following table lists the time intelligence functions that can be used for aggregation.
Calculates a value at the calendar end of the given period.
Calculates a value at the calendar end of the period prior to the given period.
Calculates a value over the interval that starts at the first day of the period and ends at the latest date in the specified date column.
The other functions in the Time Intelligence function section (Time Intelligence Functions (DAX)) are functions that can be used to retrieve dates or custom ranges of dates to use in aggregation. For example, you can use the DATESINPERIOD function to return a range of dates, and use that set of dates as an argument to another function to calculate a custom aggregation for just those dates.