Data Analysis Expressions (DAX) Overview
The Data Analysis Expressions (DAX) language is a formula language that allows users to define custom calculations in PowerPivot tables (calculated columns) and in Excel PivotTables (measures). DAX includes some of the functions that are used in Excel formulas, and additional functions that are designed to work with relational data and perform dynamic aggregation.
This section explains the following concepts:
Where to use DAX formulas
How to create DAX formulas
Types of operations you can perform with DAX
DAX formulas are very similar to Excel formulas. To create one, you type an equal sign, followed by a function name or expression, and any required values or arguments. Like Excel, DAX provides a variety of functions that you can use to work with strings, perform calculations using dates and times, or create conditional values.
However, DAX formulas are different in the following important ways:
A DAX function always references a complete column or a table. If you want to use only particular values from a table or column, you can add filters to the formula.
If you want to customize calculations on a row-by-row basis, PowerPivot provides functions that let you use the current row value or a related value to perform calculations that vary by context.
DAX includes a type of function that returns a table as its result, rather than a single value. These functions can be used to provide input to other functions, thus calculating values for entire tables or columns.
Some DAX functions provide time intelligence, which lets you create calculations using meaningful ranges of dates, and compare the results across parallel periods.
Where to Use Formulas
You can use DAX formulas either in PowerPivot tables, or in PivotTables in Excel:
You can use formulas in calculated columns, by adding a column and then typing an expression in the formula bar. You create these formulas in the PowerPivot window. For more information, see Create a Calculated Column.
You can use formulas in measures. You create these formulas in Excel, by clicking Add Measure in an existing PowerPivot PivotTable or PivotChart. For more information, see Create a Measure.
The same formula can behave differently depending on whether the formula is used in a calculated column or a measure. In a calculated column, the formula is always applied to every row in the column, throughout the table. Depending on the row context, the value might change. In a measure, however, the calculation of results is strongly dependent on context. That is, the design of the PivotTable and the choice of row and column headings affects the values that are used in calculations. For more information, see Context in DAX Formulas.
Creating Formulas by Using the Formula Bar
PowerPivot, like Excel, provides a formula bar to make it easier to create and edit formulas, and AutoComplete functionality, to minimize typing and syntax errors.
To enter a name of a table Begin typing the name of the table. Formula AutoComplete provides a dropdown list containing valid names that begin with those letters.
To enter the name of a column Type a bracket, and then choose the column from the list of columns in the current table. For a column from another table, begin typing the first letters of the table name, and then choose the column from the AutoComplete dropdown list.
For a walkthrough of how to build formulas, see Build Formulas for Calculations.
Tips for Using AutoComplete
You can use Formula AutoComplete in the middle of an existing formula with nested functions. The text immediately before the insertion point is used to display values in the drop-down list, and all of the text after the insertion point remains unchanged.
Defined names that you create for constants do not display in the AutoComplete drop-down list, but you can still type them.
PowerPivot does not add the closing parenthesis of functions or automatically match parentheses. You must make sure that each function is syntactically correct or you cannot save or use the formula.
Using Multiple Functions in a Formula
You can nest functions, meaning that you use the results from one function as an argument of another function. You can nest up to 64 levels of functions in calculated columns. However, nesting can make it difficult to create or troubleshoot formulas.
Many PowerPivot functions are designed to be used solely as nested functions. These functions return a table, which cannot be directly saved as a result to the PowerPivot workbook; it must be provided as input to a table function. For example, the functions SUMX, AVERAGEX, and MINX all require a table as the first argument.
Some limits on nesting of functions exist within measures, to ensure that performance is not affected by the many calculations required by dependencies among columns.
The DAX function library is based on the Excel function library, but the libraries have many differences. This section summarizes the differences and similarities between Excel functions and DAX functions.
Many DAX functions have the same name and the same general behavior as Excel functions but have been modified to take different types of inputs, and in some cases, might return a different data type. Generally, you cannot use DAX formulas in an Excel workbook or use Excel formulas in a PowerPivot workbook without some modification.
DAX functions never take a range of cells or a range as reference, but instead DAX functions take a column or table as reference.
DAX date and time functions return a datetime data type. In contrast, Excel date and time functions return an integer that represents a date as a serial number.
Many of the new DAX functions either return a table of values or make calculations based on a table of values as input. In contrast, Excel has no functions that return a table, but some functions can work with arrays. The ability to easily reference complete tables and columns is a new feature in PowerPivot.
DAX provides new lookup functions that are similar to the array and vector lookup functions in Excel. However, the DAX functions require that a relationship is established between the tables.
DAX does not support the variant data type found in Excel. The data in a column is expected to always be of the same data type. If the data is not the same type, DAX changes the entire column to the data type that best accommodates all values.
You can import data into a PowerPivot worksheet from many different data sources that might support different data types. When you import or load the data into a workbook and then use the data in calculations or in PivotTables, the data is converted to one of the PowerPivot data types. For a list of the data types, see Data Types Supported in PowerPivot Workbooks.
The table data type is a new data type in DAX that is used as the input or output to many new functions. For example, the FILTER function takes a table as input and outputs another table that contains only the rows that meet the filter conditions. By combining table functions with aggregation functions, you can perform complex calculations over dynamically defined data sets. For more information, see Aggregations in Formulas.
The PowerPivot window is an area where you can work with multiple tables of data and connect the tables in a relational model. Within this model, tables are connected to each other by relationships, which let you create correlations with columns in other tables and create more interesting calculations. For example, you can create formulas that sum values for a related table and then save that value in a single cell. Or, to control the rows from the related table, you can apply filters to tables and columns. For more information, see Relationships Overview.
Because you can link tables by using relationships, your PivotTables can also include data from multiple columns that are from different tables.
However, because formulas can work with entire tables and columns, you need to design calculations differently than you do in Excel.
In general, a DAX formula in a column is always applied to the entire set of values in the column (never to only a few rows or cells).
Tables in PowerPivot must always have the same number of columns in each row, and all rows in a column must contain the same data type.
When tables are connected by a relationship, you are expected to make sure that the two columns used as keys have values that match, for the most part. Because PowerPivot does not enforce referential integrity, it is possible to have non-matching values in a key column and still create a relationship. However, the presence of blank or non-matching values might affect the results of formulas and the appearance of PivotTables.
When you link tables in your workbook by using relationships, you enlarge the scope, or context, in which your formulas are evaluated. For example, formulas in a PivotTable can be affected by any filters or column and row headings in the PivotTable. You can write formulas that manipulate context, but context can also cause your results to change in ways that you might not anticipate. For more information, see Context in DAX Formulas.
You can create formulas in PowerPivot either in calculated columns or in measures.
A calculated column is a column that you add to an existing PowerPivot table. Instead of pasting or importing values in the column, you create a DAX formula that defines the column values. If you include the PowerPivot table in a PivotTable (or PivotChart), the calculated column can be used as you would any other data column.
The formulas in calculated columns are much like the formulas that you create in Excel. Unlike in Excel, however, you cannot create a different formula for different rows in a table; instead, the DAX formula is automatically applied to the entire column.
When a column contains a formula, the value is computed for each row. The results are calculated for the column as soon as you create the formula. Column values are only recalculated if the underlying data is refreshed or if manual recalculation is used.
You can create calculated columns that are based on measures and other calculated columns. However, avoid using the same name for a calculated column and a measure, as this can lead to confusing results. When referring to a column, it is best to use a fully qualified column reference, to avoid accidentally invoking a measure.
A measure is a formula that is created specifically for use in a PivotTable (or PivotChart) that uses PowerPivot data. Measures can be based on standard aggregation functions, such as COUNT or SUM, or you can define your own formula by using DAX. A measure is used in the Values area of a PivotTable. If you want to place calculated results in a different area of a PivotTable, use a calculated column instead.
To create a measure, you must first add a PivotTable or Pivot Chart to your PowerPivot workbook. When you define a formula for a measure, nothing happens until you drop the measure into a PivotTable. When you add the measure, the formula is evaluated for each cell in the Values area of the PivotTable. Because a result is created for each combination of row and column headers, the result for the measure can be different in each cell.
The definition of the measure that you create is saved with its source data table. It appears in the PowerPivot Field List and is available to all users of the workbook.
Data refresh and recalculation are two separate but related operations that you should understand when designing a data model that contains complex formulas, large amounts of data, or data that is obtained from external data sources.
Refreshing data is the process of updating the data in your workbook with new data from an external data source. You can refresh data manually at intervals that you specify. Or, if you have published the workbook to a SharePoint site, you can schedule an automatic refresh from external sources.
Recalculation is the process of updating the results of formulas and calculated columns in your workbook to reflect any changes to the formulas and to reflect changes in the underlying data. Recalculation can affect performance in the following ways:
For a calculated column, the result of the formula must always be recalculated for the entire column, whenever you change the formula.
For a measure, however, the results of a formula are not calculated until the measure is placed in the context of the PivotTable or PivotChart. The formula will also be recalculated when you change any row or column heading that affects filters on the data or when you manually refresh the PivotTable.
For more information, see the following topics: