Key Concepts in DAX
This section provides you with a basic understanding of the concepts and terminology needed to succeed with Data Analysis Expressions (DAX) formulas. If you are already familiar with Excel functions, we recommend that you read the final section, which explains concepts that are new to DAX, such as filtering and context.
A DAX formula is comprised of an equal sign followed by a function or expression.
Functions perform operations such as concatenating or adding values, calculating sums or averages, or performing logical tests. Functions usually take some kind of argument, which might be a reference to a column or table. Functions can be nested inside other functions.
An expression can be used to define a value that can be a literal value or constant, a Boolean test, or a reference to a column containing values. Boolean expressions can be used to define a filter condition, such as [Sales] > 100.
Operators within expressions, such as a plus or minus sign, indicate how the values are to be compared or processed.
Values that you use in formulas and expressions can be typed directly into the formula bar as part of an expression, or they can be obtained from other columns, tables, or formulas. However, you cannot reference only a few cells or a range of cells; DAX always works with complete columns or tables.
For example, the following formulas are all valid:
The value of Amount in the All Sales table.
110% of the value in the Amount column of the current table.
The value of the constant pi.
='FALSE' = 0
A formula with nested functions that returns the sum of the SalesAmount for only the rows in the Sales table where the region is Europe.
For more information, see the following topics:
The DAX function library is based on the Excel function library, but the libraries have many differences. This section summarizes the differences and similarities between Excel functions and DAX functions.
Many DAX functions have the same name and the same general behavior as Excel functions but have been modified to take different types of inputs, and in some cases, might return a different data type. Generally, you cannot use DAX formulas in an Excel workbook or use Excel formulas in a PowerPivot workbook without some modification.
DAX functions never take a range of cells or a range as reference, but instead DAX functions take a column or table as reference.
DAX date and time functions return a datetime data type. In contrast, Excel date and time functions return an integer that represents a date as a serial number.
Many of the new DAX functions either return a table of values or make calculations based on a table of values as input. In contrast, Excel has no functions that return a table, but some functions can work with arrays. The ability to easily reference complete tables and columns is a new feature in PowerPivot.
DAX provides new lookup functions that are similar to the array and vector lookup functions in Excel. However, the DAX functions require that a relationship is established between the tables.
DAX does not support the variant data type found in Excel. The data in a column is expected to always be of the same data type. If the data is not the same type, DAX changes the entire column to the data type that best accommodates all values.
You can import data into a PowerPivot worksheet from many different data sources that might support different data types. When you import or load the data into a workbook and then use the data in calculations or in PivotTables, the data is converted to one of the PowerPivot data types. For a list of the data types, see Data Types Supported in PowerPivot Workbooks.
The table data type is a new data type in DAX that is used as the input or output to many new functions. For example, the FILTER function takes a table as input and outputs another table that contains only the rows that meet the filter conditions. By combining table functions with aggregation functions, you can perform complex calculations over dynamically defined data sets. For more information, see Understanding Aggregations in Formulas.
There are many new DAX functions that output tables, take tables as input, or do both. Because a table can have a single column, table-valued functions also take single columns as inputs. DAX includes the following types of table-valued functions:
Filter functions Return a column, table, or values related to the current row.
Aggregation functions Aggregate any expression over the rows of a table.
Time intelligence functions Return a table of dates, or use a table of dates to calculate an aggregation.
Understanding how to use these table-valued functions is important for fully utilizing DAX formulas. For more information, see Function Reference (DAX).
The PowerPivot window is an area where you can work with multiple tables of data and connect the tables in a relational model. Within this model, tables are connected to each other by relationships, which let you create correlations with columns in other tables and create more interesting calculations. For example, you can create formulas that sum values for a related table and then save that value in a single cell. Or, to control the rows from the related table, you can apply filters to tables and columns. For more information, see Understanding Relationships.
Because you can link tables by using relationships, your PivotTables can also include data from multiple columns that are from different tables.
However, because formulas can work with entire tables and columns, you need to design calculations differently than you do in Excel.
In general, a DAX formula in a column is always applied to the entire set of values in the column (never to only a few rows or cells).
Tables in PowerPivot must always have the same number of columns in each row, and all rows in a column must contain the same data type.
When tables are connected by a relationship, you are expected to make sure that the two columns used as keys have values that match, for the most part. Because PowerPivot does not enforce referential integrity, it is possible to have non-matching values in a key column and still create a relationship. However, the presence of blank or non-matching values might affect the results of formulas and the appearance of PivotTables. For more information, see Referential Integrity, PowerPivot Relationships, and the Unknown Member.
When you link tables in your workbook by using relationships, you enlarge the scope, or context, in which your formulas are evaluated. For example, formulas in a PivotTable can be affected by any filters or column and row headings in the PivotTable. You can write formulas that manipulate context, but context can also cause your results to change in ways that you might not anticipate. For more information, see Context in DAX Formulas.
You can create formulas in PowerPivot either in calculated columns or in measures.
A calculated column is a column that you add to an existing PowerPivot table. Instead of pasting or importing values in the column, you create a DAX formula that defines the column values. If you include the PowerPivot table in a PivotTable (or PivotChart), the calculated column can be used as you would any other data column.
The formulas in calculated columns are much like the formulas that you create in Excel. Unlike in Excel, however, you cannot create a different formula for different rows in a table; instead, the DAX formula is automatically applied to the entire column.
When a column contains a formula, the value is computed for each row. The results are calculated for the column as soon as you create the formula. Column values are only recalculated if the underlying data is refreshed or if manual recalculation is used.
You can create calculated columns that are based on measures and other calculated columns.
A measure is a formula that is created specifically for use in a PivotTable (or PivotChart) that uses PowerPivot data. Measures can be based on standard aggregation functions, such as COUNT or SUM, or you can define your own formula by using DAX. A measure is used in the Values area of a PivotTable. If you want to place calculated results in a different area of a PivotTable, use a calculated column instead.
To create a measure, you must first add a PivotTable or Pivot Chart to your PowerPivot workbook. When you define a formula for a measure, nothing happens until you drop the measure into a PivotTable. When you add the measure, the formula is evaluated for each cell in the Values area of the PivotTable. Because a result is created for each combination of row and column headers, the result for the measure can be different in each cell.
The definition of the measure that you create is saved with its source data table. It appears in the PowerPivot Field List and is available to all users of the workbook.
Formulas in PowerPivot can be affected by the filters applied in a PivotTable, by relationships between tables, and by filters used in formulas. Context is what makes it possible to perform dynamic analysis. Understanding context is important for building and for troubleshooting formulas.
There are different types of context: row context, query context, and filter context.
Row context can be thought of as "the current row.” If you have created a calculated column, the row context consists of the values in each individual row and values in columns that are related to the current row. There are also some functions (EARLIER and EARLIEST) that get a value from the current row and then use that value while performing an operation over an entire table.
Query context refers to the subset of data that is implicitly created for each cell in a PivotTable, depending on the row and column headers.
Filter context is the set of values allowed in each column, based on filter constraints that were applied to the row or that are defined by filter expressions within the formula.
The following topic explains the different types of context in more detail: Context in DAX Formulas.
Data refresh and recalculation are two separate but related operations that you should understand when designing a data model that contains complex formulas, large amounts of data, or data that is obtained from external data sources.
Refreshing data is the process of updating the data in your workbook with new data from an external data source. You can refresh data manually at intervals that you specify. Or, if you have published the workbook to a SharePoint site, you can schedule an automatic refresh from external sources.
Recalculation is the process of updating the results of formulas and calculated columns in your workbook to reflect any changes to the formulas and to reflect changes in the underlying data. Recalculation can affect performance in the following ways:
For a calculated column, the result of the formula must always be recalculated for the entire column, whenever you change the formula.
For a measure, however, the results of a formula are not calculated until the measure is placed in the context of the PivotTable or PivotChart. The formula will also be recalculated when you change any row or column heading that affects filters on the data or when you manually refresh the PivotTable.
For more information, see the following topics: