Adatis BI Blogs

Understanding DAX through the Power of Evaluation Context

If you really want to understand the behaviour of your DAX and be able to write more complex DAX, you need to understand Evaluation Context. The two really do go hand in hand. Many times I have had multiple people ask me to explain what Evaluation Context is which is why this blog post was put together. To get the most out of this read and be sure the below is for you, I have constructed a list of questions that you should be able to answer after reading this blog post:What is Evaluation Context?What is the difference between Filter Context and Row Context?What elements are considered in the Filter Context?What is Initial Filter Context (IFC) and what difference does it have with Filter Context?How do we read the Initial Filter Context?Does Evaluation Context apply before the DAX formula Evaluates?Are Grand Total/Sub Totals treated differently with Initial Filter Context?What is Context Transition?How is Context Transition applied?What is Evaluation Context?Evaluation Context is: The environment in which a DAX formula is evaluated in. So, when we write a DAX formula it gets evaluated within an environment. The DAX formula evaluated will be directly dependant on the Evaluation Context. This means the result returned can be different depending on the context, even if the DAX formula used is the same.Same DAX Formula, different result… huh?If this is your first read on Evaluation Context, you may be wondering why it is possible to get different results when using the same DAX formula. Let’s make sure we make this clear before moving on. Take a look at the below measure:Total Sales = SUM(Sales[ExtendedAmount])When reading this we interpret it as: “Show me the Total Sales Amount”. But when we add this measure in either of the below matrix’s we get a breakdown by Country and by Category. How is this possible? We never specified in the DAX formula anything in relation to Country or Category, right? All we asked for was the “Total Sales Amount”. So, how do we get different values on each row?It is important to understand why this is happening in order to really understand DAX. The reason this is occurring is due to the: Evaluation Context:The environment in which a DAX formula is evaluatedWhat types of Evaluation Context exist?We have two types of Evaluation Context:Filter ContextRow ContextFilter ContextFilter Context refers to the filtering that is applied to the tables within the data model. If the tables are joined, the filters propagate from the “one” side of the relationship to the “many” side of the relationship. The Filter Context is applied by: Rows in a VisualColumns in a VisualSlicers in a ReportFilters in a ReportOther on-screen visuals acting as filters in a ReportCALCULATE FunctionThe DAX formula evaluated within the Filter Context, is only evaluated once all filters have been applied to the Data Model. The deeper you go into Power BI, the more you will hear the term Filter Context. An additional term to be aware of is Initial Filter Context (IFC). This is the same as Filter Context with the only difference being it does not apply the CALCULATE function.Filter Context in ActionOkay, so we “know” what Filter Context is from what we have read so far! But I believe we can do better and the best way to do better is by looking at Filter Context in action. Below you will find four cases of Filter Context being applied, which once understood will be sure to boost your confidence in understanding this type of Evaluation Context.One thing worth noting, for each case below we will identify the Initial Filter Context meaning not considering the CALCULATE Function. The objective is to interpret all the elements on the report that make the Initial Filter Context.1. Filter Context in Action: RowLooking at the below cell highlighted in green within the Matrix, lets understand how this value is evaluated, we must ask ourselves:What is the Initial Filter Context for this particular cell?At the start of my Power BI journey, I initially used the below matrix to assist me in identifying the Initial Filter Context for a particular cell. I quickly grew out of this but found it useful in triggering the right thinking. So, the Initial Filter Context for the highlighted cell is only coming from the Rows in the Matrix from Territory[Country] = United Kingdom, as the below matrix displays:We now know how to read the Initial Filter Context of a particular cell. Slowly you will start to mentally visualise how the Initial Filter Context is applied to the underlying model. To make more sense of this now, let’s walk through the steps taken to propagate the Initial Filter Context:All rows are filtered in the Territory Table (Dimension) to display rows for ‘United Kingdom’.Those filtered records are propagated down the relationship from the ‘one’ to ‘many’ side.The Sales Table (Fact) only exists with those filtered records.It is very important to understand that the above three steps always take place for each individual cell and only once they do, does the DAX formula evaluate.2. Filter Context in Action: Row & SlicerWe must ask ourselves once again: “What is the Initial Filter Context for this particular cell?”To read the Initial Filter Context for the highlighted cell above in green, this time it goes beyond the Rows in the Matrix. Use the below Matrix to identify the Initial Filter Context:The Initial Filter Context is propagated through the below steps. Once again, these steps always take place for each cell individually and only once they do, does the DAX formula evaluate:All rows are filtered in the Territory Table (Dimension) to display rows for ‘United Kingdom’ and in the Calendar Table (Dimension) to display rows for ‘February’.Those filtered records are propagated down the relationship from the ‘one’ to the ‘many’ side. The Sales Table (Fact) only exists with those filtered records.3. Filter Context in Action: SlicerLet’s start of with the most important question: “What is the Initial Filter Context for this particular cell?”The purpose of this example is to understand how Grand Totals and Sub Totals work with the Initial Filter Context. For the above example, the IFC is not being filtered by an individual Row of Territory[Country]. In fact, the Grand Total is un-filtered from the Territory[Country], but it does get filtered by a Slicer in the report. It’s important to understand that the IFC for the highlighted cell is only the Slicer containing Calendar[MonthName] and not the Rows in the Matrix with Territory[Country].As always, the Initial Filter Context propagates as the below steps explain:All rows are filtered in the Calendar Table (Dimension) to display rows for ‘February’.Those filtered records are propagated down the relationship from the ‘one’ to the ‘many’ side. The Sales Table (Fact) only exists with those filtered records.4. Filter Context in Action: Rows, Column, Slicer & FilterFor the last time we need to ask the all famous question: “What is the Initial Filter Context for this particular cell?”The Initial Filter Context for the above highlighted cell is coming from multiple elements which are defined in the matrix below:Even though much more is happening in this example compared to the previous, the propagation of the Initial Filter Context still works through the three steps, resulting in the below:Tip: Filter IconA recent feature in Power BI called ‘Filter Icon’ is great way to identify the Initial Filter Context of a visual. Through a simple click, you can identify:Slicers affecting visualFilters affecting visualOther on-screen visuals affecting visualBut not the Rows and Columns of the Matrix itself. Regardless, it is a very useful feature which you can use to identify the IFC.Summary of Filter ContextWhat is the difference between Evaluation Context and Filter Context?Answer: Filter Context is a type of Evaluation Context.What is applied as part of the Filter Context?Answer: Rows, Column, Slicers, Filters, On-Screen Visuals, CALCULATE FunctionWhat is Initial Filter Context (IFC) and what difference does it hold with Filter Context?Answer: Initial Filter Context (IFC) does not consider the CALCULATE FunctionHow do we read the Initial Filter Context?Answer: “The IFC for this cell is Table[Column] = Value”Does Evaluation Context apply before the DAX formula Evaluates?Answer: YesAre Grand Total/Sub Totals treated differently with Initial Filter Context?Answer: No, they are not an exception. Remember, always ‘read’ the cell.Does the Filter Icon display the IFC?Answer: Partially, it displays everything apart from filters coming from Rows/Columns.Row ContextRow Context is a special iterating mechanism which takes into consideration the row it is currently on at the time that the DAX formula is evaluated. When thinking of Row Context, imagine each row in a table being a house. The Row Context will be the postman. The evaluation of the DAX formula is the delivery being made. The postman (Row Context) must visit each house (Row) in order to complete a delivery (Evaluation). This is how I initially start thinking of Row Context. Row Context is applied when we use: Calculated Columns, X-Functions and the FILTER() function.Row Context: Calculated ColumnTo make more sense of Row Context, let’s start off with Calculated Columns which apply Row Context. Take a look at the below Calculated Column:Margin £ = Sales[SalesExcVAT] – Sales[SalesCost]From our current understanding of Row Context and looking at the above DAX formula, the first question that might come to mind is “how does it know which row to start on?” Well the answer is simple. Row Context within a Calculated Column always starts with the first row and finishes on the last row. It iterates through the entire table the Calculated Column has been placed in, starting with the first row and evaluating the expression, then moving on the second row and evaluating the expression until it repeats the iteration for all records in the table.Looking at the below screenshot we can see that the Calculated Column ‘Margin £’ has produced a value for each row.As this Calculated Column has been placed in the Sales Table, this means the Row Context is applied against this table. The Sales Table has a total of 55,327 rows, therefore the Row Context (postman) will visit each of the 55,327 records (houses) and evaluate the expression (make the delivery).Row Context: X-FunctionsWe have spoken about Row Context and how this type of Evaluation Context is applied within Calculated Columns. Now let’s move on to another way Row Context is applied: X-Functions such as SUMX, RANKX, COUNTX, etc.Before we jump into an example, let me highlight this now. The analogy used above for Calculated Columns with the Postman (Row Context) having to visit each and every house (record) in order to complete a delivery (evaluation) is the same with X-Functions. The differences I am aware of are two.Firstly, the X-Function requires us to specify the table it will iterate through; therefore, we need to map the houses for the Postman. Secondly, the Calculated Column gets stored in your actual model, whilst X-Functions are measures that are not physically stored and only calculated at usage. If you are thinking whether to use a Calculated Column or a Measure and don’t know which to choose, always remember that memory is your biggest asset when working with Power BI. For a more in depth understanding, please check out the sources supplied below.Margin £ SUMX = SUMX(Sales, Sales[SalesExcVAT] – Sales[SalesCost])Take a look at the DAX formula above. First thing worth noting is that we declared the ‘Sales’ table. Therefore, we pinpoint each row (house) that needs to be visited.Looking at the table above, the DAX formula will visit the first record and evaluate the expression:Record 1 = 29.99 – 11.2163Then it will move to the second record and repeat the process:Record 2 = 28.99 – 10.8423Then it will move to the third record and repeat the process:Record 3 = 28.99 – 10.8423Until it repeats the same process for each record.Summary of Row ContextWhen is Row Context applied?Answer: Calculated Columns, X-Functions and Filter Function.From which row does Row Context begin?Answer: In Calculated Columns the first record otherwise the first record of table specified.What is the core difference between Row Context and Filter Context?Answer: Row Context works on a single row at a time and is aware of each individual row, whilst Filter Context refers to the filters applied when a DAX formula is evaluated.Why can’t I use an aggregator such as SUM, within Row Context?Answer: Row Context does not apply Filter Context by default, it must be enabled.What is Context Transition?Answer: The process of transforming the Row Context into Filter Context. Therefore, filters propagating down from one side to many side of tables.Row Context does not consider Filter ContextA very important part of Row Context is understanding that it does not automatically apply Filter Context. When I first started to write DAX, I remember using a SUM within a Calculated Column. The values in the column were clearly incorrect. Since then I have seen many experience the same outcome at least once, until figuring out that a measure does the trick. But this is not a trick and we should look to understand why this is happening.To better understand Context Transition, let’s work through an example. Below is the model which we will be working with, which is a simplified version from earlier examples:If we add the below DAX Formula as a Calculated Column within the Customers table, what do you think we would get back as a result?Total Sales Exc. VAT Calculated Column = SUM(Sales[Sales Exc. VAT])Here is the result:All rows are repeating the value 5435. This value is the total value of ‘Sales Excluding VAT’ that exists in the entire Sales table. The above result proves that Row Context does not automatically apply Filter Context.This occurs due to the Row Context iterating (visiting) every record in the Customer table and asking, “show me the Total Sales Amount” however no Filter Context exists. This means no filters are applied to the Sales table, at all! To ensure we understand this behaviour, let’s take it step by step:Go to Record 1 in Customer Table:What is the Total Sales Amount for Record 1?Sales Table has no filters, therefore return all sales.Go to Record 2 in Customer Table:What is the Total Sales Amount for Record 2?Sales Table has no filters, therefore return all sales.Go to Record 3 in Customer Table:What is the Total Sales Amount for Record 3?Sales Table has no filters, therefore return all sales.The all-important question here is: How do we apply Filter Context in the Row Context? This is done through:Context TransitionContext Transition is outside the scope of this read, however we will scrape the surface of this concept. I would highly recommend you check out the sources below, for a deeper dive into Context Transition.How do we apply Filter Context to Row Context?Row Context does not consider Filter Context. This is so important to remember that it is worth repeating. To convert Row to Filter Context we must use the power of the CALCUALTE Function! The CALCULATE Function with no filter parameters has the job of applying Context Transition.Look at the same DAX formula from above, only this time it is wrapped within the CALCULATE Function.Total Sales Exc. VAT (Calculated Column) = CALCUALTE(SUM(Sales[Sales Exc. VAT]))As you can see below, we get the correct result:The CALCULATE Function enables Filter Context; therefore, it converts the Row Context into Filter Context. The power of the CALCULATE Function! In more detail, the filters on the Customer table propagate down the relationship to the Sales Table for each record.Summary:If someone truly wants to learn Power BI, DAX is something that should not be avoided and for this reason Evaluation Context should be high on the to-learn list. This article should have armed you with the knowledge needed to interpret DAX behaviour, as well as easily digest more in-depth articles on Evaluation Context. @Data_LazSources:Matt Allington. (2018). Supercharge Power BI. Holy Macro! Books.Marco Russo & Alberto Ferrari (2015). The Definitive Guide to DAX. Microsoft Press Store.

SSRS and Dax

Tools like Power BI have changed reporting allowing power users to leverage tabular cubes to present information quicker and without the (perceived) need for developers. However, experience tells us many users still want data in tables with a myriad of formatting and display rules. Power BI is not quite there yet in terms of providing all this functionality in the same way that SSRS is. For me, SSRS's great value and, at the same time its curse, is the sheer amount of customisation a developer can do. I have found that almost anything a business user demands in terms of formatting and display is possible. But you have invested your time and money in a tabular SSAS model which plays nicely with Power BI but your users want SSRS reports so how to get to your data - using DAX, of course. Using EVALUATE, SUMMARIZECOLUMNS and SELECTCOLUMNS you can return data from a tabular model in a tabular format ready to be read as a dataset in SSRS. If you had the following data in your tabular model:  ProductName  ProductCategory  Country  Sales  Surly Steamroller  Complete bike  France  £46,575  Genesis Day One   Frame  France  £47,987  Genesis Day One   Frame  France  £47,987  Genesis Vapour 20  Complete bike  France  £24,867  Genesis Vapour 20  Complete bike  United Kingdom  £21,856  Genesis Day One   Complete bike  United Kingdom  £47,875  Surly Steamroller  Complete bike  United Kingdom  £27,969  Surly Steamroller  Complete bike  United States  £46,575  Genesis Day One   Frame  United States  £47,987  Genesis Day One   Complete bike  United States  £47,987A pattern for returning data for SSRS could be:DEFINE VAR FilterCountry = @country EVALUATE SELECTCOLUMNS ( SUMMARIZECOLUMNS ( 'Product'[ProductName], 'ProductCategory'[ProductCategory], 'Geography'[Country], FILTER ( VALUES ( 'Geography'[Country] ), ( OR ( ( FilterCountry = "All" ), PATHCONTAINS ( FilterCountry, 'Geography'[Country] ) ) ) ), "Sales", [Sales] ), "Product Name", [ProductName], "Product Category", [ProductCategory], "Sales", [Sales], "Estimated VAT", [Sales] * 0.2 ) To step through this…The DEFINE part declare a variable which will take a value from a SSRS parameterSUMMARIZECOLUMNS will return a CROSS JOIN of all the dimension columns entered, by passing in a measure or fact column as the last part of the block (here [Sales]) it will return only the cmbinations of dimension where there is a value for measure or fact item - particularly useful if you are dealing with date table stretching into the future.FILTER will filter the SUMMARIZECOLUMNS based on the SSRS parameter value. By adding the OR and PATHCONTAINS sections you can handle multi-value parameters in SSRS. This is a good blog post explaining this approach.SELECTCOLUMNS allows you to both provide friendly names for any dimensions (this is better handled in the tabular model but the amount of times it isn't) and also perform calculations within your DAX query. If your SSRS parameter was set to France and United Kingdom, the results would be:   Product Name  Product Category  Country  Sales  Estimated VAT  Surly Steamroller  Complete bike  France   £46,575  £931.50  Genesis Day One   Frame  France  £47,987  £959.74  Genesis Day One   Frame  France  £47,987  £959.74  Genesis Vapour 20  Complete bike  France  £24,867  £497.34  Genesis Vapour 20  Complete bike   United Kingdom  £21,856  £437.12  Genesis Day One   Complete bike  United Kingdom  £47,875  £957.50  Surly Steamroller  Complete bike  United Kingdom  £27,969  £559.38If you set up a connection to your tabular cube in SSRS, you can paste in your DAX code, configure your parameters and you are good to go!

Working with Manual and Automated KPIs in a Tabular Cube

In a recent project, we had to produce a scorecard using a set of manual and calculated KPIs. To obtain the manual KPI figures, we used Master Data Services (MDS) where the users could insert the values, while for the calculated, we used the base measures created in the tabular cube. So far, this requirement does not look very complicated, however, what if I tell you that the same KPI can either be manually or automatically calculated by the cube? And that we have to present the values for different levels of a hierarchy? And that some of the KPIs are not absolute values but ratios? Now that I got your attention, let’s have a look at the solution we implemented.How to join manual and automated KPIs?Because the client couldn’t always provide the data to calculate the base measures, we delivered an MDS model to, among other functionalities, manually insert the numeric values. You can check this blog if you want to know more about the Survey model ( we were working with different markets, the same KPI could either be manually or automatically calculated, which means, the cube had to to select the appropriate scenario, depending on the selected market. In order to achieve such requirement, we created 3 measures.AutKPI – Using base measures from multiple tables, we defined the DAX code to calculate the KPIManKPI – Knowing all the MDS values were in one table, we defined a simple DAX query to sum the valuesActual – This measure was implemented with an IF statement. Eg.         Actual:=IF(ISBLANK([AutKPI]), [ManKPI], [AutKPI])How to aggregate ratio KPIs?Let’s have a look at the example below, where we are calculating the KPI for two levels of a geography hierarchy.Automated KPI 1 – Europe is naturally aggregating the values from Great Britain and IrelandAutomated KPI 2 - Considering we are using base measures, the cube can properly calculate the KPI at Europe level. Manual KPI 1 – All manual entries were aggregated with a SUM. Because those are absolute values, the figure for Europe is correctManual KPI 2 ­– Following the same logic as Manual KPI 1, we can see the Europe value is incorrect. Because this is a ratio we can't aggregate the value from the lower levels. The simplest approach to resolve this problem was to create a new calculation using an AVERAGE function, however, considering the requirements, we had to introduce a weighted average.Implementing weighted averagesThe first step to this approach is to define a weight for each market. Since the values can change according to the user’s needs, we added a new entity to the MDS model.Now let’s consider the example below showing the weighted approach.Following is the formula to calculate the KPI at Europe level. For a better understanding, I split it in different steps.C1 GB: Manual KPI x WeightC1 Ireland: Manual KPI x WeightC2: C1 GB + C1 IrelandC3: GB Weight + Ireland WeightEurope KPI: C2 / C3The scope of the project stated we had to implement the following logic:· When presenting the KPIs at market level, don’t apply the weighting· When presenting the KPIs at region level, apply the weighting but only for the ratio KPIsThe biggest challenge of this requirement was to overwrite the aggregating logic of the geography hierarchy. To achieve that, we implemented a dynamic segmentation pattern on the ratio KPIs (more details on this link This approach can be split in four steps.First step is the calculation of our numerator.Num Weighted AutKPI – Because the base measures from our automated KPIs are from different tables, we had to firstly group our data by market and region level and only then apply the calculation. Eg.Num Weighted AutKPI:=                  CALCULATE(                    SUMX(                       SUMMARIZE(                          'KPI Value',                           Market[Region],                           Market[Market]                           ),                        [AutKPI] * [KPI Weight]                    )                 ) Num Weighted ManKPI – On this instance, the grouping was not necessary because we only had one measure to consider. Eg.Num Weighted ManKPI:=                  CALCULATE(                    SUMX(                       'KPI Value',                        CALCULATE(                           SUM ( 'KPI Value'[KPIActual] ) * [KPI Weight]),                           'KPI'[KPI] = "Manual KPI"                           )                        )                    ) The second step is the calculation of our denominator.Den Weighted AutKPI – Once again, because the weights were stored in a single table no grouping was necessary.Den Weighted AutKPI:=                  CALCULATE(                    SUMX(                       'KPI Value',                        CALCULATE([KPI Weight])                        )                    ,'KPI'[KPI] = “Automated KPI"                 ) Den Weighted ManKPI – The same logic applies on this instance.Den Weighted ManKPI:=                  CALCULATE(                    SUMX(                       'KPI Value',                        CALCULATE([KPI Weight])                        )                    ,'KPI'[KPI] = “Manual KPI"                 ) The third step is the division of our measures.Weighted AutKPI:= DIVIDE([Num Weighted AutKPI], [Den Weighted AutKPI]) Weighted ManKPI:= DIVIDE([Num Weighted ManKPI], [Num Weighted ManKPI]) The fourth step is the calculation of our Weighted Actual measure, by once again, using an IF function.Weighted Actual:= IF(ISBLANK([Weighted AutKPI]), [Weighted ManKPI], [Weighted AutKPI])Finally, considering we only wanted to use the weighted measures for a subset of our KPIs, we created a new measure using a SWITCH function. Eg.Actuals:=       SWITCH(         VALUES(‘KPI’[KPI]),         “Percentage KPI”, [Weighted Actual],         "Percentage KPI2", [Weighted Actual],         “Absolute KPI”, [Actual],         "Absolute KPI2",[Actual]      ) Hopefully, I was able to clearly demonstrate our problem and how we managed to implement a solution to solve it. As always, if you have any questions or comments, do let me know.

DAX Calculated Tables in Power BI

Calculated Tables have been around for a while in Power BI, but today I found a real life scenario for using them. I connected to Manchester United’s Facebook page and plugged the data into Microsoft’s Cognitive Services. In essence, I want to measure Sentiment Analysis and find out how many times a supporter has mentioned one of Manchester United’s rival football teams. You are probably wondering what this has to do with a DAX Calculated Table, so let me explain.  I have pulled down the Facebook data (from the API), but when trying to undertake a GROUP BY for Likes, Loves, etc. in the Query Editor, Power BI hangs and the query never resolves.  Whilst I cannot pinpoint exactly why this happens, I would guess that the number of API calls to Facebook are exceeded and some form of timeout occurs. This blog will walk you through how to create a DAX Calculated Table and apply a Group By to get the count of reaction types.  There are a number of articles already out there showing examples of a Calculated Table and I have provided the links at the bottom of the post. Existing Query Currently, my query looks like the below: The only remaining task is to apply a COUNT of all records, GROUPED BY Reactions.Type and id.  If I try and use the Query Editor functionality within the UI, the transformation step never completes. I am left with the following message in bottom right hnd side of the Query Editor: After waiting two hours for the GROUP BY query to resolve, I gave up.  The alternative is to use a DAX Calculated Table and I will show you how I achieved this: Calculated Table In order to create A Calculated Table, come out of the Query Editor, navigate to the Modeling tab and select New Table. Now we can write some DAX.  Pasting the below syntax into the new Table will achieve the Group By on the ‘Reaction Man United’ query. ReactionTotalsManUnited = GROUPBY (   ReactionsManUnited, ReactionsManUnited[id], ReactionsManUnited[reactions.type],  "TotalReactions", COUNTX( CURRENTGROUP(), ReactionsManUnited[reactions.type])  )  Let me break down the code: o   Calculated Table named as ‘ReactionTotalsManUnited’ o   GROUP BY function, grouping all reaction Id’s (‘id’) and types (‘reactions.type’) o   COUNTX function applied over reaction type, using the CURRENTGROUP() function to ensure the unique count is made by Id and Type within the ‘ReactionsManUnited’ table. Finally, to test the new DAX table works, I have created a basic KPI Card.  It is aggregating exactly expected. Conclusion Let’s recap.  I have shown you how to use three DAX expressions, albeit nested together in one statement.  This demonstrates how powerful and flexible the language is. o  GROUP BY o  COUNTX o  CURRENTGROUP I made use of the Calculate Table functionality due to poor performing queries made to the Facebook API.  There are many other reasons for using them, with some good examples provided in Chris Webb’s blog.  Where possible, you should always use Query Editor (and M language) for ad hoc transformations, although a DAX expression can sometimes get around slow performing queries.  DAX measures are evaluated at run time and in memory, whereas the Query Editor needs to pull down and refresh data after every applied step.  I would strongly recommend that all budding Power BI developers learn DAX, in order to get the most out of your Power BI reports.  The Calculated Table function is just one of over 200 different expressions within Power BI. Further Reading o   Microsoft MSDN –    o   Power BI Blog - o   Reza Rad’s Blog - o   Chris Webb’s blog - o   List of DAX Expressions (Paul Turley’s blog) - Contact Me If you would like a copy of the workbook or have any questions about this blog, please leave a comment below or contact me on Twitter (@DataVizWhizz ).

New Features in SQL Server 2016 – Part 2: Analysis Services

This blog solely focuses on the new Analysis Services features of SQL Server 2016 CTP2.  For anyone who may missed it – click here to view my opening blog on the Database Engine. Although there have not been any major Analysis Services (SSAS) enhancements in CTP2, it is pleasing Microsoft are still looking to improve this part of the BI Stack.  The majority of them seem to be geared towards Tabular and DAX, although there are plans to release further Multidimensional functionality in CTP2 and beyond. There are five key enhancements for SSAS: 1.      Tabular Model Partitions (Tabular). 2.      Many 2 Many (Tabular). 3.      Analysis Services PowerPivot mode. 4.      New DAX Functions (Tabular). 5.      Enable/disable attribute hierarchies. Tabular Model Partitions SQL Server SSAS 2016 CTP2 includes new parallel processing functionality for tables with two or more partitions, increasing processing performance. There are no configuration settings for this.  More information on this feature is displayed below: 1.      IN SSAS Tabular, partitioning big data marts will help drastically when processing new data.  Instead of having to process all data, you can just partition the relevant ones. 2.      In order to create, manage, and process partitions in SQL Server Management Studio, you must have the appropriate Analysis Services permissions defined in a security role. 3.      Parallel processing is also available in SSAS.  This happens automatically, when processing a table with more than one partitions.  Although you can choose to independently process a partition if required. Many to Many Dimensions (Tabular) There isn’t too much to show or say here yet apart from the fact that this feature is FINALLY HERE!  Gone are the complications of workarounds (thanks to Marco Russo and Alberto Ferrari), we can now just use the Many to Many functionality in Tabular.  Multidimensional already has this feature, which is one of the reasons why developers have yet to buy in to Tabular.  Unfortunately, Microsoft have yet to reveal much detail on to how this feature will work.  I just hope it performs as well as the current workarounds. Analysis Services PowerPivot mode To install PowerPivot for SharePoint is now a lot easier.  If you already know how to configure SharePoint 2013, this will help you greatly.  Taken from the MSDN Microsoft website, you simply use the Install Wizard and do the following: 1.      Select ‘Analysis Services’ from the standard Instance Feature. 2.      Choose ‘Analysis Services server mode’ and configure administrators (screenshot below).   New DAX Functions There are many new DAX functions, which are all detailed on MSDN website – click here for further detail. I have picked out 5 functions that will be highly beneficial when creating an SSAS Tabular cube.  I am looking at this form a very BI focused background so I would recommend looking at all of the new functions to see if they benefit your needs. 1.      TOPn a.      This is not actually a new function but has been updated. b.     Now allows TRUE/FALSE/ASC/DESC to specify sorting direction. 2.      CALENDAR a.      Returns a table with a single column named “Date” that contains a common set of dates. The range of dates is from the specified start date to the specified end date. b.      See Similar – CALENDARAUTO 3.      MEDIAN a.      Very simple – returns the median numbers in a column. b.      See Similar – MEDIANX (uses an expression). 4.      NATURALINNERJOIN a.      Inner join of a table with another table. The tables are joined on common columns in the two tables. If the two tables have no common column names, an error is returned. b.      See similar – NATURALLEFTOUTERJOIN 5.      SUMMARIZECOLUMNS a.      Enables you to group data together and return a summary table. Enable/Disable Attribute Hierarchies Again, this functionality is tailored towards SSAS Tabular.  There is a new setting that ties back to the Tabular mode metadata 1.      ColumnUsage property a.      DAXUsage                                                     i.     Attribute hierarchies can only be used in measures. b.     UnrestrictedUsage                                                     i.     Can be set in the XMLA (in SSDT) or by using an ALTER statement in the Tabular model.                                                    ii.     An example taken from MSDN:        <Alter>   <ObjectDefinition...>     <Database>       <Dimensions>         <Dimension>           <Attributes>             <Attribute>                <ddl500:ColumnUsage value=”ddl500_500”>                DAXUsage | UnrestrictedUsage                </ddl500:ColumnUsage>   Multidimensional Features The full SQL Server 2016 release will include some key enhancement to Multidimensional SSAS.  These include: 1.      Netezza as a Data Source (Netezza Data Warehouse | IBM - NDM Technologies). 2.      General performance improvements. 3.      Unnatural hierarchies. 4.      Distinct counts. 5.      DBCC support. Checks the logical and physical integrity of objects in the specified database. 6.      Expose on-premises multidimensional cubes in the cloud with Power BI. I will provide further information on the improvements, as and when Microsoft announce them. References For more information on all of the new SSAS SQL Server 2016 features, the below resources/blogs are highly recommended. ·        Official Microsoft Page - ·        Brew your own Business Intelligence Blog – ·        Jorg Klein SQL Server Blog -

SQL PASS Summit 2014 – Kick Off

Day 1 has kicked off in Seattle, a remarkable city. Having arrived a week early, I’ve had plenty of time to check out the sights and the food and have enjoyed it immensely - a brilliant venue for PASS! There were a few announcements at this morning’s Keynote, mostly regarding Azure. Azure SQL Databases are gaining larger index handling, parallel queries, extended events and in-memory columnstore for data marts. Joseph Sirosh gave a talk about Machine Learning & Information management showing a cool example of Azure Stream Analytics using Microsoft Kinect sensor information of customer interactions in a shop being uploaded straight into Power Map! I am looking forward to hearing more on Machine Learning. There are also a handful of great improvements for Power BI. I am most looking forward to the new Live Ops Dashboards and drill-through actions! Combo Reports also look promising… Moving onto the first session, I chose to attend ‘What’s new in Microsoft Power Query for Excel’. As it turns out there’s not a massive amount of new stuff – some new data sources and a tick box when you do a merge to remove name prefixes. However one of these new sources is the long-awaited Analysis Services data source. The new ODBC Data Source is a great addition also. There was a mention regarding the possibility of a decoupled M-query SSIS component! We probably won’t hear of anything until later in 2015, unfortunately. I would say this was not a level 300 session, more like 100/200. The second session was ‘SQL Server 2014 Always On (High Availability and Disaster Recovery)’: a nice overview of what was brought in in 2012 and the newer 2014 features – these including an increased maximum number of secondary replicas, increased availability of readable secondary’s and the Add Azure Replica Wizard. Despite not being a DBA and it being a level 300 session, I found it easy to follow and absorbable. I feel many of the DBAs in the room may not have taken away any information they would have not already have known, however. Niko Neugebauer gave a fantastic, in depth session on ‘ETL Patterns with Clustered Columnstore Indexes’. It was a fast moving talk, despite the time spent waiting for some executions. It demanded your full attention! Definitely worthy of its 400 level. It left me a little tired for Marco Russo’s ‘DAX Patterns’ session which showed some examples of workarounds for common tabular weaknesses like distinct counts for Type 2 slowly changing dimensions and cumulative quantities. Overall it was a strong day. I am looking forward to tomorrow. More to follow…

KPI’s in Power View and DAX Query for KPI’s

I was recently approached by a client where we have assisted in implementing a Microsoft Analysis Services tabular solution and a supporting management pack of SSRS reports.  They were asking about utilising some of the KPI’s defined in their tabular cube in the SSRS reports using DAX queries.  I was at a bit of a loss and most web help seemed to suggest it couldn’t be done: Blog comment from 2012 However, with a bit of time to investigate it appears Power View does now support Tabular KPI’s and as such a DAX query should be able to pull them out.  My KPI base measure is named “Highest Ranking” and I am able to view its value, status or goal in Power View.  A little work with every SQL developer’s best friend Profiler and I had what I was looking for.  It appears that when you create a KPI SSAS is defining some measures in the background which do some of the work of the MDX KPI functions. Therefore the following DAX query against my cycling Demo cube contains the expected results and could therefore be used in SSRS reporting. EVALUATE ADDCOLUMNS( VALUES('DimRider'[Name]) , "Highest_Ranking", 'FactRanking'[Highest Ranking], "Highest_Ranking_Goal", 'FactRanking'[_Highest Ranking Goal], "Highest_Ranking_Status", 'FactRanking'[_Highest Ranking Status] ) ORDER BY 'DimRider'[Name]   I also tested for the existence of other KPI functions such as Trend and Weight but these do not appear to be present.  It is also interesting that the use of a KPI over a measure does not change the measure name but just group it in the field list and as such there is no need for a value function. For more info on KPI’s in Power View there is additional documentation here.  I am currently unsure of the required versions for this functionality so if it is missing from your environment I would love to hear from you.

RANKX–Analysis Services 2012 Tabular

RANKX was a new function made available in DAX at the launch of SQL Server 2012; however Ranking is a widely used function in most BI projects, and was very much anticipated. Having worked extensively with Tabular on a recent project we had the need to rank over large sets of data, with models based on billions of rows. Whilst we were extremely impressed with the performance of Tabular, we found something not quite right with RANKX. Suppose we had the following very simple format of data, and wanted to perform a simple Rank over it. CountryKey Date Units SalesValue 53 24/01/2012 62 81.290619 156 03/01/2012 3575 5186.565208 48 07/05/2012 321 362.836524 157 16/04/2012 7 8.005637 134 19/08/2012 20 44.12 232 29/07/2012 40 49.313156 178 05/03/2012 47 61.25 63 26/03/2012 51 73.369509 153 17/06/2012 3061 6026.4 31 08/07/2012 3622 5005.038257 140 13/02/2012 9870 7713.968564 14 27/05/2012 29220 46728.26302 98 20/08/2012 297 377.653459 131 30/07/2012 172 182.591931 48 27/03/2012 348 411.801792 157 06/03/2012 3 8.01422 In order to a Rank the above data by country over all time we would probably write a simple DAX query as follows: DEFINE   MEASURE FactSales[UnitCount]= SUM(FactSales[Units])   MEASURE FactSales[Rank]= RANKX(ALL(FactSales[CountryKey]), FactSales[UnitCount]) EVALUATE   SUMMARIZE(     FactSales    ,FactSales[CountryKey]    ,"ProductSalesUnits",FactSales[UnitCount]    ,"Rank", FactSales[Rank]   ) ORDER BY   FactSales[UnitCount] DESC Now you can see from the query above, in order to avoid any doubt we only rank over a single table and we have no joins to other tables at all. This works fine resulting in the data below. FactSales[CountryKey] [ProductSalesUnits] [Rank] 227 91818113 1 226 21487836 2 39 9033389 3 14 7597054 4 82 4656585 5 109 3678302 6 75 3371681 7 140 2511239 8 107 1850520 9 208 1443241 10 31 1092047 11 153 997860 12 201 914827 13 59 870366 14 163 840927 15 Now assume we rank over a lot more data and wish to apply a simple CALCULATETABLE in order to filter some data out. We may write a DAX Query as follows: DEFINE   MEASURE FactSales[UnitCount]= SUM(FactSales[Units])   MEASURE FactSales[Rank]= RANKX(ALL(FactSales[CountryKey]), FactSales[UnitCount]) EVALUATE   CALCULATETABLE (     SUMMARIZE(       FactSales      ,FactSales[CountryKey]      ,"ProductSalesUnits",FactSales[UnitCount]      ,"Rank", FactSales[Rank]     )   ,DATESBETWEEN('Date'[CalDate], DATE(2012,1,1), DATE(2012,8,31)) ) ORDER BY   FactSales[UnitCount] DESC In this instance we are joining to a Date dimension, but that is it.  The above query yields the below result. FactSales[CountryKey] [ProductSalesUnits] [Rank] 227 83147875 1 226 19446567 2 39 8137335 3 14 6769390 4 82 4100168 5 109 3321496 6 75 3010890 7 140 2165554 8 107 1655472 9 208 1274740 10 31 987122 11 153 908938 12 201 853322 13 59 771677 14 That works, we get results as we would expect within seconds. Now, if instead of ranking over an integer field, let’s apply the RANKX function to a real number. In this example we get a worrying result set using the DAX as follows: DEFINE    MEASURE FactSales[SalesValue]= FactSales[Sales Value]   MEASURE FactSales[Rank]= RANKX(ALL(FactSales[CountryKey]), FactSales[UnitCount]) EVALUATE   CALCULATETABLE (     SUMMARIZE(       FactSales      ,FactSales[CountryKey]      ,"ProductSalesValue",FactSales[Sales Value]      ,"Rank", FactSales[Rank]     )   ,DATESBETWEEN('Date'[CalDate], DATE(2012,1,1), DATE(2012,8,31)) ) ORDER BY   FactSales[UnitCount] DESC Which outputs: FactSales[CountryKey] [ProductSalesValue] [Rank] 227 84074007.25 1 226 29928143.25 3 14 10859628.74 4 39 8451588.111 4 109 7964922.769 6 82 6254219.85 6 75 4730390.37 7 107 2466064.97 9 208 1904009.18 10 140 1862708.961 11 153 1311217.35 11 22 1207366.72 13 59 1182179.95 15 Now let’s be clear, all we have done is simply change the measure from an Integer to a Float, the rest of the data is the same. You will notice that there are tie’s in the data that there should not be. Having scratched our heads for hours, rebuilt the model, re wrote the DAX, and had a number of colleagues check it over we found that when no calculate table is applied, then we get the correct answer again as follows: DEFINE   MEASURE FactSales[SalesValue]= FactSales[Sales Value]   MEASURE FactSales[Rank]= RANKX(ALL(FactSales[CountryKey]), FactSales[UnitCount]) EVALUATE   SUMMARIZE(     FactSales    ,FactSales[CountryKey]    ,"ProductSalesValue",FactSales[Sales Value]    ,"Rank", FactSales[Rank] ) ORDER BY   FactSales[UnitCount] DESC FactSales[CountryKey] [ProductSalesValue] [Rank] 227 92885561.31 1 226 33237033.3 2 14 12253005.68 3 39 9414266.358 4 109 8928147.606 5 82 7120811.54 6 75 5296490.13 7 107 2756500.54 8 208 2146627.18 9 140 2115750.609 10 153 1479858.53 11 22 1339255.82 12 59 1324799.98 13 105 1320651.83 14 31 1277065.779 15 So what we have learnt here is that RANKX seems to give the wrong answer, but only when ranking over real numbers, and only when we filter the data set in some way using CALCULATETABLE. Not being able to find a clear reason for this behaviour we eventually gave in and raised it with Microsoft. Having spent a week or so working it through with the great support team at Microsoft, it seems that this is a current “feature” of RANKX. It is believed to be a floating point arithmetic issue that is driven from how floating point numbers are stored. It is documented here: There is also some further information documented here: I have been told by Microsoft that this is the same as it is in Excel, and Analysis Services, however still does not explain why the CALCULATETABLE makes a difference.  I am still working with Microsoft to see if we can get to the bottom of it.   However, if precision beyond 15 significant figures within the rank is not important to you (up to 99 Billion with 2DP) this issue is very easily worked around. Take the following DAX query: DEFINE   MEASURE FactSales[UnitCount]= FactSales[Royalty Value]   MEASURE FactSales[Rank]= RANKX(ALL(FactSales[CountryKey]), ROUND(FactSales[UnitCount],2)) EVALUATE   CALCULATETABLE (     SUMMARIZE(       FactSales      ,FactSales[CountryKey]      ,"ProductSalesValue",FactSales[UnitCount]      ,"Rank", FactSales[Rank]     ) ,DATESBETWEEN('Date'[CalDate], DATE(2012,1,1), DATE(2012,8,31)) ) ORDER BY   FactSales[UnitCount] DESC Which now correctly outputs: FactSales[CountryKey] [ProductSalesValue] [Rank] 227 84074007.25 1 226 29928143.25 2 14 10859628.74 3 39 8451588.111 4 109 7964922.769 5 82 6254219.85 6 75 4730390.37 7 107 2466064.97 8 208 1904009.18 9 140 1862708.961 10 153 1311217.35 11 22 1207366.72 12 59 1182179.95 14 105 1196551.61 13 31 1132926.109 15 You will notice that all I have done is ROUND the measure in the RANKX function to 2 decimal places, this will stop the calculation engine from ranking over an approximated value, and force it RANK on a decimal with less than 15 significant figures, which will in turn fix the ranking. Further to this, rounding the values prior to it being loaded into the model will also work, just be sure to keep below the 15 Significant figures if possible. So in summary, it seems that the calculation engine, when using the RANKX function does not handle the float data type as we may expect. This can be easily rectified by losing some precision on the RANK measure you define. We have been informed that using the Currency function will also fix the problem. I hope this stops others going through the same pain. As and When i get any further information on this issue I will post it up.  

Allocations in PowerPivot Using DAX

Although I didn't mention it, the inspiration for my last post was the excellent MDX Solutions 2nd Edition. Whilst flicking through the book a while back, I though it would be interesting to see how DAX and PowerPivot could handle the so called 'Common Calculations and Selections in MDX'. This post continues that theme, focusing on one calculation in particular that's in the aforementioned MDX book, namely 'Unweighted Allocations down the Hierarchy'. Essentially this business problem is dealing with allocating data that is entered at a higher level (e.g. Quarter) down to a lower level (e.g. Day). It varies from business to business what the users actually want to see at the day level in this situation. Some expect to see the Quarter amount, some want to see a blank value and then some users want the amount in the quarters allocated down to the day level. If the expectation is that data should be allocated, then one way of doing the allocation is to use a ratio. In the MDX book, this is achieved by the following MDX: 1.0 / Descendants( Ancestor( [Date].[Calendar].CurrentMember, [Date].[Calendar].[Calendar Quarter] ), [Date].[Calendar].CurrentMember.Level, SELF ).Count If we put this into a query and run it against the Calendar hierarchy in AdventureWorksDW we get the following results: Therefore our quarter-entered value can simply be multiplied by this ratio to get the correct value at each level in the Calendar hierarchy. PowerPivot So the challenge is how to be able to do this in PowerPivot. I started off by looking for some data in AdventureWorksDW that would fit the bill. A good candidate seems to be the FactSalesQuota table, which I imported into PowerPivot along with its related dimensions: Once in the pivot table, I wanted to be able to calculate the correct ratio at the daily and monthly levels. For the daily level this is 1 / 90, as there are 90 days in the Q1 that year, and for the month level it is 1/3 as there are 3 months in the quarter. Given that there's no MDX style Ancestor() function in DAX, I ended up having to have a different calculation at different levels. Something that I learnt from this forum post is that you need to use IF() in order to isolate a calculation at a different level. Therefore the DAX that I came up with was: =IF( COUNTROWS(VALUES('DimDate'[FullDateAlternateKey])) = 1,     1 / CALCULATE( COUNTROWS('DimDate'), ALL('DimDate'[FullDateAlternateKey], 'DimDate'[EnglishMonthName]) ),     1 / CALCULATE( COUNTROWS( DISTINCT( 'DimDate'[EnglishMonthName]) ), ALL('DimDate'[EnglishMonthName]) )    ) The COUNTROWS(VALUES('DimDate'[FullDateAlternateKey])) = 1 allows me to check that we're at the day level, and if we are at the day level, then count the number of days in the quarter. Otherwise, if we're at the month level we calculate the number of months in the current quarter. When this DAX is put into a new measure we get the following results: This produces the same results as the MDX statement. The next step was then to apply this ratio to the Sales Amount Quota for the current quarter. This was achieved using the following DAX: =IF(       COUNTROWS( VALUES('DimDate'[FullDateAlternateKey]) ) < 90,       CALCULATE( Sum('FactSalesQuota'[SalesAmountQuota]),        All('DimDate'[FullDateAlternateKey], 'DimDate'[EnglishMonthName]) ) * 'FactSalesQuota'[Ratio],       Sum('FactSalesQuota'[SalesAmountQuota])    ) The < 90 at the start of the statement allows me to check if we're at the day level or the month level. If we're at either of these levels, then we want to pick up the Quarter-level SalesAmountQuota and multiply by the ratio. Otherwise, at the Quarter and Year levels, we just apply the original SalesAmountQuota value. This produces the following results: Conclusion The use of the MDX Scope statement would have been nice to pinpoint exactly where I wanted the calculation to occur, but then again, the use of IF() keeps it quite simple, which is better for an Excel power user. Whilst MDX is very very powerful, can you imagine a business user getting to grips with MDX statements such as Scope or Freeze? The bottom line is that the DAX calculation produces exactly what I want - the new AllocatedSalesQuota measure is correct at all levels in the Calendar hierarchy. Even with an IF() statement, like everything else in PowerPivot, the calculation performs very quickly.

DAX Closing Balances

One of the often required calculations for users in reports is to be able to calculate a point-in-time closing balance, which is especially relevant when dealing with stock levels or financial data. In the Analysis Services world, at least in the Enterprise Edition, we're lucky that we have a variety of Aggregation Functions that can deal with semi-additive measures such as a closing balance. Or we can always turn to MDX if we need to which will achieve the same thing. In PowerPivot, we don't have Aggregation Functions, but there is DAX, which has a variety of time intelligence functions up its sleeve. The idea behind this post is to see how PowerPivot would deal with the requirement of reporting a closing balance measure. DAX Time Intelligence Functions One of the DAX functions that we can use for our closing balance is called ClosingBalanceMonth(), which will simply evaluate an expression at the end of a given month. There's also the similar ClosingBalanceQuarter() and ClosingBalanceYear(). Having noticed these functions within PowerPivot, my first port of call was to set up a simple PowerPivot model by taking data from the FactInternetSales table in AdventureWorksDW, plus all it's related dimensions of course. What I ended up with was the following simple pivot table in Excel 2010: Sales Amount obviously isn't a semi-additive measure, but I'm treating it as one in this example. My first port of call was to use ClosingBalanceMonth() to try and get the value for the month to be equal to the value for last day in the month. I managed to come up with the following formula: =CLOSINGBALANCEMONTH(Sum([SalesAmount]), 'DimDate'[FullDateAlternateKey], ALL('DimDate'[FullDateAlternateKey])) This produces the results that are shown below, in the measure called ClosingBalanceFormula: It does a good job of calculating a closing balance for each month, but it presents that closing monthly balance on each day. Therefore, I started looking for an alternative approach and found the LastDate() function. It returns the last date that's in context for the passed date column. The last date that's in context at the month level will be the last date in the month, but thankfully the last date that's in context at the day level will be the current day. Therefore I came up with the following formula: ='FactInternetSales'[Sum of SalesAmount](LASTDATE('DimDate'[FullDateAlternateKey])) This produced the results that I wanted - the daily closing balance at the day level and the monthly closing balance at the month level:   A bit more digging lead me to find out that a combination of the Calculate() and LastDate() functions gets to the same result, as an alternative. That formula is: =CALCULATE(Sum('FactInternetSales'[SalesAmount]), LASTDATE('DimDate'[FullDateAlternateKey])) Conclusion It's a shame that the ClosingBalanceMonth() formula didn't quite work, but perhaps getting that approach to work will come out in the wash with more DAX experience. However, as with MDX or any kind of expression language, there's always going to be a load of different ways to get to the same result. It's much more efficient to work knowing that you've got several ways around a particular problem, rather than being forced down a particular path, and in fact, I think the range of calculation functions available shows the richness of the DAX language.