Adatis BI Blogs

Setting Up The Power BI Analysis Services Connector

The Power BI Analysis Services Connector is used in order to expose a Tabular model to Power BI allowing end users to consume data from the model directly for building of reports and ad-hoc analysis. The setup of the connector is very straightforward however you will should bear the following in mind - Only Tabular models are supported with the connectors – you will not be able to use this to enable reporting from a multidimensional database. - The Analysis Services Connector performs best if it is hosted on the same server that hosts the Tabular model. - The speed of the internet connection between the server running the Analysis Server Connector and the Power BI service is crucial to performance. - You can’t run the Analysis Services Connector on a server also running either the Data Management Gateway or the Power BI Personal Gateway. Installation Steps 1. Download the connector from 2. Run the Analysis Services Connector Setup Program on the machine hosting the Tabular model. 3. Once the installation has completed you will be given the option to launch the connector     4. Enter the login details required to connect to Power BI 5. If the details are correct you will see the below screen: 6. Enter the account details required to connect to the Tabular Instance – clearly this needs to be an account with access to read from the Tabular model. 7. Give the connection a suitable name and a friendly error message to be displayed to users in the case that the connection fails This should complete the wizard, the next step is to log onto the Power BI site – if all has gone well you should see the model as per the below.

Prototyping Analysis Services Cubes

After running a modelling workshop with end users, my next activity on a project is generally to produce a prototype model with some test data, in order to check with users that the planned star schema design will work. My tool of choice for this job tends to be Power Pivot, but I was presented with a slightly different solution when working with a client recently. This got me thinking it would be worth contrasting the two approaches…. Power Pivot First of all, lets start off with Power Pivot. The goal here is to quickly produce a prototype to verify the star schema design. I tend to just request that a business user brings along an Excel workbook with samples of each dimension and facts pre-extracted in Excel, which can then be loaded into Power Pivot. I find that this saves time, although as an alternative of course Power Pivot could extract directly from SQL Server, Oracle etc if needed. As an example, we can quickly walk through creating a very simple model to create a star schema: I’ve copied the data that represents my single fact table and related dimensions into Excel. In this case it’s to model a simple Sales Order Transactions fact, therefore my Excel workbook has dimensions such a Product, Customer, Sales Territory etc, as we well as the data that represents the fact table: After completing and closing the workbook, we now create a new workbook and open the Power Pivot window. Within the Power Pivot window, the next step is to import the data from the Excel workbook created in step 1. This can be done by choosing From Other Sources->Excel workbook and then browsing to the file. Power Pivot will now present us with all of the sheets that it finds in the workbook, so we select the sheets that represent the dimensions and facts: The next step is to setup relationships between the various tables. Once this is done, we have a simple prototype data model, as shown below: After a bit of tidying up (e.g. creating some hierarchies and removing unwanted columns), we can now connect to a Pivot Table in Excel that will help verify the star schema: They key point here is that we have rapidly built a prototype, without the need for any ETL. This means we can quickly cover the design issues with the users, but this time with their actual data, rather than just on a whiteboard. Analysis Services The alternative approach that was presented to me recently is to develop the Analysis Services cube/Tabular model before carrying out the ETL. Essentially this means that the users connect to cube, as a production system, unaware that under the hood the ETL is not yet complete. How is this achieved? Essentially by putting logic in the SQL view layer. Its a well known Analysis Services best practice to bind your SSAS objects (e.g. dimensions, measure groups) to SQL Server views, rather than using objects within the DSV. Rather than each view pulling its data from a complete dimension or fact table, instead each view would pull its data from a staging area or a copy of the source database. The idea being that, over time, the contents of each view would be updated to point to the actual dimensions and facts, once they are built. Therefore a normal view (in this case for a product dimension) that feeds Analysis Services might look like: SELECT ProductKey, ProductName, ProductCategory, ProductSubCategory FROM Dim.Product With the ‘No ETL’ approach, the view looks very different, as any transformation and cleaning will have to be carried out from the view: SELECT ROW_NUMBER() OVER(ORDER BY PROD.ProductID) AS ProductKey, PROD.Name AS ProductName, CAST(ISNULL(SUB.Name, 'Unknown') AS VARCHAR(100)) AS ProductSubCategory, CAST(ISNULL(CAT.Name, 'Unknown') AS VARCHAR(100)) AS ProductCategory FROM [$(AdventureWorks)].Production.Product PROD LEFT JOIN [$(AdventureWorks)].Production.ProductSubcategory SUB ON SUB.ProductSubcategoryID = PROD.ProductSubcategoryID LEFT JOIN [$(AdventureWorks)].Production.ProductCategory CAT ON CAT.ProductCategoryID = SUB.ProductCategoryID This therefore incurs some technical debt, as the cube gets built before the ETL or even the physical dimension table. But the idea is that you can get the users using the cube, on production even, then gradually replace the views with proper ETL and dimension tables once you have some more feedback. Conclusion The Power Pivot method is tried and tested for me many times – I find that its very quick to get up and running. The Analysis Services approach is not quite as quick to get up and running, but offers the benefit that the cube can continually evolve in a fairly agile manner. Its early days for me using this approach, but the a disadvantage is that the views can get quite complex, impacting performance depending on the data volumes. Both methods probably have their place. If I wanted to validate my design, I think I would continue to produce a quick Power Pivot model. On the other hand, if quick delivery to production is a must, and the complexity/data volumes are not an issue, then I would use the view approach.

Analysis Services Dimension Processing Errors

I’ve seen a few data quality issues recently in a some client data warehouses/marts where I’ve been asked to carry out some Analysis Services work. These data quality issues have often led to the classic SSAS dimension processing error, namely: “Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: 'dbo_DimProduct', Column: 'SubCategory', Value: 'Other'. The attribute is 'Sub Category'” Normally this error is easy to track down, but there are a few scenarios that I’ve encountered where SSAS attribute properties actually make it harder for you to troubleshoot. Scenario 1 – Trimming When hitting the above error, the first thing to do would be to run a query to see where the duplicate attribute is. In my example, I’ve just produced a very simple dimension that’s loosely based on Adventure Works – here are the attribute relationships: Therefore the first place to start would be to write a query that checks if we have any Sub Categories that are duplicated across Categories, such as: SELECT SubCategory FROM DimProduct GROUP BY SubCategory HAVING COUNT(DISTINCT Category) > 1 In my case, this actually yields no results. So what’s happening here? We can’t find any duplicates in SQL, but SSAS thinks there are duplicates. Well….Analysis Services dimension attributes contain a property called Trimming, which, by default, will remove spaces at the end of an attribute key or name. Needless to say, a dimension shouldn’t actually contain leading or trailing spaces, but in this specific scenario, the ETL wasn’t as robust as it could be. In the actual example I encountered, there was a tab at the end of one of the attribute names. To visualise that in the DimProduct example that I’ve created, the highlighted attribute has a tab at the end: Scenario 2 – NullProcessing The next scenario I want to highlight is the same error, but a different dimension. Again I’m going to create a simple example to illustrate what actually happened. This time I’ve got a customer dimension table, as follows: The error is: A duplicate attribute key has been found when processing: Table: 'dbo_DimCustomer', Column: 'CustomerCountryKey', Value: '0'. The attribute is 'Customer Country' A quick query on the above data will reveal that a CustomerCountryKey of 0 doesn’t actually exist twice for the same country, so what’s happened here? Again its due to a dimension property, this time on the attribute key, called NullProcessing. As its default is Automatic, it means that Nulls will get converted to zero during dimension processing. As we already have a member with CustomerCountryKey of 0, then we get an error. ETL and Data Quality These sort of issues highlight just why handling data quality during the ETL is so important! For the first scenario, some basic trimming/cleaning can fix the issue before it hits Analysis Services. As for the last issue, this is partly due to data warehouse/mart design. Personally I would never allow a dimension attribute to be null, partly because you shouldn’t present null attributes to a user, but also due to the lack of control that can result as shown above. Therefore a bit of design work up front to ensure that you have a robust data model is essential. Interestingly the properties that I’ve mentioned don’t exist for tables in Analysis Services Tabular. Whereas I think there are a few SSAS MD features that need to make it over into the next version of Tabular, I can personally live without the two properties outlined above.

Tabular Analysis Services Perspectives and Measure Groups

I’ve always liked how Excel and other client tools deal with measure groups in a multi dimensional SSAS cube. Once you connect to the cube, you get a drop down in the Pivot Table fields pane that lets you choose which measure group you want. From there, you get a filtered list of measures and dimensions. For example, if I connect to the Adventure Works 2012 sample cube, I get the following: By picking Internet Orders, I will see only the measures dimensions that relate to Internet Orders, which provides me with a good way of navigating a large cube. Tabular Unfortunately though, if I connect to a Tabular model (e.g. the sample Adventure Works 2012 model that I got from codeplex), then if I click the same drop down, then I will see every table in the entire model, whether the table  is a dimension, a fact or something else. E.g. here for a tabular model than contains 3 fact tables, but 15 tables in total, then I see all 15 tables: This isn’t ideal in my opinion, as I’d much rather that users had the ability to quickly jump to an area of interest, which measure groups achieve fairly well. As I have 3 fact tables, I would expect to see those 3 fact tables/measure groups in the drop down. I did accept this as a quirk of tabular until I stumbled across something recently. If you connect to a tabular perspective, then Excel behaves exactly as Multi Dimensional does, i.e. it shows you only measure groups in the drop down. To illustrate I’ve added a new perspective to the model that contains all tables and I’ve called this perspective ‘Adventure Works’: Now I can connect to this new perspective via Excel. Remember the perspective contains all my tables, so it shouldn’t be any different than connecting the model itself: The perspective does give a different result though – now only the actual measure groups are displayed to me in the drop down, which is much more user friendly: Summary This is completely different to the way that perspectives work in multi dimensional. When making MD cubes, I wouldn’t always need perspectives, just because the measure groups provided a ‘natural’ way of the users picking the subset of the cube that was of interest to them. Now with tabular, it seems that using a perspective will actually improve the user experience.

Debug MDX with Generate

On occasion you may find that your MDX queries don’t quite give you the result that you expect, which sometimes may be due to the way in which a certain bit of MDX has changed the context of part of your query. This can sometimes be tough to figure out, especially with functions like Sum() or Filter(), whereby we are looping over a set and then only returning the end product. For example, take the following query, where I’m trying to find out the count of UK male customers who have bought Road Bikes in both the current and the previous year: WITH MEMBER [Measures].[Repeat Customer Count] AS SUM( [Customer].[Customer].[Customer].Members, IIF([Measures].[Internet Sales Amount] <> 0 AND ([Measures].[Internet Sales Amount], [Date].[Calendar Year].PrevMember) <> 0, 1, NULL) ) SELECT {[Measures].[Repeat Customer Count]} ON 0, [Date].[Calendar Year].[Calendar Year] ON 1 FROM [Adventure Works] --Road Bikes, UK, Male WHERE ([Product].[Subcategory].&[2], [Sales Territory].[Sales Territory Country].&[United Kingdom], [Customer].[Gender].&[M]) It returns the following result set, which I know to be incorrect: Unlike debugging either of the axes, where of course we can see what gets returned, debugging examples such as the above might be difficult. The MDX Generate function can help here, as it works in a similar way to Sum(), by iterating over a set and then evaluating an argument once per iteration. The idea is that Generate will build us a new set, by using the looping context of the first set that you pass it. In this case, however, we can use it to return us a concatenated string based on our input set. What I’ve done here is to use the CurrentMember of attributes that I think might be causing problems (Customer, Gender) and then just print them out as a string in order to see if the attributes have the context that I expect. Therefore I’ve replaced the ‘1’ in my original query with the string of: " | " + [Customer].[Customer].CurrentMember.MEMBER_CAPTION + ", " + [Customer].[Gender].CurrentMember.MEMBER_CAPTION The full query and results then become: GENERATE( [Customer].[Customer].[Customer].Members, IIF([Measures].[Internet Sales Amount] <> 0 AND ([Measures].[Internet Sales Amount], [Date].[Calendar Year].PrevMember) <> 0, " | " + [Customer].[Customer].CurrentMember.MEMBER_CAPTION + ", " + [Customer].[Gender].CurrentMember.MEMBER_CAPTION, NULL) ) SELECT {[Measures].[Repeat Customer Count]} ON 0, [Date].[Calendar Year].[Calendar Year] ON 1 FROM [Adventure Works] --Road Bikes, UK, Male WHERE ([Product].[Subcategory].&[2], [Sales Territory].[Sales Territory Country].&[United Kingdom], [Customer].[Gender].&[M]) I can spot straight away that female customers have been returned in 2007, which is due to the fact that sets specified inside the WITH MEMBER section are not evaluated in the context of the slicer. In this case the problem can be solved by using the EXISTING keyword inside the Sum, so that the customers get sliced by the gender attribute: SUM( EXISTING [Customer].[Customer].[Customer].Members, IIF([Measures].[Internet Sales Amount] <> 0 AND ([Measures].[Internet Sales Amount], [Date].[Calendar Year].PrevMember) <> 0, 1, NULL) ) This won’t be applicable in all MDX debugging scenarios, but hopefully it will help someone out.

Analysis Services Member Properties and Excel

When working with users who are browsing a cube using Excel 2010, a common requirement that I find is that the users want to create tabular or grid based reports. E.g. “I have a list of currencies, I want to display the currency code in an adjacent column”. Or “Here are my account codes, I want to also see the account type”. The common way that I see users attempt this is to stack attribute hierarchies on top of one another. For example, using Adventure Works, to get the currency names and codes together, I can stack the two attribute hierarchies: This produces the following report: Not exactly what we want – ideally we want the currency codes in separate columns, on the same line as the name. An Alternative Approach One well known way to achieve this layout is to right click on the relevant member and display its member properties, whereas another is to set the layout to Tabular Form. I tend to prefer the member properties route as I think it produces slightly better reports, so I’ll be covering a simple member properties design tip shortly. First of all, here’s the sort of report that I’m able to create using member properties: I’m able to do this because the Adventure Works Source Currency Code Attribute Hierarchy has the Source Currency (names) as a member property. Therefore, I get the following option when I right click on the currency code: Displaying member properties will only work if the attribute relationships are set up correctly. The reason that I’m able to see the member property ‘Source Currency’ is because of the following attribute relationship: However, if we just display the currency name, using the Source Currency attribute hierarchy and then try and right click to see the currency code, we’ll unfortunately see nothing in the ‘Show Properties in Report’ submenu. This is because the currency name is a property of the code, rather than the other way around. I’d argue that for a user it’s intuitive to get the name from the code, or vice versa, as they sit at the same level. With a few simple changes we can achieve this, essentially by adding an extra attribute to the dimension… Within the dimension editor in BIDS, the first step is to add the new attribute. In this case it should be based on the Currency Code column that is, of course, already in use. That doesn’t matter, we want to use it twice. Therefore, I’ve created a new attribute called Currency Code and have set AttributeHierarchyEnabled = False, as there’s no need for anyone to browse this attribute. Now onto the all important attribute relationships. The attribute hierarchy that contains the names is called Source Currency. As it’s not possible to get the currency codes by right clicking on this attribute hierarchy, Source Currency is the attribute needs to be related to the new Currency Code attribute: Finally, the end result for the user is that they now get the ability to easily get access to the currency codes by right clicking on the names:

Power View Default Field Set

Here's another quick tip when creating data models for use with Power View - Default Field Sets can be created in both Tabular SSAS and PowerPivot that give the user a shortcut for automatically adding multiple fields to a report. As an example, here's a screenshot of the Default Field Set in PowerPivot - this will tell Power view that the following fields should be automatically selected when the table is added to a Power View report: I'd seen this in the documentation a while back, but hadn't actually used it, nor could I see a way to select the table. In fact, although there's no checkbox, it's just a simple single click on the actual table name, as I’ve highlighted below: When you click on the table name Product above, you will now get the following fields automatically added to a table visualisation: So not exactly rocket science, but worth doing! Here's how to set up the Default Field set for PowerPivot and Tabular Analysis Services.

Working with Images in Power View

Power View includes several ways to create visualisations with images, ranging from a simple table, to the scrollable tile visualisation. In this post I’m going to explain a bit about the tile visualisation, showing how it can be used in Power View, before showing a couple of tips for working with images in both PowerPivot and Analysis Services Tabular models. Power View Tiles The Power View Tiles visualisation allows you to display a scrollable series of images, which, when clicked on, will display data of your choosing. The report below is an example of this: What’s nice about the View above is that the names of the products are automatically displayed underneath the images. This is a sensible place to display the product names, as it means the area below can focus on whatever product level data you want, in this case Revenue by Month for the selected product. This works fine in the sample models (in my case the Tailspin Toys model that I think came with CTP3) but it wont work in vanilla PowerPivot / Tabular models unless you configure a few model settings. Trying to reproduce the report above without these settings will give the following tile visualisation, minus the product names: PowerPivot Advanced Settings To address this in PowerPivot, you first of all need to switch to Advanced Mode, which you can do by selecting this option from the File menu in PowerPivot: This gives you the following advanced tab: Clicking on the highlighted Table Behaviour button in the ribbon gives you the following Table Behaviour window: Here you can “change the default behaviour for different visualisation types and default grouping behaviour in client tools for this table”. The following properties should be set in order to get the tiles working correctly: Row Identifier – Specifies a column that only contains unique values, which allows the column to be used as an internal grouping key in client tools. Default Label – Specifies which column gives a row level display name, e.g. the product name in a Product table. This is key, as when a row level image is used, this property specifies which value to display alongside the image in tiles and other image-based visualisation. Default Image – Specifies which column contains images representing the row level data, e.g. pictures of products. The full details for all these properties can be found here. Tabular Models Tabular Models contain the same properties, which can be edited when you open a tabular project in the new SQL Server Data Tools. The image below shows how the properties have been configured for the column in my model called Image: Back to Power View If we now build a tile visualisation using the image column, we will see that we get the names of the products returned along with the image. It’s also worth noting that the Image and Product Name columns have now have an icon next to them, indicating that the field exhibits “row level” behaviour: Finally, for some examples of what’s new in Power View SQL 2012 RC0 take a look at this post on the Reporting Services Team Blog.

Managing SSAS Named Sets with Master Data Services Collections

Master Data Services Collections are probably not the first feature that come to mind when you think of the MDS product. The hierarchies and member grids tend to be the core functionality, but as this post will hopefully show, MDS Collections are useful also. Collections are essentially managed lists of members that can come from multiple different MDS explicit hierarchies, or also from another collection. The idea is that this “subset of members” can be maintained in MDS by a business user and then fed to external applications for reporting or other purposes. Analysis Services Named Sets One example of how collections can be used is to maintain Analysis Services named sets. Some named sets, such as a Top 50 Customers, don't require any maintenance, as it’s just the top 50 of all customers, based on a particular measure. On the other hand, sometimes named sets can be static lists, e.g. a list of core products that the user wants to see for reporting. In the latter example, if a user wants the definition of a named set to change, they have to get IT to change the MDX script. MDS collections can help by allowing the user to control the named set definition, reducing the burden on IT. Example Scenario  Here’s an example of how this is done. First of all, the end game is that I have a named set in Analysis Services that is currently just for 3 products. Therefore, a user can easily drop this set into an Excel report to get quick access to the products that are important to them: So the challenge is that we need to find some way of extending this named set without doing it manually. This is where MDS starts to come in, so using the sample Product model that comes with MDS, I’ve created a new collection called Favourite Products, as shown in the image below: If I go and edit the Favourite Products collection in MDS, then I can drag and drop more products into this collection, or remove some existing members. In this case, I’ve chosen to add two new products: C#, AMO and the MDS API So the named set represents the target that we need to get to, whereas the MDS collection that’s shown is the source. To get the contents of the MDS collection to update the named set, one way of doing it is to use the MDS API to pick up the collection members, and then to use AMO in order to write the named set into the MDX script. I’m just doing this in a C# windows application, but you could do it via a batch process, such as SSIS. For this post I’m just going to show the code, so here goes: This post is already starting to feel too long so I’m not going to show the basics of the MDS API. For that, take a look at a good posting by the MDS Team blog here. Also, as anyone who knows me will no doubt agree, I don’t tend to get involved in doing C#, so don’t consider this to be production ready! It should give you an idea of the basics though. Anyway, assuming that we’re now connected to the MDS Web Service, I’ve created the following method that will return the members from the collection: private HierarchyMembers ObtainHierarchyMembers(string entityId, string hierarchyId, string modelId, string versionId) { HierarchyMembersGetRequest request = new HierarchyMembersGetRequest(); HierarchyMembersGetResponse response = new HierarchyMembersGetResponse(); request.HierarchyMembersGetCriteria = new HierarchyMembersGetCriteria(); //State that our hierarhcy type is a collection and that we want collection members request.HierarchyMembersGetCriteria.HierarchyType = HierarchyType.Collection; //Pass in the key search criteria to identify the correct collection in MDS request.HierarchyMembersGetCriteria.ParentEntityId = new Identifier { Name = entityId }; request.HierarchyMembersGetCriteria.HierarchyId = new Identifier { Name = hierarchyId }; request.HierarchyMembersGetCriteria.ModelId = new Identifier { Name = modelId }; request.HierarchyMembersGetCriteria.VersionId = new Identifier { Name = versionId }; request.HierarchyMembersGetCriteria.RowLimit = 50; request.International = new International(); OperationResult result = new OperationResult(); //Return the hierarchy members from the service return service.HierarchyMembersGet(request.International, request.HierarchyMembersGetCriteria, out result); } Before we use the above method, we need to connect to SSAS using AMO. That can be done quite easily with the following code: Server server = new Server(); string connection = "Data Source=.;Catalog=Adventure Works DW 2008R2;"; server.Connect(connection); After we’ve done all the usual error handling associated with connecting to a database, we need to pick up the SSAS database and cube: Database database = server.Databases["Adventure Works DW 2008R2"]; Cube cube = database.Cubes["Adventure Works"]; Now we’re ready to call our ObtainHierarchyMembers method, before looping through it to build the named set: StringBuilder mdx = new StringBuilder("\n//Auto generated named set at " + DateTime.Now.ToString() + "\nCREATE SET CurrentCube.[Favourite Products] AS {"); int count = 1; //Loop through the collection to build the mdx foreach (ParentChild pc in hm.Members) { //Add the members to the MDX string //This references the member by name //It would be possible to reference by member key, but would require more work mdx.Append("[Product].[Product].[" + pc.Child.Name + "]"); if (count < hm.Members.Count()) { mdx.Append(", "); } count++; } mdx.Append("};"); Now we need to insert the named set in the correct place within the existing MDX script, bearing in mind it could already exist: string currentScript = cube.MdxScripts[0].Commands[0].Text; //Find the correct place to insert the named set within the MDX script: int start = currentScript.IndexOf("\n//Auto generated named set at"); int end = 0; StringBuilder newScript = new StringBuilder(currentScript); if (start != -1) { end = currentScript.IndexOf(";", start); //If the named set already exists, remove it newScript.Remove(start, end - start + 1); } else { start = currentScript.Length; } //Insert the named set in the correct place newScript.Insert(start, mdx.ToString()); //Update the cube's MDX script cube.MdxScripts[0].Commands[0].Text = newScript.ToString(); Finally we just need to update the cube in order to write the MDX back to the cube: //Call the update methods to update the cube cube.MdxScripts[0].Update(); cube.Update(); User Reports Now that the cube has been updated, if the Excel report is refreshed, then the two new products will appear in the list: Summary Controlling SSAS named sets like this won’t be for everybody. I’ve certainly worked in a few clients where strict process has to be followed to update this sort of thing, but I can think of other companies that I know where this would be really useful. Managing Analysis Services named sets is just one use for collections. Another example might be managing default multi-select parameters for SSRS reports. As collections are just lists of members that can be extracted easily, what you do with them is up to you!

Master Data Services Training in the UK

This is just a quick post to announce a range of SQL Server training courses, organised (and in some cases delivered) by Chris Webb. To start off there’s a SQL Server course delivered by Christian Bolton in December, followed by an Analysis Services course delivered by Chris in February. I’ll be delivering a Master Data Services course in February, before Chris delivers an MDX course in March. The details for all the courses are: SQL Server Internals and Troubleshooting Workshop - Christian Bolton – 6th – 7th December 2011 The Advanced Troubleshooting Workshop for SQL Server 2005, 2008 and R2 provides attendees with SQL Server internals knowledge, practical troubleshooting skills and a proven methodical approach to problem solving. The workshop will enable attendees to tackle complex SQL Server problems with confidence. Full details and registration here. Real World Cube Design and Performance Tuning with Analysis Services – Chris Webb – February 2012 A two day course that takes real world experience in showing you how to build a best practice Analysis Services cube, covering design issues such as data warehouse design and complex cube modelling. Day two then covers performance optimisation for Analysis Services, including MDX optimisation and cube processing. Full details and registration here. Introduction to Master Data Services with Jeremy Kashel – February 2012 An end to end look inside Master Data Services, this full day course will begin with a synopsis of Master Data Management before moving on to an overview of Microsoft SQL Server 2008 R2 Master Data Services (MDS). The remainder of the course will cover the major MDS topics, such as modelling and business rules, which will include a number of practical exercises. More details and registration for here. Introduction to MDX with Chris Webb – March 2012 The Introduction to MDX course aims to take you from the point of being a complete beginner with no previous knowledge of MDX up to the point where you can write 90% of the MDX calculations and queries you’ll ever need to write. The three day course covers the basics, such as sets, tuples, members to more advanced concepts such as scoped assignments and performance tuning. Full details and registration here.

MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook - Book Review

As you may have already seen, Packt have released a new MDX book, namely MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook by Tomislav Piasevoli. The book is written as cookbook, meaning that you can choose the individual ‘recipes’ that apply to whatever problem you’re trying to solve. Each recipe starts off with a section called ‘Getting Ready’, essentially giving you the base query to use, before moving onto ‘How to do it', which covers the main instructions for the recipe. There are then further sections, which explain how the example works and also suggest other functions/concepts that you may want to consider. This sort of cookbook style makes it really easy to follow, each recipe is displayed very clearly in the book. A wide range of MDX problems are covered, from time calculations to context-aware calculations. Not every piece of the MDX functionality is covered, which is to be expected, given the size of the book. It also doesn’t cover the usual introduction to members/sets etc that MDX books tend to cover, but it’s clearly stated that having a working knowledge of MDX is a pre-requisite for the book. I found the copy that I’ve been reading in the Adatis office, but I really do like it, so I’ll definitely be ordering my own copy!

Adding a Calculated Field to an Excel Pivot Table With Analysis Services

One question that I get asked from time to time is how to add a calculated field to an Excel Pivot Table that’s connected to an Analysis Services cube. I’ve been aware of a workaround to do this for a while, but a quick Bing :-) revealed that the common answer given to this question is that it’s not possible. It definitely is possible, it’s just not that obvious and is perhaps a little limited. Here’s how it can be done in Excel 2010: Sample Cube First of all, connect to a cube and produce a pivot table report. I’ve connected to the 2008 R2 version of the Adventure Works DW cube and have produced the following report, using the Internet Sales Amount measure: Adding a Calculated Field If you now open the Field List, you will see that Internet Sales Amount is selected of course. Here’s the trick – you can select the measure twice by dragging it into the Values area, which will give the following: Now click the drop down next to Internet Sales Amount2, which will display the following window: You can change the name of the field – I’ve called mine Percent Share. In the Show Values as drop-down, I’ve picked % of Grand Total, so that I can see the share of the sales in each country across all years. So now the report is as follows: Summary A better approach is to put as many calculations in the cube as possible, but as some calculations requested by users are report-specific, this feature of Pivot Tables allows users to produce the reports that they want. The calculations that you can carry out are only those in the ‘Show values as’ drop-down. There’s quite a few in there, especially in Excel 2010, as mentioned here. If that’s not enough, then an alternative is to use the OLAP Pivot Table Extensions, which is available to download on Codeplex.

Analysis Services KPIs and Reporting Services 2008 R2

There was a comment on my last blog post by Peter Eb asking about whether Reporting Services 2008 R2 knows how to render SSAS KPIs automatically. The short answer is that it doesn't unfortunately, but there are only a few steps needed to get it working. For this blog post I've built a simple report that uses the Adventure Works DW 2008 R2 Analysis Services database as a data source. The first step is to build a data set that contains a query that references the Month attribute and the Channel Revenue KPI: I'm interested in reporting on the status and trend part of the KPI, so I've added the month and value parts of the KPI, plus blank columns for the status and the trend: For the status and the trend columns, the next step is then to drag on (or right click and insert) an indicator in each of the cells. This is when the manual steps come in: Right click on the indicator to open the properties window; Set the Value of the indicator to be the Status part of the Analysis Services KPI; Change the States Measurement Unit to Numeric; Configure the indicator Start and End properties to be -1, 0 and 1. SSAS KPIs can have an unlimited number of status values, so it may be necessary to add more states here if SSAS returns more than 3 values for the Status. This changes that are needed to the indicator properties window is shown below: Applying the same method to the Trend part of the KPI produces the following simple report: This post wasn't really supposed to compare SSRS Vs PerformancePoint Services, but I've found out that PerformancePoint Services does pick up the SSAS KPI meta data (slight mismatch on the colours only), saving you having to set up the indicator yourself: It's a shame that the KPIs aren't picked up automatically by SSRS, but then again the above process isn't too difficult to master.

Allocations in PowerPivot Using DAX

Although I didn't mention it, the inspiration for my last post was the excellent MDX Solutions 2nd Edition. Whilst flicking through the book a while back, I though it would be interesting to see how DAX and PowerPivot could handle the so called 'Common Calculations and Selections in MDX'. This post continues that theme, focusing on one calculation in particular that's in the aforementioned MDX book, namely 'Unweighted Allocations down the Hierarchy'. Essentially this business problem is dealing with allocating data that is entered at a higher level (e.g. Quarter) down to a lower level (e.g. Day). It varies from business to business what the users actually want to see at the day level in this situation. Some expect to see the Quarter amount, some want to see a blank value and then some users want the amount in the quarters allocated down to the day level. If the expectation is that data should be allocated, then one way of doing the allocation is to use a ratio. In the MDX book, this is achieved by the following MDX: 1.0 / Descendants( Ancestor( [Date].[Calendar].CurrentMember, [Date].[Calendar].[Calendar Quarter] ), [Date].[Calendar].CurrentMember.Level, SELF ).Count If we put this into a query and run it against the Calendar hierarchy in AdventureWorksDW we get the following results: Therefore our quarter-entered value can simply be multiplied by this ratio to get the correct value at each level in the Calendar hierarchy. PowerPivot So the challenge is how to be able to do this in PowerPivot. I started off by looking for some data in AdventureWorksDW that would fit the bill. A good candidate seems to be the FactSalesQuota table, which I imported into PowerPivot along with its related dimensions: Once in the pivot table, I wanted to be able to calculate the correct ratio at the daily and monthly levels. For the daily level this is 1 / 90, as there are 90 days in the Q1 that year, and for the month level it is 1/3 as there are 3 months in the quarter. Given that there's no MDX style Ancestor() function in DAX, I ended up having to have a different calculation at different levels. Something that I learnt from this forum post is that you need to use IF() in order to isolate a calculation at a different level. Therefore the DAX that I came up with was: =IF( COUNTROWS(VALUES('DimDate'[FullDateAlternateKey])) = 1,     1 / CALCULATE( COUNTROWS('DimDate'), ALL('DimDate'[FullDateAlternateKey], 'DimDate'[EnglishMonthName]) ),     1 / CALCULATE( COUNTROWS( DISTINCT( 'DimDate'[EnglishMonthName]) ), ALL('DimDate'[EnglishMonthName]) )    ) The COUNTROWS(VALUES('DimDate'[FullDateAlternateKey])) = 1 allows me to check that we're at the day level, and if we are at the day level, then count the number of days in the quarter. Otherwise, if we're at the month level we calculate the number of months in the current quarter. When this DAX is put into a new measure we get the following results: This produces the same results as the MDX statement. The next step was then to apply this ratio to the Sales Amount Quota for the current quarter. This was achieved using the following DAX: =IF(       COUNTROWS( VALUES('DimDate'[FullDateAlternateKey]) ) < 90,       CALCULATE( Sum('FactSalesQuota'[SalesAmountQuota]),        All('DimDate'[FullDateAlternateKey], 'DimDate'[EnglishMonthName]) ) * 'FactSalesQuota'[Ratio],       Sum('FactSalesQuota'[SalesAmountQuota])    ) The < 90 at the start of the statement allows me to check if we're at the day level or the month level. If we're at either of these levels, then we want to pick up the Quarter-level SalesAmountQuota and multiply by the ratio. Otherwise, at the Quarter and Year levels, we just apply the original SalesAmountQuota value. This produces the following results: Conclusion The use of the MDX Scope statement would have been nice to pinpoint exactly where I wanted the calculation to occur, but then again, the use of IF() keeps it quite simple, which is better for an Excel power user. Whilst MDX is very very powerful, can you imagine a business user getting to grips with MDX statements such as Scope or Freeze? The bottom line is that the DAX calculation produces exactly what I want - the new AllocatedSalesQuota measure is correct at all levels in the Calendar hierarchy. Even with an IF() statement, like everything else in PowerPivot, the calculation performs very quickly.

DAX Closing Balances

One of the often required calculations for users in reports is to be able to calculate a point-in-time closing balance, which is especially relevant when dealing with stock levels or financial data. In the Analysis Services world, at least in the Enterprise Edition, we're lucky that we have a variety of Aggregation Functions that can deal with semi-additive measures such as a closing balance. Or we can always turn to MDX if we need to which will achieve the same thing. In PowerPivot, we don't have Aggregation Functions, but there is DAX, which has a variety of time intelligence functions up its sleeve. The idea behind this post is to see how PowerPivot would deal with the requirement of reporting a closing balance measure. DAX Time Intelligence Functions One of the DAX functions that we can use for our closing balance is called ClosingBalanceMonth(), which will simply evaluate an expression at the end of a given month. There's also the similar ClosingBalanceQuarter() and ClosingBalanceYear(). Having noticed these functions within PowerPivot, my first port of call was to set up a simple PowerPivot model by taking data from the FactInternetSales table in AdventureWorksDW, plus all it's related dimensions of course. What I ended up with was the following simple pivot table in Excel 2010: Sales Amount obviously isn't a semi-additive measure, but I'm treating it as one in this example. My first port of call was to use ClosingBalanceMonth() to try and get the value for the month to be equal to the value for last day in the month. I managed to come up with the following formula: =CLOSINGBALANCEMONTH(Sum([SalesAmount]), 'DimDate'[FullDateAlternateKey], ALL('DimDate'[FullDateAlternateKey])) This produces the results that are shown below, in the measure called ClosingBalanceFormula: It does a good job of calculating a closing balance for each month, but it presents that closing monthly balance on each day. Therefore, I started looking for an alternative approach and found the LastDate() function. It returns the last date that's in context for the passed date column. The last date that's in context at the month level will be the last date in the month, but thankfully the last date that's in context at the day level will be the current day. Therefore I came up with the following formula: ='FactInternetSales'[Sum of SalesAmount](LASTDATE('DimDate'[FullDateAlternateKey])) This produced the results that I wanted - the daily closing balance at the day level and the monthly closing balance at the month level:   A bit more digging lead me to find out that a combination of the Calculate() and LastDate() functions gets to the same result, as an alternative. That formula is: =CALCULATE(Sum('FactInternetSales'[SalesAmount]), LASTDATE('DimDate'[FullDateAlternateKey])) Conclusion It's a shame that the ClosingBalanceMonth() formula didn't quite work, but perhaps getting that approach to work will come out in the wash with more DAX experience. However, as with MDX or any kind of expression language, there's always going to be a load of different ways to get to the same result. It's much more efficient to work knowing that you've got several ways around a particular problem, rather than being forced down a particular path, and in fact, I think the range of calculation functions available shows the richness of the DAX language.

XLCubed Version 5 Released

Interesting to see yesterday that XLCubed have released version 5 of their popular OLAP reporting tool. As I've mentioned before, I've always found XLCubed to be very powerful and easy to use, and therefore, in my opinion, it's definitely one of the best Excel OLAP reporting tools out there. Version 5 sees the introduction of some new features in the Excel/web editions, namely: Enhanced Excel integration with 'MicroCharts'; An improved user interface in the Web Edition; Print to PDF and enhanced save to Excel options in the Web Edition; Fine grain control over publishing Excel reports to the web. Personally, I'm used to using the Excel Edition to slice and dice cubes to get what I want quite easily. Although it's not entirely new to V5, you can carry out the same kind of ad-hoc analysis in the Web Edition, as shown in the following video. The end result is that you are able to build the kind of dashboards as shown below, whether in the web or the Excel edition: As usual, the evaluation editions for the Excel and Web versions can be downloaded here

Analysis Services 2008 Performance Guide

As Chris Webb has already blogged, the Analysis Services 2008 Performance Guide has now been released. Like the 2005 version, the 2008 guide contains a wealth of information and is a must read if you are trying to tune either query or processing performance in Analysis Services. The document can be downloaded here.

Day 1 at the BI conference

So here we are in Seattle at the BI conference.  Day 1 and it's been great to catch up with some old faces and meet some new ones.  We were promised some big news today and as is being reported by Chris, Mosha, Marco and all, we weren't disappointed.  Both "Kilimanjaro" and "Gemini" look super-exciting for all of us in the MS BI world.  The former is the evolution of the DatAllegro acquisition  - we saw an SSRS report that ran in 10-15 seconds against (hold little finger to corner of mouth) one trillion relational records in a 150 terabyte database - impressive.  Equally impressive is 20 million rows sorting and filtering in the blink of an eye!  The Gemini project (self-service BI) had two key features for me;  Firstly the in-memory storage (think TM1) that allows that sort of performance and secondly, and something that hasn't been widely commented on so far, the ability to publish your Excel reports to SharePoint/web (as XLCubed Web does now) at the click of a button.  The interface looked really good already and of course it's generating AS cubes behind the scenes.  It did raise the question of how it's all going to fit in with PPS V2 - hopefully the Office and SQL teams have been talking! I think that a lot of BI professionals were probably initially thinking (like me) that this could be bad news for their careers but having taken it all in, I don't think that's the case.   Although it's going to have data cleansing abilities, this isn't going to be a replacement for data warehouses/marts.  In fact for the whole self-service BI thing to really take off, it's going to need really good data underneath it.   Microsoft's intention with the Gemini release is to "democratise" BI; The more people who get to use this type of concept, the better the data quality will have to be - you can't expect every department to have to clean their own data. Allowing users to create and publish their own cubes and reports has a few warning signs as well - isn't this just excel hell without excel? Every department can create their own calculations, reports, cubes etc; We've been telling our clients for years that they really need one version of the truth - but now we're going to let each user make his own truth? It will certainly need some thought. As Chris W mentions, it's likely to also still need technical resource to help users create complex calculations so we won't be out of a job just yet ;) As for the rest of the day, some interesting sessions around PPS and SSRS were the order of the day for me.  Though disappointment in one respect as I found out that the PPS monitoring SDK would definitely not allow you to build a custom cascading filter - something that I was going to investigate. More later this week....

Analysis Services Properties for PerformancePoint MDX Rules

One of the great features of PEL is that you can choose either a SQL or MDX implementation for your rules, depending on what you want to achieve. Whilst the MDX rules are much less restrictive than the SQL rules, they can sometimes run slower, depending of course on how your environment is set up. When the MDX rules do take a long time to run, it's possible that you might see this message: What has essentially happened within Analysis Services is that an object (e.g. a cube or dimension) has been processed and is waiting to commit. Unfortunately this is not possible whilst an existing query is running, so therefore AS waits for the query to complete. It will, however, only wait so long, which is defined by one of the Analysis Services Properties called ForceCommitTimeout. Once this threshold has been reached, then the offending query is canceled, resulting in the error message above. Finding the right balance for the Analysis Services ForceCommitTimeout and the PerformancePoint PAC 'OLAP Cube Refresh Interval' setting is key. If you have set PPS to re-process its cubes too often then you may well see the above message. On the other hand, if you set the ForceCommitTimeout too high, then queries executed whilst the cube is waiting to commit will be made to wait, meaning your big query will get though ok, but other users may see bad performance. Darren Gosbell has written an excellent post here that provides a thorough explanation of ForceCommitTimeout and other related properties.

SQL Server 2005 Cumulative Update 8 for Service Pack 2

Microsoft have just released Cumulative Update 8 for SQL Server 2005 Service Pack 2. Amongst the changes are a few fixes for small problems in Analysis Services, such as: Infinite recursion occurs in the CalculationPassValue function in SQL Server 2005 Analysis Services; A data source view that connects to an Oracle database generates incorrect relationships when you develop a Microsoft SQL Server 2005 Analysis Services project by using BIDS; The MeasureGroupMeasures function does not work correctly in SQL Server 2005 Analysis Services; All the MDX queries that are running on an instance of SQL Server 2005 Analysis Services are canceled when you start or stop a SQL Server Profiler trace for the instance. Details the above fixes and many more can be found here, as well as details of how to download the update. As usual, you can only apply the hotfix if you have Service Pack 2 installed. Details of all SQL Server 2005 builds released after Service Pack 2 can be found here.