Adatis

Adatis BI Blogs

PowerBI Optimisation 1 – Connecting Via Management Studio

I recently gave a talk to the London PowerBI UserGroup and I kicked things off with a confession - "I don't do much report building in PowerBI". Perhaps an odd way to qualify myself to speak to that particular audience. But I am, however, a cloud solution architect - I spend my time designing large scalable cloud systems to process vast amounts of data and PowerBI is a common tool used on top of these systems. Why then, do we accept the lack of controls available within PowerBI? Given any other end-user system I'd want to know about performance bottlenecks, about data model efficiency and, more than anything, I'd want it in source control. First and foremost, the talk is available here. The key to it all, is realising that PowerBI Desktop, when running, starts a SQL Server Analysis Services processes in the background. It doesn't just use the same engine as Tabular, it literally runs tabular in the background without telling you. Open up a PowerBI Desktop file and, after you've seen the "initialising model…" window, you'll see this process in the background - one for each PBID session. So - if the model is using Tabular in the background, we must be able to actually connect to the model! First - Find your Temporary SSAS Port There are two straight forward ways we can achieve this: 1. By far the easiest, is to open up DaxStudio if you have it installed. When you open DaxStudio, it gives you a Connect window, which lists all of the PowerBI processes you have running in the background, as well as any Tabular services: When you connect to a PBI file here, you'll see the Port listed In this case, my port is 5524 -be aware that this will change every time you open PowerBI Desktop, so you can't hardcode anything looking for your "powerbi port". 2. Alternatively, you can find the "msmdsrv.port.txt" file related to your specific instance. Take a look in your user appdata folder, you should find a Microsoft/Power BI Desktop/ folder with some analysis services details: C:\Users\<YourUser>\AppData\Local\Microsoft\Power BI Desktop\AnalysisServicesWorkspaces\ You'll see an instance for each of your PBI Desktop instances, I've only got one at the moment: Inside this folder, in another folder called "Data", you'll find the file we're looking for: Opening this file, we see: Pretty straight forward, and no DAX required. Obviously if you have multiple instances, you'll need to figure out which of these relates to the instance you're after. Connect via SSMS Now that we know our port, we can simply open up management studio, connect to analysis services and enter "localhost:" and the port number from earlier.   Once connected, you'll see a model connection - each PBIX file will have a GUID for this instance, but you can drill down and see the objects underneath, exactly as you would with a Tabular model: You can now write queries, browse the model and basically treat it as a Tabular instance. The Database itself will use a generated GUID, and several internal tables will do the same - you can see above that a hidden data table has been created for every datekey included in my model. We'll discuss the applications of this in my next post - namely how this unlocks performance tuning, monitoring and source control.

Power BI Maps Handling Duplicate City Names

The Bing map engine behind the map visualisation in Power BI is very intuitive allowing users to provide textual data such as City or Country or Postcode to map metrics, instead of just latitude and longitude as most other applications do. However one thing which is not immediately obvious is how to get around the issue of duplicate City/Town names. In this blog I will explain how to map your metrics when your data source contains duplicate cities/towns. To start with we have a simple data set with quarterly sales for 6 different cities based in 5 different states which is being loaded from a CSV into Power BI. Straight away you can see that we only have 2 distinct city names.   As soon as we try to map the sales data by city, we get an obvious problem all of the Bristol sales are being assigned to Bristol, England, while the Georgetown sales are appearing in Guyana. Adding state to the Location field does nothing to help the problem as Power BI only reads a single input in the Location field. So the solution is to create a new column containing both City and State data. To do this you need to complete the following steps: 1. Click “Edit Queries” 2. Select the data source in question. 3. Select the two or more columns which contain the data we want to merge eg: City and State      -If additional geographical data is available such as Country then this can be included in the merged column. 4. Navigate to the "Add Columns" menu and select "Merge Columns" 5. Choose the separator value and name the new column For simplicity I have just called this “Merged” and separated the values using only a space. Once the new column has been created it can be dropped into the Location field of the map visualization. As you can see from the screenshot below I now have 6 data points, showing all three variations of Bristol, and all three variations of Georgetown. One final tip, is to ensure you have set the Data Category value for the column in question.  In this case I have set the Data Category to City to help Bing identify the type of data I believe I am providing it. The only problem with this, is if you set the Data Category value incorrectly no data will be displayed as shown in this final screenshot where I have changed the Data Category to “Continent”

Setting Up The Power BI Analysis Services Connector

The Power BI Analysis Services Connector is used in order to expose a Tabular model to Power BI allowing end users to consume data from the model directly for building of reports and ad-hoc analysis. The setup of the connector is very straightforward however you will should bear the following in mind - Only Tabular models are supported with the connectors – you will not be able to use this to enable reporting from a multidimensional database. - The Analysis Services Connector performs best if it is hosted on the same server that hosts the Tabular model. - The speed of the internet connection between the server running the Analysis Server Connector and the Power BI service is crucial to performance. - You can’t run the Analysis Services Connector on a server also running either the Data Management Gateway or the Power BI Personal Gateway. Installation Steps 1. Download the connector from http://www.microsoft.com/en-us/download/details.aspx?id=45333 2. Run the Analysis Services Connector Setup Program on the machine hosting the Tabular model. 3. Once the installation has completed you will be given the option to launch the connector     4. Enter the login details required to connect to Power BI 5. If the details are correct you will see the below screen: 6. Enter the account details required to connect to the Tabular Instance – clearly this needs to be an account with access to read from the Tabular model. 7. Give the connection a suitable name and a friendly error message to be displayed to users in the case that the connection fails This should complete the wizard, the next step is to log onto the Power BI site – if all has gone well you should see the model as per the below.

Power BI MDS Dashboard

Master Data Services tables such as the staging tables have been well blogged and documented, but there are a host of other system tables and views that can help you with an MDM project. By querying this tables and views, we can gather MDS metrics such as the number of members, number of validation errors and staging errors, all of which can then be exposed to a data steward via a dashboard. Given all of the recent updates around Power BI, I decided to use Power BI Desktop in order to build an MDM dashboard for this blog post. The goal in this case is to produce a dashboard that allows a data steward to get an overview of metrics on an MDS model by model basis. Power BI Desktop The first step when working with Power BI is to import some data. Once we start Power BI Desktop the welcome screen gives you a shortcut to get data, or we can do this via the ribbon: In this case we need to take data from SQL Server, so clicking the SQL Server option requires us to enter a server name and optionally a database name. Once we do this and pick the MDS database, we get a list of tables choose from: Clicking on Load will load the tables into Power BI. Master Data Services Objects The question is what tables do we need? If you navigate the MDS database you will see that there is a mix of Master Data tables (e.g. post-fixed with _EN for entity), application tables and metadata views (prefixed with viw_SYSTEM). Its the system views that we need to start with, as we want to allow a breakdown by model, entity and version. To do this we need to load in the following views: --Models SELECT ID AS ModelId, Name AS ModelName FROM MDM.viw_SYSTEM_SCHEMA_MODEL --Entities SELECT E.Id AS EntityId, E.Name AS EntityName, E.Model_ID, e.Model_Name, E.Model_MUID FROM MDM.viw_SYSTEM_SCHEMA_ENTITY E --Get a count of versions --This is used as a total and by version status SELECT V.ID AS VersionId, V.Name AS VersionName, V.Model_MUID, V.Model_Name, V.Status AS VersionStatus FROM [mdm].[viw_SYSTEM_SCHEMA_VERSION] V Next it would be useful to report on the member count in each entity. This is a little harder as its not stored in the metadata tables, but we can combine the MDS metadata views with the SQL Server sys.dm_db_partition_stats Dynamic Management View in order to return the row counts per member: --This gives the row count per entity for all models SELECT E.Name AS EntityName, M.Name AS ModelName, M.MUID, M.ID AS ModelId, 'mdm.' + E.EntityTable AS TableName, P.row_count FROM MDM.viw_SYSTEM_SCHEMA_ENTITY E INNER JOIN MDM.viw_SYSTEM_SCHEMA_MODEL M ON M.MUID = E.Model_MUID INNER JOIN sys.dm_db_partition_stats P on P.object_id = OBJECT_ID('mdm.' + e.EntityTable) AND P.index_id = 1 Finally a few more metrics that would be useful are the number of staging errors and a breakdown of the members by validation status (Validation Succeeded, Validation Failed, Awaiting Validation). To do this we can take in the following tables and stored procedures: --Get the current count of errors that have occured during staging SELECT Entity_ID AS EntityId, ErrorMemberCount, Version_ID FROM mdm.tblStgBatch --Pass in the appropriate version id EXEC [mdm].[udpValidationStatusSummaryGet] --This gets the validation issues by model/entity: SELECT ValidationIssue_ID, Version_ID, Model_ID, ModelName, Entity_ID AS EntityId, EntityName, Member_ID, BusinessRuleName FROM [mdm].[viw_SYSTEM_USER_VALIDATION] Power BI Relationships and Visualisations Once we import the data, Power BI Desktop will detect relationships automatically for us, but can alter them by clicking on Manage Relationships on the ribbon if we wish. The following shows the relationships between the various MDS objects mentioned above: Once we’re happy with the model and the relationships, then we need to start dragging and dropping to build the Power BI report. As an example we can create a simple card visualisation to show the count of various metrics: This can be used to produce the following metrics, the count of entities, versions, validation issues and staging errors: Equally we can create a simple column chart by using the [mdm].[viw_SYSTEM_USER_VALIDATION] view. The count of issues is the Value section of the chart, whereas the Business Rule Name is on the Axis: Putting it all together we can get the following report, shown for the sample Product model: Conclusion This is just a start, there are other metrics that you could put onto an MDM dashboard – as an example you could include MDM ETL metrics if you’ve logged them. But overall the combination of the rich MDS metadata and intuitive Power BI interface means its relatively easy to produce simple dashboards.

Power BI Visual Studio Online Content Packs – Analysing BI Team Performance

I spend a fair amount of time championing the use of data and analytics around my client’s companies, convincing people from all walks of life that making their data more visible and available around the company will be hugely beneficial. I was recently challenged on this – If I’m such a firm believer in sharing data and improvement through analysis, why am I not sharing my own data? It’s a very good point, and not one that I had an answer for readily available. I’m currently engaged on a project which uses Visual Studio Online for both ALM and source control. Fortunately, Microsoft have recently released the PowerBI VSO Content Pack, making the data held within your account available for dashboarding. The above link describes the steps to connect, but I found the dashboards required a little setting up before they were completely useful. You are first presented with a mixed bag of charts and metrics, many of which will contain no data. This is because the data model has different entities depending on the project template (Agile, Scrum or CMMI) chosen within TFS, as well as the source control binding (TFS or Git). I removed many of the charts from the default dashboard then went exploring in the exposed report model – I was very happy with what I found, the VSO object model exposed pretty much every metric I could think of to report on the activity of a BI development team, including report templates for each template/source version you could be using. I gave myself 15 minutes to see if I could pull out a reasonable dashboard and was pleasantly surprised by what could be done in so little time. So – how do you analyse an analysis team? How is the Project Going? Firstly, we’re an agile team. We run iterations loosely based around Scrum principles, we manage our client projects through story backlogs and report daily on blockers, impediments etc. This information is key to us operationally, but it also tells a useful story. How many new features were added in the sprint? How many individual user stories, each representing a distinct piece of business value, were delivered? How much effort is remaining in the backlog (and therefore how many additional sprints would be required to deliver all known functionality?). How many bugs have been raised – and how effective are we at dealing with them? What’s the current Sprint Status? The day to day metrics also tell a valuable story – was the sprint smooth and predictable, or was it a rush to deliver towards the end? How much work is still remaining in the current sprint? Are there any blocked tasks or impediments that may be seen as a risk to delivery? What actual work has been done? Stories and tasks only tell one side of the story – a task may represent a change to a single piece of code, or a large update that touches much of the system. Simply counting tasks therefore limits our understanding of how productive we were during a sprint. Fortunately, we can also analyse the source control history, looking at the changesets committed and their contents. This provides some insight into the complexity of those completed tasks – it’s not a perfect measure but gets us a little closer. We can now ask questions such as: How many individual changesets were committed? Who commits most regularly? What kind of work was done – what is the most common file amended? Is there someone who hordes changes and causes potential problems by not regularly committing their work? Is our development behaviour changing overtime as we review our practices and learn from them? Finally, it’s also worth noting that the content pack has been fully set up with synonyms to allow for the Q&A Natural Language query bar to be activated. So if there’s a metric not included in the dashboards, users can simply type their question into the query bar. For example, I want to better understand the type of changes we’re doing – are we creating new entities or modifying existing code? For this, I tried the following, with the relevant chart appearing before I’d even finished typing: There’s a whole lot more content in the packs under the individual report tabs, but this gave me a good point to start that conversation. I can now provide weekly dashboard updates to my project sponsors, showing just how much progress we’re making. This is a huge boost to my ability to champion data and I’m expecting it to actually improve some of our working habits. Now, if anyone interrupts me mid-flow I can simply grab my phone, load up the Power BI app and pull out some insights from the team’s current performance, wherever I am.

Adatis Hackathon Jan 2015 – Power BI Designer Preview

In the January Adatis hackathon we were split into 3 teams and got the chance to play with some bleeding edge technology. I was captain of the Gleam team, which meant we got our hands on the new Power BI Designer preview. To check out the other teams see here: STREAM (stream analytics), DREAM (Azure ML). We tackled two main dashboards in our hack build, firstly creating excel mock-ups then reproducing inside Power BI Designer. This was pretty challenging as some of the chart types simply weren’t there and there was some awkward mashing of data to get it into different charts. The end results weren’t quite as pretty as we were hoping, but it’s still very quick to put together. The move towards a “dashboard” as opposed to large, single charts means we can create richer, more informative systems using the tool. However our main focus was the designer itself - this is the start of the Power BI Suite leaving the clutches of Excel. Don’t get me wrong, when Power BI first started coming about it was right at home as a series of Add-Ons for Excel but as it has grown, and SSRS has become more dated, the corporate market started to look towards it for reporting. Power BI as a completely Excel-based system is outside of the development ecosystem, it’s not source controlled or versioned. For this alone I feel it justified for it to leave the nest and out into the real world as a standalone system. That said… The preview designer is currently lacking functionality, it actually has less functionality than it did in Excel. There were certainly fewer chart types available than in its Excel counterpart and we did come across a few (very annoying) bugs. However, this is only a preview and  since the Hackday there have been several updates for it that add various bits of functionality as well as bug fixes. So even if we are not 100% there yet it does seem that this new product is heading in the right direction. There’s still a question of where it’s heading and how it will develop – but the Power BI Dashboards currently previewing in the US, and new features such as the Power BI API are really moving the technology into a mature, usable state. We will be keeping a close eye on this as it develops and too see what new features get added in future updates  of the preview and eventually in the real thing!

SQL PASS Summit 2014 – Kick Off

Day 1 has kicked off in Seattle, a remarkable city. Having arrived a week early, I’ve had plenty of time to check out the sights and the food and have enjoyed it immensely - a brilliant venue for PASS! There were a few announcements at this morning’s Keynote, mostly regarding Azure. Azure SQL Databases are gaining larger index handling, parallel queries, extended events and in-memory columnstore for data marts. Joseph Sirosh gave a talk about Machine Learning & Information management showing a cool example of Azure Stream Analytics using Microsoft Kinect sensor information of customer interactions in a shop being uploaded straight into Power Map! I am looking forward to hearing more on Machine Learning. There are also a handful of great improvements for Power BI. I am most looking forward to the new Live Ops Dashboards and drill-through actions! Combo Reports also look promising… Moving onto the first session, I chose to attend ‘What’s new in Microsoft Power Query for Excel’. As it turns out there’s not a massive amount of new stuff – some new data sources and a tick box when you do a merge to remove name prefixes. However one of these new sources is the long-awaited Analysis Services data source. The new ODBC Data Source is a great addition also. There was a mention regarding the possibility of a decoupled M-query SSIS component! We probably won’t hear of anything until later in 2015, unfortunately. I would say this was not a level 300 session, more like 100/200. The second session was ‘SQL Server 2014 Always On (High Availability and Disaster Recovery)’: a nice overview of what was brought in in 2012 and the newer 2014 features – these including an increased maximum number of secondary replicas, increased availability of readable secondary’s and the Add Azure Replica Wizard. Despite not being a DBA and it being a level 300 session, I found it easy to follow and absorbable. I feel many of the DBAs in the room may not have taken away any information they would have not already have known, however. Niko Neugebauer gave a fantastic, in depth session on ‘ETL Patterns with Clustered Columnstore Indexes’. It was a fast moving talk, despite the time spent waiting for some executions. It demanded your full attention! Definitely worthy of its 400 level. It left me a little tired for Marco Russo’s ‘DAX Patterns’ session which showed some examples of workarounds for common tabular weaknesses like distinct counts for Type 2 slowly changing dimensions and cumulative quantities. Overall it was a strong day. I am looking forward to tomorrow. More to follow…

KPI’s in Power View and DAX Query for KPI’s

I was recently approached by a client where we have assisted in implementing a Microsoft Analysis Services tabular solution and a supporting management pack of SSRS reports.  They were asking about utilising some of the KPI’s defined in their tabular cube in the SSRS reports using DAX queries.  I was at a bit of a loss and most web help seemed to suggest it couldn’t be done: Blog comment from 2012 However, with a bit of time to investigate it appears Power View does now support Tabular KPI’s and as such a DAX query should be able to pull them out.  My KPI base measure is named “Highest Ranking” and I am able to view its value, status or goal in Power View.  A little work with every SQL developer’s best friend Profiler and I had what I was looking for.  It appears that when you create a KPI SSAS is defining some measures in the background which do some of the work of the MDX KPI functions. Therefore the following DAX query against my cycling Demo cube contains the expected results and could therefore be used in SSRS reporting. EVALUATE ADDCOLUMNS( VALUES('DimRider'[Name]) , "Highest_Ranking", 'FactRanking'[Highest Ranking], "Highest_Ranking_Goal", 'FactRanking'[_Highest Ranking Goal], "Highest_Ranking_Status", 'FactRanking'[_Highest Ranking Status] ) ORDER BY 'DimRider'[Name]   I also tested for the existence of other KPI functions such as Trend and Weight but these do not appear to be present.  It is also interesting that the use of a KPI over a measure does not change the measure name but just group it in the field list and as such there is no need for a value function. For more info on KPI’s in Power View there is additional documentation here.  I am currently unsure of the required versions for this functionality so if it is missing from your environment I would love to hear from you.

Power BI preview– Mobile Integration

I have had the invite through to view the Power BI preview today (Get yours here).  The first thing I wanted to do was get some reports up and start displaying them through my Surface and its Power BI App. The app comes pre-installed with samples but to link it up with your own site you need to hit browse from the app menu.  From there you can add a location.  What caught me out was where do i navigate to to get my Power BI preview reports? Well it is hidden away in the documentation here that you don’t try and navigate directly to your Power BI site as I did but you directly give the SharePoint site and it will pick up the Power BI app from there.  From there you can navigate to the reports you have uploaded to your preview site and favourite the ones you want to appear on the home screen.  The couple I have on there can be seen here: You can control which sheets get displayed by changing the browser view options of the workbook, as described here More to come on Power BI shortly..