Adatis

Adatis BI Blogs

Power BI Composite Models and Aggregations

So something which feels like its gone under the radar a bit is the addition of composite models and aggregations in Power BI. These have been around a couple of months now in Preview but I’ve not seen much buzz around the features. After recently attending PASS Summit and seeing Christian Wade’s 1 Trillion row demo, I thought it was worth blogging about – especially as I expect parts of this functionality to be expanded to Azure AS in the near future. If you think about it, the majority of BI query’s are done at some form of aggregation level but users will still want the detail, and so these features essentially unlock the ability to report against giant datasets at both an aggregated and granular level at the same time – something that was not physically possible beforehand. Power BI is now able to work with multiple petabytes of data with ease over trillions of rows with pretty much instant response times – so this is a bit of a game changer.   Composite Models Previously with Power BI, you were restricted to either DirectQuery or Import for a single data source. With Composite models that has all changed. A report can now include data connections from more than one connection in any combination, so you are able to connect to both DirectQuery AND Import in the same model and combine the data. This then opens up a world of possibilities that were not possible before.  With this functionality, we also have the ability to create many-to-many relationships, but I won’t be doing into detail in this for this blog. As part of this also comes some functionality called Storage Mode which unlocks table-level storage features. This is the next part of the jigsaw.   Storage Mode This allows you to specify whether tables are either imported or queried on the fly. There is now a third option “Dual” which acts as either a cached or not cached table depending on the context of the query that's submitted at runtime. Setting the correct storage mode has many advantages such as: Better query performance (no need to push real time queries to the dataset for relatively static tables such as those used for filters). Ability to use larger datasets (interactive analysis is better for datasets you don't want to cache into memory such as tables with significant data volume). Data refresh optimisation (only cache data that's necessary, meaning quicker refresh times). Reducing latency in real-time data (no need to re-read those static tables each time). To access the storage mode, you either select the table in the report pane and click Properties, or navigate to the table in the new Modelling View (requires exposing this in Options as its in preview). Changing a table storage mode will then prompt us to change the related tables to Dual. This propagation logic is designed to help with models that contain many tables!   Modelling View Before I get to aggregations, I also need to introduce the new modelling view. This will be changing to fall in line with what you may be familiar with on SSAS / Azure AS. It feels like this is one of the first steps to integrate SSAS further into the Power BI spectrum and facilitate enterprise datasets. By updating the modelling view, you now have the ability to create additional diagrams, thus allowing you to break out particularly complex models by subject area rather than trying to join it all up within one model. For instance, if you are working with 5 tables in your model, this isn't a particularly big deal – now multiply that by 10 and all of a sudden it becomes a pain to manage, and this new feature will help alleviate that. The feature also allows you to multi select objects in one go and update properties through a new side-pane. An example of this new feature can be seen below.   Aggregations Finally, the most exciting feature of the lot – aggregations. Without the new modelling view, without the composite models, without the new storage modes – this would not be possible. Aggregations allow us to create a newly defined table within the model but with all fields of the original table it is created from to be aggregated/grouped in some or another. These aggregations/groups include count, group by, max, min, sum, and count rows. This can be set to either set to Import mode with/without incremental refresh, or via DirectQuery and optimised by using columnstore indexes. This then unlocks faster query performance over huge datasets via the cache at aggregated level using fractions of resource compared to detailed levels. You then have the flexibility to set the aggregate table to import, while leaving the granular table to DirectQuery, which will speed up performance when navigating the report. The aggregate table can also be hidden so that user will not even be aware of the implementation. You can also have more than one aggregation table, potentially one for a particular set of reports and then a lower grain aggregate table for analysts, and this can be done through the precedence option. The tool then will then query the aggregation table with highest precedence level first to see if it can resolve the runtime query before moving down the levels. Aggregations don’t just work for measures such as SUM or COUNT. It also works for more complex measures - all of the components of a measure are always folded down to the sum, min, max, count, level and then those sub query’s work out whether they can hit the cache or not.  Its also worth mentioning that you can check if its hit the cache via the DAX editor.   Conclusion For me, all these features mentioned above will be a bit of a game changer in the right scenario. I’ve ran into scaling problems before and re-working the logic at the Warehouse level for both a granular and aggregated datasets certainly added some overhead. While the functionality obviously doesn't work with Live Connection into SSAS cubes, this new functionality opens up options for using Power BI to do of the activities which historically would have only been done in SSAS. As Christian Wade also alluded to at PASS, Power BI will soon become a superset of SSAS and so we may well be doing less and less with SSAS and more with Power BI in this area as time goes by.

Power BI Dataflows – Bringing self-service ETL to the Business User

Announced at the recent PASS Summit, Power BI Dataflows has now gone into public preview. Previously, any ETL that users applied via Power Query within Power BI Desktop was only applied to their own dataset. With Dataflows, ETL processes are now classified as a first-class citizen and this update provides centralised self-service data prep as part of the Power BI product so that business users can expose cleaned/schematised entities across the business in a similar way to a Data Warehouse. No longer will the same lightweight ETL processes be occurring across multiple users for the same dataset, and often in different ways. Traditionally, self-service BI is limited to analytic models and reports/dashboards but with Dataflows that is changing as it shifts down a peg.  In this blog, I’ll take a look at the new functionality and give my thoughts. Before we begin, its worth noting that this is an entirely different piece of functionality to that of the similarly named Data Flows in Data Factory which is in private preview. You can tell the difference by the capitalisation of the word “Flow” for Data Factory. I’m not sure what the marketing department would say about this but it is slightly confusing having both been released as news at the same time.   What is a Dataflow? A dataflow is a collection of entities (which are similar to tables) that are created and managed from within the Power BI Service (powerbi.com). This is then stored under the Common Data Model (CDM) in an Azure Data Lake Storage (Gen2) resource as files and folders – no longer as a Tabular Model that we are used to with Datasets.  Dataflows are used to ingest, cleanse, transform, enrich and schematize/model the data in a similar way to that of ETL to create a Data Warehouse. Once created you can use Power BI Desktop to create datasets based off of these entities in the usual manner. While dataflows can be created and edited by both Pro and Premium users, there is a larger set of functionality only available via premium. This includes creating computed entities (calculation aggregates), linking to existing entities in other dataflows, and incremental refreshes.   Creating a Dataflow To go about creating a dataflow, navigate to an App Workspace on the Power BI Service (note that this cannot be My Workspace for obvious reasons, also note this cannot be done in Power BI Desktop). You’ll notice alongside Dashboards, Reports, Workbooks, and Datasets that there is a new artefact labelled Dataflows (Preview). If you have Premium capacity this will need to be enabled via the Admin Portal Capacity Settings. Click the + Create and select Dataflow. Each dataflow has only one owner and only the owner can edit it. To create a dataflow, you can either define new entities which involves connecting to a data source and then mapping to the Common Data Model entities / defining custom entities OR linking to other existing dataflows (see below).   If you are creating new entities, you are provided with a set of familiar data source connectors (see below). This set is not as complete as those through Power BI Desktop but I expect more to appear within time. I suspect there is also more work to do here because some of these connectors documented below, i.e. Excel - ask for a File Path or URL without giving any functionality to go get this through a lookup. This is something we have come to expect as part of Power BI Desktop.   Once you have loaded your data, you will then be provided with a fairly familiar query editor pane (see below). The first thing you will notice is that the options to manipulate and transform your data are limited to that of Power Query as part of Power BI Desktop. Again, I suspect this will mature when the functionality hits GA but for the time being gives some reasonable functionality to do what you need to do to create entities in the form of column/row transformations. You are also provided an option to “map to standard” which allows you to join into the Common Data Model (CDM), essentially some generic templates for entities such as Account/Product/Customer/etc. While I can see how this could be useful for an average user, I think it would still be better to create your own based around your businesses own definition of an Account/Product/Customer/etc. These are then stored as custom entities within the CDM. Once the entities are transformed/schematised how you want them, you save them to the Power BI Service with a name/description. Under the hood, this is then written to a Data Lake as a series of files along with a JSON schema.   Dataflows are then available via Power BI Desktop using Get Data (see below). I believe you will need the November 2018 release or later to see this option. The dataset is then treated in a similar way to everything else you would have seen previously within Power BI Desktop. When selected, you should be able to see all of the workspaces within your BI tenant that you have access to which contain dataflows. Within each dataflow you can see each entity. That’s really it, they’ve kept it simple on purpose and abstract from what’s going on under the hood.   Wait, but what does this mean for the classic Enterprise Data Warehouse? I certainly think from a business perspective it will blur the lines slightly between using traditional ETL processes to create a DW or going down this route, more so at smaller scales. It would be hard pressed to replace an Enterprise level solution, but there is definitely a use case somewhere around departmental level instead of creating a data mart. It will allow users to leverage their data faster than waiting on IT to build out a solution. It really depends on the complexity and maturity of the platform they are intending to develop and its purpose. The fact is sits on ADLS Gen2 is a positive and allows other applications to access the data at a raw level rather than relying on a specific connector. It also allows users of products such as PowerApps to connect into a cleaned CDM, another plus. On the other side, its worth bearing in mind that you will need the Premium version to facilitate incremental refresh which is required by the majority of data warehouse solutions.  The lowest cost of Power BI Premium is that of a single P1 node which costs at the time of writing £3,766/month or £45k/pa – not small by any standards. This is worth taking into consideration. The other fact is that traditional DWHs deal with concepts such as SCD Type II and creating History which Dataflows will not be able to facilitate. You may also have noticed there was no mention of DAX or measures, which still needs to be down at the usual level within Power BI Desktop, not ideal to share KPIs across users. Lastly, there is currently no data lineage, although this has been rumoured to be part of the roadmap.   Conclusion While this feature will be a game changer for a number of end users and projects, I think it will join the set of questions such as “Do I need SSAS or can I use Power BI for my models?” I expect to see this appear at next years conferences! The answer will always be – it depends! As we have seen with some aspects of Power BI, this new functionality also opens up businesses to governance issues with potentially having users create their own version of the truth rather than something which has been thought out and modelled by someone with a certain skillset or authoritative source. This could cause contention between departments. As noted above, I expect the Power BI team to continue to mature the functionality of the product – but for the time being, it certainly opens up another avenue to draw people into the toolset and provide another level of functionality for business and users alike.

RIP PerformancePoint Planning

It's nearly a week since the announcement that shook the (PPS) world !  It's been a bit difficult to report on; generally the Adatis blogs try and offer solutions to problems we have encountered out in the real-world.  Now I could say something crass here about the real-world and the decision makers involved...but that would be childish right? If I was to offer up my feelings, they wouldn't be that far from Alan Whitehouse's excellent post on the subject.  If I had an ounce of class about me, it would be much more aligned with Adrian's poignant discussion opener, the one with the sharp-witted title, but alas.... We've spent the best part of the week speaking to customers, partners and Microsoft about what to do next.  The timing was choice - would you believe, we actually had three new PerformancePoint Planning phases kicking off this week, according to my project plan - I should be setting up Kerberos as we speak..  [There is always a positive right?] Some customers are carrying on regardless, they... ...already have planning deployments and are too far invested and dependent to back out at this stage or,  ...have a short-term view (That's not a criticism) and need a "quick" fix with a low TCO to get them through some initial grief.  (Typically these customers are going through rapid organisational change, or form part of a recent acquisition and, to help them see the wood from the trees during the transition, require short/sharp solutions) Other customers, with longer-term views, feel the product, or more importantly, the suitably skilled resource pool, will drain away far quicker than the life-span of the much touted Microsoft product support.  I have to agree - Fact - Adatis will not be employing or training anymore PerformancePoint Planning Consultants.  I doubt many other consulting firms will either. It's those customers with the longer-term view that are the ones currently in limbo - they are experiencing pain, they need pain relief, what should they do - wait and see what Office 14/15 offers? (There is talk of some planning functionality appearing in future Office versions - what truth there is in that..?). The Dynamics customers could wait for the resurrection of Forecaster - I do have information on good authority that they will be developing Forecaster to be closer, in terms of flexibility, to PPS Planning.  I had originally heard the opposite view in that Forecaster will be replaced with a cut down version of PPS Planning.  Either way, I'm sure some of the PPS Planning code-base will be utilised, which could end rumours of PPS Planning being 'given' to the community as some form of community/open-source arrangement.  An arrangement that is, in my opinion, a non-starter anyway, "Hey, Mr FD, We've got this great open-source budgeting and forecasting product we think you should implement!" - yeah right ! Another rumour (and mixed message) is that Service Pack 3 will contain some of the requested features that were earmarked for version 2 (After all, the code has already been written, right?) this rumour was actually started by Guy Weismantel in his Announcement Video.  However, the information I have since received, clearly states that Service Pack 3 will contain stability and bug fixes only - so which is it to be?  It's unlikely for a service pack to contain new features, but it's not unheard of; anyone remember the original release of Reporting Services?  That arrived as part of a service pack for SQL Server 2000. The burning question I cannot get answered is, have Microsoft actually stepped out of the BPM market for good?  We are told that Excel, Sharepoint and SQL Server provide BPM - I can't see, without Planning, how they can.  Short of hard-coded values, renewed Sharepoint/Excel hell, another vendor or bespoke planning solution, businesses can't set plans which have further reaching implications; effectively Planning's demise is also, effectively, shelving the Scorecard/KPI functionality from the M&A toolset too !  It will be interesting to see the new Monitoring & Analytics Marketing, will they still demo Strategy Maps and Scorecards, or will they now focus on Decomposition trees and Heat maps? Monitoring & Analytics may, in practice, just become Analytics.. I would have thought the cost of continuing to develop the product (even if it were a lemon, which Planning certainly wasn't)  is far less than the potential loss of revenue that Microsoft will face due not only to the loss of confidence by its customers (who are going to think twice about investing in any Microsoft product now, let alone a V1) but perhaps more significantly, the doors it opens to it's competitors who can offer a complete BI\BPM stack.  Planning was foot in the customer's door for BI - once you put planning in, the customer had already bought the full BI stack, and in most cases, our customers were wowed by what they could now achieve.  I suspect Cognos and SAP are still partying now!

PerformancePoint SP2 - Planning Fixes and a mini-feature

Jeremy has already announced the release of PerformancePoint Server SP2 and it's great to see that the PPS dev team hit their target release date !  I've spent a little commute time this morning checking out the documentation, admittedly I've initially focused on the Planning component and there are no great surprises (Tim has already told you about the new bits) but I have spotted what could arguably be described as a mini-feature surrounding form validation that I'm sure that will come in useful. As you would expect, previously released hot fixes have been packaged up into this service pack: 954710 Description of the PerformancePoint Server 2007 hotfix package: July 1, 2008 955432 Description of the PerformancePoint Server 2007 hotfix package: July 14, 2008 955751 Description of the PerformancePoint Server 2007 hotfix package: July 28, 2008 956553 Description of the PerformancePoint Server 2007 hotfix package: August 21, 2008 Plus fixes to issues not previously addressed: Excel Add-In Related You locally save and close a form in PerformancePoint Add-in for Excel. When you reopen the form, you are prompted to update the form. However, you expect that you are not prompted to update the form because the form is already up to date. In PerformancePoint Add-in for Excel, you open an offline form assignment. In the form assignment, you add a link to an external Excel worksheet in a cell. Then, you submit the changes to the PerformancePoint Planning Server database. However, when you reopen the assignment, the link that you added is not retained. After you install PerformancePoint Server 2007 Service Pack 1, you create a page filter in PerformancePoint Add-in for Excel. You have a user in PerformancePoint Server 2007 that does not have permission to the default member of the page filter. However, the user has permission to other leaf members in the page filter. When the user opens a report that uses this page filter, the user receives the following error message: Cannot render the <MatrixName> matrix. The server returned the following error: The <CubeName> cube either does not exist or has not been processed. However, in the release version of PerformancePoint Server 2007, the next member that the user has access to will be automatically selected for use in the page filter. You define data validation in a worksheet of Excel. However, you can still submit a form in PerformancePoint Add-in for Excel if data in the form is not validated. You have a matrix that is based on a large and complex model in PerformancePoint Add-in for Excel. You open the Select Filters dialog box to change a page filter for this matrix. When you click the Value column of the filter, the dialog box that displays the dimension members takes a long time to display. Business Rules Related After you migrate an application in PerformancePoint Server 2007 from one server to another server, the order of user-defined business rules and system business rules in models is not preserved. You cannot use the datamember function in the ALLOCATE statement and in the TRANSFER statement. Consider the following scenario. You create an automatic rule that uses MdxQuery implementation or Native MdxQuery implementation in Planning Business Modeler. Then you submit changes to the source data that the rule uses from an assignment form. The submission causes the model to be reprocessed. Because model reprocess causes rules in the automatic rule set to be executed, you expect that the target data of the automatic rule will reflect the change by the form submission. However, after the model is reprocessed, the target data of the automatic rule does not reflect the change. Rule expression of system business rules uses dimension member names instead of dimension member labels in PerformancePoint Server 2007. Planning Business Modeler Related You have a model that contains many form templates and assignments. When you try to change objects in the model in Planning Business Modeler, Planning Business Modeler crashes. You create a member property of the Date data type in a dimension in PerformancePoint Server 2007. Additionally, you specify the Set value to Null option when you create the member property. When you retrieve the value of this member property, you obtain a value of 1899-12-31T00:00:00. However, you expect that you obtain a value of blank. You cannot schedule recurring jobs for a frequency that is less than an hour. When a user updates a business rule in Planning Business Modeler, the audit log file of PerformancePoint Server 2007 logs the user ID of the user that created the rule. However, you expect that the audit log file logs the user ID of the user that updated the rule. Consider the following scenario. You create a dimension that has no hierarchy in a localized version of PerformancePoint Server 2007. Then you perform one of the following operations: You run the bsp_DI_CreateHierarchyLabelTableForDimension stored procedure to create label-based hierarchy table for the dimension. You perform the Prepare the Staging DB operation in PerformancePoint Planning Data Migration Tool. In this scenario, you receive the following error message: A problem was encountered while attempting to connect to, or Execute BSP on, the specified Database For more information regarding this error please review the Application Event Log on the SQL Server for any "MSSQLSERVER ERRORS" and\or Please check that all parameters in the UI are correct and try again

New PerformancePoint Contoso Demo - Released

Amidst my write up of the first day of the Microsoft BI Conference, I mentioned a new planning demo suite was imminent, and I would post more information about the demos soon.  Well, as it has now been officially released (27th October) I can spill the beans... Taken directly from the PPS Planning Forum announcement, the demo.. .. consists of Planning and Financial Consolidation demo. It shows how the fictitious Contoso Group uses Microsoft Office PerformancePoint Server for planning, statutory consolidation and data analysis. Well, I'm proud to announce that Adatis, in the shape of my colleague Jeremy Kashel, designed and built the PerformancePoint Planning element of the suite.  The PerformancePoint Financial Consolidation element was conceived and developed by our friends at Solitwork of Denmark. The demo can be downloaded from here... http://www.microsoft.com/downloads/details.aspx?FamilyId=00B97AC5-8B69-4F4D-AA0C-ACBFBFB9B48E&displaylang=en ...and is part of the next 'All Up BI VPC' (Version 7). Great work guys!

PerformancePoint Server 2007 Forums - Released

The PerformancePoint Server forums are now part of the official TechNet Forums and no longer limited to Microsoft Connect Beta Forums. For those that didn't see the relatively active Beta forums on Connect, Microsoft separated the product into two groups and this is reflected in the official versions too. Planning:http://forums.microsoft.com/TechNet/ShowForum.aspx?ForumID=1871&SiteID=17 Monitoring and Analytics:http://forums.microsoft.com/TechNet/ShowForum.aspx?ForumID=1872&SiteID=17 At the time of writing the Planning forum was empty and the M&A forum had just the single topic.  It will be interesting to see how the activity picks up through the launch period and into the New Year.

TechEd Announcement: Microsoft purchase Dundas..

..Well, not quite.  Microsoft have not actually bought Dundas but they have bought the code base for the charting, gauge, calendar, map and barcode controls.  This is excellent news - I've used Dundas charting in various guises for a number of years and have been well impressed with the depth and breadth of the suite.  The plan is to integrate the charting functionality into Reporting Services 2008 (Katmai).  Adding this functionality to Reporting Services is going to be great step forward and I can't wait to start developing some presentation ready reports.  Should make the sales pitch easier too !