Adatis

Adatis BI Blogs

Understanding DAX through the Power of Evaluation Context

If you really want to understand the behaviour of your DAX and be able to write more complex DAX, you need to understand Evaluation Context. The two really do go hand in hand. Many times I have had multiple people ask me to explain what Evaluation Context is which is why this blog post was put together. To get the most out of this read and be sure the below is for you, I have constructed a list of questions that you should be able to answer after reading this blog post:What is Evaluation Context?What is the difference between Filter Context and Row Context?What elements are considered in the Filter Context?What is Initial Filter Context (IFC) and what difference does it have with Filter Context?How do we read the Initial Filter Context?Does Evaluation Context apply before the DAX formula Evaluates?Are Grand Total/Sub Totals treated differently with Initial Filter Context?What is Context Transition?How is Context Transition applied?What is Evaluation Context?Evaluation Context is: The environment in which a DAX formula is evaluated in. So, when we write a DAX formula it gets evaluated within an environment. The DAX formula evaluated will be directly dependant on the Evaluation Context. This means the result returned can be different depending on the context, even if the DAX formula used is the same.Same DAX Formula, different result… huh?If this is your first read on Evaluation Context, you may be wondering why it is possible to get different results when using the same DAX formula. Let’s make sure we make this clear before moving on. Take a look at the below measure:Total Sales = SUM(Sales[ExtendedAmount])When reading this we interpret it as: “Show me the Total Sales Amount”. But when we add this measure in either of the below matrix’s we get a breakdown by Country and by Category. How is this possible? We never specified in the DAX formula anything in relation to Country or Category, right? All we asked for was the “Total Sales Amount”. So, how do we get different values on each row?It is important to understand why this is happening in order to really understand DAX. The reason this is occurring is due to the: Evaluation Context:The environment in which a DAX formula is evaluatedWhat types of Evaluation Context exist?We have two types of Evaluation Context:Filter ContextRow ContextFilter ContextFilter Context refers to the filtering that is applied to the tables within the data model. If the tables are joined, the filters propagate from the “one” side of the relationship to the “many” side of the relationship. The Filter Context is applied by: Rows in a VisualColumns in a VisualSlicers in a ReportFilters in a ReportOther on-screen visuals acting as filters in a ReportCALCULATE FunctionThe DAX formula evaluated within the Filter Context, is only evaluated once all filters have been applied to the Data Model. The deeper you go into Power BI, the more you will hear the term Filter Context. An additional term to be aware of is Initial Filter Context (IFC). This is the same as Filter Context with the only difference being it does not apply the CALCULATE function.Filter Context in ActionOkay, so we “know” what Filter Context is from what we have read so far! But I believe we can do better and the best way to do better is by looking at Filter Context in action. Below you will find four cases of Filter Context being applied, which once understood will be sure to boost your confidence in understanding this type of Evaluation Context.One thing worth noting, for each case below we will identify the Initial Filter Context meaning not considering the CALCULATE Function. The objective is to interpret all the elements on the report that make the Initial Filter Context.1. Filter Context in Action: RowLooking at the below cell highlighted in green within the Matrix, lets understand how this value is evaluated, we must ask ourselves:What is the Initial Filter Context for this particular cell?At the start of my Power BI journey, I initially used the below matrix to assist me in identifying the Initial Filter Context for a particular cell. I quickly grew out of this but found it useful in triggering the right thinking. So, the Initial Filter Context for the highlighted cell is only coming from the Rows in the Matrix from Territory[Country] = United Kingdom, as the below matrix displays:We now know how to read the Initial Filter Context of a particular cell. Slowly you will start to mentally visualise how the Initial Filter Context is applied to the underlying model. To make more sense of this now, let’s walk through the steps taken to propagate the Initial Filter Context:All rows are filtered in the Territory Table (Dimension) to display rows for ‘United Kingdom’.Those filtered records are propagated down the relationship from the ‘one’ to ‘many’ side.The Sales Table (Fact) only exists with those filtered records.It is very important to understand that the above three steps always take place for each individual cell and only once they do, does the DAX formula evaluate.2. Filter Context in Action: Row & SlicerWe must ask ourselves once again: “What is the Initial Filter Context for this particular cell?”To read the Initial Filter Context for the highlighted cell above in green, this time it goes beyond the Rows in the Matrix. Use the below Matrix to identify the Initial Filter Context:The Initial Filter Context is propagated through the below steps. Once again, these steps always take place for each cell individually and only once they do, does the DAX formula evaluate:All rows are filtered in the Territory Table (Dimension) to display rows for ‘United Kingdom’ and in the Calendar Table (Dimension) to display rows for ‘February’.Those filtered records are propagated down the relationship from the ‘one’ to the ‘many’ side. The Sales Table (Fact) only exists with those filtered records.3. Filter Context in Action: SlicerLet’s start of with the most important question: “What is the Initial Filter Context for this particular cell?”The purpose of this example is to understand how Grand Totals and Sub Totals work with the Initial Filter Context. For the above example, the IFC is not being filtered by an individual Row of Territory[Country]. In fact, the Grand Total is un-filtered from the Territory[Country], but it does get filtered by a Slicer in the report. It’s important to understand that the IFC for the highlighted cell is only the Slicer containing Calendar[MonthName] and not the Rows in the Matrix with Territory[Country].As always, the Initial Filter Context propagates as the below steps explain:All rows are filtered in the Calendar Table (Dimension) to display rows for ‘February’.Those filtered records are propagated down the relationship from the ‘one’ to the ‘many’ side. The Sales Table (Fact) only exists with those filtered records.4. Filter Context in Action: Rows, Column, Slicer & FilterFor the last time we need to ask the all famous question: “What is the Initial Filter Context for this particular cell?”The Initial Filter Context for the above highlighted cell is coming from multiple elements which are defined in the matrix below:Even though much more is happening in this example compared to the previous, the propagation of the Initial Filter Context still works through the three steps, resulting in the below:Tip: Filter IconA recent feature in Power BI called ‘Filter Icon’ is great way to identify the Initial Filter Context of a visual. Through a simple click, you can identify:Slicers affecting visualFilters affecting visualOther on-screen visuals affecting visualBut not the Rows and Columns of the Matrix itself. Regardless, it is a very useful feature which you can use to identify the IFC.Summary of Filter ContextWhat is the difference between Evaluation Context and Filter Context?Answer: Filter Context is a type of Evaluation Context.What is applied as part of the Filter Context?Answer: Rows, Column, Slicers, Filters, On-Screen Visuals, CALCULATE FunctionWhat is Initial Filter Context (IFC) and what difference does it hold with Filter Context?Answer: Initial Filter Context (IFC) does not consider the CALCULATE FunctionHow do we read the Initial Filter Context?Answer: “The IFC for this cell is Table[Column] = Value”Does Evaluation Context apply before the DAX formula Evaluates?Answer: YesAre Grand Total/Sub Totals treated differently with Initial Filter Context?Answer: No, they are not an exception. Remember, always ‘read’ the cell.Does the Filter Icon display the IFC?Answer: Partially, it displays everything apart from filters coming from Rows/Columns.Row ContextRow Context is a special iterating mechanism which takes into consideration the row it is currently on at the time that the DAX formula is evaluated. When thinking of Row Context, imagine each row in a table being a house. The Row Context will be the postman. The evaluation of the DAX formula is the delivery being made. The postman (Row Context) must visit each house (Row) in order to complete a delivery (Evaluation). This is how I initially start thinking of Row Context. Row Context is applied when we use: Calculated Columns, X-Functions and the FILTER() function.Row Context: Calculated ColumnTo make more sense of Row Context, let’s start off with Calculated Columns which apply Row Context. Take a look at the below Calculated Column:Margin £ = Sales[SalesExcVAT] – Sales[SalesCost]From our current understanding of Row Context and looking at the above DAX formula, the first question that might come to mind is “how does it know which row to start on?” Well the answer is simple. Row Context within a Calculated Column always starts with the first row and finishes on the last row. It iterates through the entire table the Calculated Column has been placed in, starting with the first row and evaluating the expression, then moving on the second row and evaluating the expression until it repeats the iteration for all records in the table.Looking at the below screenshot we can see that the Calculated Column ‘Margin £’ has produced a value for each row.As this Calculated Column has been placed in the Sales Table, this means the Row Context is applied against this table. The Sales Table has a total of 55,327 rows, therefore the Row Context (postman) will visit each of the 55,327 records (houses) and evaluate the expression (make the delivery).Row Context: X-FunctionsWe have spoken about Row Context and how this type of Evaluation Context is applied within Calculated Columns. Now let’s move on to another way Row Context is applied: X-Functions such as SUMX, RANKX, COUNTX, etc.Before we jump into an example, let me highlight this now. The analogy used above for Calculated Columns with the Postman (Row Context) having to visit each and every house (record) in order to complete a delivery (evaluation) is the same with X-Functions. The differences I am aware of are two.Firstly, the X-Function requires us to specify the table it will iterate through; therefore, we need to map the houses for the Postman. Secondly, the Calculated Column gets stored in your actual model, whilst X-Functions are measures that are not physically stored and only calculated at usage. If you are thinking whether to use a Calculated Column or a Measure and don’t know which to choose, always remember that memory is your biggest asset when working with Power BI. For a more in depth understanding, please check out the sources supplied below.Margin £ SUMX = SUMX(Sales, Sales[SalesExcVAT] – Sales[SalesCost])Take a look at the DAX formula above. First thing worth noting is that we declared the ‘Sales’ table. Therefore, we pinpoint each row (house) that needs to be visited.Looking at the table above, the DAX formula will visit the first record and evaluate the expression:Record 1 = 29.99 – 11.2163Then it will move to the second record and repeat the process:Record 2 = 28.99 – 10.8423Then it will move to the third record and repeat the process:Record 3 = 28.99 – 10.8423Until it repeats the same process for each record.Summary of Row ContextWhen is Row Context applied?Answer: Calculated Columns, X-Functions and Filter Function.From which row does Row Context begin?Answer: In Calculated Columns the first record otherwise the first record of table specified.What is the core difference between Row Context and Filter Context?Answer: Row Context works on a single row at a time and is aware of each individual row, whilst Filter Context refers to the filters applied when a DAX formula is evaluated.Why can’t I use an aggregator such as SUM, within Row Context?Answer: Row Context does not apply Filter Context by default, it must be enabled.What is Context Transition?Answer: The process of transforming the Row Context into Filter Context. Therefore, filters propagating down from one side to many side of tables.Row Context does not consider Filter ContextA very important part of Row Context is understanding that it does not automatically apply Filter Context. When I first started to write DAX, I remember using a SUM within a Calculated Column. The values in the column were clearly incorrect. Since then I have seen many experience the same outcome at least once, until figuring out that a measure does the trick. But this is not a trick and we should look to understand why this is happening.To better understand Context Transition, let’s work through an example. Below is the model which we will be working with, which is a simplified version from earlier examples:If we add the below DAX Formula as a Calculated Column within the Customers table, what do you think we would get back as a result?Total Sales Exc. VAT Calculated Column = SUM(Sales[Sales Exc. VAT])Here is the result:All rows are repeating the value 5435. This value is the total value of ‘Sales Excluding VAT’ that exists in the entire Sales table. The above result proves that Row Context does not automatically apply Filter Context.This occurs due to the Row Context iterating (visiting) every record in the Customer table and asking, “show me the Total Sales Amount” however no Filter Context exists. This means no filters are applied to the Sales table, at all! To ensure we understand this behaviour, let’s take it step by step:Go to Record 1 in Customer Table:What is the Total Sales Amount for Record 1?Sales Table has no filters, therefore return all sales.Go to Record 2 in Customer Table:What is the Total Sales Amount for Record 2?Sales Table has no filters, therefore return all sales.Go to Record 3 in Customer Table:What is the Total Sales Amount for Record 3?Sales Table has no filters, therefore return all sales.The all-important question here is: How do we apply Filter Context in the Row Context? This is done through:Context TransitionContext Transition is outside the scope of this read, however we will scrape the surface of this concept. I would highly recommend you check out the sources below, for a deeper dive into Context Transition.How do we apply Filter Context to Row Context?Row Context does not consider Filter Context. This is so important to remember that it is worth repeating. To convert Row to Filter Context we must use the power of the CALCUALTE Function! The CALCULATE Function with no filter parameters has the job of applying Context Transition.Look at the same DAX formula from above, only this time it is wrapped within the CALCULATE Function.Total Sales Exc. VAT (Calculated Column) = CALCUALTE(SUM(Sales[Sales Exc. VAT]))As you can see below, we get the correct result:The CALCULATE Function enables Filter Context; therefore, it converts the Row Context into Filter Context. The power of the CALCULATE Function! In more detail, the filters on the Customer table propagate down the relationship to the Sales Table for each record.Summary:If someone truly wants to learn Power BI, DAX is something that should not be avoided and for this reason Evaluation Context should be high on the to-learn list. This article should have armed you with the knowledge needed to interpret DAX behaviour, as well as easily digest more in-depth articles on Evaluation Context. @Data_LazSources:Matt Allington. (2018). Supercharge Power BI. Holy Macro! Books.Marco Russo & Alberto Ferrari (2015). The Definitive Guide to DAX. Microsoft Press Store.

Injecting Databricks DataFrame into a Power BI Push Dataset

Using Python and the Power BI REST APINow, why would you want to do that?There are some scenarios where this approach may be beneficial. To get the full picture and establish the landscape let’s first answer two questions:Why a Databricks DataFrame?Recently Databricks became an integral part of the Modern Datawarehouse approach when aiming for the Azure cloud. Its wide usage in data transformation begs for a richer variety of data destinations. The usual and most widely used persistence is the file store (lake, blob, etc.). It’s fine with large volumes of data, but then we have to go a long way before reaching the data presentation (additional storage layers, SQL, Tabular data model etc…).What if we want to instantly update a Power BI report directly from Databricks? It could be a small dataset feeding a dashboard for example. It could be business data or data-related metrics that we want to see in (near)real-time. Or we want to monitor the data transformation process continuously in a Databricks streaming scenario.Why a Push Dataset?We can do real-time streaming in Power BI by using Streaming or PubNub streaming datasets, which is fine, but let’s see some of the advantages of using a Push dataset:You can build a report on top of a Push datasetYou can pin report visuals to a dashboard, and they will update in real-time on each data pushData is stored permanently in Power BIYou don’t need a data gateway in placeFor a more detailed comparison of all the real-time streaming methods, please check here.If you think that all this makes sense, then continue further.ImplementationSince we’ll be utilising the Power BI REST API there are some limitations that we need to be aware of upfront. You can see them here.In order to be able to call the Power BI API you need to register an application and set proper permissions. Follow the steps here. App registration details below:After creating the app registration, you should grant permissions in the Azure portal:Without granting these permissions from the Azure portal you won’t be able to get authorisation token and use the REST API.Now that we’re all set-up let’s go straight to the point.My initial intention was to show you how to build the whole thing step-by-step in this post. But then it become complicated and I decided that an abstraction is needed here to keep things simple. That’s why I wrapped up all the “boring” stuff in a Python class called pbiDatasetAPI, which you can find on GitHub here. The comments in the code should be enough to understand the logic. The methods of this class will take care of:AuthenticationHTTP requestsHTTP requests JSON body generation (data and metadata)Data type conversion (Spark to EDM)Wrapping in a Python function of all the Power BI REST API operations that we need to performIn order to start using it you should import a notebook (using the above-mentioned URL) in your Databricks Workspace:Now that the class notebook is imported you can create a new Python notebook in the same folder to test how it’s working. Let’s call it “Inject DataFrame into Power BI Push Dataset”. First, we’ll execute our class notebook:%run "./pbiDatasetAPI" Next, we’ll need a DataFrame with data that will be pushed to the Power BI Push dataset. We can use some of the sample datasets which come with Databricks (in this case Amazon reviews):dfInject = spark.read.parquet('dbfs:/databricks-datasets/amazon/test4K') dfInject = dfInject.select("brand", "img", "price", "rating", "review", "time").limit(200) We take 200 rows, which is just enough.In the next command we’ll create some variables and instantiate the pbiDatasetAPI class:# Initialise variables username = "<USER_EMAIL>" password = "<USER_PASSWORD>" application_id = "********-****-****-****-************" # Power BI application ID groupId = None # Set to None if not using Power BI groups datasetName = "InjectedDataset" # The name of the Power BI dataset tableNames = ["AmazonReviews"] # Table name or list of table names dataFrames = [dfInject] # DataFrame name or list of DataFrame names # Create a pbiDatasetAPI class instance pbi = pbiDatasetAPI(username, password, application_id) You should set your username and password, and the application ID obtained in the previous steps. Optionally provide also a group ID or set it to None if you’re not using groups on powerbi.com.The tableNames and dataFrames variables are lists, because we may want to insert multiple DataFrames in multiple tables. In our case it’s one DataFrame to one table.Let’s create a dataset with one table in Power BI:# Create the dataset and the table in PBI pbi.executePBIOperation("postdataset", groupId = groupId, datasetName = datasetName, tableNames = tableNames, dataFrames = dataFrames, reCreateIfExists = True) The next step is to post all the DataFrame’s rows to the Push dataset.# Get the datasetKey for the dataset (by name) datasetKey = pbi.executePBIOperation("getdatasetbyname", groupId = groupId, datasetName = datasetName)[0]["id"] # Insert the contents of the DataFrame in the PBI dataset table pbi.executePBIOperation("postrows", groupId = groupId, datasetKey = datasetKey, tableNames = tableNames, dataFrames = dataFrames) This is where the magic happens. One important note here is that the dataset key is always unique. This does not apply to the dataset’s name, which means that we can have multiple datasets with the same name.You can see the newly created dataset on powerbi.com:You can create a report on top of this dataset and pin visuals to a dashboard (tiles will be updated automatically upon data change).Now, let’s try to manipulate the metadata a bit – change the data type of the “rating” column in the DataFrame from double to string and update the Push dataset accordingly:# Change the data type of the column 'rating' from double to string dfInjectModified = dfInject.withColumn("rating", dfInject.rating.cast("string")) # Get the datasetKey for the dataset (by name) datasetKey = pbi.executePBIOperation("getdatasetbyname", groupId = groupId, datasetName = datasetName)[0]["id"] # Update the metadata of the Power BI table pbi.executePBIOperation("puttable", groupId = groupId, datasetKey = datasetKey, tableNames = tableNames, dataFrames = [dfInjectModified]) The only thing remaining now is to try and delete all the rows (it’s all or nothing – we can’t delete some of the rows) from the table in the dataset and then delete the entire dataset:# Get the datasetKey for the dataset (by name) datasetKey = pbi.executePBIOperation("getdatasetbyname", groupId = groupId, datasetName = datasetName)[0]["id"] # Delete all rows from the table(s) pbi.executePBIOperation("deleterows", groupId = groupId, datasetKey = datasetKey, tableNames = tableNames) # Get the datasetKey for the dataset (by name) datasetKey = pbi.executePBIOperation("getdatasetbyname", groupId = groupId, datasetName = datasetName)[0]["id"] # Delete the dataset pbi.executePBIOperation("deletedatasetbyid", groupId = groupId, datasetKey = datasetKey) All the code above you can import in a notebook using this URL.If you closely examine the Power BI REST API documentation here, you’ll find that the pbiDatasetAPI class is not completely finished yet. There’s more that needs to be done, like:Create measures with DAXCreate table relationshipsSet cross-filtering behaviouretc.I intend to update the class in the future so that all the features will be available. Check GitHub now and then for updates.

The Azure Modern Data Warehouse: Unparalleled Performance

Today, 80% of organisations adopt cloud-first strategies to scale, reduce costs, capitalise on new capabilities including advanced analytics and AI and to remain competitive. Cloud-adoption is accelerating, and data exploitation is a key driver. The central tenet to this enterprise-wide exploitation of data is the cloud-based Modern Data Warehouse. Legacy on-premises or appliance based EDWs, that were once the strategic asset for only the largest of enterprise organisations, not only limit performance, and flexibility, but are also harder to set up & scale securely. The Modern Data Warehouse fundamentally changes the landscape for data analytics by making analytics available to everyone across organisations of all sizes, and not only the largest enterprise. A modern data analytics platform enables you to bring together huge volumes of relational and non-relational, or structured and unstructured data into a single repository; the highly scalable and cost-effective Azure Data Lake. This provides access across the enterprise from basic information consumption through to innovation led data science. Big data processing capability for data preparation, such as transformation and cleansing, can be performed as well as infusing Machine Learning and AI with the results made readily available for analysis through visual tools like Power BI. Azure provides unparalleled performance at incredible value. To further support this claim, Microsoft have just announced the GigaOM TPC-DS Benchmark Results that further cements Azure SQL Data Warehouse as a leader for price/performance for both decision support benchmarks, having already attained best price/performance status for the TPC-H benchmark, announced back in Feb 2019. TPC-DS @ 30TB$ per Query per Hour TPC-H @ 30TB$ per Query per Hour Azure SQL Data Warehouse (Azure SQL DW) always delivered on performance when compared to alternatives, and now GigaOm found analytics on Azure is 12x faster and 73% cheaper when compared using the TPC-DS benchmark. Azure SQL DW has established itself as the alternative to on-premises data warehouse platforms and leader in Cloud Analytics. Adatis have been at the cutting edge of Cloud Analytics Solutions since the introduction of the Azure SQL Data Warehouse PaaS offering back in 2015. In the last 18 months we have noticed the profile of Azure SQL DW rise sharply; with Azure SQL DW outperforming and taking over workloads from its closest competitors. We specialise in all aspects of delivering the value of Cloud Analytics, AI and the Modern Data Warehouse, from strategic business value led engagements through technical design and implementation to on-going continuous improvement via fully managed DataOps practices. Adatis utilise Microsoft Azure technologies, in conjunction with first-party Spark based services, that securely integrate to provide enterprise-grade, cloud-scale analytics and insight for all and partner deeply with Microsoft to enable data driven transformation for our customers. We work to develop a modern data analytics strategy and ensure it is implemented and supported in the best way possible, aligning to your specific company’s goals and overriding strategy. If you want to find out how Adatis can help you make sense of your data and learn more about the Modern Data Warehouse, please join us in London on 6th June for an exclusive workshop. We will guide you through the Microsoft landscape and showcase how we can help you get more value from your data, wherever you are on your data transformation journey. Register here for our "Put your Data to Work" in-person event to be held in London on 6th June 2019 Additional Resources Microsoft Azure Big Data and Analytics Information on Azure SQL Data Warehouse Try Azure SQL Data Warehouse for Free #SimplyUnmatched, #analytics, #Azure, #AzureSQLDW, #MSPowerBI

ArcGIS Maps for Power BI – Further capabilities and other information

The ArcGIS Maps for Power BI is also useful when used in conjunction with other visualizations, through Interactions. When selecting any location feature on the map, the filters will be communicated to the other visualizations which will in turn consider the same filtering.The perfect map blends art and science into an effective tool of visual communication.In the previous blog posts (that can be accessed from the following links: https://bit.ly/2PnIgR3 and https://bit.ly/2PnegWk),we have gone through the ArcGIS Maps for Power BI tool and had an overview of its use.To wrap up this series, we will build on the description of most of the features in the previous posts, available when editing the ArcGIS Maps for Power BI. The remaining features currently available in this tool will be shortly described to wrap up the features section: Pins – These provide the option of adding locations that are important to you on the map. For instance, if you are analyzing supply chain and distribution, the warehouse/s might be useful as pin/s, since these do not change no matter the changing dataset and need to be considered in most analysis.Drive time – Once pins are set-up, drive time tool could be used to highlight distances within specified radius or drive time. For instance, if you are analyzing orders that are being delivered late in a particular area, the pin and drive time tools could help in analyzing the possibility of opening a new store in a nearby area (in comparison to the coverage it will offer to the client sites that are having the orders delivered late).Reference layers – These can be either demographic layers or ArcGIS reference layers. Out-of-the-box demographic layers are US based and 10 of them are provided freely, which give data such as average households incomes. ArcGIS layers is data being published by other authoritative partners, customers and users that use ArcGIS online like weather services related data. Such data can be loaded and used in Power BI, without the need to spend time collecting regional data, cleaning and modelling it. For instance, an insurance company using Power BI and which might need to analyze the impact of a storm and the insurance policies within an area, might utilize ArcGIS storm surge information alongside with their own dataset for prediction and effect on policies.Infographics – This tool provides information such as population and age that could also be used alongside the data. So, taking the same insurance use case above, this information could help identify how many people (irrelevant whether customers or not) are likely to be affected. This might hence be a good indication for selling and marketing departments.Different selection optionsThere are different select methods available when using ArcGIS Maps for Power BI. These can be accessed through the map visual beneath the zoom in and out buttons (+ and -), as highlighted below:Select individual location – This enables the selection of only a single individual location at a time. Once another location is clicked the previous selected one will be automatically unselected, and the last selection retained with the reporting data refreshed and re-filtered accordingly.Select multiple location – This enables the selection of multiple locations by drawing a box over multiple locations that need to be selected simultaneously. This provide a good way of comparing the data of various locations.Select locations using reference layer – This enables the selection by area on a reference layer or within a defined radius / driving time from a specific location. This selection method will only be available if the map contains a reference layer or a drive-time area layer. If the map contains both a reference layer and a drive-time area layer, then both options will be available to choose from. The Reference layer selection tool will select all map features within the reference polygon – for instance if a state in US is selected, all location-based features located in that state on the data layer will be selected on the map. If the Drive time areas is chosen, data features will be selected within the defined area.Find similar – This latest added feature helps the user to swiftly pinpoint any locations that have similar attributes to the selected locations on the map (through the data layer only). For instance, in a map showing the GDP per country, the Find similar would allow the easy identification of those countries/ locations that have the same/ closest GDP. The numerical field (like GDP in this case) that is to be used as a basis for comparable values, needs to be dragged in the Find Similar data well.Then, after choosing the location to have the GDP compared to – in this case US – the GDP for the other locations will be compared and a rank for the most similar GDPs will be shown. In this case, Canada has been ranked as the closest for that year when filtering on the North and South American regions, with Dominican Republic being marked as the 10th most similar.Basic vs Plus subscriber/ ArcGIS accountBy default, the ArcGIS Maps for Power BI tool is available in basic version. This brings with it some limitations when compared to the same tool accessed through the Plus Subscription.In order to login or signup for Plus Subscription, or even connect to ArcGIS directly, the yellow circle with the plus sign on the ArcGIS Map for Power BI could be used – as per below:But what is exactly different between the Basic and Plus subscription? Following is a summarized comparison of the main features that differ between both: ArcGIS Maps for Power BI features Free (incorporated within Power BI with no cost) Plus subscription (monthly fee per user - currently $5 pp) US demographics only Global and richer demographics including Age, Behaviours, Healthcare etc. Public maps Validated, at the ready data, organized and collected from reliable and trustworthy sources Map and visualize locations: 1.5K features (geocoded locations) per map; 100K features per month Map and visualize further locations: 5K features per map; 1 million features per month Carry out spatial analysis including heatmaps, drive-time etc. Execute the same spatial analysis as the basic incorporated 4 basic basemaps 4 basic basemaps + 8 others including satellite imagery, oceans and terrain One point that remains evident from the above is the limit that still exist in the features that could be included in a map. Hence if one is trying to geocode locations or street addresses, in Basic, only the first 1500 will be considered in one map, whilst 5000 will be taken into account for the Plus subscription. This cap mainly exists to retain a good user experience for both visualization and performance, yet might be something to keep in mind especially if targeting large datasets.However, this is not a restriction for Power BI overall – since it offers other mapping tools that might be used in large datasets like for instance MapBox. The different mapping tools present different capabilities and features that might appeal and apply to different users depending on their use-case, scope, complexity and functionality sought.Performance enhancement and other improvementsIn one of the latest updates (September 2018), performance of ArcGIS Maps for Power BI has been improved drastically, and is cited to be about 50% better. This is mostly visible when navigating through the map or through interactions done and its benefits make live presentations of such maps less nerve-wracking. The improvement does not only apply for new maps or new reports, but will also be experienced by existing reports that are already using an ArcGIS Map for Power BI. In December 2018, other improvements were released for ArcGIS Maps for Power BI. In addition to the Find Similar feature that has been referred to already above, there were other features enhanced covering both the basic and plus subscription. One other improvement worth highlighting is the boundary data in the Location Type (which had been described in the previous blog on this series). In this latest release, more boundary data has been made available to help increase accuracy in data analysis. Likewise, the Plus subscribers have access to lot more curated infographic data for various demographics.Concluding remarksAll of this shows that GIS and mapping tools are being given the required attention by Power BI and Microsoft in general. With their increased popularity in recent years, they have now become essential in decision making and business, and through proper data visualizations – the data could become much more meaningful. This ties up with what Dr. Keith Harries suggests, that “The perfect map blends art and science into an effective tool of visual communication.” And thanks to Power BI mapping tools this is being enabled!

Power BI Composite Models and Aggregations

So something which feels like its gone under the radar a bit is the addition of composite models and aggregations in Power BI. These have been around a couple of months now in Preview but I’ve not seen much buzz around the features. After recently attending PASS Summit and seeing Christian Wade’s 1 Trillion row demo, I thought it was worth blogging about – especially as I expect parts of this functionality to be expanded to Azure AS in the near future. If you think about it, the majority of BI query’s are done at some form of aggregation level but users will still want the detail, and so these features essentially unlock the ability to report against giant datasets at both an aggregated and granular level at the same time – something that was not physically possible beforehand. Power BI is now able to work with multiple petabytes of data with ease over trillions of rows with pretty much instant response times – so this is a bit of a game changer.   Composite Models Previously with Power BI, you were restricted to either DirectQuery or Import for a single data source. With Composite models that has all changed. A report can now include data connections from more than one connection in any combination, so you are able to connect to both DirectQuery AND Import in the same model and combine the data. This then opens up a world of possibilities that were not possible before.  With this functionality, we also have the ability to create many-to-many relationships, but I won’t be doing into detail in this for this blog. As part of this also comes some functionality called Storage Mode which unlocks table-level storage features. This is the next part of the jigsaw.   Storage Mode This allows you to specify whether tables are either imported or queried on the fly. There is now a third option “Dual” which acts as either a cached or not cached table depending on the context of the query that's submitted at runtime. Setting the correct storage mode has many advantages such as: Better query performance (no need to push real time queries to the dataset for relatively static tables such as those used for filters). Ability to use larger datasets (interactive analysis is better for datasets you don't want to cache into memory such as tables with significant data volume). Data refresh optimisation (only cache data that's necessary, meaning quicker refresh times). Reducing latency in real-time data (no need to re-read those static tables each time). To access the storage mode, you either select the table in the report pane and click Properties, or navigate to the table in the new Modelling View (requires exposing this in Options as its in preview). Changing a table storage mode will then prompt us to change the related tables to Dual. This propagation logic is designed to help with models that contain many tables!   Modelling View Before I get to aggregations, I also need to introduce the new modelling view. This will be changing to fall in line with what you may be familiar with on SSAS / Azure AS. It feels like this is one of the first steps to integrate SSAS further into the Power BI spectrum and facilitate enterprise datasets. By updating the modelling view, you now have the ability to create additional diagrams, thus allowing you to break out particularly complex models by subject area rather than trying to join it all up within one model. For instance, if you are working with 5 tables in your model, this isn't a particularly big deal – now multiply that by 10 and all of a sudden it becomes a pain to manage, and this new feature will help alleviate that. The feature also allows you to multi select objects in one go and update properties through a new side-pane. An example of this new feature can be seen below.   Aggregations Finally, the most exciting feature of the lot – aggregations. Without the new modelling view, without the composite models, without the new storage modes – this would not be possible. Aggregations allow us to create a newly defined table within the model but with all fields of the original table it is created from to be aggregated/grouped in some or another. These aggregations/groups include count, group by, max, min, sum, and count rows. This can be set to either set to Import mode with/without incremental refresh, or via DirectQuery and optimised by using columnstore indexes. This then unlocks faster query performance over huge datasets via the cache at aggregated level using fractions of resource compared to detailed levels. You then have the flexibility to set the aggregate table to import, while leaving the granular table to DirectQuery, which will speed up performance when navigating the report. The aggregate table can also be hidden so that user will not even be aware of the implementation. You can also have more than one aggregation table, potentially one for a particular set of reports and then a lower grain aggregate table for analysts, and this can be done through the precedence option. The tool then will then query the aggregation table with highest precedence level first to see if it can resolve the runtime query before moving down the levels. Aggregations don’t just work for measures such as SUM or COUNT. It also works for more complex measures - all of the components of a measure are always folded down to the sum, min, max, count, level and then those sub query’s work out whether they can hit the cache or not.  Its also worth mentioning that you can check if its hit the cache via the DAX editor.   Conclusion For me, all these features mentioned above will be a bit of a game changer in the right scenario. I’ve ran into scaling problems before and re-working the logic at the Warehouse level for both a granular and aggregated datasets certainly added some overhead. While the functionality obviously doesn't work with Live Connection into SSAS cubes, this new functionality opens up options for using Power BI to do of the activities which historically would have only been done in SSAS. As Christian Wade also alluded to at PASS, Power BI will soon become a superset of SSAS and so we may well be doing less and less with SSAS and more with Power BI in this area as time goes by.

Power BI Dataflows – Bringing self-service ETL to the Business User

Announced at the recent PASS Summit, Power BI Dataflows has now gone into public preview. Previously, any ETL that users applied via Power Query within Power BI Desktop was only applied to their own dataset. With Dataflows, ETL processes are now classified as a first-class citizen and this update provides centralised self-service data prep as part of the Power BI product so that business users can expose cleaned/schematised entities across the business in a similar way to a Data Warehouse. No longer will the same lightweight ETL processes be occurring across multiple users for the same dataset, and often in different ways. Traditionally, self-service BI is limited to analytic models and reports/dashboards but with Dataflows that is changing as it shifts down a peg.  In this blog, I’ll take a look at the new functionality and give my thoughts. Before we begin, its worth noting that this is an entirely different piece of functionality to that of the similarly named Data Flows in Data Factory which is in private preview. You can tell the difference by the capitalisation of the word “Flow” for Data Factory. I’m not sure what the marketing department would say about this but it is slightly confusing having both been released as news at the same time.   What is a Dataflow? A dataflow is a collection of entities (which are similar to tables) that are created and managed from within the Power BI Service (powerbi.com). This is then stored under the Common Data Model (CDM) in an Azure Data Lake Storage (Gen2) resource as files and folders – no longer as a Tabular Model that we are used to with Datasets.  Dataflows are used to ingest, cleanse, transform, enrich and schematize/model the data in a similar way to that of ETL to create a Data Warehouse. Once created you can use Power BI Desktop to create datasets based off of these entities in the usual manner. While dataflows can be created and edited by both Pro and Premium users, there is a larger set of functionality only available via premium. This includes creating computed entities (calculation aggregates), linking to existing entities in other dataflows, and incremental refreshes.   Creating a Dataflow To go about creating a dataflow, navigate to an App Workspace on the Power BI Service (note that this cannot be My Workspace for obvious reasons, also note this cannot be done in Power BI Desktop). You’ll notice alongside Dashboards, Reports, Workbooks, and Datasets that there is a new artefact labelled Dataflows (Preview). If you have Premium capacity this will need to be enabled via the Admin Portal Capacity Settings. Click the + Create and select Dataflow. Each dataflow has only one owner and only the owner can edit it. To create a dataflow, you can either define new entities which involves connecting to a data source and then mapping to the Common Data Model entities / defining custom entities OR linking to other existing dataflows (see below).   If you are creating new entities, you are provided with a set of familiar data source connectors (see below). This set is not as complete as those through Power BI Desktop but I expect more to appear within time. I suspect there is also more work to do here because some of these connectors documented below, i.e. Excel - ask for a File Path or URL without giving any functionality to go get this through a lookup. This is something we have come to expect as part of Power BI Desktop.   Once you have loaded your data, you will then be provided with a fairly familiar query editor pane (see below). The first thing you will notice is that the options to manipulate and transform your data are limited to that of Power Query as part of Power BI Desktop. Again, I suspect this will mature when the functionality hits GA but for the time being gives some reasonable functionality to do what you need to do to create entities in the form of column/row transformations. You are also provided an option to “map to standard” which allows you to join into the Common Data Model (CDM), essentially some generic templates for entities such as Account/Product/Customer/etc. While I can see how this could be useful for an average user, I think it would still be better to create your own based around your businesses own definition of an Account/Product/Customer/etc. These are then stored as custom entities within the CDM. Once the entities are transformed/schematised how you want them, you save them to the Power BI Service with a name/description. Under the hood, this is then written to a Data Lake as a series of files along with a JSON schema.   Dataflows are then available via Power BI Desktop using Get Data (see below). I believe you will need the November 2018 release or later to see this option. The dataset is then treated in a similar way to everything else you would have seen previously within Power BI Desktop. When selected, you should be able to see all of the workspaces within your BI tenant that you have access to which contain dataflows. Within each dataflow you can see each entity. That’s really it, they’ve kept it simple on purpose and abstract from what’s going on under the hood.   Wait, but what does this mean for the classic Enterprise Data Warehouse? I certainly think from a business perspective it will blur the lines slightly between using traditional ETL processes to create a DW or going down this route, more so at smaller scales. It would be hard pressed to replace an Enterprise level solution, but there is definitely a use case somewhere around departmental level instead of creating a data mart. It will allow users to leverage their data faster than waiting on IT to build out a solution. It really depends on the complexity and maturity of the platform they are intending to develop and its purpose. The fact is sits on ADLS Gen2 is a positive and allows other applications to access the data at a raw level rather than relying on a specific connector. It also allows users of products such as PowerApps to connect into a cleaned CDM, another plus. On the other side, its worth bearing in mind that you will need the Premium version to facilitate incremental refresh which is required by the majority of data warehouse solutions.  The lowest cost of Power BI Premium is that of a single P1 node which costs at the time of writing £3,766/month or £45k/pa – not small by any standards. This is worth taking into consideration. The other fact is that traditional DWHs deal with concepts such as SCD Type II and creating History which Dataflows will not be able to facilitate. You may also have noticed there was no mention of DAX or measures, which still needs to be down at the usual level within Power BI Desktop, not ideal to share KPIs across users. Lastly, there is currently no data lineage, although this has been rumoured to be part of the roadmap.   Conclusion While this feature will be a game changer for a number of end users and projects, I think it will join the set of questions such as “Do I need SSAS or can I use Power BI for my models?” I expect to see this appear at next years conferences! The answer will always be – it depends! As we have seen with some aspects of Power BI, this new functionality also opens up businesses to governance issues with potentially having users create their own version of the truth rather than something which has been thought out and modelled by someone with a certain skillset or authoritative source. This could cause contention between departments. As noted above, I expect the Power BI team to continue to mature the functionality of the product – but for the time being, it certainly opens up another avenue to draw people into the toolset and provide another level of functionality for business and users alike.

ArcGIS Maps for Power BI – Editing and available features

This blog post builds up on the previous post that covered an introduction to ArcGIS Maps for Power BI – which can be re-accessed through the following link: https://bit.ly/2PnIgR3We will start by looking at the editing features that are available within ArcGIS Maps for Power BI. Jack Dangermond said, “the application of GIS is limited only by the imagination of those who use it.” So, let’s try to get a glimpse of the range of features, with the hope of being inspired and applying them in an imaginative way Editing an ArcGIS map in PowerBIClicking on the Edit of the visualization map would allow you to change various map contents through different tabs and tools – as shown below:Following is some information on each of the tools:Basemap – This lets you choose from four different basic basemaps (unless you are a Plus subscriber or have an ArcGIS account which then gives you access to more options of basemaps). Through such changes, one can ensure that the focus will remain on the data importance rather than caused visual distractions within the map. If different basemaps use different projections, there might be the need to save, close and re-open (or change between report pages) for the map features to reflect the new projection. In such cases, the Attention icon will probably show stating that the basemap will be updated next time visual loads.                                        Location Type – This will allow the locations to either show as points or boundaries as can be seen below. It can be specified whether the data pertains to one country or many countries. For the boundaries the location type could be changed from a dropdown list including countries, districts, postcode areas, region etc. (available list changes depending on the country selected). The method to be used in matching locations could vary between:Closest match—This can be used when the exact spelling cannot be guaranteed (with errors like Frence), if there might be a mismatch with the spelling in Esri's data services, or if an exact match search will not add all your data to the map (even if this might not be the best solution to data accuracy). This might obviously lead to inaccuracy and wrong assumptions.Exact match—This is the most accurate method and is to be used when boundaries are defined by codes or abbreviations, or when one is sure that the dataset spelling is correct and matches Esri's data services.The Location Type might be a good place to start with if a “Failed to locate [x] features error” is encountered as follows:This might be a result of incorrect settings in the location type, like adding a list of states in Brazil without setting the Locations are in one country setting, choosing Brazil and changing Location Type to States. The error might also be a result of an improper match between a value in the dataset and Esri’s data services (or a spelling mistake).Map themes – This allows a change in the style for the map and once can choose from location only, heatmaps or clustering (the last two are only available for point layers, that is when you select Points in the Location Type). Through the clustering option, one could group individual location points into larger circular clusters that fall within a cluster radius – giving a high level view and then the ability to drill down into each region. If heatmaps are chosen any values in the Size or Color will be ignored and the tooltips will not be available.If numerical data exists in the Size and/ or Color field wells, there will be a further 3 map themes available – being Size (showing bubbles with different sizes based on the measure’s value), Color and Size & Color:Symbol style – This provides the option to do changes in appearance that are immediately reflected in the map including symbol shape and size, colours (and defining the colour ramp), transparency level and classification types. The available options in the Symbol Style screen depends on the map theme selected and the nature of the data being analyzed. For example for heat maps formatting options like Transparency and Area of Influence will be available. Classification types will provide different options to classify the data and defines the way ArcGIS is going to create the clusters from your data, namely:Natural Breaks (sometimes referred to as Jenks) – The fluctuated data values are clustered in naturally occurring data categories. Class breaks take place where a gap between clusters exist. This method is suitable for unevenly distributed data. For instance, streets are clustered based on their length (short vs long), or cities based on their size (small vs large).Equal Intervals – Value ranges are set equally in size in every category. The entire range of data values (max - min) is divided equally into the number of categories that have been chosen. One will specify the number of classes, and ArcGIS Maps for Power BI will automatically determine how to divide the data.Quantile – Classification of data is done within certain number of categories having equal number of units in every category.This might be suitable for evenly distributed data.Standard Deviation – Shows how much a feature’s attribute value varies from the mean – whether above or below mean. This might be a good way of pointing out the extremes.Manual Breaks – Enables one to define own classes, class breaks and ranges.As one can see, such features and functions not only offer different options to the creator to give context to the geographic data visualization but will also help the end-user in understanding and adapting to the data better and enhance comparison. I hope the information in this post was useful in some way or another. In the next blog, we will continue building further on this. Stay mapped!

Visualising Network Data in Power BI with Python Integration and NetworkX

The long awaited Python Integration in Power BI added earlier this month welcomes the opportunity for further customised reporting by exploiting the vast range of Python visualisation libraries.Among my favourite of these Python visualisation/ data science libraries is NetworkX, a powerful package designed to manipulate and study the structure and dynamics of complex networks. While NetworkX excels most at applying graph theory algorithms on network graphs in excess of 100 million edges, it also provides the capability to visualise these networks efficiently and, in my opinion, easier than the equivalent packages in R.In this article, I will explain how to visualise network data in Power BI utilising the new Python Integration and the NetworkX Python library.Getting StartedTo begin experimenting with NetworkX and Python in Power BI, there are several pre-requisites:Enable Python integration in the preview settings by going to File –> Options and Settings –> Options –> Preview features and enabling Python support.Ensure Python is installed and fully up-to-date.Install the following Python libraries:NetworkXNumPypandasMatplotlibLoading DataThe data I used was created to demonstrate this task in Power BI but there are many real-world network datasets to experiment with provided by Stanford Network Analysis Project. This small dummy dataset represents a co-purchasing network of books.The data I loaded into Power BI consisted of two separate CSVs. One, Books.csv, consisted of metadata pertaining to the top 40 bestselling books according to Wikipedia and their assigned IDs. The other, Relationship.csv, was an edgelist of the book IDs which is a popular method for storing/ delivering network data. The graph I wanted to create was an undirected, unweighted graph which I wanted to be able to cross-filter accurately. Because of this, I duplicated this edgelist and reversed the columns so the ToNodeId and FromNodeId were swapped. Adding this new edge list onto the end of the original edgelist has created a dataset with can be filtered on both columns later down the line. For directed graphs, this step is unnecessary and can be ignored.Once loaded into Power BI, I duplicated the Books table to create the following relationship diagram as it isn’t possible to replicate the relationship between FromNodeId to Book ID and ToNodeId to Book ID with only one Books table.From here I can build my network graph.Building the Network GraphFinally, we can begin the Python Integration!Select the Python visual from the visualizations pane and drag this onto the dashboard.Drag the Book Title columns of both Books and Books2 into Values.Power BI will create a data frame from these values. This can be seen in the top 4 lines in the Python script editor.The following Python code (also shown above) will create and draw a simple undirected and unweighted network graph with node labels from the data frame Power BI generated:import networkx as nx import matplotlib.pyplot as plt G = nx.from_pandas_edgelist(dataset, source="Book Title", target="Book Title.1") nx.draw(G, with_labels = True) plt.show() ** NOTE: You may find that the code above will fail to work with large networks. This is because by default networkx will draw the graph according to the Fruchterman Reingold layout, which will position the nodes for the highest readability. This layout is unsuitable for networks larger than 1000 nodes due to the memory and run time required to run the algorithm. As an alternative, you can position the nodes in a circle or randomly by editing the linenx.draw(G, with_labels = True)tonx.draw(G, with_labels = True, pos=nx.circular_layout(G)) or nx.draw(G, with_labels = True, pos=nx.random_layout(G)) **This will produce the network graph below:You are also able to cross filter the network graph by selecting rows in the table on the right-hand side:ConclusionPython visuals are simple to produce and although the visual itself isn’t interactive, they will update with data refreshes and cross filtering, much like the R integration added 3 years ago. The introduction of Python in Power BI has opened doors for visualisation with libraries such as NetworkX, to visualise all BI networks from Airline Connection Flights and Co-Purchasing networks to Social Network Analysis.

Iconography in Design

In this blog I will go about discussing the importance of design, in particular looking at how icons can help add the finishing touches to a piece of front end development. Whether you’re building a Power BI report, SSRS report, PowerApp, or doing any kind of front end visualisation and design, you’ll know that 50% of the battle is making whatever you’re building jump out of the page. You could have built the most useful report that a user could wish for, but unless it looks good, you’re not going to get any plaudits. Essentially, good design goes a long way to making a success on a piece of development. This is even more important if you’re building the underlying architecture / ETL, and for all our good practice and thoughtfulness in this area as developers, will rarely impress or impact on an end user. Its important to get inspiration for design from somewhere, so before going about designing or even building something, take a look on Google, do some research and try to look for similar things to what you’re doing and see what you think looks good, or not so good. You should build an idea up of how you want to design you’re landing page, report or report headers, etc. within the tool. If you’re in Power BI, take a look at the partner showcase for example. There’s some really good examples to give you ideas – like this one! Microsoft apps such as Power BI or PowerApps go a long way in helping us build the best when it comes to data visualisation, but unfortunately they sometimes rely on developers to go outside the box to finish things off.   Icon Finder Recently I’ve started to use a site called iconfinder.com which has a large pool of useful icons you can use for building out certain corners of apps that the standard MS tooling will not support. The icons are mostly $2 each, but you can get a subscription for £20/month or $10 with a discount code (which you should cancel as soon as you register). Please, don’t jump in if you think you need to use this as a resource. Alternatively, save the icon using Chrome and use it as a placeholder until you are happy to push to production. For non-subscribed users, the icon will always come on a background of faint grey lines. This isn’t too bad as they don’t completely ruin the look and feel of the icon in the development and are good for a placeholder for demos, etc. To get started, just type in your keyword for the type of icon you’re looking for, and then its just a case of wading through the results to find the icon that fits the look and feel your inspired to build against. Sometimes, you’ll get another bit of inspiration off the back of this which you can use as another key word to find even more icons. The site also comes with a very handy icon editor tool, essential Paint Shop Pro on the web. There’s lots of these sites out there but its useful its all integrated into one place at no extra cost. It will load the SVG icon into it automatically if subscribed which then allows you to edit colours or shapes etc. In my instance, I found a nice % complete icon set which would look good on a white background. Unfortunately, I wanted it on a blue header bar, so needed to change it up slightly to fit the look and feel. No problem, took less than a minute to modify and download. Its also worth mentioning that the site does a good job at helping you find a pool of icons which will fit together nicely using the same look and feel, showing you icons from the same icon set. In one instance, I replaced an icon I was looking for to be from the same pool even though the icon wasn’t exactly what I was looking for – because overall it just felt like it fitted together nicer with the other icons on the screen.   Design in Practice As mentioned above, doing some research before you build can really help you create a much better finish. For inspiration for a recent PowerApps design I did a quick search for landing pages on Google, and found a few I liked the look of (below). As long as the general elements you are working to are similar, it really doesn’t matter where the inspiration comes from. In these cases, they were in the form of mobile apps. From these images, I was able to identify the key components which made me bookmark them: I wanted some kind of non-offensive background, possibly semi-transparent, or with overlay. I wanted a title that stands out the page, so white on grey or similar. I wanted a small section for a blurb for the PowerApp. I wanted 2 buttons, and the buttons to stand out. I wanted logos in the top corners. I wanted a nice look and feel for the colour palette. From this, I then produced the following landing page. I found the background on picjumbo.com which turned out to be quite a nice resource for some generic business style artwork, and then added a blurring filter across the top. This still interferes slightly with the buttons / title so I’m not completely happy but satisfied enough that it achieves the look and feel I was looking for. For the title, the range of fonts supplied with PowerApps is rather limited so I could go externally for this too but was happy enough for the time being. The layout also leaves room to shrink the title and add a small blurb if need be. The buttons are made up of a number of icons and fit with the theme for the app. As mentioned above, I also added % complete icons to each page so users were able to understand how far they were along the scoring pages within the app. PowerApps provides sufficient icons for the back/refresh buttons that fit in with the white on blue theme, so I didn’t have to go externally for these. These were placed on the page header next to the logo.     Power BI While this most recent bit of design was focused on PowerApps, I also add small bits into PowerBI during report design. For instance, rather than just have a generic button that can push you to a “details” page which has a table for the row by row breakdown of some aggregated data - I looked for an icon, edited the colour palette slightly and added this to the report. With recent Power BI functionality, I can make the image act as a button and redirect the user to another page. I’ve also used icons in dashboard design where a single visual didn’t really represent the content of the report to drill into. This can also be a good way to go about adding a bit of flavour to a dashboard to mix things up. In my case, it also meant the drill down into the report level was less ambiguous, by asking a question as the title if that’s what the user wants to do. Design is always subjective of course, but its great to use other resources at your disposal to go about building out apps. Depending upon the current estate in which you develop, it also helps them stand out a bit more and add a unique context to the reports/apps within the project. Hopefully this blog has given you a few ideas for your next project!

Adatis Lyon > Marseille Charity Ride

Hi Folks,You might be aware, but Adatis are somewhat into our cycling, but we don’t like to do it for any old reason. Therefore, after the blazing success of the 2016 London > Paris charity ride, we’re doing it again… but going even further!This time we’ll be making our way from Lyon, through Valence and Avignon, right down to the south coast at Marseille. That’s a whopping 340 kilometres!Now, we’re all riding for our own causes from a whole range of different backgrounds. I’ve chosen to help out some very smart, very generous people over at Stichting Suubi. They’re doing some great things in Uganda, providing medical care to local school children. But what’s more, they’re developing tools to harness Azure and Machine Learning to automatically track healing progress, recognise wound degeneration and all sorts! Check out TendoCare.AI for more info!Now, obviously this is where the ask comes in – I’d love it if you could help me make sure the trip is worthwhile by popping a donation in, big or small. In return, not only will I drag my lazy, unfit self along this ridiculous ride, I’ll do it on my huge cargo bike which is utterly and completely unsuitable for such a ride:For some more stats and general updates on our progress, we even have a PowerBI of our progress towards charity milestones and training fitness because yes, we are that nerdy.Once again – anything you can do to help out, we’d really appreciate it - https://fundrazr.com/suubifoundation?ref=ab_c7L2T7Thanks,Simon

Power BI with Azure Databricks for Dummies (in 15 minutes)

    Microsoft Azure Databricks is Microsoft’s Apache Spark-based platform optimised for Azure and thus integration with Power BI. It was released on 26th of February this year and is still in preview but in our recent project we decided to give it a go and explore what options would such a solution behold for an enterprise data warehouse solution. To do so we have created a cluster for our project with some notepads with a PySpark script that does all our ETL. On top of that we have decided to use Power BI to feed directly from the Databricks cluster tables to build our dataset and reports. With this blog post I am going to take you through the quick process of setting up the Databricks connection in Power BI desktop using the web app and the scheduled refresh. The prerequisites for such a task are: Azure Databricks cluster with data tables (Facts, Dimensions, etc.) An access Databricks token (not the same as Azure tokens) Power BI subscription (I’m working with Pro version)Power BI DesktopAnd that’s all – no gateways, no drivers, no batteries needed.      1. Firstly, we need to make sure we have started our cluster as this takes approximately 5-10 mins. If we have already created our token (non-expiring would be ideal) and saved it somewhere we just need to open Power BI desktop and select Get Data from the welcoming screen. If you try to filter by Spark on all the connectors options you will notice there are currently three options, we will use “Spark (Beta)” one and click Connect:     2. The next step is a bit trickier as you need the URL address of your cluster (and from now on we need it to be started). To compose the above-mentioned address open Databricks, go to Clusters and select the cluster you want to connect to. On the cluster edit page, scroll down and select the JDBC/ODBC tab. On the JDBC/ODBC tab, copy and save the JDBC URL. Construct the JDBC server address that you will use when you set up your Spark cluster connection in Power BI Desktop. Take the JDBC URL that you copied and saved in step 3 and do the following: ⦁    Replace jdbc:hive2 with https. ⦁    Remove everything in the path between the port number (443) and sql (sql/protocol…) whilst keeping the boxed text as per the image below:.In our example, the server address would be:https://eastus2.azuredatabricks.net:443/sql/protocolv1/o/5441075421675247/0221-211616-tings395or https://eastus2.azuredatabricks.net:443/sql/protocolv1/o/5441075421675247/boost (if you choose the aliased version)Once you have the address you need to input under the Server label in the dialog box of the next setup step:Please note that the Protocol must be HTTP. As per the Data Connectivity mode – both are supported so it is up to your personal / business preference. 3. The next step will be to input your login details. Please keep in mind this is although this is a bit misleading as for User name you need to input “token” and for Password – the token string value you have previously saved (or if you forgot to save it just go create a new one as I did a couple of times ) 4. After you have successfully connected (if the cluster is Online you shouldn’t experience any issues) you will be able to select the data tables you need for your report and edit the queries to your liking. It is a ‘bit’ slow when compared to other data connectors but it is still in beta and hopefully it will be much more responsive in the near future. 5. Upon configuring your dataset, creating the report masterpiece of the day and publishing it online, comes the final touch of the report development - setting up the automatic dataset refresh from the Power BI web service: And here comes the sweetest part – as Power BI and Databricks are integrated and everything is in the cloud you don’t need gateways, ODBC drivers nor anything just setup the credentials once again as per the following image: N.B. Whenever your cluster is offline you will see an error that your credentials are not valid :For this to work as expected always start your Databricks cluster first and make sure it is online before your dataset refresh schedule hits! However, there is a not-so-sweet part as well. I’d even say salty as the tears I cry whenever I need to fight this bug. Currently once you deploy your dataset over a week or two the scheduled refresh starts failing with strange errors like “Bad request” or my favourite “Invalid method name: ‘GetCrossReference’” for which I have logged a bug with Microsoft. This is manually fixed by opening your Power BI Desktop, click Edit Queries and refresh all the data tables one by one. If you get an error just try again – the Spark connector is a bit buggy and takes a while to fetch the data but once you have refreshed the data set and publish it your report will refresh smoothly (taking for granted your cluster is always online when you are refreshing the dataset) for at least a week . As a conclusion, Power BI with Databricks is a tempting new option for cloud only based solution which even in preview is pretty stable and reliable, so just give it a go and let me know if you experience any bugs or if you discover something new and exciting about this connection of Power BI.

Displaying vector data on maps in Power BI

Something I have enjoyed doing in the past is making maps, especially ones showing how various things move around London. One thing I did was take some open data on usage of London Santander bikes and passed it through the Google Waypoints API and visualised it using Mapbox:Which got me thinking, is it possible to do something similar using Power BI? Microsoft introduced the geography data type with SQL Server 2008 and more mapping providers are making versions of their tools available in Power BI, including Esri and Mapbox. From a business perspective, IoT devices are enabling the collection of greater volumes of geographical data for companies to use in their analyses.Using the GEOjson files I had previously produced for the project above, I set about importing them to a SQL DB instance in Azure as a geography data type. This involved reading the file into a variable and then iterating over the index position of each individual ride.-- load the geojson file into a SQL variable DECLARE @geojson VARCHAR(max) = (SELECT * FROM OPENROWSET (BULK 'C:\path\to\my\file', SINGLE_CLOB) as j) -- use the index position to iterate over each ride SELECT geography::STGeomFromText('LINESTRING(' +STRING_AGG(CAST(long + ' ' + lat as varchar(max)), ',') + ')',4326) FROM ( SELECT * FROM OPENJSON(@geojson, '$.features[0].geometry.coordinates' ) WITH ( long varchar(100) '$[0]' ,lat varchar(100) '$[1]' ) ) aThis results in a row per ride and visualises pretty well in SSMS. If you are familiar with the geography of London you can make out the river Thames toward the centre of the image and Regents Park towards the top left:This could be overlaid on a shape file of London or a map from another provider such as Google Maps or Mapbox.However, when you try to load the dataset into Power BI, you find that Power BI does not natively support Geography data types. There is an idea you can vote on here to get them supported: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/12257955-support-sql-server-geometry-geography-data-types-iAfter a bit of research it looked like using R within Power BI could be an option. The foundation thinking of this was rather than converting the longitude and latitudes to a geography data type as above I wrote them to a table with an Id to associate each ride and let the ggmap library in R join the dots. Once you have your data loaded into Power BI, you only need a couple of lines of R code to render the map:library(ggmap) qmap("london", zoom=12, maptype="toner-lite") + geom_path(aes(x = long, y = lat, group=LineId), size = .7, data = dataset, colour="#FF69B4", lineend = "round")An important thing to note is that this R library uses the Google Maps API which could cause some privacy concerns. Also before showing this to users you would want to assign a Google Maps API token in Power BI as without it you are using a generally available connection which means your maps will fail to load a lot of the time.The resulting map displays the paths – although the sorting of each row gets mixed up somewhere in Power BI for some paths hence the horizontal lines.And it turns out that the Power BI service does not currently support the use of certain R libraries which use client-server queries over the internet meaning you cannot publish this map.So as a proof of concept it is possible to display vector paths on a map in Power BI desktop, however you will need to wait until Microsoft supports geography data types in Power BI or allows R libraries which call an external service before taking it any further.These links provided the basis for this PoC and would be a good starting point if you wanted to take things further:https://bertwagner.com/2018/01/16/importing-geojson-earthquake-data-into-sql-server/https://docs.microsoft.com/en-us/sql/t-sql/spatial-geography/ogc-methods-on-geography-instances?view=sql-server-2017http://angryanalyticsblog.azurewebsites.net/index.php/2017/02/27/power-bi-routing-visual-with-two-lines-of-r/

Conditional Formatting based on a separate measure in Power BI

What are we trying to do? I recently had a request from a client to have a traffic light indicator in a table visualisation in Power BI. Fine I thought, conditional formatting was released with the November Power BI Desktop update which will be able to handle that nicely. However, a further requirement emerged which was that this conditional formatting must be based on the difference between the values of 2 separate measures. For example, if we have a measure for ‘Sales Amount’ and a second measure for ‘Sales Amount Last Month’ we would need a traffic light indicator showing green, yellow and red if the Sales Amount was more, equal to or less than Sales Amount Last Month respectively. My initial thought for this problem was actually fairly positive, I had worked with conditional formatting in Power BI and KPIs in SSAS before so I had a plan of attack, however when I had a quick search online I couldn’t find any posts about this being done before. Luckily my initial feeling proved correct and this is very easy to achieve!   How we did it I first imported a very simple data set with ‘Date’ and ‘Sales Amount’ columns:   Next, I created a measure to calculate the Sales Amount for the previous month: Sales Amount LM = CALCULATE(SUM(Sales[Sales Amount]),PREVIOUSMONTH(Sales[Date].[Date]))   Now I needed to create a third measure which would act as my traffic light indicator. All this measure needs to do is show a value which can be used to calculate the difference between ‘Sales Amount’ and ‘Sales Amount LM’ which can in-turn be used to indicate the difference for a conditional formatting range. This of course can simply be ‘Sales Amount’ minus ‘Sales Amount LM’: Indicator = SUM(Sales[Sales Amount]) - 'Sales'[Sales Amount LM]   From here it is simply a case of applying the conditional formatting to the ‘Indicator’ measure. In case you haven’t done this before you first click on the drop-down arrow for your measure you wish to conditionally format and then select ‘Conditional formatting’:   Here are the settings I chose for ‘Background color scales’:   Here are the settings I chose for ‘Font color scales’:   After applying those settings to our ‘Indicator’ measure we arrive at the following for our table visualisation:   And that’s it! Of course, this can be used in a number of different ways if you needed a different type of value comparison or a different range for your conditional formatting but this should help get you started. As always, any comments or thoughts are very welcome.

Data Source Permissions and On-Premises Data Gateway: SQL Server and Analysis Services

In Microsoft’s documentation surrounding the On-Premises Data Gateway, the advice on permissions for the account used to authenticate the Data Source in the Power BI Service can be concerning for most, especially DBAs. In the Analysis Services section of the documentation, the advice is:The Windows account you enter must have Server Administrator permissions for the instance you are connecting to. If this account’s password is set to expire, users could get a connection error if the password isn’t updated for the data source.Server Administrator permissions…? What happened to the principle of least-privilege? In a practical sense, the On-Premises Data Gateway has to deal with two very different implementations of Analysis Services: Multidimensional and Tabular. Each are setup and configured differently from the other, and the nature of their security models are also different. As a one size fits all approach, it works. As we will soon see, the permissions do not have to be set as Server Admin The SQL section of the documentation, on the other hand, doesn’t actually specify what permissions are required for the Data Source to be established in the Power BI Service. PermissionsExactly what permissions are required for these common data sources, I hear you ask. As data sources are established at a database level, so too are the permissions set.Data SourceMinimum Permissions LevelSQL Server Databasedb_datareaderSSAS Tabular DatabaseProcess database and ReadSSAS Multidimensional DatabaseFull control (Administrator) Principle of least-permissions is now restored. Though there still are the curious incidents of Analysis Services data sources requiring permissions in addition to read. I am unsure, I have my suspicions, and have tried to find out. If you know, please leave a comment below!

Embed PowerApps into Power BI Desktop

Microsoft’s January 2018 Power BI Desktop update (blog found here) contains quite a few small visualisation features, but the one that stood out most to me is the ability to import PowerApps as a Custom Visual. My last two blogs (Part 1 and Part 2) demonstrated how to embed a PowerApp into a Power BI Dashboard in the service, but it wasn’t possible to achieve this in Power BI Desktop or a Report.  How things have changed within a month! This article illustrates how quick and easy it now is to plug an existing PowerApp into Power BI Desktop.      PowerApps Custom Visual Like all popular Custom Visuals, you can either Import from file (stored on a local machine) or Import from store. The Microsoft endorsed Custom Visuals are found in the store, which is why I would generally advise importing visuals from there.  Search for “PowerApps” and it will appear as the top result. Click ‘Add’ and the PowerApps Custom Visual is available. NOTE:   The PowerApps Custom Visual is currently in Preview, meaning its reliability cannot be 100% guaranteed.  Always use Preview tools/features/visuals with caution. PowerApp Connection I previously created an ‘SvT Adfjusted’ PowerApp, which will be used for this example.   Firstly, the visual needs to have ‘data’.  It can be an attribute or measure, but doesn’t matter when connecting to an existing PowerApp.  If you haven’t logged into the PowerApps portal, you may be promoted to do so with Power BI Desktop.  Once logged in, you will see the below screen:  If you see the ‘not supported’ error message above, do not worry – this is red herring.  Click ‘OK’ and then navigate to the appropriate Environment within the nested PowerApps portal.  ‘Adjust Target App’ resides in the ‘Callums Blog’ Environment. Click ‘Choose App’, select the PowerApp and click ‘Add’. That’s it, the PowerApp is now embedded and ready to use in Power BI Desktop. It is also possible to create a new PowerApp within Power BI Desktop, which is demonstrated in this video. The look, feel and general experience is the same as what you see in the PowerApps portal, meaning you do not even need to use a web browser to get started. PowerApps Refresh I wander how long Power BI Desktop takes to refresh when a change is made to the embedded PowerApp?  Let’s find out. Before: Principal A ‘Target’ value is changed from 850000 to 950000.   Unfortunately, nothing happens.  Power BI Desktop caches the data sourced from the Azure SQL Database, which is where the PowerApp data is stored.  The only ways to view the change is to click the Power BI ‘Refresh’ button or change the context of the visual interaction within a report.  What I mean by the latter is that you need to force the Tabular engine to write a new DAX query, which in turn, forces a query refresh.  Using the ‘Refresh’ button (below) will always be the simplest refresh method.   After: Published Report The report can now be published to the Power BI Service. The Service contains the same refresh issues as described in Power BI Desktop. Manually refreshing the web page (CTRL-F5) is currently the best approach.  It is worth noting there is between 15-30 seconds delay between a submitting a PowerApp change and viewing the new number in a visual.  This is expected, as there are a few processes (behind the scenes) that must happen first. Web Content vs. Custom Visual There are now two ways of adding a PowerApp into the Power BI Service – but which is better? Web Content -          Advantages o   Easy to dynamically add a ‘Web Content’ tile to a Dashboard. o   More flexibility around the link behavior e.g. open custom link or other report in a new tab. o   The PowerApp sits completely outside of Power BI and does not necessarily need to be seen in a report. -          Disadvantages o   PowerApp embed code is required and may not be available to all users. o   Extra layer of management and maintenance. Custom Visual -          Advantages o   No increase in the Power BI Desktop data model or file size.  The PowerApp is sourced from the cloud and acts as a Live Connection. o   One stop shop where both visuals and the PowerApp reside. o   Less places for an issue to occur.  Security can also be resolved, using Power BI Desktop as a test. o   PowerApp can be pinned as a visual - to an existing or new Dashboard.  This step is far more user friendly and easier to achieve with the Custom Visual. -          Disadvantages o   No ability to use custom links. Whilst Web Content does offer a bit more flexibility and interactivity, I would advise using the Custom Visual wherever possible.  Having one place where everything is managed, makes everyone’s lives far easier.  If you have a good business case for adopting Web Content, then of course, please do so.  References Find other recommended resources below. o   PowerApps Custom Visual Video - http://bit.ly/2Fi3vLY o   Power BI January Update - http://bit.ly/2CSfLVl o   PowerApps - http://bit.ly/2Brjys4 o   Flow - http://bit.ly/2CoL2vW o   Common Data Service - http://bit.ly/2CnXXhv Contact Me If you have any questions or want to share your experiences with the PowerApps Custom Visual, feel free to leave a comment below. Twitter:                @DataVizWhizz

Embed PowerApps into Power BI Dashboards – Part 1

Having read Matt How’s blog (found here) about PowerApps, not only did it get me interested in the technology, I also wondered how well (if at all possible) it could integrate with Power BI. Our friend Google soon told me that it was already possible to embed PowerApps into Power BI, released in the April update. However, apart from this blog by Ankit Saraf, there aren’t many professionals sharing their experiences. In addition, leveraging Direct Query mode in Power BI means we can simulate real time user input and reporting. To replicate my solution below, you will need an understanding of PowerApps, Azure SQL Database, Flow and the Common Data Service (CDS). The Further Reading section provides some good links to get you up to speed. I have broken the blog into 2 parts: -          Part 1: How Power BI visuals and PowerApps can be used together. -          Part 2: Benefits and Drawbacks of the tools/processes used. Solution I picked a typical use case that would link Power BI and PowerApps – Actual vs. Target. The Power App will be used for adjusting target values, whilst an Azure SQL Database will contain the original target and actual values. All data and Power App interaction will be embedded into a Power BI Dashboard. Create Sample Tables and Data in Azure SQL Database Create and populate two tables – dbo.SvT for static actual vs. target data and dbo.SvTAdjusted that will eventually contain the adjusted target data from the PowerApps form.             Note:     Azure SQL tables require a Primary Key column to communicate with Flow and consume CDS data. Create PowerApp Create an Environment within the PowerApps service, adding two new Connections:   1.       Connection to the CDS, using my company Microsoft account. This is where the adjusted budget values reside. 2.       Connection to the Azure SQL database, which will become the destination table to store the CDS Power App data.   The next step is to import the SQL Data from dbo.SvTAdjusted directly into a CDS PowerApp.     This automatically creates a user form containing the data. Here is where you can customise the PowerApp, such as making fields read only and configuring look and feel.     Publish the App and test and change the ‘Target’ values to test. Create Flow trigger Navigate to https://emea.flow.microsoft.com/en-us/ and login. Create a new flow, searching for ‘Common Data Service’ as the connector. Select the below and create the Flow.     Select the PowerApp CDS Entity (Adjusted Target) as source.     Add a new step (Add an Action) and search for ‘SQL Server’. Select SQL Server – Update Row as the destination and map to the dbo.SvTAdjusted table. The column data types between CDS and Azure SQL Database must match when being mapped. Save the Flow.       Create Power BI Report Create a Power BI Desktop report and connect to the Azure SQL Database. Set up the one to one relationship on ‘PrincipalID’, between the tables. Create some KPI’s and a table to compare dbo.SvT and dbo.SvTAdjusted metrics. In the below example, the ‘Adjusted Budget’ metric will change when we make changes in the CDS Power App. Embed Power App into Dashboard Publish Power BI Desktop report and pin as a live page. To embed the PowerApp into the Dashboard, add a Tile and select Web Content. The App ID can be found under Apps in the Power Apps web portal. Simply paste the App ID into [AppID].  <iframe width="98%" height="98%" src="https://web.powerapps.com/webplayer/iframeapp?hideNavBar=true&source=powerbi&screenColor=rgba(165,34,55,1)&appId=/providers/Microsoft.PowerApps/apps/AppID]   The PowerApp is added as a Dashboard tile. It is now possible to change the ‘Budget’ values.       Time to test everything works. Change the values for all three records and refresh the Power BI Dashboard. The values have changed almost instantly!     Further Reading Check out Part 2 of the blog, where I will be discussing the benefits and drawbacks I have found with using Power BI and PowerApps together. Find other recommended resources below. o   Matt How’s Blog - http://bit.ly/2CpbTYI o   Embed PowerApps into Power BI - http://bit.ly/2ywgsNX o   PowerApps - http://bit.ly/2Brjys4 o   Flow - http://bit.ly/2CoL2vW o   Common Data Service - http://bit.ly/2CnXXhv Contact Me If you have any questions or want to share your experiences with PowerApps and Power BI, feel free to leave a comment below. All scripts and workbooks are available upon request.Twitter:                @DataVizWhizz

Embed PowerApps into Power BI Dashboards – Part 2

Part 2 of this blog focuses on my experiences with PowerApps, Flow and Power BI. Part 1 was more of a demo and ‘How to’ guide, but when I read an article online, I always find known limitations, challenges or workarounds as the most interesting takeaways. Without further ado, here are my findings.   A summary of both blogs below: -          Part 1: How Power BI visuals and PowerApps can be used together. -          Part 2: Benefits and Drawbacks of the tools/processes used. Benefits -          Easy to get started. Rolling out Power Apps, Flow and Azure databases into production of course needs careful thought, but for Proof of Concept’s, Flow (2,000 runs per month) and PowerApps (for Office 365 users) are free to use. Links to the price breakdowns are provided in the Further Reading section below. -          There are a range of Wizards, Templates and GUI’s. All the tools used offer great templates for moving or inputting data and the fact barely any code is needed, makes it simple for business users. Following a couple of YouTube tutorials on each technology will get people up to speed very quickly. -          Azure technologies provide seamless integration between Microsoft tools. Whilst there are some other well-known, reputable cloud service providers around, using one product is always going to produce a slicker solution. Having less configuration steps means less chance of human error. -          Customisable features of PowerApps give the ability to mask, validate and format the PowerApp screens. It also makes the user entry a more pleasant experience, as the forms are more intuitive. Limitations -          You can only embed PowerApps into a Dashboard – as a Tile. I am not sure if moving PowerApps into a Power BI Report is on the roadmap, but I would be surprised if it was never supported. -          Power BI Dashboards are cached and not entirely real time. You can change the cache settings to 15 minutes, but the best way to ensure your visuals contain the latest Power App data is to manually refresh your page in the browser. Reports do update automatically, which makes it even more frustrating. -          Common Data Service (CDS) is a preview Data Connector in Power BI. As a result, you need to either have your environment set as ‘America’ and/or been given the beta by Microsoft. If I had access to this connector, there would have been no need to have the Azure SQL Database or Flow trigger. Milinda Vitharana’s blog shows how to enable CDS Power BI Integration. -          If you wanted to use an on-premise database instead of an Azure database, an additional step is needed. A Data Gateway (link here) must be installed to move the Power App data back into the SQL database. Therefore, I would always recommend (where possible) using PaaS or other cloud services, as they talk to each other natively. -          The error handling within the PowerApps is still quite limited. If Flow fails when updating data between PowerApps and Azure SQL Database, nothing is captured within the form itself. An Admin would need to check the Flow job or set up email alerts for user’s peace of mind.     Conclusion The initial signs look promising for Power BI and PowerApps integration. I managed to create an Actual vs Target Proof of Concept in just a matter of hours, without any real coding. There are still quite a few drawbacks and hoops to jump through to bring everything into a Power BI Dashboard, but I can only see things getting easier from this point. There are other use cases for embedding a PowerApp into Power BI, such as monitoring live sales and re-ordering stock within a PowerApp or updating product descriptions that automatically updates the Dashboard attributes. Giving someone the ability to directly interact with a Dashboard and make instant business decisions is priceless in today’s fast paced world. Further Reading Find other recommended resources below. o   PowerApps Pricing - http://bit.ly/2j5sN69 o   Flow Pricing - http://bit.ly/2kw0MFr o   Milinda Vitharana’s blog - http://bit.ly/2BfkywQ Contact Me If you have any questions or want to share your experiences with PowerApps and Power BI, feel free to leave a comment below. All scripts and workbooks are available upon request. Twitter:            @DataVizWhizz

One-way Domain Trust and Power BI

I ran into a problem setting up on-premises data gateways on a client recently, whereby they had two domains but with a one-way trust. The result was that when authenticating within the Power BI Service to retrieve data from on-premises data sources in the untrusted domain it would throw an error. At this point it is worth spending some time explaining the architecture.The ArchitectureThe architecture might be familiar to many who use Power BI and the on-premises data gateway, with a little caveat. Domain 1 is the main domain. Domain 2 is the secondary domain and trusts Domain 1. Domain 1, on the other hand, doesn’t trust Domain 2.A user in Domain 1 can access data sources in both Domain 1 and Domain 2. They can create their Power BI reports with a live connection or direct query and publish them to the Power BI Service. In order to use the reports in the service, on-premises data gateways need to be established to provide a messaging service between on-premises and the cloud. In this example, each domain has a domain controller, a tabular server and an on-premises data gateway for each tabular server.The ProblemWhen a user logged-on to the Power BI Service tries to access data from Domain 2, their credentials are passed down to the on-premises data gateway, checked against the domain controller in Domain 2 and returns an error to the Power BI Service. What I think happens is that the user (User.One@Domain1.com) will have their credentials passed down through the on-premises data gateway to the domain controller in Domain 2. Either the domain controller will not be able to find the user, it is the untrusted domain, and will not be able to pass the short name (DOMAIN1\USERONE) to the tabular server, or it tries to check with the domain controller in Domain 1 and encounters the dreaded Kerberos and cannot perform a double hop to return the short name. Either way, the result is the same in that the short name cannot be passed to the tabular server. The SolutionAs you can imagine, there are a few solutions to the problem. If it is a Kerberos related issue, then Kerberos will have to be configured separatelyMake Domain 2 a trusted domainUser mapping in Power BI ServiceThis latter approach is the one I opted for because it was guaranteed to work and would not change the current domain and network configuration.In the gateways settings in the Power BI Service, I went to the Users tab under my data source and clicked on Map user names. In there I mapped users in Domain 1 to users in Domain 2.If you have a large number of users, individual mapping might not be preferable or feasible, which is why you can replace the Domain names in part of the user string, as in example 3. This, however, does rely upon users in Domain 1 having an equivalent account in Domain 2. This is not always the case, for which the wildcard to service account would work, as shown in example 4.

Considerations for Creating a Power BI Enterprise Report Deck

Creating or re-creating an Enterprise report deck in Power BI should be reasonably straight forward given a specification, but there are a number of considerations which need to be made when building something of this nature. In the following blog post, I will detail some of these, and the route I would suggest taking. The contents of this blog revolve around more tabular reports than chart visuals, but the same themes can apply.   Fonts I think it goes without saying to keep the font consistent across both a single report, and a report deck.  The default for Power BI is Segoe UI which for the most part is pleasant, just be careful not to flick between this and Segoe UI Light as this can cause discrepancies. It is however the font size that will cause you more of an issue. The first consideration is to set a minimum size. As report developers we want the report to look as tidy as possible, and usually this means fitting everything on 1 page. The easiest way to do this is to set the font smaller if you are having space issues – but this does not always translate so well to the end user. Depending on the device, they may consider the minimum Power BI lets you set (size 8) as too small for consumption on something such as an iPad – so this is worth checking first. The second consideration is to set the font size for different elements of the report, i.e. row level data at something like a 10, and header level elements at a 12. Anything else that exists such as filter elements should be set the same as the header levels. I would usually set titles a number of points above this, at something like an 18. In general, having varying levels of font size on a single report between elements will look inconsistent so the aim here is consistency! The third consideration if possible is to keep the font size the same across all the reports within the report deck for the same type of element. Again, this adds a consistent feel to the deck. If one report has more rows than another, in my opinion its still better to use the same font size across both, rather than filling the page on both using varying sizes. The last consideration is to be careful when mixing text from a textbox and a card together in the same area of the report. Unfortunately Power BI does not currently support expressions like SSRS does, thus a textbox is for static text only. Dynamic text can only be created through a measure and assigned to a card. However having both elements side-by-side with one another does not give the expected result. The font size of the font in a text box is not the same as a card; a card size 10 is roughly equivalent to a text box size 13 (although you can only set 12-14), thus leaving you with some inconsistent fonts between elements. My suggestion is to create measures referring to elements on the report, and use them for both static/dynamic text, thus every textbox will be a card visual and have that consistent look and feel. If you only need static text, stick to text boxes.   Objects The next consideration is around the number of objects on a report – keep it simple.  Avoid building a giant monolithic report, the more objects you use, the slower the report will perform on PBI service, iPad’s and even to develop.  This is especially true for tables/matrices which will each need to fire off separate DAX queries to return the data elements. Too many objects also has knock on effects for exporting to PowerPoint as objects will overlap with one another more which may not be as much of a case within PBI service but will affect other apps. You can use the selection pane (in the view tab) so move objects above/below one another which will bring forward/push back the elements.   Alignment Another scenario which I have come across is that sometimes it is necessary to include a column header in between the header for a measure and the actual data – for instance to specify the currency or unit. There are 2 options available; the first is to set the headers of the table as white text and insert text boxes over their position. While this achieves the goal, the final look and feel is not ideal as a large proportion of time can be spent aligning the columns with the text in the text boxes, and even then it can still be pixels out of alignment. Adding/removing measures then means you have to go through the same process again as everything shifts left/right. Fortunately, in the November release of Power BI, they have added the ability to align data within the tables better. A better approach for this scenario is to rename the measures within the table visual to whichever unit you want to show for that column. The downside of this is for a developer you will then need to hover the measures to see where the original measure came from, a small annoyance which is compensated by the amount of time saved trying to do alignment within a table. Also, this means less objects in the report, and less objects will generally create a quicker, more manageable report. For anyone particularly new to Power BI, you an use the arrow keys to move around a single element pixel by pixel, to help with alignment. There’s also options on the format tab. I’m still waiting for the ability to pixel nudge multiple elements when selected together!   Colour Hopefully you should be guided in terms of colour by a corporate colour scheme. This often comprises of a set of 4 or so main RGB values to use, complimented by a further set of colours. Pick 1 (or even 2) of these from the main set of colours and use that for the report framework, either the border/frame, or report header/footer, and then use another colour for the table headers, or two if the report requires double table headers. Again, consistency is the key across reports within the report deck. If using double headers for the columns, consider using the colours as background behind the header text rather than colouring in the text in the corporate colour. Consider white text on the darker backgrounds.   Parameter Selection Most reports will contain some kind of slicer visual, to allow the user to change the context of the data – usually by period. As part of the report build, you’ll need to assess where the best position for this is on the report and to keep it consistent between reports within the deck. If your reports will require the real estate going across the page (i.e. wide tables), then consider the top of the report, else if they need the real estate going down the page (i.e. long tables), consider the right hand side. I think by default I would build it at the top, alongside any logos / titles. If you require multiple slicers, maybe move these to the side and build a panel to group them together. Another little hidden option is that of switching the slicer visual between List/Dropdown (circled red below). For some reason, list is the default but most users will agree that the dropdown is neater, and saves space. I’m not sure why this appears here rather than in the standard visual configuration tab, maybe they will move it soon? The dropdown slicer visual still has some issues which I hope will be fixed soon such as not de-selecting the dropdown after a selection has been made. Another click is required outside of the dropdown to hide the options. This is not the best for the end users, and there seems to be no viable alternative.   Header Logic Swapping Unfortunately as I mentioned previously, Power BI does not support expressions within tables, and therefore switching context based on a parameter is not easy to achieve. This is possible but it needs to be done entirely within DAX. To keep the DAX measures for this simple, consider specifying the position on the report as the name of the measure. Then within the definition of the measure, reference other created measures and keep the statement simple, allowing anyone debugging the report to trace the switching logic easily. Also use a DAX formatter such as this to make the DAX as readable as possible. It would be nice for this to be included within Power BI, hopefully it will soon! I’ve included an example DAX statement below to provide this functionality.   New Measure = IF ( HASONEVALUE('Dim'[Value]), SWITCH( VALUES('Dim'[Value]), "Comments describing the logic definition", "", "Value", [Measure], [AlternativeMeasure] ), BLANK () )   Template What does all of this lead to? The answer is a template for the report deck. As well as having guidelines for the above items which make up a report, its also good to build a physical .pbix template for your suite of reports. This way, you are not starting from scratch for each report,and you will get a more consistent feel down to the pixel level of where the objects are. Don’t over complicate the template, but leave enough elements on it to save you re-creating them each time you build a new report. I would generally avoid copying from an existing report each time to then build another report, as this will sometimes include elements like PBI defined measures, which you do not want to carry between reports. Instead define a template which you take from each time.   Conclusion Once decided on a number of these points, it is worth gaining a consensus from the product owner over whether this is acceptable to use moving forward. Do not get to the end of the deck, and demonstrate your decisions across the report set, this will leave you with far too much re-work. Instead aim to deliver maybe one of the more complex reports with a number of the items above, and then apply those decisions to the rest of the report deck.

PASS Summit 2017 – Coming Soon to the Power BI Service

I recently attended a Power BI Governance session at Pass 2017 and some new features were demoed in the Service.  I thought I would share these with you. The below have been on the Microsoft roadmap and are not strictly hot off the press. However, we were told to keep an eye on the Power BI blog site (found here) over ‘the next few weeks’ (early December 2017) – which is pretty exciting!  Without further ado, here they are: Workspace Apps ·         Selective Report Publishing.   o   Meaning you can pick and choose what reports are consumed by the end users. o   This is particularly useful if you have a combination of workspace developers and consume only users.  Self-serve analysts may be working on their own reports and they will no longer exposed to everyone within the App. ·         Install Apps automatically for existing or new users.  This can tie back to Security Groups or Distribution Lists. Collaborate with External Users ·         Directly linked to Workspace Apps. ·         Facilitates App and Dashboard sharing with personal work email accounts e.g. Gmail. ·         Uses Azure Active Directory business-to-business (AAD B2B) and incorporates Row Level Security (RLS). o   For more on AAD B2B – click here. -          UPN mapping to bridge on-premise AD to AAD and natively support the external email collaboration. Audit Logs ·         Solution templates for faster, convenient Audit Logging. Examples include: o   Author tracking – which users publish the most reports o   Gateway activity – deleted, changed, reconfigured. o   Report Lifecycle – when reports are modified, deleted and history of these activities. o   Dataset and data source tracking. My personal favourite is the ability to control what reports are packaged into a Workspace App.  This has caused me problems at various clients and made the experience of release management more convoluted and time consuming.  It will certainly please some of my customers! Further Reading I must thank Adam Wilson (Twitter handle below) for delivering a great session at Pass 2017.  All images are taken from his slide deck. ·         Adam Wilson’s Twitter handle - @AdamDWilson ·         Power BI Blog - http://bit.ly/20bcQb4 ·         AAD B2B - http://bit.ly/2mpKD7H ·         Pass 2017 Website - http://bit.ly/2xSbQC0 Contact Me I would be very interested to see what the community’s thoughts are on these announcements.  Feel free to comment or message me on Twitter if you want to discuss anything Power BI. ·         @DataVizWhizz