Adatis BI Blogs

Whitelisting Azure IP addresses for SQL Server

In a recent blog post, I wrote about whitelisting Azure Data Centre IP addresses for Key Vault. Key Vault’s firewall uses CIDR notation for IP ranges, which is exactly what is contained within the list of IP addresses supplied by Microsoft. However, there are some resources, like Azure SQL Server, which only accept IP ranges. Therefore, we need a way of converting CIDR to an IP range. Handily, there’s a PowerShell script which exists to provide that conversion – called ipcalc.ps1. When you download it, make sure it’s in the same working folder as the script you’re going to use to create the new firewall rules.From there, we can make slight amends to the script we had in the previous post and produce the following script:If you need to assign the IP ranges to other resources you can substitute the New-AzSqlServerFirewallRule with the appropriate cmdlet and parameters

Process an Azure Analysis Services Tabular Model from an Azure Function

A couple of weeks ago I wrote a blog post detailing how to process a tabular model from a stored procedure. The challenge there was to have a way of processing a tabular model on demand from a web application. The challenge still exists, but how do you achieve the same in Azure using Platform as a Service (PaaS) objects which do not have the benefit of a full server and items, such as msdb and SQL Agent, to aid in the process?In this post, I’m going to show you how to process a tabular model only using Azure PaaS offerings. Not only am I going to show you how to do process a tabular model on-demand, but also how to process a tabular model on a schedule. This post has taken inspiration and part of the code base from the a Microsoft blog: Automating Azure Analysis Services processing with Azure Functions.Azure FunctionsBefore we begin properly, it’s worth spending some time introducing Azure Functions. According to Microsoft, Azure Functions are:…a solution for easily running small pieces of code, or "functions," in the cloud. You can write just the code you need for the problem at hand, without worrying about a whole application or the infrastructure to run it. Functions can make development even more productive, and you can use your development language of choice, such as C#, F#, Node.js, Java, or PHP. Pay only for the time your code runs and trust Azure to scale as needed. Azure Functions lets you develop server less applications on Microsoft Azure.They are super useful for extending the capabilities of any solution and not just limited to what we’re going to cover here.On-Demand RefreshThe use of Azure Functions creates a trigger for the on-demand refresh of a tabular model from the web application or web hook, this is to make sure that selected elements of data in a tabular model, for example hot partitions, are always up to date. The following describes the process that Azure Functions will be involved in this scenario:The steps that are needed to create an Azure Function for On-Demand Refresh are as follow:1) Create an Azure Function AppNavigate to the Azure Portal and create a Function App (the name changes quite a bit, so don’t be concerned if it’s not exactly displayed as it is below)2) Create a new FunctionAfter you’ve created the Function App, we need to add a new Webhook + API function, which we’ll use as the basis for our on-demand refresh. Click on the + button next to Functions, select Webhook + API, choose C# as your language and click Create this function.3) Configure the FunctionDownload the latest client libraries for Analysis Services. This needs to be done to your local machine so you can then copy these files to your Azure Function App. After you’ve downloaded the client libraries, the DLLs can be found in C:\Program Files (x86)\Microsoft SQL Server\140\SDK\Assemblies. The two files you need are:C:\Program Files (x86)\Microsoft SQL Server\140\SDK\Assemblies\Microsoft.AnalysisServices.Core.DLLC:\Program Files (x86)\Microsoft SQL Server\140\SDK\Assemblies\Microsoft.AnalysisServices.Tabular.DLLThe documentation references the 130 assemblies, which is not correct and will not work. You need the assemblies in 140.In order to add these assemblies to your function, you’ll need to add a folder called “bin”. To do this, select your Function App, click Platform features, and under Development Tools, click Advanced tools (Kudu).In Kudu, click Debug console and select Cmd. Navigate to the site\wwwroot\OnDemandHttpTrigger folder and add the “bin” folder here by clicking the + button.Once you’ve added the “bin” folder, go back over to the Azure portal and select your function OnDemandHttpTrigger.On the right under View files, navigate to the bin folder. Then click the Upload button to add the two previously mentioned DLLs to the bin folder.You should see the two DLLs in your bin folder now.4) Add Azure Analysis Services Connection String to the Function AppThis step is going to add the connection string to the Azure Analysis Services (AAS) service to the entire Function App, not just individual functions. Click the name of your Function App, then select Platform features. Select Application settings under General Settings.Now we need to add our connection string under the Connection strings section. You’ll need your AAS server name and a user ID and password that has access to the AAS database.You can find your AAS server name by navigating to your AAS database in the Azure portal and copying the value constructed after clicking Show server connection strings:Your connection string should look like this:Provider=MSOLAP;Data Source=<your aas server>; Initial Catalog=<aas database name>;User ID=<your username>;Password=<your password>Back in the screen for the Function App, fill in the Name textbox with a name for your connection string and paste your connection string in the Value text box:Click Save at the top to save these settings for your Functions.5) Time for CodeOur Function App has been configured, now we need to add code to the function. The function comes with a working function, for which to test out the functionality, but we don’t need everything that is on offer.We’re going to programmatically process the tabular model. In doing so, we’ll leverage Analysis Services Management Objects (AMO). If you’re new to AMO, start here.Paste in the following code (all code referenced can also be download from my GitHub Repo):#r "Microsoft.AnalysisServices.Tabular.DLL"#r "Microsoft.AnalysisServices.Core.DLL"#r "System.Configuration"using System;using System.Configuration;using Microsoft.AnalysisServices.Tabular;using System.Net;public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log){log.Info("C# HTTP trigger function processed a request.");// parse query parameterstring status = req.GetQueryNameValuePairs().FirstOrDefault(q => string.Compare(q.Key, "status", true) == 0).Value;if (status == null){// Get request bodydynamic data = await req.Content.ReadAsAsync<object>();status = data?.status;}if (status == "execute"){log.Info($"C# trigger function started at: {DateTime.Now}");  try            {Microsoft.AnalysisServices.Tabular.Server asSrv = new Microsoft.AnalysisServices.Tabular.Server();var connStr = ConfigurationManager.ConnectionStrings["AASTabular"].ConnectionString;asSrv.Connect(connStr);Database db = asSrv.Databases["azureadventureworks"];Model m = db.Model;//db.Model.RequestRefresh(RefreshType.Full);     // Mark the model for refreshm.RequestRefresh(RefreshType.Full);     // Mark the model for refresh//m.Tables["Date"].RequestRefresh(RefreshType.Full);     // Mark only one table for refreshdb.Model.SaveChanges();     //commit  which will execute the refreshasSrv.Disconnect();            }catch (Exception e)            {log.Info($"C# trigger function exception: {e.ToString()}");            }log.Info($"C# trigger function finished at: {DateTime.Now}"); }return status == "execute"?req.CreateResponse(HttpStatusCode.OK, "Successfully Processed Tabular Model ") :req.CreateResponse(HttpStatusCode.BadRequest, "Please pass a status on the query string or in the request body");}Click the Save button at the top.6) Test, Test, TestClick the Run button at the top to test the function The function can also be tested in a web browser, and be called by a Web App using the POST HTTP method.Now we have a fully functioning method of refreshing an Azure Analysis Services tabular model on-demand.Scheduled RefreshThe use of Azure Functions creates a trigger for the scheduled refresh of a tabular model, this is to make sure that the entire tabular model has the latest data and is always up to date. The following describes the process that Azure Functions will be involved in this scenario:The steps that are needed to create an Azure Function for Scheduled Refresh are as follow:1) Create a FunctionWe’ve created our Function App, and now we need to add a new Timer Trigger function, which we’ll use as the basis for our scheduled refresh. Click on the + button next to Functions, select Timer, choose C# as your language and click Create this function.2) Configure the TimerWhat use is a timer without a schedule? To give the timer a schedule, click Integrate, set the schedule and click Save.The Schedule text box expects a CRON expression to define the days and times that the function should execute. Click the little Documentation button on the screen above to read about CRON expressions. The schedule I’ve set is to run once everyday at 09:30AM.3) Configure the FunctionSee step 3 of the On-Demand Function for detailed steps. You’ll need to create the bin folder and upload the DLLs to the bin folder.4) Time for CodeWe’ve configured our function, so now it’s time to add the code. The code base is much simpler than the On-Demand code base, mainly because it’s doing fewer tasks. But the AMO section is exactly the same. Paste in the following code:#r "Microsoft.AnalysisServices.Tabular.DLL"#r "Microsoft.AnalysisServices.Core.DLL"#r "System.Configuration"using System;using System.Configuration;using Microsoft.AnalysisServices.Tabular;public static void Run(TimerInfo myTimer, TraceWriter log){log.Info($"C# Timer trigger function started at: {DateTime.Now}");  try            {Microsoft.AnalysisServices.Tabular.Server asSrv = new Microsoft.AnalysisServices.Tabular.Server();var connStr = ConfigurationManager.ConnectionStrings["AASTabular"].ConnectionString;asSrv.Connect(connStr);Database db = asSrv.Databases["azureadventureworks"];Model m = db.Model;//db.Model.RequestRefresh(RefreshType.Full);     // Mark the model for refreshm.RequestRefresh(RefreshType.Full);     // Mark the model for refresh//m.Tables["Date"].RequestRefresh(RefreshType.Full);     // Mark only one table for refreshdb.Model.SaveChanges();     //commit  which will execute the refreshasSrv.Disconnect();            }catch (Exception e)            {log.Info($"C# Timer trigger function exception: {e.ToString()}");            }log.Info($"C# Timer trigger function finished at: {DateTime.Now}"); }Click the save button at the top.5) Test, Test, TestClick the Run button at the top to test the function Now we have a fully functioning method of refreshing an Azure Analysis Services tabular model on-demand.ConclusionI have shown you how simple it is to invoke two methods of refreshing a tabular model using Azure Functions: an On-Demand refresh and a refresh by Schedule. I hope that you take inspiration and use these methods in your use of both Azure Analysis Services and Azure Functions.

Process a Tabular model from a Stored Procedure

The ChallengeRecently, at a client, I was challenged to create a stored procedure that would process a tabular model. This stored procedure would then be executed from a web application. The process behind it being: a user enters data into a web application, which gets written to a database. That data then needs to be immediately surfaced up into reports, with additional calculations and measures along the way. Therefore the tabular model, which does all the additional calculation and measures, needs to be processed by a user from the web application. That’s the challenge – process a tabular model quickly, but should be processed by users on-demand.The SolutionPart one: Method of processThere’s quite a few methods to process a tabular model: use an SSIS package, PowerShell, SQL Agent Job and others. I opted for SQL Agent Job because it was the most simple method of execution without having to reconfigure the server or introduce technologies and languages that weren’t already in use. First things first, create a SQL Agent Job, I called mine ProcessTabular. Then create a Step. The Type should be SQL Server Analysis Services Command, input the server address and input the refresh script. I’m using SQL Server 2016, so using Tabular Model Scripting Language (TMSL) for my command. XMLA commands also work for older versions. A full list of commands for processing a tabular database can be found here. Part two: Start the Agent JobNow that we have a SQL Agent job, we’ll need to start that job using SQL. Luckily, there’s a system stored procedure that can start agent jobs: msdb.dbo.sp_start_job Method for calling it is EXEC msdb.dbo.sp_start_job 'ProcessTabular'producing the following successful messagePart three: The Stored Proceduresp_start_job works, but it doesn’t accommodate for providing a completion message, or informing a user that a process is in progress. Introducing the code for the stored procedure:CREATE PROCEDURE [dbo].[ProcessTabularDatabase]  AS  DECLARE @JobName nvarchar(50) = 'ProcessTabular', @ResultCheck INT  IF NOT EXISTS(              SELECT 1          FROM msdb.dbo.sysjobs_view job           JOIN msdb.dbo.sysjobactivity activity ON job.job_id = activity.job_id          WHERE               activity.run_Requested_date IS NOT NULL           AND activity.stop_execution_date IS NULL           AND = @JobName          ) BEGIN           PRINT 'Starting job ''' + @JobName + '''';      EXEC msdb.dbo.sp_start_job @JobName;     WHILE (1 = 1)  BEGIN   SELECT @ResultCheck =  IIF(stop_execution_date IS NULL,1,0)       FROM msdb.dbo.sysjobactivity AS sja      JOIN msdb.dbo.sysjobs AS sj ON sja.job_id = sj.job_id      WHERE = @JobName    IF @ResultCheck = 0 BREAK;  END    PRINT 'Successfully Processed Tabular Database' END ELSE BEGIN      PRINT 'Job ''' + @JobName + ''' is already started '; ENDConclusionThis stored procedure can be executed by the web application, enabling users to process a tabular database on-demand and get feedback as to the success of the task.

How to run a successful testing stage

As we all know, testing is one of the most important stages of an IT project, however, either because the client doesn’t know how to test the solution, because we don’t have sample data we can use to compare against our results or because there is not a general approach we can apply to all projects, testing is sometimes set to failure. On this blog, I will share the approach adopted on the project I have been working. Step 1 - Build test scenarios This step can only succeed with the help of the Business Analyst or any other person from the business side. In this project, we are creating a set of KPIs to be used on Power BI and Excel. Considering the end goal, the BA created a set of scenarios (example below) that we used to test our values.   Step 2 – Create SQL scripts to query the source data One of the biggest risks of this approach lies on this step. Here, we want to create a set of SQL scripts that will follow the logic implemented in the cube. If the logic is wrong, the KPI will show incorrect values, even though we managed to match the results from the source data and the cube. This is where the input of the business user is crucial, since only him will be able to look at the numbers and confirm they are accordingly. Building the test script is very simple. All we should do is set a couple of variables and make sure all the business rules are applied. USE STAGE DECLARE @StartDate DATETIME = '20170602' DECLARE @EndDate DATETIME = '20170603' DECLARE @OutletUniverse Int IF OBJECT_ID(N'tempdb..#CallDataAggregated', N'U') IS NOT NULL DROP TABLE #CallDataAggregated; SELECT COUNT(DISTINCT B.VISIT_ID) AS POPCount INTO #CallDataAggregated FROM Dms.SalRdCallPerformanceRep A INNER JOIN Dms.SalVsDailyTimingSum B ON B.DIST_CD = A.DIST_CD AND B.SLSMAN_CD = A.SLSMAN_CD AND B.CUST_CD = A.CUST_CD AND B.VISIT_DT = A.VISIT_DT INNER JOIN Dms.SalSlSalesman C ON C.SLSMAN_CD = A.SLSMAN_CD AND C.DIST_CD = A.DIST_CD WHERE (A.VISIT_DT >= @StartDate AND A.VISIT_DT < @EndDate) AND USER_DEFINE1 IN ('DSM', 'EBSM', 'HTSR', 'KAM', 'OTSR', 'PTSR', 'RVR', 'TMR') AND B.VISIT_TYPE IN ('S','E','X') SELECT @OutletUniverse = MAX(OutletUniverse) FROM Warehouse.Fct.MarketConfiguration WHERE MarketKey = 13 AND (DateKey >= CONVERT(VARCHAR(8),@StartDate,112) AND DateKey < CONVERT(VARCHAR(8),@EndDate,112)) SELECT POPCount ,@OutletUniverse ,(CONVERT(FLOAT,POPCount) / @OutletUniverse) AS Coverage FROM #CallDataAggregated   Step 3 – Share the results with the Business Analyst and Testers Once our testing is complete and the results are approved by the BA, we release the KPIs to UAT. If we are very lucky, we will have a tester that will then carry with his own checks, however, if that is not the case, we will have to make the work for them.   Step 4 – Product testing session with the business users To sign off the KPIs, the business users need to agree with the results that are shown on the cube, however, they don’t always have the time, skills or tools to query the data. To resolve such problem, we created some examples in excel were we compare the source data with the cube. KPI UAT Cube – In this sheet, we run a query in the cube for a specific scenario KPI Source – We query the source data ensuring that all the business rules are applied, which is a risky approach as discussed above KPI Pivot – We create a Pivot table based on the data from the KPI Source sheet Once the excel scenarios are completed, we arrange a session with the business users and demonstrate that the values from the cube match with the source data. If they agree with the results, the KPIs are signed off and the testing stage is considered a success. If you have any questions or thoughts, please leave your comment below.

SQL PASS Summit–Day 3 and Reflections

Apologies for the delay in getting this blog out to you all. When PASS finished on the Friday we had to rush over to the airport to get our flight back to the UK. When I landed on Saturday I was suffering from jet lag and only now am I in a fit state to blog again.   I got the impression from the schedule that Day 3 of PASS was going to be a wind-down day as very few of the sessions seemed as intense as the previous days’ sessions. My first session of the day, despite being the last day of PASS, was early. Earlier than any of the keynotes, but worth getting up for – a Chalk Talk with the Data Warehouse Fast Track Team. This also included the Azure Data Warehouse team as well, and the conversation was much more focused on the Azure side of Data Warehousing. Lots of conversations around Polybase and patterns in how to get data from on-prem to cloud using Polybase. In terms of patterns, it was reassuring to learn that the approach Adatis has adopted is spot on. Simon Whiteley is the man to see about that. His blog is here: On the Fast Track theme, my next session was  exploring the SQL Server Fast Track Data Warehouse, which was interesting to know about, especially the various testing that these pre-configured servers go through. At some point next year, Microsoft will be releasing the Fast Track Testing Programme to the community so that everyone will be able to test their hardware to the same exacting standards and know what their maximum throughput / IO demand etc., is in order to properly gauge hardware performance. After this session I got talking to a few people about data warehousing. The conversation was so engrossing that I missed the session that I was due to attend. Luckily, most of the sessions at PASS are recorded so I will have to chase up that session and others when they get released.   My final session of the day was a Deep Dive of SQL SSIS 2016. It wasn’t so much a deep dive and more a run-down of upcoming features. The one I’m most excited about is the Azure Data Lake Store connector, which will be released once Azure Data Lake goes into General Availability, which I’ve been told is soon…..   Now that I’ve had to week to digest and reflect on SQL PASS Summit, my findings are thus: SQL PASS Summit is invaluable. It provides an opportunity to learn so much from so many people, and not just learn from the presenters. There are so many people from all over the SQL community, with different experiences of SQL, different experiences of data, different experiences of life, that you can’t not learn something. PASS provides the easy environment to share ideas among peers and learn new technologies, new ways of working and new tricks. I’ve already started sharing some of my learning's with colleagues and I can’t wait to share them with everyone else too!

SQL PASS Summit–Day 2

Day 2, Thursday, started off with a keynote from David DeWitt on cloud data warehousing, scalable storage and scalable compute. This set my theme for the majority of the day – which turned out to be big data tech.   My first session was with James Rowland-Jones and Kevin Ngo on sizing Azure SQL Data Warehouse for proposals – essentially answering “how much is this going to cost me?”. There are various factors to consider, which I will blog on separately. I’ve already briefly fed back to members of the team and they’re excited to know what I learnt in more detail.   My second session was about best practices for Big BI which, unfortunately, ended up being a sales pitch and I came away having felt that I’ve didn’t learn anything. There’s a lot of promise for BI in the big data space, so watch this space as we explore Azure SQL Data Warehouse, Azure Data Lake (Store and Analytics), and other big data technology for BI.   The third session was with Michael Rys on Tuning and Optimising U-SQL Queries for Maximum Performance. It was a full on session, learnt loads and took loads of notes. I need time to digest this information as Michael covered off a very complex topic, very quickly. I will, however, be blogging on it in due course.   After an intense third session, I chose a less intense session for the last session of the day: a Q&A with the SQL Engineering team. This was a great opportunity to learn from other users how they’re using SQL. Most users who asked questions were wanting to know about indexing, backups and High Availability.   Tonight – packing, and networking before the last day of PASS tomorrow!

SQL PASS Summit–Day 1

Day 1, Wednesday, technically started on Tuesday with a newbies speed networking event in which we had to rotate through a crowd of 10 other people - introducing ourselves and asking questions about our professional lives. This was awkward to begin with but, as the evening wore on, introducing ourselves to strangers became a lot easier and more normal. We then moved on to the Welcome Reception and then a #SQLKaraoke event. Great opportunities to meet new people from different areas of the world and parts of the community. Wednesday morning proper, began with a keynote from Joseph Sirosh. This keynote from Joseph essentially set the tone and theme for a large part of the conference sessions - Azure, Big Data and the Cortana Intelligence Suite. The first session I attended was on Design Patterns for Azure SQL Database (for which a separate blog will be forthcoming). The next session I attended was about incorporating Azure Data Lake Analytics into a BI environment (again, another blog is in the pipeline). My final session of the day was Going Under the Hood with Azure Data Lake. This was the most insightful session of the day, which has subsequently sparked my brain into Data Lake mode (expect many blogs on this), and went through how Azure Data Lake works as well as how the U-SQL language works and resources are allocated. Tonight - more networking. So far, the community has been so welcoming and I’m very much looking forward to tomorrow where I’ll be learning about Big Data solutions and best practices. I’m also looking forward to sharing all my experiences and learning's with my colleagues and wider SQL Community.

SQL Server 2016 New String Split Function

With the release of SQL Server 2016 RC0, comes a function I have waited a long time for the string_split function.  Everyone has there own different way of splitting strings, and I shall walk you through how I used to this, and how I will now be splitting strings using SQL Server 2016 RC0 onwards. Take the following example where I have one record which contains 2 columns, my 2nd column containing multiple values which I want to split out. Previously I would have used some string manipulation and the xml data type to split the values into column 2 out as separate values using code like that shown below. ;WITH CTE AS (SELECT UPPER('Adatis') as Brand_Code ,CAST ('<M>' + REPLACE('1,2,3,4,5,6', ',', '</M><M>') + '</M>' AS XML) AS Ord ) SELECT Cte.Brand_Code, Ord.a.value('.', 'VARCHAR(100)') as OrderID FROM CTE OUTER APPLY Ord.nodes ('/M') AS Ord(a) --Use the node method to join back to the original data Now all I have to do is call the SPLIT_STRING function in SQL Server 2016 simply specifying the string to be split and the value to split by. SELECT UPPER('Adatis') as Brand_Code, Value FROM string_split('1,2,3,4,5,6',',') This works fine if you just want to split a single string, but if you have multiple values you want to split then you will need to use CROSS APPLY. Given the table below with two columns which replicate the query above. I would now need to only write the following query which is a lot neater and easier for anyone reviewing my work to understand. Here I am simply asking it to split out values found in the OrderID column using a comma as the value to split by. SELECT BrandCode, Value FROM StringSplitExample a CROSS APPLY STRING_SPLIT(a.OrderID,',')   The Results: One thing to note, is that this function will not strip out any leading/trailing spaces so this will have to be handled either by using RTRIM and LTRIM or a REPLACE function. The following screenshots show the issue and my resolution Before:   SELECT BrandCode, LTRIM(RTRIM(Value)) FROM StringSplitExample a CROSS APPLY STRING_SPLIT(a.OrderID,',') WHERE BrandCode = 'Other2' After:

SQL - Using The MERGE Statement To Apply Type 2 SCD Logic

Introduced in SQL 2008 the merge function is a useful way of inserting, updating and deleting data inside one SQL statement. In the example below I have 2 tables one containing historical data using type 2 SCD (Slowly changing dimensions) called DimBrand and another containing just the latest dimension data called LatestDimBrand. Using the merge function I will insert new records from LatestDimBrand into DimBrand, I will archive (apply an end date) to any DimBrand records which do not appear in the latest data, and finally enter a new record and archive the old record for any Brands which may have changed. DimBrand (the target of our Inserts, Updates, Deletes) and DimLatestBrand (the source for Inserts,Updates,Deletes):      The merge code in it’s entirety:INSERT #DimBrand ([BrandCode],[BrandName],[StartDate]) SELECT [BrandCode],[BrandName],getdate() FROM ( MERGE #DimBrand AS Target USING ( SELECT [BrandCode],[BrandName],[StartDate],[EndDate] FROM #LatestDimBrand ) AS Source ON (Target.[BrandCode] = Source.[BrandCode]) ------------------------------- WHEN MATCHED AND Target.[BrandName] <> Source.[BrandName] THEN UPDATE SET Target.[EndDate] = getdate() ------------------------------- WHEN NOT MATCHED BY TARGET THEN INSERT ( [BrandCode] ,[BrandName] ,[StartDate] ,[EndDate] ) VALUES ( Source.[BrandCode], Source.[BrandName], Source.[StartDate], Source.[EndDate] ) ------------------------------- WHEN NOT MATCHED BY SOURCE THEN UPDATE SET Target.[EndDate] = getdate() ------------------------------- OUTPUT $Action, Source.* ) As i([Action],[BrandCode],[BrandName],[StartDate],[EndDate]) ------------------------------- WHERE [Action] = 'UPDATE' AND BrandCode IS NOT NULL   The insert statement, although appearing at the top of the SQL statement is the last thing to be executed and uses the results set of the merge function specified in the OUTPUT clause further down. Next is the start of our merge statement here we specify a Target and a Source table. We also specify which columns the 2 tables should match on, in this case only the BrandCode. The next step is to specify and handle any matches, here we are looking for Brand Code’s which appear in both tables, but with differing Brand Names. If a match is found the row in the target table is given an end date.  As with the following “NOT MATCHED” section inserts,updates or deletes can be applied to either table here. This handles the first part of our Slowly Changing Dimension requirement. Where there is no match between the Target table and the Source table, the relevant records from the Source table are inserted into the Target table. The penultimate part of the merge statement takes any Target records which do not appear in our “Latest” Source table and sets an EndDate for them as they are no longer a valid record. The OUTPUT clause populates the outer INSERT statement, here the Source rows which are involved in the UPDATE statements above are pulled out to the outer FROM clause. We have also selected $Action so that we can filter out the INSERTED rows under the NOT MATCHED by Target statement. The result of all the above is the brand “Googles” is archived along with the “MikeRowSoft” which as you can see from the BrandCode has been rebranded as MickRowSoft.

SQL Order by with aggregation

In this post we will look into applying an order by clause to a query containing an aggregate function. The purpose of this is to allow us to order a result set by a column which does not appear in the select or group by clause.  Without using an aggregate function, or including the column to order by in your select/group by, you will be unable to execute the query successfully. This is quite a straight forward problem, but one that can easily trip people up.  I started by creating a simple table containing Colours which appear multiple times over multiple dates and against multiple IDs. Using the count() function to count how many times a colour appears on a certain date doesn’t cause us any problems. The result set shows us 7 rows of data, we note that the colour Aqua Marine Green appears first in the list and while black appears second in the list as we will use this information later. However if we wanted to see the result set in the order of their respective ID’s this is where we run into issues, The error message is quite clear if not a little confusing on what needs to happen here, either we need to add the ID column to our group by clause, or wrap it in an aggregate function.  Adding the ID to the group by clause will change your result set, if we take the query above and simply add the ID column to our select and group by we no longer get an error message, however the data has lost all meaning as now we are back to our 10 original rows of data each being counted once. Using an aggregate function in the order by clause “"eg: Min(ID) fixes the problem and provides us with the result set we are after. If we look above we can see that “Black” appears 3 times against Date “2015-09-03” and the ID’s for those 3 records are 1,3,8.  It is only after specifying which ID to order by is SQL able to correctly execute the query. Without adding an aggregate function be that MIN() SUM() AVG() SQL is unable to determine which ID it should use for each group to order the data by. Be careful when picking the aggregate function to use as your choice will affect the result set.  The screenshots below show the result set ordered by MIN(ID), AVG(ID), SUM(ID). Each returning the same set of data but not in the same order. (the aggregated ID column has been added to the screenshots for reference only.) The first example orders by the MIN() function and results in Black appearing first in the result set. The next example is ordered by the AVG() function and results in Black appearing fourth in the result set. Finally ordering by the SUM() function results in Black appearing last in the result set.

How to get an age from two dates in SQL

For a project that I’ve been working on recently I was tasked with finding the Age of a product at a certain point in time in year and months. What should have been a relatively simple task proved to be a bit more difficult as the requirement was that the age be returned in two separate columns: Years and Months. I approached it by using DATEDIFF as a starting point as that’s what I would want to emulate in order to produce my age. DATEDIFF(DAY,@StartDate,@EndDate) I used DAY as the interval as I thought it would be easier to calculate the age in a more granular format, if required. I then started breaking down the output as working in number of days is a bit unmanageable. I broke the average number of days into a month (30.42) and the number of months in a year (12) DATEDIFF(DAY,@StartDate,@EndDate)/30.42/12 Then I hit a stumbling block. How was I to split out the month from the year and the year from the month and have them displayed as whole numbers? To get the year I used the following query: ROUND((DATEDIFF(DAY,@StartDate,@EndDate)/30.42/12),1)-ROUND((DATEDIFF(DAY,@StartDate,@EndDate)/30.42/12),1)%1 To get the month I used the following query, not as long as the query to produce the year but still similar: ROUND((DATEDIFF(DAY,@StartDate,@EndDate)/30.42/12),1)%1 * 12 The multiplication of the month by 12 is crucial, as it’s the only way to turn a fraction of a year into a number of months. I have then wrapped the overall query up within a CTE producing the following query script: DECLARE @StartDate AS DATE DECLARE @EndDate AS DATE SET @StartDate = '1969-07-16' SET @EndDate = GETDATE() ; WITH Age AS( SELECT CONVERT(INT,(ROUND(((DATEDIFF(DAY,@StartDate,@EndDate)/30.42)/12),1))%1 *12 )     AS AgeMonth, CONVERT(INT,(ROUND(((DATEDIFF(DAY,@StartDate,@EndDate)/30.42)/12),1))-(ROUND(((DATEDIFF(DAY,@StartDate,@EndDate)/30.42)/12),1))%1)     AS AgeYear             ) SELECT AgeYear, AgeMonth FROM Age

Audit Trail in PerformancePoint Planning

I've noticed that the PPS Technet documentation has been updated recently to include an official Microsoft method to carry out auditing in PPS Planning. PPS will do some basic auditing out of the box, namely to the audit.log file on the server. This will automatically capture key events that occur on the server, e.g. creation of a model, updating of a dimension etc. The audit file does not, however, track changes to the model fact data. There has been a custom solution around for this for a while now - Sacha has written an excellent post here that details what you need to do in order to implement your own PPS audit trail. Like Sacha's method, the Microsoft approach involves creating auditing tables, which should then be populated by running a custom stored procedure. The stored procedure should then be scheduled on a periodic basis (e.g. hourly) to capture any new activity. This is a bit different to Sacha's method, where triggers are used to capture changes in real-time as they occur. In both cases the idea is to use something like Reporting Services to to view detailed auditing reports on your PPS data. One thing that did catch my eye on in the Technet documentation is a method to decode the binary 'change list' column that's held in the dbo.Submissions table. Whereas you can manually export the change list to a CSV file, there has historically been no way to take what's in the change list column and automatically decode it into a useful format. The following C# code will read the change list, and then insert it into your newly created auditing table: DataSet ds = new DataSet(); DataLayer dl = new DataLayer("PPSConnection"); ds = dl.ExecuteDataSetFromSQL("SELECT [SubmissionID]FROM [_AppDB].[dbo].[Submissions] s1 where s1.SubmissionID not in (select SubmissionID from [_StagingDB].[dbo].[SubmissionsAudited]) and s1.[Status] = 0"); string sSQL = ""; foreach (DataRow r in ds.Tables[0].Rows) { sSQL = @"INSERT INTO SubmissionsAudited(… ) VALUES("; // RETRIEVE THE CHANGELIST FOR THIS SUBMISSION DataSetWrapper dsw = new DataSetWrapper((Byte[])r["ChangeList"]); foreach (DataRow cldr in dsw.DataSet.Tables[0].Rows) { // SUBMISSION ROW DATA sSQL += r[0].ToString() + ", " + r[1].ToString() + ", " + r[2].ToString() + ", " + r[3].ToString() + ", '" + r[4].ToString() + "', "; // CHANGELIST ROW DATA foreach (object o in cldr.ItemArray) { sSQL += "," + o.ToString(); } sSQL += ")"; } // STORE EACH CHANGE TO THE AUDIT TABLE dl.ExecuteNonQuery(sSQL); Click here to view the Technet documentation.

Entering Dates in PPS Planning Assignments

In the recent PPS Planning projects that I've been involved in, the challenges have often been around subjects such as business rules, hence the often recurring theme of this blog. Recently the tables were turned though, as I was told by a user that they wanted to enter dates into a PPS assignment. I was initially a bit concerned that the Excel add-in may not be able to deliver here - after all its great at capturing numbers, but knowing the rigid structure of the fact tables, I couldn't see how it would manage to store a date. Then I remembered something from my VBA days many years ago - that is that Excel stores dates as a number from 30/12/1899, meaning in theory it should be possible to get dates working in PPS. Thankfully it is possible, as this post explains. Excel Setup The first step to get this working when designing your form template is to set the matrix to have a matrix style of 'none'. If you don't do this, then the built-in matrix styles will over-ride your formatting changes to the required cells. Speaking of formatting, the next step is to format the data entry cells that will contain dates, just using the standard Excel formatting window: Once these few simple steps are done, then the assignment will behave just like any other. As the date is stored as a number, the numeric representation of the date will end up in the fact table just as any other piece of data. Dates in Business Rules Once the numbers are in the fact table, we need to convert them to dates to use them in business rules in some way. We can't do much in PEL unfortunately, so the options are either NativeMDX or NativeSQL. As Analysis Services can pickup some of the VBA functions, it's possible to use the VBA DateAdd() function to convert the stored number back into a date. So in the example below, I'm using the DateAdd() function to convert the number to a date, before comparing the resulting date against another date using the VBA DateDiff() function: WITH MEMBER [Measures].[DateExample] AS VBA!DateAdd("d", [Measures].[Value], "30/12/1899") MEMBER [Measures].[DateDiff] AS VBA!DateDiff("d", [Measures].[DateExample], "01/07/1987") SELECT Descendants([Time].[Monthly].[Year].&[2008],,leaves) ON 0 FROM [Strategic Planning] WHERE ([Account].[Profit and Loss].&[5010], [Measures].[DateDiff], [Entity].[Divisions].&[5003]) Although the above is just a simple example, it should give you the idea of the kind of calculations that can be performed in Analysis Services. It's possible to use these functions via a NativeMDXScript or a NativeMDXQuery. It's a similar story with SQL, as it also has its own DateAdd() function, as shown in the simple select statement below: SELECT DateAdd(d, [Value], '30/12/1899') FROM dbo.[MG_Strategic Planning_MeasureGroup_default_partition] WHERE Scenario_memberid = 4 AND Account_MemberId = 5010 So it's a shame that PEL can't work with dates, but the fact that both the database engine and Analysis Services have a DateAdd function means that it's possible to use dates for logic in both definition and procedural business rules.

Tracking PerformancePoint Planning Submissions Using Reporting Services

The standard operational reports that come with PerformancePoint will allow you to report on a variety of PerformancePoint admin related activities, such as cycles, assignments, forms, jobs and associations. I find that the assignments report is particularly useful - after all, finding out who has/hasn't submitted is an important part of any data-gathering exercise. Whilst it is useful, I do find that the assignments report is the one that admin users want changed, especially when a model site exists containing many cycles and assignments. Extra Functionality With a large PPS Planning implementation you can easily end up with many assignment instances, cycles and users. I've been involved in such an implementation recently, and, due to the large number of assignments, the admin user requested a bit more filtering capability than the out of the box assignments report provides. Also, the existing assignments report tells will tell you that user A has submitted their assignment, but it won't go into any detail about what the submission actually contained. E.g. did the user submit all their entities? For some users it is quite key to know what other users have been submitting - for one thing it makes navigation easier if as an approver you know exactly which department/entity to pick in the assignment filters. Examples By knowing which tables to use, you can write an SSRS report that provides the additional functionality mentioned above. The starting point is to get the base report query right. In my case, as I'm in a separate auditing database, the query goes inside a custom stored procedure, and is as follows: SELECT A.AssignmentId, C.CycleInstanceName, AD.AssignmentDefName, A.AssignmentName, U.UserId, U.UserName, ENT.Name AS EntityName, CUST.Name As CustomerName, CASE WHEN A.Status = 'partial' OR A.Status = 'Approved' OR A.Status = 'Submitted' THEN 1 ELSE 0 END AS Draft_Submitted, CASE WHEN A.Status = 'Approved' OR A.Status = 'Submitted' THEN 1 ELSE 0 END As Final_Submitted, CASE WHEN A.Status = 'Approved' THEN 1 ELSE 0 END AS Approved, Approve.UserName As Approver FROM dbo.Assignments A LEFT OUTER JOIN dbo.[MG_Planning_MeasureGroup_default_partition] Fact ON A.AssignmentID = Fact.AssignmentID LEFT OUTER JOIN dbo.AssignmentDefinitions AD ON AD.AssignmentDefID = A.AssignmentDefID LEFT OUTER JOIN dbo.CycleInstances C ON C.CycleInstanceID = A.CycleInstanceID LEFT OUTER JOIN dbo.D_Entity ENT ON ENT.MemberId = Fact.Entity_MemberId LEFT OUTER JOIN dbo.D_Customer CUST ON CUST.MemberId = Fact.[Customer_MemberId] LEFT OUTER JOIN dbo.BizUsers U ON U.UserID = A.ContributorUserId LEFT OUTER JOIN dbo.ApproverList AL ON AL.AssignmentID = A.AssignmentId LEFT OUTER JOIN dbo.BizUsers Approve ON Approve.UserID = AL.ApproverUserID You can figure out most of the tables to use by looking at a view called AssignmentsView within the application database. One thing that I have taken into account is assignment definitions. If you have large number of users completing an assignment, then the chances are that you will have set up an assignment definition that points at a business role or a submission hierarchy. You ideally want to be able to filter on the assignment definition to return all assignment instances that belong to that assignment definition. Therefore, in my case I have three filters for the report, but you could easily add more: The final view is a report that shows the status of the assignments returned by the filter, but also, when expanded, shows the the entities and customers that the contributor has submitted: The above is just a taster of what can be achieved. A couple of ways that it can be extended include: Integrating with Sacha's data auditing idea to provide detailed history on what values the contributor has changed; Including comments, annotations and deadlines.

Using NativeSQL Business Rules for Seeding or What Ifs

You need to give really careful consideration to procedural business rules in PerformancePoint Planning. This is where multi-dimensional knowledge and experience helps, as you need to know whether its best to use a SQL or MDX implementation for your rules. User Requirements One area where I've found that this is particularly relevant is user requests for procedural rules that accept parameters that are member(s) of a dimension. E.g. Lets run the rule for the selected member in the entity dimension, perhaps to carry out a what if on a particular entity. This is fine for a single member, but on occasion I've found that the users want to pick either a)a leaf level member (which is fine) or b)a non-leaf member, and for the rule to apply the logic to all descendants of that member. The Technical Problem What technical problem? You can use the descendants function to achieve this right? You can with an MDX rule, but you can't with a SQL rule. So, surely just use an MDX rule then? This is where you have to think relational world Vs OLAP world. Putting a statement such as Descendants($Entity$, 10, leaves) in the scope of your rule will mean that when the rule is run, then it will always operate on the leaves of the dimension, regardless of whether you pick a leaf or non-leaf member. But the issue is that if your scope statement covers cells that have definition rules against them, then an MDX query will pick up the results of of those cells and and shove them into the fact table! So can you use the Except() function or something similar to filter out those cells that you don't need? Yes you can, but not if the members start to span different levels, the compiler won't let you do that. As I said, you can't use a SQL implementation because using Descendants($Entity$, 10, leaves) in the scope statement will tell you - "A SQL implementation is forbidden by a parameter or variable reference in a MemberSet expression unless the reference is the entire expression." The Solution Rules that operate on fact data should really use SQL, its quicker and avoids hitting calculated data. I'm sure that you've seen this coming a mile off (!), but one way to go about this is to use a NativeSQL rule. You can reference database objects in your NativeSQL rule, one of which will make it easier for you carry out descendant/hierarchy type functions in SQL itself. A function will get created in the Calc schema of your Planning application database in the format fnCalc_0_DimensionName_MemberSetName_IsAncestorOrSelf - this will occur when you deploy your model site. Using this function in conjunction with NativeSQL will allow you to get the leaves of a member quite easily. The following NativeSQL rule creates a copy of one scenario's data into another scenario, which is referenced by a parameter:--Although integers, the parameters are actually 'members of a dimension' parameters create proc Calc.[sp$procedure] @scenario int, @entity int as begin begin transaction insert into dbo.[tbl$factTable] (alldim$except(dim$Entity, dim$Scenario), dim$Entity, dim$Scenario, column$ruleId, Value, LoadingControlID, bmo$CreateDateTime, bmo$ChangeDateTime, ContextID) select alldim$except(dim$Entity, dim$Scenario), dim$Entity, @scenario, $ruleId, Value, -9, getutcdate(), getutcdate(), -1 --No need to hard code the fact table name from dbo.[tbl$factTable] t inner join ( select e.memberId from dbo.d_Entity e --Get leaves of the member key that is in @entity where Calc.fnCalc_0_Entity_Divisions_IsAncestorOrSelf(@entity, e.memberid) = 1 and e.memberid not in (select h.parentmemberid from dbo.H_Entity_Divisions h) ) lv on lv.memberid = dim$Entity where t.dim$Time >= 200701 and t.dim$Time <= 200712 and (dim$Scenario = 1) if @@error <> 0 rollback transaction else commit transaction end I suppose the above rule may look a little bit daunting, due to all the dollar signs everywhere. These are just the preprocessor commands, which will get replaced when the stored procedure gets created. But other than that, its just an insert statement. In some cases the syntax is actually quite handy, as you can use statements such as alldim$except(), which is a short hand way of returning fact table dimension column names very easily. The idea behind the preprocessor commands is that you can protect the logic in your rule from minor structural database changes. The key is that when the rule gets run, it has normal parameters, meaning the user will see the following interface:   In the case above, the member key of North America Operations will end up in the stored procedure, as I have a parameter in the stored procedure called @entity. Inside the stored procedure it is then used in combination with the IsAncestorOrSelf function to pick up the leaf level descendants. The only caveat that I will make is that this hasn't come from a production environment, so I've skimped a bit on a few of the operational details such as deleting the existing data, decent error handling and returning the number of records affected. These are all things that you should add in if you use this kind of thing in a production environment!