Adatis

Adatis BI Blogs

SQL Server 2005 Service Pack 3

SQL Server 2005 SP3 is just a Beta at this stage, however it is available for download here, and it includes all the patches released to date: http://www.microsoft.com/downloads/details.aspx?FamilyId=D22317E1-BC64-4936-A14B-7A632B50A4CA&displaylang=en Not a huge amount of fixes, but a few for Reporting Services that caught my eye: Optimised Sharepoint Integration, to make reports respond faster. Connector to Teradata Enhanced PDF rendering On the subject of service packs, it has now been released that PerformancePoint Server SP2 "should" support SQL Server 2008, lets hope we get our hands on this soon. 

PerformancePoint Planning - Data change tracking / audit trail

PerformancePoint Planning does contain auditing out-of-the-box.  However, it's not data level auditing but instead Metadata level auditing.  If you want to track when a user has been added, or a workflow cycle started then PPS-P keeps a track of all these changes plus a whole host of other system changes in an XML file located in the directory specified in the 'Audit file name' property of the 'Auditing' tab in PAC. The auditing will answer questions such as: "When was the staging database last synchronised?" "Who re-started my assignment?" "When and who changed these business rules?" This audit file does contain and audit a lot, and one day, unless I'm beaten to it by Microsoft (hint hint!) I'll either write an XSLT file to transform the XML into a 'nice looking' HTML page, or build some routines to suck the audit files into a usable database where I can stick some SSRS reports over the top.  Until then, notepad will have to do. What's missing out-of-the-box is the ability to audit data level changes.  Questions that are actually more pertinent to Business Performance Management and requirements under Sarbanes-Oxley (SOX), the Basel Accords and other, related regulations: "Who increased the salary assumption for the North West sales force?" "When were the revenue targets last increased?" "How many times do the sales forecasts get amended before they are approved?" This post attempts to detail a simple mechanism that could be utilised to implement a basic data change audit trail that could be used to answer all questions relating to data changes.  At this juncture I must point out that the approach described below entails making database level changes to the Planning Application database and therefore would result in an unsupported environment.  In reality you would leave the planning database well alone and perform these changes to a log shipped or equivalent synchronised copy of the database, well away from the main production planning application database.  There, you have been warned ! In this scenario I want to place an audit trail on a simple assumption model.  The assumption model contains five dimensions, Time, Account, Region(Entity), Role (Custom dimension) and Scenario.  I want to track all changes to data within this model. To set up the auditing the following is required: Audit Tables Three new tables are required, these are essentially copies of existing Planning Application tables and are used to keep a history of inserts and updates.  I create these tables in an 'Audit' schema; this keeps them grouped together, and allows me to maintain the same name as the original.  The only difference between the two tables is the addition of a new primary key 'AuditId'. MG_Drivers_MeasureGroup_default_partition This is the main fact table of the assumption model that will contain the audit trail.  A complete history of changes will be maintained in this table. BizUsers BizUsers contains all current users of the planning application.  As users can be removed from the application it is necessary to audit this table to ensure changes by non-current users can be forever traced. AssignmentsHistory Stores history information relating to an assignment.  Information such as when it was submitted (either draft or final).  The comments that were added by the submitter etc etc.  The key field contained in this table is the UserId that allows us to know where to point the finger of blame! (Figuratively speaking of course!).  Like the BizUsers table, when assignments are no more, the associated history records are purged too.  Auditing this table maintains the history. You could argue that each of the dimension tables also need auditing for cases when members are removed from the dimension in the future.  I can't argue against this and depending on your business process and application you might be quite right ! Example Audit Table Create Statement: SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [Audit].[AssignmentsHistory]( [AuditID] [int] IDENTITY(1,1) NOT NULL, [AssignmentID] [bigint] NOT NULL, [ChangeDatetime] [datetime] NOT NULL, [CreateDatetime] [datetime] NOT NULL, [UserID] [bigint] NOT NULL, [Action] [nvarchar](100) NOT NULL, [Comments] [nvarchar](2000) NULL, CONSTRAINT [PK_AuditAssignmentsHistory] PRIMARY KEY CLUSTERED ( [AuditID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] Triggers Now, one of my colleagues in particular will shoot me down for this as, for years I've been strongly and completely against triggers.  They cause no end of pain when performance tuning and debugging line of business application databases.  I always argue that all trigger logic can, and should, reside in (transaction based) stored procedures that actually perform the data updates.  I actually do still stand by this, but, in this situation we don't really have general access to the data update stored procedures so, although I'm actually breaking one of my own 10 commandments, my hand is forced. For each of the original tables a trigger is required to update the appropriate audit table.  To make the reporting easier I create a trigger for Update and Insert, this means that the most recent version of the record is in the audit table too.  (C'mon, storage is cheap !) Example Original Table Trigger Create Statement: SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TRIGGER [dbo].[trig_AssignmentsHistory_Audit] ON [dbo].[AssignmentsHistory] FOR UPDATE, INSERT Not For Replication AS Set NoCount On INSERT into Audit.AssignmentsHistory( AssignmentID, ChangeDatetime, CreateDatetime, UserID, Action, Comments) Select AssignmentID, ChangeDatetime, CreateDatetime, UserID, Action, Comments From Inserted View In order to report the audit trail, a simple view is required to show the changes over time of the values held in the fact.  This view is primarily based on the audit fact table and resolves all the dimension member ids to give you the dimension member names.  Example Audit Trail View: (This view is specific to my fact table.  It will need modification for each fact table). SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER VIEW [Audit].[AuditReport] AS SELECT AuditFact.ChangeDatetime, dbo.D_Scenario.Name AS Scenario, dbo.D_Account.Name AS Account, dbo.D_Entity.Name AS Region, dbo.D_Role.Name AS Role, (SELECT TOP 1 UserName FROM Audit.AssignmentsHistory ah INNER JOIN Audit.BizUsers bu ON ah.UserID = bu.UserId WHERE AssignmentId = AuditFact.AssignmentId) ChangedBy, AuditFact.Value FROM Audit.MG_Drivers_MeasureGroup_default_partition AS AuditFact INNER JOIN dbo.D_Role ON dbo.D_Role.MemberId = AuditFact.Role_MemberId INNER JOIN dbo.D_Account ON dbo.D_Account.MemberId = AuditFact.Account_MemberId INNER JOIN dbo.D_Scenario ON AuditFact.Scenario_MemberId = dbo.D_Scenario.MemberId INNER JOIN dbo.D_Entity ON AuditFact.Region_MemberId = dbo.D_Entity.MemberId The output of the view above will detail each of the changes made to each of the facts over time when queried like this: SELECT Scenario, Account, Region, Role, ChangedBy, ChangeDateTime, Value FROM [Audit].[AuditReport] WHERE Scenario = 'Budget' AND Account = 'Annual Salary' AND Region = 'South West' AND Role = 'Assistant Manager' ORDER BY ChangeDateTime DESC Resulting in the following output (most recent first):   Notice the HR Director set the initial salary value for the South West Assistant Manager, the rather indecisive Finance Director made a few amends for it to be later adjusted again by the HR Director.  PerformancePoint Planning Data Change Tracking ! Possible Enhancements The above method describes a very simple way of auditing data level changes in PPS-P.  This can be taken much further and I've included some possibilities below: Develop a parameter driven SSRS report to report the audit trail for a dimension member combination Create some generic scripts that accept the fact table as a parameter and create the appropriate audit tables, triggers and views. Link up with the fact table annotations table and/or the submission comments (Held in the AssignmentsHistory table) to enrich the reporting. The data volume will obviously grow over time - intelligent purging/archiving of audit records. Update the view to handle certain conditions where adjacent audit trail records contain no apparent change in value (only the change date will be different - this can occur when a contributor submits, makes subsequent changes and submits again without first clearing current changes).

LINQ and SQL Server

Some time ago I reported on LINQ whilst Visual Studio 2008 was still in BETA. On Monday, Nov. 19 2007, Microsoft announced that Visual Studio 2008 and the .NET Framework 3.5 was released to manufacturing (RTM).  Since then I am sure many of you have had many hours playing with VS 2008, WPF, Expression Blend, and all that these new products contain.  Having finally got my hands on it, and found the time, LINQ was the first stop.  So we have a database that we need to run complex queries on where stored procedures just would not be flexible enough.  Further to this Inline SQL is an obvious No No, so perfect scenario for a LINQ to SQL implementation.  Deciding that a DBML (Database Markup Language) structure would work best, generating this file, and getting this file to be update-able was crucial.  So after some research we found a small, unknown command line application called SQLMetal.  This allows you to drop and re-create DBML files, based entirely on a given database.   SqlMetal will always produce a consistent data access layer: SQLMetal is a command line tool that can generate the data access layer in seconds. SqlMetal will produce either C# or VB.Net code. SqlMetal generates a strongly typed data access layer. This is great for reducing runtime error. Now those runtime errors pop up in the development cycle as compile errors, reducing stress on the QA dept, Support dept and upper management, later in the life cycle. SqlMetal can generate strongly typed interfaces for stored procedures and user-defined functions. From the developer point of view, it is now a complete breeze to call either a stored proc and/or a user-defined function. To generate source code from SQL database directly, execute the following: C:\Program Files\Microsoft SDKs\Windows\v6.0A\Bin> SqlMetal /server:baker   /database:AdventureWorks  /namespace:AdventureWorks  /dbml:AdventureWorks.dbml This creates a DBML file that can be added to your project:    This now gives an entire database structure accessible via LINQ, one thing to be aware of though; ensure you have a solid database design, as SQLMetal will transform you Database design identifying all keys and relationships. And once you have the database as an object model, you can use LINQ as you wish. LINQ to SQL uses ADO.NET under the covers but offer several additional benefits: Reduces complexity Fewer lines of code Strong Typing

PerformancePoint Server and SQL 2008

So after many hours looking into this issue, it seems that the current release of PPS does not work on SQL Server 2008, and there seems to be no tricking it.  Should you manage to get round the install checks, while setting up the Database Connections to SQL and Analysis servers, the versions are checked once again, and the door is currently closed. In the mean time, SQL 2005 will still work!  SQL 2008 is still due to Launch SQL 2008 in February, however it now looks as if the RTM (Release to Manufacture) will now not be until Q3 of 2008 - so we are unlikely to see,even early adoption until July time.  Microsoft see  SQL server 2008 as a critical step forward towards their "broader vision for business intelligence", so I think we can be rest assured that PerformancePoint will be integrated before release date.  In the mean time I will update you with any further developments on this, and will continue testing with further CTP's and patches.

Install SQL Server 2005, After 2008

So, if like me, you wanted to look at SQL 2008, and thought it best to totally remove 2005, and then install 2008 (Katmai), then realising that the Katmai BI tools did not install properly, you may want to re-install 2005.  If have got to this stage you may have found and annoying installation bug, even after removing 2008. This is regarding the SQL Native Client.  It fails to install and therefore most of SQL 2005 fails to install.  After numerous more efforts to install this there is a simple fix. From the SQL 2005 disc, run Sqlncli.msi  - Choose remove, repair doesn't work!  - Once removed install SQL server 2005 as normal - easy - when you know how!

PPS Planning - Force Cancel Check Out Functionality

You've been badgering away at a PerformancePoint Server Planning Model all day, updating objects, writing complex MDX scope statements to satisfy obscure business requirements and you've only got to add a new view to the time dimension before you can deploy and start testing when you find that your colleague - the one that had to leave early to pick up the kids - has it checked out ! What's the problem?  Surely you can undo the checkout from the admin console like you can in VSS Admin?  Okay, he could lose his isolated changes but you've been at the whole model for the whole day.  Well, there is some good news and some rather irritating bad news ! There is indeed an admin level undo checkout feature that allows you to unlock the objects so you can check them out to you, and this feature is easily found in the Planning Admin Console. It's one drawback is the fact it forces you to undo the checkout for ALL checked out objects in an application.  There is no ability to select an individual planning object !  In my scenario it was not that much of a big deal as I could check all my objects in and then force the undo checkout on the only remaining checked out object but I'm not always going to be that lucky. On the face of it, it would be little effort for the Redmond gang to add that additional functionality to the Admin Console.  You can obviously check-in/check-out individual objects from within the Business Modeler so the stored procs already exist, right?  hmm, actually, that gives me an idea ...

Basket Analysis using Analysis Services 2005

My job is pretty varied these days :o) Today I've been looking at satisfying some basket analysis type requirements where, for example, we need to determine what other products customers have bought in addition to a known product. We could have used data mining for this but decided not to as, in this case, it literally was a few mouse clicks away from our existing AS2005 cube. The implementation was surprisingly straight forward and query results (admittedly on a subset of the full dataset) very impressive. In an attempt to outline the implementation steps I will add some basket analysis to the Adventure Works sample database to demonstrate how easy it was ! Requirement List all products (and associated profit margin) that have also been purchased by customers who have bought a road bike from the 550-W range. Approach Rather than using data mining to handle queries of this type I want to extend my current cube.  In order for the performance to be acceptable we will adopt a variation on the many-to-many distinct count solution discussed in numerous blogs, white papers and articles. This result of this approach is a new measure group, we'll call it 'Customer Cross Purchase' and a new reference dimension based on product for our 'known product', we'll call this 'Product Filter'  From a relational perspective, the product filter dimension provides a lookup for customers (in Customer Cross Purchase) that have purchased that product. This in turn provides the ability to locate all orders (Internet Sales Fact) for that subset of customers.  Once we know all the orders for that subset of customers we can simply list the distinct products that make up those orders. (Product) Implementation The measure group contains a distinct set of customers and products that will act as the junction table of our many-to-many.  This is a view over the main fact and customer dimension. CREATE VIEW [dbo].[vCustomerCrossPurchase]ASSELECT DISTINCT f.CustomerKey, f.ProductKeyFROM         dbo.FactInternetSales AS f INNER JOIN                      dbo.DimCustomer AS c ON f.CustomerKey = c.CustomerKey Next, the fact table [view] is added to the Data Source View, ensuring the relationships to the customer and product dimension are set up. With the DSV updated, the cube itself can be extended.  A new measure group is created together with a Product Filter reference dimension.  The dimension usage looks like the diagram below.  This ensures the appropriate relationships exist as outlined above The new measure group is mapped to the product filter and customer dimensions, as per our dsv.  Note, this is not done automatically as the real (non referenced) product dimension is selected instead. To complete the picture, the Customer Cross Purchase measure group is used to create a many-to-many relationship between the Product Filter and the main Internet Sales measure group. Testing Once deployed and processed we can test out our modifications to check for some reasonable results. The following MDX query returns a list of all products that have been bought by customers buying  a road bike from their 550-W range in Reading, England. select     non empty [Product].[Product].members on rows,     [Measures].[Internet Gross Profit] on columnsfrom    [adventure works]where(    [Product Filter].[Product Model Categories].[Model].&[Road-550-W],    [Customer].[City].&[Reading]&[ENG]) The query is simple, it lists products on the rows and profit on the columns, the 'where' clause slices by Reading, England and employs the new Product Filter dimension.  The Product Filter dimension has the effect of slicing the main fact table by customers that have bought a bike from the Road 550-W range. So, we can see that apart from the road bikes, a few other accessories have been purchased too.  A quick couple of queries confirm the results. Three customers (above) have bought a road bike from the 550-W range and the other products these customers have bought (below) match our results !

MDX Stored Procedures (Part 1)

UPDATE:  You can find part 2 here, along with a link to download a C# quick start.  A long time ago I realised that inevitably, however well specified the requirements or however tight the deadlines are, clients always throw a curve ball.  To that avail, like many, I adopt an agile approach to project delivery with a keen eye on building frameworks and generic components that improve productivity and flexibility for current and future projects. My current project is no different to any other, we knew the requirements would change as, even from the outset, they were high level and vague.  With little budget to flesh them out, a controlled, iterative approach was essential.  It's a typical BI project, various data sources require consolidating and processing to allow analytics and reporting to assist in strategy and revenue generation. The reporting element was the vaguest of them all, although what did exist immediately eliminated using Reporting Services as the reporting/delivery mechanism.  The required reports were highly interactive and aesthetically challenging enough to stretch Reporting Services functionality to the limit, even without the vagueness busting curve balls out of sight but already primed for deployment. With all this in mind, we decided to use ASP.NET 2.0 web forms for the reporting as this gave us a little more freedom with regard to layout, a high degree of flexibility surrounding interactivity and the ability to quickly react to the ever changing requirements.  Now, with an eye on productivity we decided to build an MDX Helper class (mentioned previously here and here) that would streamline the development process and enable us to react quickly to those inevitable changes. Our own requirements for such a component were as follows: Allow text files containing parameterised MDX to be executed. Must support the execution of multiple queries over a single connection Return a conventional ADOMD cell set object containing the query results Return a typed/untyped ADO.NET DataTable object for binding to native and third party .NET controls Allow the inspection of various server, database and cube properties Lightweight, re-usable and simple to use The public interface of the resultant component looks like this: In it's simplest form, usage is straight forward. Take the following parameterised query stored in file UKSales.txt select    [Product].[Product Categories].[Category].members on rows,    [Measures].[Internet Sales Amount] on columnsfrom    [Adventure Works]where    @Country You return an ADOMD cell set object using the following C# code snippet: // Create an instance of the helper class, setting cube and query informationMDXHelper helper = new MDXHelper("ASTestServer", "Adventure Works DW", "Adventure Works", MDXHelper.MDXResourceFormat.RelativeURL, "UKSales.txt");// Add in the country parameter helper.Parameters.Add("@Country", "[Geography].[Geography].[Country].&[United Kingdom]");// Execute the queryCellSet result = helper.ExecuteCellSet();// Tidy uphelper = null; There are several overrides on the constructor and 'Execute' methods to allow for flexible usage and it's been used in anger now for a good couple of weeks supporting every type of MDX query we've managed to throw at it so far.  It still needs a little refinement and some full-on testing but it has achieved its goal and has helped us easily digest changes to the original requirements.  It has also given us some great ideas for version 2 where the query definitions are stored in the Analysis Services cube definition itself. In a future post I'll delve into the approach and implementation of the helper class to see how it works under the hood.

XML Namespaces and the ADOMD XML Reader Result Set

I'm building a generic helper class to flatten an ADOMD data set to a data table for easier consumption by default and third party .NET controls.  I can't rely on the Load method of the data table, nor the CellSet object as in some cases not all the information is persisted; datatypes and top level hierarchies to name two. To persist all of the information I need to traverse the actual xml dataset returned by the ExecuteXMLReader method of the ADOMDCommand object. I don't use xml enough these days to remember all the nuances with namespaces and it took me a little while to work out what to set the namespaces to in order for the 'SelectNodes' method to work. Below is the code snippet I used to ensure the output from the ExecuteXmlReader method is fully traversable. // Execute the command returning the XML data setXmlReader reader = command.ExecuteXmlReader(); // Load the object into an XML document XmlDocument xml = new XmlDocument();xml.Load(reader); // Create an XML namespace managerSystem.Xml.XmlNamespaceManager nsManager=new XmlNamespaceManager(xml.NameTable); // Add the namepsaces to the managernsManager.AddNamespace("defaultNS","urn:schemas-microsoft-com:xml-analysis:mddataset");nsManager.AddNamespace("xsi",http://www.w3.org/2001/XMLSchema-instance);nsManager.AddNamespace("xsd",http://www.w3.org/2001/XMLSchema); // Return a list of nodes that represent the row informationXmlNodeList rows = xml.SelectNodes("defaultNS:root/defaultNS:Axes/defaultNS:Axis[@name ='Axis1']", nsManager); This kb article helped me out a bit:http://support.microsoft.com/default.aspx?scid=kb;en-gb;q318545 There are some more goodies included in my helper class that I will blog about in the future, once it's stable !

Analysis Services XMLA to clear query cache

To get a realistic measurement of how your query tuning improves query response time you should start from a common point.  To help with this you should clear your query cache before running a queries after making improvements (Unless you are cache warming of course !).   Here is some XMLA script to clear the cache for a specified database. <Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">   <ClearCache>       <Object>          <DatabaseID> database id </DatabaseID>          <CubeID> cube id </CubeID>       </Object>    </ClearCache> </Batch>   Replace database id and cube id with the details for your cube.  Notice these are ids, not names.  Look in the Analysis Services properties windows to find the ids.

MDX Performance Hints

Amongst some common, known hints there are a few new bits and pieces in this white paper: http://www.sqlserveranalysisservices.com/OLAPPapers/MDX%20Performance%20Hintsv1.htm

AS2005 Cube Processing Via SSIS 2005 and SQL Server Agent

Microsoft has confirmed the processing of AS 2005 cubes from SSIS 2005 invoked from SQL Server Agent defaults to serial/sequential processing, irrespective of whether the parallel processing option has been selected.  This results in increased cube processing times (possibly up to 4 times longer) when compared with using XMLA scripted tasks from SQL Server Agent. Microsoft has reproduced the problem and has escalated it to the SSIS Product Team.