Adatis

Adatis BI Blogs

SQL Server 2012 Running Totals

SQL server has sometimes come under fire with the Oracle vs SQL Server debate because of the lack of some of the more advanced (and less used) functionality that is outlined in the SQL ANSI standards.   An example of this is Window Functions, which became an ANSI standard under the ANSI:2003 revision.  A Window function is an aggregate function that can be applied to a subset of a full set of data.  Now this can be achieved in current versions of SQL Server (2008 R2 and its predecessors), but not using Window Functions and therefore has a performance implication.  There are about 3 or 4 approaches to achieving a running total in SQL Server prior to the 2012 version, however none particularly elegant.  For my following example I will outline how to perform a running total using AdvertureWorks2008 sample data.  I will remind us of 1 of the method's (probably the most common) of how we used to do running totals prior to SQL Server 2012, and then show how to do the same running total using a SQL Server 2012 Window Function.  To set the scene, we will be looking for a running total of Line Items for a given Order.  This may be kind of query you may wish to write to generate an invoice with a running total on it. To begin with lets look at the more traditional query: SELECT         A.SalesOrderID,         A.SalesOrderDetailID,         A.LineTotal,         SUM(B.LineTotal) FROM Sales.SalesOrderDetail  AS A     CROSS JOIN Sales.SalesOrderDetail AS B WHERE     B.SalesOrderDetailID <= A.SalesOrderDetailID         AND     A.SalesOrderID = B.SalesOrderID GROUP BY     A.SalesOrderID,     A.SalesOrderDetailID,     A.LineTotal ORDER BY                     A.SalesOrderID,     A.SalesOrderDetailID,     A.LineTotal This simply works by self joining up to certain point, so works fine for any data that can be ordered easily, as running totals usually are this method usually suffices. The following query uses the new Window function in SQL Server 2012:   SELECT        SalesOrderID,         SalesOrderDetailID,         LineTotal,         SUM(LineTotal)                 OVER     (PARTITION BY                                 SalesOrderID                            ORDER BY                                 SalesOrderDetailID) AS OrderRunningTotal FROM         Sales.SalesOrderDetail ORDER BY         SalesOrderID,        SalesOrderDetailID,     LineTotal You will instantly notice that the second query is far more elegant, and more simplistic to understand.  In short, in this query, we are telling SQL server to Sum the “Line Total” over an ordered partition of the data.  Both queries return the same result, however it becomes interesting when we look at the execution plans, relative to each other.  This is shown below (apologies for the size of these): What is important to note that the second (bottom) plan is smaller, and simpler.  the other, very significant point is that the Query Cost (relative to batch) is a whopping 97% for the old method of running totals, meaning that the new Windowing Functions are far more efficient.  In summary, I believe that this is an example of where SQL server is becoming a firm competitor to some of its perceived rivals, its one of the few points that can be raised as a valid point in the argument of SQL Server vs Oracle, but not any more!

Master Data Services Excel Add-in

Master Data Services in SQL Server Denali now includes an Excel add-in that allows MDS users to add and update data that exists within Master Data Services. For those of you that haven’t had a chance to download the latest CTP, this post gives an overview of what’s possible in the Excel Add-in. Excel Add-in Overview Once you install the Excel add-in (available here for download) you will see an additional toolbar in the ribbon, as shown below: At a high level, the following functionality is available within the MDS add-in: Retrieve members from a master data entity Add or Update members (including annotations) and publish the changes back to MDS Create entities in MDS if you have sufficient permission Run the business rules I don’t want this post to go on for ever, so today I’m going to focus on retrieving and updating members. Retrieving MDS Entity Members The Excel Add-in comes with a sidebar called the Explorer that can be used to connect to an MDS model and retrieve data from an entity within the model. What I quite like is that there is a filter button that allows a user to specify how to filter the data before loading it. In the example below, I’ve connected to the sample customer model, and have filtered the customer entity to only show customers of type 1 and customers from Seattle: On the same filtering window its possible to choose the attributes to display. Therefore in the above example, by clicking the ‘Load Data’ button, a filtered list of customer members will be shown: Editing Members Once the members have been retrieved, editing is just a simple case of typing into the field that you want to change. In the example below I’ve chosen to change the names of two of the customers, which has caused the cells to be highlighted, informing me of the changes that I’ve made: Domain-based attributes are also picked up nicely by Excel. Not that I imagine it would make too much sense, but it’s possible to change the Sales District North Western US (WUSSL) to Central US (CEUS), for example. Excel handles this by rendering a drop down for this domain attribute: As it is in Master Data Manager, within Excel it’s also possible to display a domain attribute’s name. Regardless of the type of attribute that gets changed, the changes will just remain in the Excel sheet until I click the publish button: Clicking the the publish button will result in being prompted to make an optional annotation for each of the changes: The annotations, as well as the member updates, will be committed to the Master Data Services database. The end result is that the transactions can be viewed as per 2008 R2 in Master Data Manager, or by right clicking in Excel and choosing ‘View Transactions’. Summary The Excel add-in doesn’t contain all the functionality available within Master Data Manager, but provides an interface that clearly Excel-based data stewards will be very comfortable with. It’s also much easier to do bulk updates in the Excel front-end, when compared to the web front-end. That’s about if for now. There’s more to the Excel add-in, which I’m aiming to cover at SQLBits 9, plus Master Data Manager has been given a complete overhaul in Denali – hopefully I’ll find the time to cover that soon…

Data Quality Services

As I’m sure you’ve heard, CTP3 of SQL Server Denali was released yesterday, and can be downloaded here. Denali includes Data Quality Services (DQS), Microsoft’s new data cleansing and matching component that’s based on the Zoomix acquisition that occurred a couple of years back. Data Quality Services didn’t make it into the first CTP, but is now available, so I though it would be worth a quick blog post. Installing Data Quality Services Data Quality Services is an option in the main install, but when you go to run the Data Quality Client, you’ll get a message stating that DQS is not installed. As far as I can tell, DQS needs to be installed manually, by running the DQSInstaller.exe, which you can find in the SQL Server Binn directory. This will create two SQL Server databases: Data Quality Client Once DQS is configured, you’ll be in a position to use the Data Quality Client, which is a windows application, available in 32 or 64 bit. Once you connect, you’ll get the following screen: The knowledge base is the key to how DQS works, being utilised to determine the data quality of your source data. You can create your own knowledge base, using your own data, or even cloud-based data. For this example, I’m going to use the built in knowledge base called DQS Data. As an example, I’ve created some data that I want to cleanse. It’s adventure works country data that I’ve put into a separate table that I’ve called dbo.Country. My task now is is to clean this data, which I can do by creating a data quality project: I’ve called my project Countries, and I’ve picked the built-in DQS Data knowledge base, which I noticed contains reference data for countries. The activity that I’ve selected is Cleansing, and then I’ve clicked Create. DQS will then prompt for mapping the source data to one of the domains in the Knowledge Base. Here I’ve mapping my country name to the Country domain from the knowledge base: After clicking the Start button on the next screen, the cleaning process starts, which gives the following results: I put a couple of typos into my source data, such as changing Algeria to ‘Algerian’ and Albania to ‘Albana’. These were picked up by DQS, along with a few others, but a user has the opportunity to approve or reject, via the radio buttons shown above. I chose to approve the first two, and then clicked next. In the final screen, DQS allows you to output the cleaned data to a separate table, as shown below: If you go into the table that DQS has created, you will see that there is a Name_Status column, which holds the cleaning status on a per record basis: This can be used to update the source data, if required, and therefore address the data quality issues. Summary This has been a quick introduction to cleaning data with Data Quality Services. There’s plenty more to look at, and I hope to find time to do so at some point - In particular I’m keen to take a look at how DQS works with Master Data Services. I’ve noticed that there’s also a new MS DQS blog (http://blogs.msdn.com/b/dqs/) – I’m sure that will be a great resource for DQS learning.

Master Data Services in SQL Server Denali

If you didn’t make it to Tech Ed in North America last week, then you can download session recordings and take part in the labs at the Tech Ed site here. The session recordings are also available for streaming or download on Channel 9. Today I wanted to highlight in particular John McAllister’s session titled Managing Master Data with MDS and Microsoft Excel. We’ve heard for a while about the use of Excel with the new version of MDS, so it’s great to finally see it. The key points from the session for me are: The new Excel add-in: Allows the editing of Master Data (even lets you use Excel formulas) and allows you to publish back to MDS. It seems to provide the same functionality as the UI that we’re used to – it’s aware of the domain attributes, giving you drop downs in the relevant cells, and is aware of the business rules for an entity. Also lets you actually build an MDS model from Excel. Improvements to the current web-based UI (Master Data Manager). The design has changed in several places, for example it seems that the grid now has a side bar to display the member that you’ve clicked on. This “side bar” looks to have better validation capability than the current R2 member editing experience. Integration with Data Quality Services. It’s a little unclear exactly what this will be, as it’s not shown in the video, so I think we’ll have to wait a while to find out. The database now contains a staging table per entity, as opposed to a single shared staging table. Improved Performance – this is in several areas of the product apparently, but it’s certainly visible in the video that the Master Data Manager UI is quicker. I’ll be doing a series of blog posts on the new features once the new CTP/beta of Denali is released.

SQL Server Denali CTP 1 Released

I’ve just found out that the first CTP for SQL Server codename Denali is now available for download. The link to get both the 32 and 64 bit versions is: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=6a04f16f-f6be-4f92-9c92-f7e5677d91f9 On a related note, Simon Sabin has recently posted here about the Tech-Ed Europe keynote that shows a glimpse of a new Silverlight reporting tool for SQL that’s part of Denali. Well worth watching the keynote video…