Jeremy Kashel

Jeremy Kashel's Blog

Master Data Services Training in the UK

This is just a quick post to announce a range of SQL Server training courses, organised (and in some cases delivered) by Chris Webb. To start off there’s a SQL Server course delivered by Christian Bolton in December, followed by an Analysis Services course delivered by Chris in February. I’ll be delivering a Master Data Services course in February, before Chris delivers an MDX course in March.

The details for all the courses are:

  • SQL Server Internals and Troubleshooting Workshop - Christian Bolton – 6th – 7th December 2011

The Advanced Troubleshooting Workshop for SQL Server 2005, 2008 and R2 provides attendees with SQL Server internals knowledge, practical troubleshooting skills and a proven methodical approach to problem solving. The workshop will enable attendees to tackle complex SQL Server problems with confidence.

Full details and registration here.

  • Real World Cube Design and Performance Tuning with Analysis Services – Chris Webb – February 2012

A two day course that takes real world experience in showing you how to build a best practice Analysis Services cube, covering design issues such as data warehouse design and complex cube modelling. Day two then covers performance optimisation for Analysis Services, including MDX optimisation and cube processing.

Full details and registration here.

  • Introduction to Master Data Services with Jeremy Kashel – February 2012

An end to end look inside Master Data Services, this full day course will begin with a synopsis of Master Data Management before moving on to an overview of Microsoft SQL Server 2008 R2 Master Data Services (MDS). The remainder of the course will cover the major MDS topics, such as modelling and business rules, which will include a number of practical exercises.

More details and registration for here.

  • Introduction to MDX with Chris Webb – March 2012

The Introduction to MDX course aims to take you from the point of being a complete beginner with no previous knowledge of MDX up to the point where you can write 90% of the MDX calculations and queries you’ll ever need to write. The three day course covers the basics, such as sets, tuples, members to more advanced concepts such as scoped assignments and performance tuning.

Full details and registration here.

Data Quality Services

As I’m sure you’ve heard, CTP3 of SQL Server Denali was released yesterday, and can be downloaded here. Denali includes Data Quality Services (DQS), Microsoft’s new data cleansing and matching component that’s based on the Zoomix acquisition that occurred a couple of years back. Data Quality Services didn’t make it into the first CTP, but is now available, so I though it would be worth a quick blog post.

Installing Data Quality Services

Data Quality Services is an option in the main install, but when you go to run the Data Quality Client, you’ll get a message stating that DQS is not installed. As far as I can tell, DQS needs to be installed manually, by running the DQSInstaller.exe, which you can find in the SQL Server Binn directory. This will create two SQL Server databases:

image

Data Quality Client

Once DQS is configured, you’ll be in a position to use the Data Quality Client, which is a windows application, available in 32 or 64 bit. Once you connect, you’ll get the following screen:

image

The knowledge base is the key to how DQS works, being utilised to determine the data quality of your source data. You can create your own knowledge base, using your own data, or even cloud-based data. For this example, I’m going to use the built in knowledge base called DQS Data.

As an example, I’ve created some data that I want to cleanse. It’s adventure works country data that I’ve put into a separate table that I’ve called dbo.Country.

My task now is is to clean this data, which I can do by creating a data quality project:

image

I’ve called my project Countries, and I’ve picked the built-in DQS Data knowledge base, which I noticed contains reference data for countries. The activity that I’ve selected is Cleansing, and then I’ve clicked Create.

DQS will then prompt for mapping the source data to one of the domains in the Knowledge Base. Here I’ve mapping my country name to the Country domain from the knowledge base:

image

After clicking the Start button on the next screen, the cleaning process starts, which gives the following results:

image

I put a couple of typos into my source data, such as changing Algeria to ‘Algerian’ and Albania to ‘Albana’. These were picked up by DQS, along with a few others, but a user has the opportunity to approve or reject, via the radio buttons shown above. I chose to approve the first two, and then clicked next. In the final screen, DQS allows you to output the cleaned data to a separate table, as shown below:

image

If you go into the table that DQS has created, you will see that there is a Name_Status column, which holds the cleaning status on a per record basis:

image

This can be used to update the source data, if required, and therefore address the data quality issues.

Summary

This has been a quick introduction to cleaning data with Data Quality Services. There’s plenty more to look at, and I hope to find time to do so at some point - In particular I’m keen to take a look at how DQS works with Master Data Services. I’ve noticed that there’s also a new MS DQS blog (http://blogs.msdn.com/b/dqs/) – I’m sure that will be a great resource for DQS learning.

SQL Server Denali CTP 1 Released

I’ve just found out that the first CTP for SQL Server codename Denali is now available for download. The link to get both the 32 and 64 bit versions is:

http://www.microsoft.com/downloads/en/details.aspx?FamilyID=6a04f16f-f6be-4f92-9c92-f7e5677d91f9

On a related note, Simon Sabin has recently posted here about the Tech-Ed Europe keynote that shows a glimpse of a new Silverlight reporting tool for SQL that’s part of Denali. Well worth watching the keynote video…

Master Data Services - Reversing Transactions

MDM tools give the control of the enterprise master data over to the data stewards and power users, rather than relying on automated data integration alone.

Master Data Services is no exception to the above. One of the ways that this is true for MDS is that it allows users to inspect the transactions that have occurred (either internal to MDS or from a source system) and choose if they want to reverse them.

In order to achieve this MDS has a useful log of all transactions that's viewable by users. Here's an example of some transactions that have occurred in my test system - some are from data that I've loaded up via the staging tables, some are from manual member additions that I've carried out in the front end, and some are from business rules that have automatically run:

image

In the model that this example is taken from, I've got some business rules that look to address data quality issues. Taking the Kimball view on data quality issues in a data warehousing context - many can, and should, be addressed in the source system, then re-loaded. That isn't always possible, which is one of the reasons why we have business rules in MDS. However, as good any sort of automated rule is - there are always exceptions.

In the transactions shown above, an automatic business rule has run that checks a Customer's overdraft limit, then sets it to 10,000 if its over 10,000. Therefore, when a value of 50,000 was encountered for Member Code 10311, the MDS business rules kicked in and quite correctly did their job. This was not what I wanted in this particular case.

Thankfully we can click on the undo button that's shown above the grid, and reverse a chosen transaction, whether its come from a source system, a business rule or a manual edit. It doesn't seem possible to reverse many transactions at once, but that may be just due to the CTP. In my example, by selecting the first transaction in the list, then clicking the undo button, I've reversed my automatic business rule. Therefore, the user Kaylee Adams (10311) shown below now has her original overdraft limit:

image

In conclusion, when some sort of manual intervention is needed to successfully manage master data, MDM tools allow that intervention to come from the power users, rather than having to wait for someone technical to address the issue.

Master Data Services - Business Rules

I've been keeping an eye on the SQL Server 2008 R2 CTPs over the past few months, but have been compelled to start blogging again following the release of Master Data Services (MDS) in the November CTP.

The idea of a Microsoft MDM tool first caught my attention with the acquisition of Stratature, and since then I've seen a few talks on the subject, such as Kirk Haselden's talk on the subject back at the BI Conference last year.

Now that I've got my hands on it, I've decided to cover the set up of business rules in MDS. Business rules are key to an MDM solution. If we want to use MDM to load data from disparate source systems, we will definitely have to carry out a lot of cleansing and confirming in order to ensure that the end users only consume clean and accurate data.

To set the scene a bit, I've created several entities in my MDM model, namely Customer, Country and City. These could form a Customer Geography hierarchy for example, but for the moment I'm going to focus on Customer. The following shows the Customers that I've entered manually:

image

When I add a Customer (via the button that is circled) or when I edit a customer, the third column of the grid for the relevant member will change from a tick to a question mark, indicating that data validation has not taken place.

For this example, what I want to happen is for the Overdraft Limit attribute to validate that it is within normal boundaries that have been set by the business, e.g. a bank. To do this, I'm going to set up a simple business rule.

Selecting Manage->Business Rules will take you to the Business Rules Maintenance screen, where the 'plus' icon will create you a new business rule. Editing the new blank rule will give a screen with a basic IF....THEN GUI to produce a basic business rule. On the IF part you pick conditions such as greater than, less than etc, alongside an all important dimension attribute. You do this by dragging and dropping conditions, in the screen below:

image

In my case I've picked that the overdraft limit can't be greater than £10,000, and if it is greater, then set it back to £10,000. This will do for now, but I could have prevented validation from succeeding, or caused MDM workflow to start. Clicking the MDS back button will take us back to the business rules maintenance screen, where the rule is not active until we publish it:

image

Once we do publish the rule, it will kick in whenever validation runs or when you manually run the business rules. In my grid of Customers above, I have an overdraft which is a mistake. When I validate the Customer entity, the 5555555 for the second customer automatically reverts to £10,000, as shown below:

image

This is quite a simple example of course, and via manual editing. The real power of these business rules will come when loading masses of data from source systems, with the added power of workflow to prompt business users to deal with the validation issues that may occur. I'll aim to post about integrating from other systems via my next post in due course....

BI for IT Professionals using PerformancePoint

The PerformancePoint downloads page has been recently updated to include a framework on how BI can provided to the IT Department.

Normally the IT Department would be helping out with supporting BI solutions, not actually being end users themselves - but it makes sense when you consider the kind of information that Operations Manager captures.

As this video explains, the end goal is to create solutions that will allow effective monitoring the IT infrastructure. An example of the kind of the kind of dashboards that can be produced is shown below:

Dashboard 

There is white paper and also a sample solution available for download to learn more.

SQL Server 2005 Cumulative Update 8 for Service Pack 2

Microsoft have just released Cumulative Update 8 for SQL Server 2005 Service Pack 2.

Amongst the changes are a few fixes for small problems in Analysis Services, such as:

  • Infinite recursion occurs in the CalculationPassValue function in SQL Server 2005 Analysis Services;
  • A data source view that connects to an Oracle database generates incorrect relationships when you develop a Microsoft SQL Server 2005 Analysis Services project by using BIDS;
  • The MeasureGroupMeasures function does not work correctly in SQL Server 2005 Analysis Services;
  • All the MDX queries that are running on an instance of SQL Server 2005 Analysis Services are canceled when you start or stop a SQL Server Profiler trace for the instance.

Details the above fixes and many more can be found here, as well as details of how to download the update. As usual, you can only apply the hotfix if you have Service Pack 2 installed.

Details of all SQL Server 2005 builds released after Service Pack 2 can be found here.