Sacha Tomey

Sacha Tomey's Blog

Power BI Streaming Datasets–An Alternative PowerShell Push Script

I attended the London Power BI Meetup last night. Guest speaker was Peter Myers On the topic of "Delivering Real-Time Power BI Dashboards With Power BI." It was a great session.

Peter showed off three mechanisms for streaming data to a real time dashboard:

  • The Power BI Rest API
  • Azure Stream Analytics
  • Streaming Datasets

We've done a fair bit at Adatis with the first two and whilst I was aware of the August 2016 feature, Streaming Datasets I'd never got round to looking at them in depth. Now, having seen them in action I wish I had - they are much quicker to set up than the other two options and require little to no development effort to get going - pretty good for demo scenarios or when you want to get something streaming pretty quickly at low cost.

You can find out more about Streaming Datasets and how to set them up here:

If you create a new Streaming Dataset using 'API' as the source, Power BI will provide you with an example PowerShell script to send a single row of data into the dataset.  To extend this, I've hacked together a PowerShell script and that loops and sends 'random' data to the dataset. If you create a Streaming Dataset that matches the schema below, the PowerShell script further below will work immediately (subject to you replacing the endpoint information). If you create a different target streaming dataset you can easily modify the PowerShell script to continually push data into that dataset too.

I’ve shared this here, mainly as a repository for me, when I need it, but hopefully to benefit others too.

Streaming Dataset Schema


Alternative PowerShell Script

Just remember to copy the Power BI end point to the relevant location in the script.

You can find the end point (or Push URL) for the Dataset by navigating to the API Info area within the Streaming Dataset management page within the Power BI Service:


# Initialise Stream
$sleepDuration = 1  #PowerBI seldom updates realtime dashboards faster than once per second.
$eventsToSend = 500 #Change this to determine how many events are part of the stream
# Initialise the Payload
$payload = @{EventDate = '' ; EventValue = 0; EventSource = ''}

# Initialise Event Sources
$eventSource = @('Source1', 'Source2', 'Source3')

# Iterate until $eventsToSend events have been sent
$index = 1 
    # Update payload
    $payload.EventDate = Get-Date -format s
    $source = Get-Random -Minimum 0 -Maximum 3 
    $payload.EventSource = $eventSource[$source]
    $value = Get-Random -Minimum 0.00 -Maximum 101.00 
    $payload.EventValue = $value

    # Send the event
    Invoke-RestMethod -Method Post -Uri "$endpoint" -Body (ConvertTo-Json @($payload))

    # Report what has been sent
    "`nEvent {0}" -f $index

    # Sleep for a second
    Start-Sleep $sleepDuration

    # Ready for the next iteration

} While ($index -le $eventsToSend)

# Finished
"`n{0} Events Sent" -f $eventsToSend

Power BI Embedded - What is it?

At the Microsoft Data Insight Summit on 22nd March 2016, Microsoft announced that Power BI had 5 million registered users.  This is a pretty impressive milestone considering its relative recency to the market and maturity.

However, Microsoft are far more ambitious than 5 million users - Microsoft won't be happy until they "bring big data analytics to a billion people."  So, how can they significantly increase their user base and market share against their competition?

One strategy is to continue broadening the offering with more features, innovations, connections in a hope to tempt more users to the platform through sheer breadth. (Arguably their strategy to date).  Another is to focus on depth and start to shore up, dare I say 'finish' some of the features, innovations and connectivity they have already released.  A third strategy, and by no means a cop-out, or poor relation, is to fall back on their tried and tested method of relying on their extensive and established community of developers, ISVs and Partners to drive adoption through extensibility, application and service development.  Power BI Embedded is at the heart of this third strategy.

Despite some frustrations, I am a fan of Power BI.  I love the journey it has taken and the direction it is going but when you start to use it in anger you do find (arguably several) areas where it falls short - these areas, I have no doubt will be addressed in time but as a Partner at the forefront that focuses on Microsoft Data Analytics at its core, we are a little bruised and battered from navigating our way around shortcomings, through pitfalls and the need to invent workarounds, constrained by the SaaS model.

One of these shortcomings and one of the most common requested features by our customers was to expose Power BI to external users such as customers and partners.  Whilst this was technically possible through various guises, licencing, costs and agreements were not clear therefore unforeseen costs and/or non-compliance always loomed.

For anyone, even remotely close to Power BI, the arrival of external access was an easy prediction. However, I will admit, I thought Microsoft would implement it differently.  As the Power BI Pro service is linked to Active Directory and SharePoint Online, I speculated that external access would be via SharePoint Online External Users.  Technically, this is a seemingly straightforward mechanism for exposing Power BI to external users but possibly complicates licencing as we are no longer limited to SharePoint content what with the mash-up capabilities Power BI affords.

Use Cases

Instead, Microsoft have delivered Power BI Embedded.  Power BI Embedded is specifically targeted at these external access use cases.  In fact, it's exclusively for external access - you are prohibited from developing applications and solutions that utilise Power BI Embedded for internal consumption.  Typical scenarios include the following:

  • Embed interactive reports (authored in Power BI Desktop) within your own customer facing applications
  • Use the Power BI Visualisation Framework to create custom visualisations thus extending Power BI
  • Provide up-to-date, real-time interactive dashboard and reporting to customers through the Power BI Embedded Service framework

Power BI Embedded is an Azure Service.  This is an important point.  It is not an additional, broadening feature of the Power BI Service as we know it - it's a distinct offering that falls under the Azure umbrella, decoupled from the Power BI Service.

Licencing and costs

This decoupling allows Microsoft to control and modify two key elements that made everything tricky for external access under the existing Power BI service.  The Authentication method is different (users do not need to be part of AD.  Your custom application utilises Application Authentication Tokens) and therefore the licencing model can be (and is) completely different.

Licencing is not on a per user per month basis. It's on a per render basis. What's a render? A render is a visual element that results in a query to the service. The point to stress here is "..that results in a query to the service".  Power BI Embedded utilises caching so if an embedded a report contains half a dozen data visualisations you will not automatically incur half a dozen renders each time the report is viewed. If the service can obtain the visualisation details from cache then you avoid incurring a render.  Following the initial render, a filter, for example, is likely to incur an additional render as that will, certainly initially, result an additional query to the service.

As a Power BI Embedded Service owner you are able to control the extent to which users can drive new queries to minimise and limit costs but it remains to be seen how much control this provides.  Accurate prediction of render cost is likely to be extremely difficult until your solution is active and monitored.

At the time of writing (April 2016), Power BI Embedded is in preview and free until 1st May 2016.  Beyond this, there are two pricing tiers.  A Freemium model, allowing up to 1,000 renders per month and a Standard model that is charged at £1.5273 per 1,000 renders.


During the preview, connectivity is limited.  There's no indication of what will become available as it moves to General Availability and beyond but as it stands there are two modes, Direct Query and Cached:

Direct Query mode supports connections to cloud services only including:

  • SQL Azure DW
  • SQL Azure
  • Spark on HD Insight

Cached mode

Datasets can be loaded into Power BI Embedded service but they become static datasets.  You cannot refresh cached data once it has been loaded into the service.  Anything that relies on either the Personal or Enterprise Gateway cannot be kept up-to-date for use with Power BI Embedded (yet?).  This rules out on premise and hybrid (for example utilising IaaS) scenarios.

There is no indication as to the maximum size of the cached model the service will support.  The assumption is this will match that of the Power BI Service, currently 250MB.


  • Report and Dashboard rendering is through an IFRAME.
  • Power BI Embedded APIs are based on REST
  • Rendering is [currently] limited to interactive reports therefore you cannot expose ad-hoc drag drop style reporting over your dataset using Power BI Embedded
  • The Power BI Embedded Service is currently only available in the South Central US Region

Power BI Embedded is a great strategy for customers, partners and of course Microsoft; it will likely drive additional adoption of the Power BI Service pushing Power BI on past the current 5 million user base. The Power BI Service will drive external access (and therefore Power BI Embedded) and Power BI Embedded will drive internal usage (The Power BI Service).  One billion users anyone ?!

Additional Resources

Power BI Embedded Overview

Get Started with Power BI Embedded

Power BI Embedded FAQ

Power BI Embedded API

Addressing the Travelling Salesman Problem with Azure ML and Power Map

I’ve recorded a short video (< 7 minutes) showing how Azure ML, Power BI and Power Map can address the Travelling Salesman Problem.

The Travelling Salesman Problem is firmly routed in academia and is computationally complex to solve with many of the solutions tackling it in complete isolation to the real world, therefore it’s computed based on straight line distance only. Here we’ve utilised actual driving distance, journey time and real world information such as traffic information to provide a slightly more pragmatic result..

The video can be found here, on YouTube.


Video : Build an Automated Multilingual Product Dimension with Power BI Desktop

I’ve recorded a short video (< 10 minutes) showing off some of the data manipulation/transformation/integration capabilities of Power BI Desktop.  In this video I take some raw product data and a set of international languages and build a Product dimension that contains translations of each of the raw products for each of the supplied languages.

The resulting product dimension build routine (using just the Power BI Desktop GUI) will automatically (following a refresh) cope with either additional products or additional languages to the raw source files.

For the translation itself I utilise the Microsoft Translate service from the Azure Marketplace. No coding required.

The video can be found here, on YouTube.


Friday Fun: GeoFlow does the Great South Run

GeoFlow was released to public preview yesterday; a new 3D visualization tool for Excel which allow users to create, navigate and interact with time-sensitive data applied to a digital map.

Back in October last year, along with 25,000 other people, my good friend and colleague Tim Kent (@TimK_Adatis) and I ran the Great South Run; a 10 mile run around the City of Portsmouth on the south coast of England.  As it happened, we both wore GPS watches and using the data collected I've created a simple GeoFlow tour of the race.

Tim is Green - I am Red - who wins...  there's only one way to find out ......

Run Race

Boot to VHD – Demo Environments and more

A few people have asked me about this recently, so I thought I’d share my approach.

Creating demo environments, particularly for the MS BI Stack, can be time consuming and a challenge, particularly when you need to take the demo with you and can’t rely on powerful, internal servers and good client internet connectivity.

A bit of history

Initially we used to have a dedicated, decently specced, Demo laptop that would be installed with all the goodies that we would need to demo. This worked until the demo needed to be in two places at once, or you needed to carry around your day-to-day laptop too.

The solution was to use the demo environment as our day to day workstation but it was massive overkill to have full blown Windows Server running SharePoint with multiple instances of SQL Server etc. and, unless you had a high spec machine, everything was a little laggy.

The next approach was to carry around a couple of laptop hard disks that you’d swap in and out depending on whether you were demoing or working. This worked well for a good while but did prevent timely demos (no screwdriver, no demo).

Then we entertained VirtualBox and Hyper-V and other virtualisation tech to run Virtual environments – this was all well and good but the primary downfall of this approach is the fact you need a really high spec machine to run both the host and the virtual environment or performance is going to be a major issue and for demos, you want performance to be good, as good as possible.

Then we discovered Boot to VHD. I’m not sure when this was first possible and I definitely believe we were late to the game but we’ve been using it for around 12 months, long enough to prove it to be a solid approach to creating and running [not only] demo environments.

Boot to VHD

The concept is easy, and “does what it says on the tin”. You create, or acquire a VHD and configure your laptop to Boot directly to the VHD.


1) The VHD can use all the host resources. Under traditional virtualisation approaches you need to split memory and/or processors which impacts performance. So on an 8GB, 2 proc laptop traditionally you would have 4GB, 1 proc for the host and 4GB, 1 proc for the virtual environment. With Boot to VHD the virtual environment can utilise the full 8GB and both processors.

2) It’s flexible. I have a chunky external HHD containing several different virtual environments for different purposes. I can backup, swap out, replace and roll-back environments in the time it takes to copy a VHD from local to external or vice-versa. You can even share demo environments with you colleagues.

3) You always have a demo environment to hand. All it takes is a reboot to load up the appropriate environment for those spontaneous demo opportunities.


1) You do need to be careful regarding disk space usage and be very disciplined to ensure you always have enough disk space available. If you are running a number of large environments there will be an element of maintenance to ensure everything always fits.

2) Without resorting to a hybrid approach, you can’t demo a distributed system working together.


So to make use of Boot to VHD, we’ll assume we already have a VHD available and ready for booting to. These can either be created manually, acquired from your internal infrastructure team, or from other third-parties.

When creating them manually I ALWAYS create “Dynamically Expanding” virtual hard disks. This way, you can actually store more VHD environments on your laptop than you would otherwise.

Although dynamically expanding disks allow you to store more environments, you will still need to ensure you have enough disk space for the disk to expand into as this will be required at the time of boot up. So, if your VHD is set to a 100GB dynamically expanding disk, (it might only be a 20GB file), but when it’s booted up, it will expand to 100GB, so you will need that space on your hard disk or the boot up will fail.

1) Copy the VHD to your laptop to a suitable location e.g. C:\VHD

2) Create a new Boot entry
Run the following at a command prompt as an Administrator:

bcdedit /copy {current} /d "My New VHD Option"

Be sure to update the label to something to help you identify the VHD – this label will appear on the boot menu when you reboot.


Note the new GUID that has been created.

3) Using the GUID created for you in the previous step and the location of the VHD, run the following three commands, one after the other

bcdedit /set {23dd42c1-f397-11e1-9602-923139648459} device vhd=[C:]\VHD\AdatisBI.vhd
bcdedit /set {23dd42c1-f397-11e1-9602-923139648459} osdevice vhd=[C:]\VHD\AdatisBI.vhd
bcdedit /set {23dd42c1-f397-11e1-9602-923139648459} detecthal on

Note the square brackets around the drive letters, these are required. If you have spaces in your path, or filename, you’ll need to wrap the path, excluding the drive letter, in quotes.e.g.

..vhd=[C:]"\VHD Path\AdatisBI.vhd"


That’s all there is to it. Reboot and you should be presented with a new Boot option and away you go.


When it doesn’t work you generally get a BSOD on boot up. To date I’ve identified two reasons for this:

1) You don’t have enough disk space for the VHD to expand (The BSOD actually does inform you of this)

2) You may need to change the SATA Mode configuration in the BIOS. Depending on how and where the VHD was created you may need to change the setting to either ATA or AHCI. If that works, you’ll have to reverse the change to reboot into your physical installation.

I’ve yet to create a portable (i.e. sharable amongst colleagues) VHD for Windows 8. I have successfully created a Windows 8 VHD but it only currently works on the laptop it was created on, this is unlike any other VHD I have created in the past. If I work out a fix, I will update this post.

Additional Information

There are a couple of extra benefits that are worth pointing out.

1) Once you’ve booted to VHD, your original, physical OS installation drive is reallocated, normally to drive D (Your VHD will assume C drive). This allows you to share files between environments, or as I do, place my SkyDrive folder on an accessible location on the original, physical drive. This allows me to have SkyDrive installed on VHDs but only have a single copy of the contents on my HDD.

2) The reverse is true too. You can attach a VHD (from the physical install, or from within another VHD) using the Disk Management tool to access, move or copy files between environments. The disk is expanded at this point so you will need enough disk space to accommodate it.

3) If disk space is a premium, you can shrink the VHD using a tool such as VHD Resizer. It doesn’t resize the physical VHD file, but allows you to reduce the size of the virtual hard disk. It also allows you to convert from Fixed to Dynamic disks and vice-versa.

4) You can remove boot entries with the following (or you can use the System Configuration tool):

bcdedit /delete {GUID}

5) I have found this approach so reliable my day-to-day Windows 7 installation is a VHD. I have not noticed any impact to performance. The only thing that I have noticed is that you cannot determine a “Windows Experience Index” when running a VHD – but I can live with that Smile

SQL Server 2012 : Columnstore Index in action

One of the new SQL Server 2012 data warehouse features is the Columnstore index. It stores data by columns instead of by rows, similar to a column-oriented DBMS like the Vertica Analytic Database and claims to increase query performance by hundreds to thousands of times.

The issue with indexes in a data warehouse environment is the number and broad range of questions that the warehouse may have to answer meaning you either have to introduce a large number of large indexes (that in many cases results in a larger set of indexes than actual data), plump for a costly spindle-rich hardware infrastructure, or you opt for a balanced hardware and software solution such as a Microsoft SQL Server 2008 R2 Fast Track Data Warehouse or a HP Business Data Warehouse Appliance where the approach is ‘index-light’ and you rely on the combination of high throughput and performance power to reduce the dependency on the traditional index.

The Columnstore index is different in that, when applied correctly, a broad range of questions can benefit from a single Columnstore index, the index is compressed (using the same Vertipaq technology that PowerPivot and Tabular based Analysis Services share) reducing the effort required on the expensive and slow disk subsystem and increasing the effort of the fast and lower cost memory/processor combination.

In order to test the claims of the Columnstore index I’ve performed some testing on a Hyper-V instance of SQL Server 2012 “Denali” CTP3 using a blown up version of the AdventureWorksDWDenali sample database. I’ve increased the FactResellerSales table from approximately 61,000 records to approximately 15.5 million records and removed all existing indexes to give me a simple, but reasonably large ‘heap’.


With a clear cache, run the following simple aggregation:

    ,SUM(SalesAmount) AS SalesAmount


Table 'FactResellerSales'. Scan count 5, logical reads 457665, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 7641 ms, elapsed time = 43718 ms


Non-Clustered Index

Before jumping straight in with a columnstore index, let’s review performance using a traditional index. I tried a variety of combinations, the fastest I could get this query to go was to simply add the following:

CREATE NONCLUSTERED INDEX [IX_SalesTerritoryKey] ON [dbo].[FactResellerSales]
   [SalesTerritoryKey] ASC
INCLUDE ([SalesAmount]) WITH


Notice I have compressed the index using page compression, this reduced the number of pages my data consumed significantly. The IO stats when I re-ran the same query (on a clear cache) looked like this:

Table 'FactResellerSales'. Scan count 5, logical reads 26928, physical reads 0, read-ahead reads 26816, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 6170 ms, elapsed time = 5201 ms.


Much better! Approximately 6% of the original logical reads were required, resulting in a query response time of just over 5 seconds. Remember though, this new index will really only answer this specific question. If we change the query, performance is likely to fall off the cliff and revert back to the table scan.

Incidentally, adopting an index-light ([no index]) approach and simply compressing (and reloading to remove fragmentation) the underlying table itself, performance was only nominally slower than the indexed table with the added advantage of being able to perform for a large number of different queries. (Effectively speeding up the table scan. Partitioning the table can help with this approach too.)

Columnstore Index

Okay, time to bring out the columnstore. The recommendation is to add all columns into the columnstore index (Columnstore indexes do not support ‘include’ columns), practically there may be a few cases where you do exclude some columns. Meta data, or system columns that are unlikely to be used in true analysis are good candidates to leave out of the columnstore. However, in this instance, I am including all columns:


Now when I run the query on a clear cache:

Table 'FactResellerSales_V2'. Scan count 4, logical reads 2207, physical reads 18, read-ahead reads 3988, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 235 ms, elapsed time = 327 ms.


I think the figures speak for themselves ! Sub-second response and because all columns are part of the index, a broad range of questions can be satisfied by this single index.


The traditional (compressed) non-clustered index takes up around 208 MB whereas the Columnstore Index comes in a little less at 194 MB so speed and storage efficiency, further compounded when you take into account the potential additional indexes the warehouse may require.

So, the downsides? Columnstore indexes render the table read-only. In order to to update the table you either need to drop and re-create the index or employ a partition switching approach. The other notable disadvantage, consistently witnessed during my tests, is the columnstore index takes longer to build. The traditional non-clustered index took approximately 21 seconds to build whereas the columnstore took approximately 1 minute 49 seconds. Remember though, you only need one columnstore index to satisfy many queries so that’s potentially not a fair comparison.


If you don’t notice a huge difference between a table scan and a Columnstore Index Scan, check the Actual Execution Mode of the Columnstore Index Scan. This should be set to Batch, not Row.



If the Actual Execution Mode is reporting Row then your query cannot run in parallel:

- Ensure, if running via Hyper-V, you have assigned more than one processor to the image.
- Ensure the Server Property ‘Max Degee of Parallelism’ is not set to 1.


In summary, for warehousing workloads, a columnstore index is a great addition to the database engine with significant performance improvements even on reasonably small data sets. It will re-define the ‘index-light’ approach that the SQL Server Fast Track Data Warehouse methodology champions and help simplify warehouse based performance tuning activities. Will it work in every scenario? I very much doubt it, but it’s a good place to start until we get to experience it live in the field.

SQL Server 2012 Licensing

Today saw the announcement of how SQL Server 2012 will be carved up and licensed, and it's changed quite a bit. There are three key changes:

1) There's a new Business Intelligence Edition that sits between Standard and Enterprise
2) No more processor licensing. There's a move to Core based licensing instead (with a minimum cost of 4 cores per server)
3) Enterprise is only available on the Core licensing model (Unless upgrading through Software Assurance *)

Enterprise, as you would expect, has all the functionality SQL Server 2012 has to offer.

The Business Intelligence edition strips away
- Advanced Security (Advanced auditing, transparent data encryption)
Data Warehousing (ColumnStore, compression, partitioning)
and provides a cut-down, basic (as opposed to advanced) level of High Availability (AlwaysOn).

In addition, the Standard Edition removes
- Enterprise data management (Data Quality Services, Master Data Services),
- Self-Service Business Intelligence (Power View, PowerPivot for SPS)
- Corporate Business Intelligence (Semantic model, advanced analytics)

If you are utilising 4 core processors, licence costs for Standard ($1,793 per core, or $898 per Server + $209 per CAL) and Enterprise ($6,874 per core) remain similar (ish).  However, you will be stung if you have more cores. The Business Intelligence edition is only available via a Server + CAL licence model and it's apparent that Microsoft are placing a big bet on MDS/DQS, Power View, PowerPivot for SharePoint and BISM as the licence for the Business Intelligence edition is $8,592 per server, plus $209 per CAL, that's nearly 10x more per server than Standard Edition !

For the complete low-down check out these links:

Editions Overview:

Licensing Overview:

Licence Detail (including costs):

* If you are currently running Enterprise as a Server + CAL and you upgrade to SQL 2012 through Software Assurance, you can keep Server + CAL model, providing you don’t exceed 20 cores.

Business Data Warehouse Appliance

Microsoft and HP announced the release of their latest Data Warehousing focused appliance last week, the Business Data Warehouse Appliance (BDW):

Not to be confused with the BDA (the Business Decision Appliance aka the "PowerPivot Appliance") this latest appliance is targeted at data warehouse workloads and follows the Fast Track Data Warehouse principals.

This is great move, and I think they have the sizing just right;  Approx. 5TB-8TB of compressed user data (depending on your achieved compression ratio) will cater for a decent proportion of the warehouses and data marts in operation today.

The announcement appears to focus on the fast deployment (they claim it can be installed and configured in around 10 minutes), and that's pretty impressive but I'd like to know what other appliance specific value they have added to the overall package.  After all, installation and configuration of the appliance is just the tip of the iceberg.

I've been lucky enough to be involved in a Fast Track Data Warehouse implementation so I have a couple of ideas (some of which we've implemented) that I'd like to see baked in to a Data Warehouse Appliance offering:
  • Operational Reporting (Some of which are already achievable through the SQL Server Management Data Warehouse)
    • Disk and File group usage reports - Help with on-going capacity planning by detailing growth and trends over time for the disk as a whole and file groups associated with each database.
    • Fragmentation Reports - [The methodology surrounding Fast Track has a high emphasis on avoiding and managing fragmentation]  Reports that detail levels of fragmentation at both the physical and logical level would potentially pre-empt any fragmentation related issues.  A kind of combination of WinDirStat and Internals Viewer would be a great graphical representation of the fragmentation at both those levels.
  • Database Administration/Developer Accelerators
    • Database/object Creation functions - I want to be able to create a database and or a file group of a specific size and let the appliance do the work of creating the physical files on each of the mount points for me.  PDW (SQL Server Parallel Data Warehouse Edition) already does something like this that's baked into the product.
    • Simplified partition management - For example, I'm likely to want to age my data over time and merge smaller partitions into larger partitions (certainly until the maximum number of partitions limit is raised in 'Denali'), or I might want to remove all the data from a specific partition in preparation for a reload.  Make it easy and handle all the swapping out, multi merging etc for me.
    • Fragmentation Management - For example, I might want to select 1 or more tables and have them completely rebuilt to remove any extent fragmentation.
    • Resize management - My file group is approaching full, I need a bigger one, I want a function to perform that resize in a 'Fast Track approved' manner.

    There's a whole host of other 'value-adds' e.g. a 'Best Practice Analyser' that could be included as part of a Data Warehouse appliance, and it will certainly be interesting to see how the appliance develops following adoption over the next few appliance updates/revisions.

Microsoft Tech-Ed 2010 BI Conference, New Orleans Day 2 (Tuesday 8th June 2010)

Day 2 and the BI Keynote.

Announcements? Only two, although actually, old news:

- They announced the availability of the MS BI Indexing Connector. Originally announced back in May

- They got their story straight(er) with regard to the release of what will be called Pivot Viewer Extensions for Reporting Services. It will be available in 30 days.

The session took more of a “look where we’ve come since the Seattle BI Conference” and, as Ted Kummert described, it’s Microsoft’s BI [School] Report Card.

Interesting change in semantics for their BI strap line; no longer do they spout “BI for the Masses”, now it’s “BI for Everyone”. Although they admitted they, along with the rest of the industry are falling well short at only a current average of 20% ‘reach’.

With the recent delivery of SQL Server 2008 R2, Sharepoint 2010 and Office 2010 the BI Integration story is significantly more complete.

A large focus on PowerPivot and how it has helped customer quickly deliver fast, available reporting ‘applications’. Although I know a few people that would object to simply describing DAX purely as a familiar extension to the Excel formula engine.

Following the look back, a brief look forward:

- Cloud Computing will pay a part, Reporting and Analytics will be coming, when combined with Windows AppFabric, described yesterday this is a closer reality.

- Consumerisation enhancements, with better search and improved social media integration BI will move towards becoming a utility.

- Compliance: Several plans; Improved Data Quality, Data Cleaning and Machine Learning and strong meta data strategy support to deliver lineage and provide change impact analysis.

- Data Volumes. SQL Server Parallel Data Warehouse Edition has completed CTP2, this will open up high performance datawarehousing to data volumes that exceed 100TB. Dallas, the data marketplace will be better integrated to development and reporting tools.

Then tempted us with some previews of what *could* make a future version of SQL Server. Essentially, the theme for the future is to join the dots between Self Service BI and the Enterprise BI Platform and focussed on plans around PowerPivot:

- KPI creation

Essentially they are exposing (yet another) way to create (SSAS based) KPI’s through a neat, slider based GUI directly from within the PowerPivot Client.

- Wide Table Support

To help with cumbersome wide PowerPivot tables, they have introduced a ‘Record View’ to help see all the fields on one screen, all appropriately grouped with edit/add/delete support for new fields, calculations etc.

- Multi Developer Support

They plan to integrate the PowerPivot client into BIDS. This will facilitate integration with Visual SourceSafe for controlled multi developer support, they also plan to provide a lineage visualisation to help with audit and impact change analysis.

- Data Volumes

Following on from the BIDS integration, plans surrounding deployment to server based versions of SSAS to allow increased performance for higher data volumes. They replayed the demo of the 2m row data set from Seattle where we first saw almost instant sort and filtering, but this time applied it (with equally impressive performance) to a data set of more than 2bn records!  It was described by Amir Netz as “The engine of the devil!” ;)