Simon

Simon Whiteley's Blog

A Guide to Azure SQL DataWarehouse

So you've heard the hype - the Azure SQL DW is going to solve all of your problems in one fell swoop… Right? Well… maybe. The system itself is a mix of technologies designed for low concurrency analytics across huge amounts of relational data. In short, it's a cloud-scalable T-SQL-based MPP platform, with all the benefits and restrictions that performing everything in parallel brings. If your problem can be solved by performing lots of calculations over small of your data before aggregating the results into a whole, this is the technology for you.

However, before you jump right in, be aware that SQLDW is a very different beast to other SQL tools. There are specific concepts you need to be familiar with before building your system, otherwise you're not going to see the promised performance gains and will likely lose faith very quickly!

I'm in the process of documenting these concepts, plus there is a wealth of information available on the Azure SQLDW site. For the next few months, I'll be running through the blog topics below and updating the links accordingly. If there are topics you'd like me to add to the list, please get in touch!

Azure SQLDW Core Concepts:

- What is it?

- How Does Scaling Work?

- Distributions

- Polybase

- Polybase Limitations

- Polybase Design Patterns

- CTAS

- Resource Classes

- Partitioning

Designing ETL (or ELT) In Azure SQLDW

- Row counts

- Statistics

- Surrogate Keys

Performance Tuning Azure SQLDW

- Plan Exploration in SQLDW

- Data Movement Types

- Minimising Data Movement

Managing Azure SQLDW

- Backup & Restore

- Monitoring Distributions

- System Monitoring

- Job Orchestration

- Scaling and Management

- Performance Tuning Queries

Azure SQLDW Architecture

- Presentation Layers

- Data Lake Integrations

Comments (5) -

  • Nick

    6/2/2017 7:53:18 AM | Reply

    I'd be interested in seeing how you would solve the Source Control issue on AzureDW, given that it appears to be 100% manual at the moment (no SSDT Database Project support, and no 3rd Party Source control tooling).

    Manual tracking of DB changes (tables and sprocs) is just asking for trouble.

    • Simon Whiteley

      6/14/2017 11:09:30 AM | Reply

      Hi Nick - afraid there's no nice answer at the moment! I still use database projects to organise my SQL code, structuring by schema/object type etc. I then pull together deployment scripts using Powershell but, you're right, this doesn't allow for change tracking/schema drift etc - it's more a wipe & replace approach which works for greenfield new builds, rubbish for ongoing incremental deployment.

      There are a few Microsoft chaps who go a step further and deploy their code (with PDW/SQLDW specific bits commented out) to an Azure SQLDB. They perform change tracking etc against the SQLDB and so can generate change scripts, drift reports etc from that. Then, when they're ready to deploy their code, then run a bit of powershell to uncomment those SQLDW bits.

      Both approaches are clunky and flawed, but until the SSDT team integrates d-sql files into database projects there's not much else we can do. That said, haven't looked to see if Redgate have caught up yet, might be worth a bit of a dig at their SQLCompare tool to see if they have SQLDW on the roadmap?

      Thanks for reading!
      Simon

    • SimonWhiteley

      6/27/2017 9:56:04 PM | Reply

      I'm happy to say we have an update on this one Nick - it's hot off the press but there is now support within Azure SQLDW for an open source scripting tool. Essentially you can write out all objects within a given SQLDW instance to a flat file - this is progress at least!

      Once you've got that file, you can then compare it to a file scripted from another environments or use Powershell to produce a comparison file directly from SSDT. I'm working on a blog to discuss the approach, it's nowhere near full SSDT functionality, but it's at least a step in the right direction.

      Simon

      • SimonWhiteley

        6/27/2017 9:58:49 PM | Reply

        Whoops - might help if I include the link: Check out the MSSQL-Scripter release here - https://t.co/YL3dBWkeQM

      • Nick

        6/28/2017 7:00:13 AM | Reply

        Thanks Simon, that's interesting, I'll have a look at that.

        I've also had a chat with the guys at Redgate (I saw it was mentioned on the MS Feedback item for SSDT support for DW (http://bit.ly/2sPW36B) ). It's on their backlog, but given the number of users of DW, I don't imagine it's too high on the priority..

        Thanks for the link though.

Pingbacks and trackbacks (1)+

Loading