Zach Stagers

A quick look at U-SQL Database Projects

U-SQL Database Projects were released today (see here for the announcement from Microsoft) and it promises easier development and deployment. I’ve blogged in the past about using PowerShell to deploy U-SQL scripts, and I’ve worked with the older “U-SQL Project” project quite a bit, so I wanted to try the new project out as quickly as I could to see what it has to offer.

Once you’ve updated to the latest version of ADL Tools, you’ll be able to select the new project type:



The project contains a number of database object templates, allowing you to get off the mark quickly with development, and the project allows all of the usual visual studio organization with folders, and intellisense in the code editor.

Object templates available:


Here’s the template of a Procedure Script item:


Source Control

The projects integrates much more nicely with TFS than the older “U-SQL Project” does.

It actually gives you the icons (padlock, check mark, etc..) in the solution explorer, so it actually looks like it’s under source control!

Something that I’d really hoped had been fixed, but hasn’t, is when copying and renaming an existing item, it doesn’t recognize the rename. You have to undo the checkout of the non-existent object (the copy, before being renamed):


Something that has been fixed is that with the older project style, you wouldn’t be able to edit the “Procedure2.usql” item from my previous example until you’d checked it in – now you can!


Where you’re deploying to is controlled by the Cloud Explorer in Visual Studio and which Data Lakes you have access to. By right clicking and selecting deploy on the project, you are presented with a simple dialogue box where you select which Data Lake Analytics account, making it nice and easy to deploy to development vs UAT, etc…


The Database Name specified in this textbox is the name of the database your project will be deployed to. Unfortunately, it is just a free form textbox, so be careful of any typos!

Deploying Multiple U-SQL Procedures with PowerShell

If you’d like to deploy multiple U-SQL procedures without having to open each one in Visual Studio and submit the job to Data Lake Analytics manually, here’s a PowerShell script which you can point at a folder location containing your .USQL files to loop through them and submit them for you.

This method relies on the Login-AzureRmAccount command with a service principal, which you can learn more about here.

The Script

$azureAccountName = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
$azurePassword = ConvertTo-SecureString "xxxxxxxxxxxxxxxxxxxx/xxxxxxxxxxxxxxx/xxxxxx=" -AsPlainText -Force
$psCred = New-Object System.Management.Automation.PSCredential($azureAccountName, $azurePassword)
$psTenantID = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
$adla = "zachstagersadla"
$fileLocation = "C:\SourceControl\USQL.Project\*.usql"

Login-AzureRmAccount -ServicePrincipal -TenantID $psTenantID -Credential $psCred | Out-Null

ForEach ($file in Get-ChildItem -Path $fileLocation)
		$scriptContents = [IO.File]::ReadAllText($file.FullName)
		Submit-AzureRmDataLakeAnalyticsJob `
			-AccountName $adla `
			-Name $file.Name `
			-Script $scriptContents | Out-Null;
		Write-Host "`n" $file.Name "submitted."

Parameter Configuration

Where these various GUID’s can be found within the Azure portal is liable to change, but at time of writing I have provided a path to follow to find each of them.

$azureAccountName – This is the Application Id of your Enterprise Application, which can be found by navigating to Azure Active Directory > Enterprise Applications > All Applications > Selecting your application > Properties > Application Id.

$azurePassword – This is the secret key of your Enterprise Application, and would have been generated during application registration. If you’ve created your application, but have not generated a secret key, you can do so by navigating to: Enterprise Applications > New Application > Application You’re Developing > OK, take me to App Registration > Change the drop down from ‘My Apps’ to ‘All Apps’ (may not be required) > Select your application > Settings > Keys > Fill in the details and click save. Note the important message about the key only being available to copy until before navigating away from the page!

$psTenantID – From within the Azure Portal, go to Azure Active Directory, open the Properties blade, and copy the ‘Directory ID’.

$adla – The name of the data lake analytics resource you’re deploying to.

$fileLocation – The file location on your local machine which contains the USQL scripts you wish to deploy. Note the “\*.usql” on the end in the example, this is a wildcard search for all files ending in .usql.


The service principal needs to be given the following permissions to successfully execute against the lake:

· Owner of the Data Lake Analytics resource you’re deploying to. This is configured via the Access Control blade.

· Owner of the Data Lake Store resource associated to the Analytics resource you’re deploying to. This is configured via the Access Control blade.

· Read, Write, and Execute permissions against the sub-folders within the Data Lake Store. Ensure you select ‘This folder and all children’ and ‘An access permission entry and a default permission entry’. This is configured by entering the Data Lake Store, selecting Data Explorer, then Access.

Bulk updating SSIS package protection level with DTUTIL

I was recently working with an SSIS project with well over 100 packages. It had been developed using the ‘Encrypt Sensitive with User Key’ package and project protection level, project deployment model, and utilizing components from the Azure Feature Pack.

The project was to be deployed to a remote server within another domain using a different user to the one which developed the packages, which caused an error with the ‘Encrypt Sensitive with User Key’ setting upon deployment.

As a side note, regarding package protection level and the Azure Feature Pack, ‘Don’t Save Sensitive’ cannot be used as you’ll receive an error stating: “Error when copying file from Azure Data Lake Store to local drive using SSIS. [Azure Data Lake Store File System Task] Error: There is an internal error to refresh token. Value cannot be null. Parameter name: input”. There seems to be an issue with the components not accepting values from an environment variable if ‘Don’t Save Sensitive’ is used.

With project deployment model, the project and all packages within it need to have the same protection level, but there’s no easy way from within visual studio to apply the update to all packages, and I didn’t want to have to open 100 packages individually to change the setting manually! SQL and DTUTIL to the rescue…

Step One

Check the entire project and all packages out of source control, making sure you have the latest version. Close visual studio.

Step Two

Change the protection level at the project level by right clicking the project and selecting properties. A dialogue box should open, on the project tab you should see a Security section, under which you can change the protection level.

Step Three

Using SQL against the SSISDB (assuming the project is deployed locally), you can easily produce a list of all the packages you need to update within a DTUTIL string to change the encryption setting:


DECLARE @FolderPath VARCHAR(1000) = 'C:\SourceControl\FolderX'
DECLARE @DtutilString VARCHAR(1000) = 
	'dtutil.exe /file "'+ @FolderPath +'\XXX" /encrypt file;"'+ @FolderPath +'\XXX";2;Password1! /q'

       REPLACE(@DtutilString, 'XXX', [name])
FROM internal.packages
WHERE project_id = 2

This query will produce you a string like the below for each package in your project:

dtutil.exe /file "C:\SourceControl\FolderX\Package1.dtsx" /encrypt file;"C:\SourceControl\FolderX\Package1.dtsx";2;Password1! /q

This executes DTUTIL for the file specified, encrypting the package using ‘Encrypt with password’ (2) and the password Password1! in this case. The /q tells the command to execute “quietly” – meaning you won’t be prompted with an “Are you sure?” message each time. More information about DTUTIL can be found here.

If the project isn’t deployed, the same can be achieved through PowerShell against the folder the packages live in.

Step Four

Copy the list of packages generated by Step Three, open notepad, and paste. Save the file as a batch file (.bat).

Run the batch file as an administrator (right click the file, select run as admin).

A command prompt window should open, and you should see it streaming through you packages with a success message.

Step Five

The encryption setting for each package is also stored in the project file, and needs to be changed here too.

Find the project file for the project you’re working with (.dtproj), right click it and select Open with, then notepad or your preferred text editor.

Within the SSIS:PackageMetaData node for each package, there’s a nested <SSIS:Property SSIS:Name=”ProtectionLevel”> element containing the integer value for its corresponding protection level.

Run a find replace for this full string, swapping only the integer value to the desired protection level. In this example we’re going from ‘Don’t Save Sensitive’ (0) to ‘Encrypt with password’ (2):


Save and close the project file.

Step Six

Re-open visual studio and your project, spot check a few packages to ensure the correct protection level is now selected. Build your project, and check back in to source control once satisfied.

SSIS Azure Data Lake Store Destination Mapping Problem

The Problem

My current project involves Azure’s Data Lake Store and Analytics. We’re using the SSIS Azure Feature Pack’s Azure Data Lake Store Destination to move data from our clients on premise system into the Lake, then using U-SQL to generate a delta file which goes on to be loaded into the warehouse. U-SQL is a “schema-on-read” language, which means you need a consistent and predictable format to be able to define the schema as you pull data out.

We ran in to an issue with this schema-on-read approach, but once you understand the issue, it’s simple to rectify. The Data Lake Store Destination task does not use the same column ordering which is shown in the destination mapping. Instead, it appears to rely on an underlying column identifier. This means that if you apply any conversions to a column in the data flow, this column will automatically be placed at the end of file– taking away the predictability of the file format, and potentially making your schema inconsistent if you have historic data in the Lake.

An Example

Create a simple package which pulls data from a flat file and moves it into the Lake.


Mappings of the Destination are as follows:


Running the package, and viewing the file in the Lake gives us the following (as we’d expect, based on the mappings):


Now add a conversion task – the one in my package just converts Col2 to a DT_I4, update the mappings in the destination, and run the package.


Open the file up in the Lake again, and you’ll find that Col2 is now at the end and contains the name of the input column, not the destination column:


The Fix

As mention in my “The Problem” section, the fix is extremely simple – just handle it in your U-SQL by re-ordering the columns appropriately during extraction! This article is more about giving a heads up and highlighting the problem, than a mind-blowing solution.