This is the third post in a series on Azure Data Factory Custom Activity Development.
Are We There Yet?
Azure Data Factory is, as we all know, essentially an execution platform in the cloud for orchestrating data processing. This means however that when it comes to working locally on your development environment, there isn’t much you can do in the way of actually running your code. That means no debugging. Developing Custom Activities without an environment in which to debug them is pretty painful. You package up your Activity, copy it to Azure Blob Storage, reschedule your Pipeline Activity data slice and wait with fingers crossed to see whether your Activity is doing what you were convinced it should have done the last time you went through this process. And then nope, still something amiss. Pick through the log messages to try and decipher where it went belly up and why. Add some more output statements somewhere near where you think the issue is and fire it off into the ADF stratosphere for yet another test flight. As I’m sure you know, this can very quickly become not only frustrating if your bug is one that is not so easy to find, but also very very time consuming.
What if, instead of this self-flagellating loop of despair, you could simulate all that Azure Data Factory runtime stuff right here on your dev machine, where it really matters? Well turns out I wasn’t just pointing out the pain points because I want us to revisit all those flashbacks of treacle-wading development experiences. There is hope, and there are two readily available solutions to this debugging slogathon.
The first approach to allowing that most welcome ability to step through your misbehaving Custom Activity code is to use a base class for your activity, deriving your Custom Activity from this. The base class also includes a number of helper methods such as GetExtendedProperty() and GetDataSet() that make inspecting the Activity easier. You can find this project, the ADFCustomActivityRunner, on GitHub here.
I have to say I’m not a fan of including additional code to your deployment for the purpose of debugging, and I didn’t have much joy getting this to work, at least in the short time I spent with it, so I looked for another solution. I realise I’m not doing justice to the hard work put into the project here, so please do let me know how you get on with it if you choose to use this.
I did find a more preferable (at least in my view) alternative to our Data Factory debugging requirement. This is the ADFLocalEnvironment project on GitHub, available here. It also creates an execution environment for our Activity, but does not require any base class from which we need to derive our Custom Activities. For me this is a cleaner solution, with no inherent dependency within our deployed code on another code base. The project also has code to export to ARM Templates, which looks very useful, although surplus to my my pressing desire to be able to F11 through the Custom Activity code.
Having downloaded the source code and built the assembly, there are essentially two things we need to do to set up our debugging environment:
1: Create ADF Pipeline Project
In order to be able to run our Custom Activity within the provided local environment, we need to wrap it into an Azure Data Factory Pipeline. Unless you have some complex Input Dataset requirements as a precursor to your debugging activities, this Pipeline can be as simple as creating some dummy datasets to serve as Input and Output, and having the Custom Activity in question sitting in the pipeline all on its lonesome. No need for anything else as they won’t be being used. This is purely a harness for including our Custom Activity.
2: Create an Executable Project
So as to be able to execute our code for stepping into from within Visual Studio,we need to create a small harness project from which we can call this local environment to run our Custom Activity Pipeline. This can be a console app, VS/NUnit Test project or whatever you want to use. All you need is to be able to call the ADFLocalEnvironment code which will handle running your Pipeline Activity. So we add our harness project, add the reference to the ADFLocalEnvironment, and then go about writing the code to execute the Pipeline.
private string adfProjectPath = @"..\..\..\CustomActivities.DataFactory\CustomActivities.DataFactory.dfproj";
ADFLocalEnvironment env = new ADFLocalEnvironment(adfProjectPath, "MyConfig");
env.ExecuteActivity("PL_HiveDataValidationTest", "HiveDataValidation", DateTime.Now.AddSeconds(2), DateTime.Now.AddDays(1));
So we set the relative path to the project containing our skeleton pipeline, create a new Local Environment pointing to this (including an optional ADF config file should you want to) and then simulate the execution of the required Custom Activity within the skeleton pipeline we created earlier. We can then set debug points within our Custom Activity allowing runtime inspection and all of a sudden there are small fluffy bunnies playing on hillocks and a beautiful sunrise, who knows maybe even a tax rebate waiting on the door matt, and the world is indeed a wonderful place once more.
If you are using the above ADFLocalEnvironment within a VS Test project, there is a small change that you will need to make to the source code. I have raised this as an issue on GitHub here, so hopefully this will make it into the code base soon. I’ll detail it below briefly so that you can make the change yourself. in the ADFLocalEnvironment.cs file, at line 329, you will need to change how the debugger build path is determined, due to the slightly different value returned from AppDomain.CurrentDomain.BaseDirectory when running withVS Test. Here’s the change:
debuggerBuildPath = string.Join("\\", AppDomain.CurrentDomain.BaseDirectory.GetTokens('\\', 0, AppDomain.CurrentDomain.BaseDirectory.EndsWith("\\") ? 3 : 2, true));
That should be it and you’re all good to go bug hunting.
I would like to thank the above two GitHub project contributors for all their hard work in making ADF Custom Activities a place where we no longer need to fear to tread. If I was American I would probably say something like “You guys are awesome”, but I’m not, and as you don’t actually fill me with awe it would be wrong for me to do so, but thanks all the same for some amazing development that has at least made my work a whole lot easier. Whoop whoop.
The next post in the series will look at that most important part of the development cycle. Part 4: Testing ADF Pipelines