With the rise of self-service business intelligence tools, like Power BI, and an increased engagement with data in the workplace, people’s expectations of where they can find expert information about data has changed. Where previously there would an expert that people would have to book time with in order to understand data, now people expect to get quick and detailed information about the data assets that an enterprise holds and maintains without going through a single contact. With Azure Data Catalog, data consumers can quickly discover data assets and gain knowledge about the data from documentation, tags and glossary terms from the subject matter experts. This post aims to give a brief introduction to Azure Data Catalog and what it can broadly be used for.
What is Azure Data Catalog?
Azure Data Catalog is a fully managed Azure service which is an enterprise-wide metadata catalogue that enables data discovery. With Azure Data Catalog, you register; discover; annotate; and, for some sources, connect to data assets. Azure Data Catalog is designed to manage disparate information about data; to make it easy to find data assets, understand them, and connect to them. Any user (analyst, data scientist, or developer) can discover, understand, and consume data sources. Azure Data Catalog is a one-stop central shop for all users to contribute their knowledge and build a community and culture of data.
What can Azure Data Catalog be used for?
As mentioned in the earlier headings, Azure Data Catalog can be used for data asset management; data governance; and data discovery. For data asset management, this means knowing what data is available and where; for data governance teams, this means answering questions like: where is my customer data? or what does this data model look like?; for data discovery, this means knowing which data is suitable for particular reports and who you can go to if you have any questions. There are some common scenarios for using Azure Data Catalog that Microsoft has put together, and it’s well worth reading to get a fuller understanding of what Azure Data Catalog can be used for.
Operations Management Suite (OMS) is an Azure based tool that helps manage your entire IT infrastructure, whether on premise or in the cloud. OMS allows you to monitor the machines you have in your infrastructure and provides a bridge for a hybrid cloud solution, by moving multi-tier workloads into Azure, or run tests on a copy of production workloads in Azure, as well as storing critical data in Azure.
Configuring and deploying OMS is very quick and relatively straightforward. This post will deal with creating a hybrid cloud solution using OMS and configuring the solution so you can make best use of resources.
Within the Azure management portal you will have to create a new Operational Insights workspace as detailed below:
Once the Operational Insights has been created, you can navigate to it in the Azure Portal and click “Manage” which will bring up the OMS itself. The start page should look like this:
And you want to click on the “Get Started” button to begin creating your hybrid cloud solution. You then want to add various solutions to the suite. For this demonstration we’re going to accept the default and have all the solutions.
The next step is to connect a data source, which can be an on premise machine, a virtual machine or an Azure data storage. For this demonstration we are only interested in connecting to an on premise machine and a VM.