Adatis

Adatis BI Blogs

Databricks UDF Performance Comparisons

I’ve recently been spending quite a bit of time on the Azure Databricks platform, and while learning decided it was worth using it to experiment with some common data warehousing tasks in the form of data cleansing. As Databricks provides us with a platform to run a Spark environment on, it offers options to use cross-platform APIs that allow us to write code in Scala, Python, R, and SQL within the same notebook. As with most things in life, not everything is equal and there are potential differences in performance between them. In this blog, I will explain the tests I produced with the aim of outlining best practice for Databricks implementations for UDFs of this nature. Scala is the native language for Spark – and without going into too much detail here, it will compile down faster to the JVM for processing. Under the hood, Python on the other hand provides a wrapper around the code but in reality is a Scala program telling the cluster what to do, and being transformed by Scala code. Converting these objects into a form Python can read is called serialisation / deserialisation, and its expensive, especially over time and across a distributed dataset. This most expensive scenario occurs through UDFs (functions) – the runtime process for which can be seen below. The overhead here is in (4) and (5) to read the data and write into JVM memory. Using Scala to create the UDFs, the execution process can skip these steps and keep everything native. Scala UDFs operate within the JVM of the executor so we can skip serialisation and deserialisation.   Experiments As part of my data for this task I took a list of company names from a data set and then run them through a process to codify them, essentially stripping out characters which cause them to be unique and converting them to upper case, thus grouping a set of companies together under the same name. For instance Adatis, Adatis Ltd, and Adatis (Ltd) would become ADATIS. This was an example of a typical cleansing activity when working with data sets. The dataset in question was around 2.5GB and contained 10.5m rows. The cluster I used was Databricks runtime 4.2 (Spark 2.3.1 / Scala 2.11) with Standard_DS2_v2 VMs for the driver/worker nodes (14GB memory) with autoscaling disabled and limited to 2 workers. I disabled the autoscaling for this as I was seeing wildly inconsistent timings each run which impacted the tests. The goods news is that with it enabled and using up to 8 workers, the timings were about 20% faster albeit more erratic from a standard deviation point of view. The following approaches were tested: Scala program calls Scala UDF via Function Scala program calls Scala UDF via SQL Python program calls Scala UDF via SQL Python program calls Python UDF via Function Python program calls Python Vectorised UDF via Function Python program uses SQL While it was true in previous versions of Spark that there was a difference between these using Scala/Python, in the latest version of Spark (2.3) it is believed to be more of a level playing field by using Apache Arrow in the form of Vectorised Pandas UDFs within Python. As part of the tests I also wanted to use Python to call a Scala UDF via a function but unfortunately we cannot do this without creating a Jar file of the Scala code and importing it separately. This would be done via SBT (build tool) using the following guide here. I considered this too much of an overhead for the purposes of the experiment. The following code was then used as part of a Databricks notebook to define the tests. A custom function to time the write was required for Scala whereas Python allows us to use %timeit for a similar purpose.   Scala program calls Scala UDF via Function // Scala program calls Scala UDF via Function %scala def codifyScalaUdf = udf((string: String) => string.toUpperCase.replace(" ", "").replace("#","").replace(";","").replace("&","").replace(" AND ","").replace(" THE ","").replace("LTD","").replace("LIMITED","").replace("PLC","").replace(".","").replace(",","").replace("[","").replace("]","").replace("LLP","").replace("INC","").replace("CORP","")) spark.udf.register("ScalaUdf", codifyScalaUdf) val transformedScalaDf = table("DataTable").select(codifyScalaUdf($"CompanyName").alias("CompanyName")) val ssfTime = timeIt(transformedScalaDf.write.mode("overwrite").format("parquet").saveAsTable("SSF"))   Scala program calls Scala UDF via SQL // Scala program calls Scala UDF via SQL %scala val sss = spark.sql("SELECT ScalaUdf(CompanyName) as a from DataTable where CompanyName is not null") val sssTime = timeIt(sss.write.mode("overwrite").format("parquet").saveAsTable("SSS"))   Python program calls Scala UDF via SQL # Python program calls Scala UDF via SQL pss = spark.sql("SELECT ScalaUdf(CompanyName) as a from DataTable where CompanyName is not null") %timeit -r 1 pss.write.format("parquet").saveAsTable("PSS", mode='overwrite')   Python program calls Python UDF via Function # Python program calls Python UDF via Function from pyspark.sql.functions import * from pyspark.sql.types import StringType @udf(StringType()) def pythonCodifyUDF(string): return (string.upper().replace(" ", "").replace("#","").replace(";","").replace("&","").replace(" AND ","").replace(" THE ","").replace("LTD","").replace("LIMITED","").replace("PLC","").replace(".","").replace(",","").replace("[","").replace("]","").replace("LLP","").replace("INC","").replace("CORP","")) pyDF = df.select(pythonCodifyUDF(col("CompanyName")).alias("CompanyName")).filter(col("CompanyName").isNotNull()) %timeit -r 1 pyDF.write.format("parquet").saveAsTable("PPF", mode='overwrite')   Python program calls Python Vectorised UDF via Function # Python program calls Python Vectorised UDF via Function from pyspark.sql.types import StringType from pyspark.sql.functions import pandas_udf, col @pandas_udf(returnType=StringType()) def pythonCodifyVecUDF(string): return (string.replace(" ", "").replace("#","").replace(";","").replace("&","").replace(" AND ","").replace(" THE ","").replace("LTD","").replace("LIMITED","").replace("PLC","").replace(".","").replace(",","").replace("[","").replace("]","").replace("LLP","").replace("INC","").replace("CORP","")).str.upper() pyVecDF = df.select(pythonCodifyVecUDF(col("CompanyName")).alias("CompanyName")).filter(col("CompanyName").isNotNull()) %timeit -r 1 pyVecDF.write.format("parquet").saveAsTable("PVF", mode='overwrite')   Python Program uses SQL # Python Program uses SQL sql = spark.sql("SELECT upper(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(CompanyName,' ',''),'&',''),';',''),'#',''),' AND ',''),' THE ',''),'LTD',''),'LIMITED',''),'PLC',''),'.',''),',',''),'[',''),']',''),'LLP',''),'INC',''),'CORP','')) as a from DataTable where CompanyName is not null") %timeit -r 1 sql.write.format("parquet").saveAsTable("SQL", mode='overwrite')   Results and Observations It was interesting to note the following: The hypothesis above does indeed hold true and the 2 methods which were expected to be slowest were within the experiment, and by a considerable margin. The Scala UDF performs consistently regardless of the method used to call the UDF. The Python vectorised UDF now performs on par with the Scala UDFs and there is a clear difference between the vectorised and non-vectorised Python UDFs. The standard deviation for the vectorised UDF was surprisingly low and the method was performing consistently each run. The non-vectorised Python UDF was the opposite. To summarise, moving forward – as long as you adopt to writing your UDFs in Scala or use the vectorised version of the Python UDF, the performance will be similar for this type of activity. Its worth noting to definitely avoid writing the UDFs as standard Python functions due to the theory and results above. Over time, across a complete solution and with more data, this time would add up.

Embedding Databricks Notebooks into a BlogEngine.NET post

Databricks is a buzzword now. This means that each day more and more related content appears on the net. With Databricks Notebooks it’s so easy to share code. If by any chance you need to share a notebook directly in your blog post here are some short guidelines on how to do so. If your favourite blog engine appears to be BlogEngine.NET it’s not so straightforward task. Fear not – following are the steps you need to take: Export your notebook to HTML from the Databricks portal: Make the following text replacements in the exported HTML: replace & with &amp; and “ with &quot; Paste the following code in the blog’s source replacing both the [[HEIGHT]] placeholders with the notebook’s height in pixels (you may need a little trial and error to get to the exact value so that the vertical scrollers disappear) and [[NOTEBOOK_HTML]] placeholder with the resulting HTML from the above point: <iframe height="[[HEIGHT]]px" frameborder="0" scrolling="no" width="100%" style="width: 100%; height: [[HEIGHT]]px;" sandbox="allow-forms allow-pointer-lock allow-popups allow-presentation allow-same-origin allow-scripts allow-top-navigation" srcdoc="[[NOTEBOOK_HTML]]"></iframe> To make everything compatible with Internet Explorer and Edge, at the very end of your blog script place between <script></script> tags the wonderful srcdoc-polyfill script Here is an example notebook with the above steps applied: Blog2 - Databricks window.settings = {"enableUsageDeliveryConfiguration":false,"enableNotebookNotifications":true,"enableSshKeyUI":false,"defaultInteractivePricePerDBU":0.55,"enableDynamicAutoCompleteResourceLoading":false,"enableClusterMetricsUI":false,"allowWhitelistedIframeDomains":true,"enableOnDemandClusterType":true,"enableAutoCompleteAsYouType":[],"devTierName":"Community Edition","enableJobsPrefetching":true,"workspaceFeaturedLinks":[{"linkURI":"https://docs.azuredatabricks.net/index.html","displayName":"Documentation","icon":"question"},{"linkURI":"https://docs.azuredatabricks.net/release-notes/product/index.html","displayName":"Release Notes","icon":"code"},{"linkURI":"https://docs.azuredatabricks.net/spark/latest/training/index.html","displayName":"Training & Tutorials","icon":"graduation-cap"}],"enableReservoirTableUI":true,"enableClearStateFeature":true,"dbcForumURL":"http://forums.databricks.com/","enableProtoClusterInfoDeltaPublisher":true,"enableAttachExistingCluster":true,"sandboxForSandboxFrame":"allow-scripts allow-popups allow-popups-to-escape-sandbox allow-forms","resetJobListOnConnect":true,"serverlessDefaultSparkVersion":"latest-stable-scala2.11","maxCustomTags":8,"serverlessDefaultMaxWorkers":8,"enableInstanceProfilesUIInJobs":true,"nodeInfo":{"node_types":[{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":7284,"instance_type_id":"Standard_DS3_v2","node_type_id":"Standard_DS3_v2","description":"Standard_DS3_v2","support_cluster_tags":true,"container_memory_mb":9105,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS3_v2","provider":"Azure","local_disk_size_gb":28,"supports_accelerated_networking":true,"compute_units":4.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":14336,"num_cores":4,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":14336,"is_hidden":false,"category":"General Purpose","num_cores":4.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":18409,"instance_type_id":"Standard_DS4_v2","node_type_id":"Standard_DS4_v2","description":"Standard_DS4_v2","support_cluster_tags":true,"container_memory_mb":23011,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS4_v2","provider":"Azure","local_disk_size_gb":56,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":28672,"num_cores":8,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":28672,"is_hidden":false,"category":"General Purpose","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":40658,"instance_type_id":"Standard_DS5_v2","node_type_id":"Standard_DS5_v2","description":"Standard_DS5_v2","support_cluster_tags":true,"container_memory_mb":50823,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS5_v2","provider":"Azure","local_disk_size_gb":112,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":57344,"num_cores":16,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":57344,"is_hidden":false,"category":"General Purpose","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":21587,"instance_type_id":"Standard_D8s_v3","node_type_id":"Standard_D8s_v3","description":"Standard_D8s_v3","support_cluster_tags":true,"container_memory_mb":26984,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":348},"node_instance_type":{"instance_type_id":"Standard_D8s_v3","provider":"Azure","local_disk_size_gb":64,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":32768,"num_cores":8,"cpu_quota_type":"Standard DSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":32768,"is_hidden":false,"category":"General Purpose","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":47015,"instance_type_id":"Standard_D16s_v3","node_type_id":"Standard_D16s_v3","description":"Standard_D16s_v3","support_cluster_tags":true,"container_memory_mb":58769,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":348},"node_instance_type":{"instance_type_id":"Standard_D16s_v3","provider":"Azure","local_disk_size_gb":128,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":65536,"num_cores":16,"cpu_quota_type":"Standard DSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":65536,"is_hidden":false,"category":"General Purpose","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":97871,"instance_type_id":"Standard_D32s_v3","node_type_id":"Standard_D32s_v3","description":"Standard_D32s_v3","support_cluster_tags":true,"container_memory_mb":122339,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":348},"node_instance_type":{"instance_type_id":"Standard_D32s_v3","provider":"Azure","local_disk_size_gb":256,"supports_accelerated_networking":true,"compute_units":32.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":131072,"num_cores":32,"cpu_quota_type":"Standard DSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":131072,"is_hidden":false,"category":"General Purpose","num_cores":32.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":199583,"instance_type_id":"Standard_D64s_v3","node_type_id":"Standard_D64s_v3","description":"Standard_D64s_v3","support_cluster_tags":true,"container_memory_mb":249479,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":348},"node_instance_type":{"instance_type_id":"Standard_D64s_v3","provider":"Azure","local_disk_size_gb":512,"supports_accelerated_networking":true,"compute_units":64.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":262144,"num_cores":64,"cpu_quota_type":"Standard DSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":262144,"is_hidden":false,"category":"General Purpose","num_cores":64.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":7284,"instance_type_id":"Standard_D3_v2","node_type_id":"Standard_D3_v2","description":"Standard_D3_v2","support_cluster_tags":true,"container_memory_mb":9105,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D3_v2","provider":"Azure","local_disk_size_gb":200,"supports_accelerated_networking":true,"compute_units":4.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":14336,"num_cores":4,"cpu_quota_type":"Standard Dv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":14336,"is_hidden":false,"category":"General Purpose (HDD)","num_cores":4.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":21587,"instance_type_id":"Standard_D8_v3","node_type_id":"Standard_D8_v3","description":"Standard_D8_v3","support_cluster_tags":true,"container_memory_mb":26984,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D8_v3","provider":"Azure","local_disk_size_gb":200,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":32768,"num_cores":8,"cpu_quota_type":"Standard Dv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":32768,"is_hidden":false,"category":"General Purpose (HDD)","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":47015,"instance_type_id":"Standard_D16_v3","node_type_id":"Standard_D16_v3","description":"Standard_D16_v3","support_cluster_tags":true,"container_memory_mb":58769,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D16_v3","provider":"Azure","local_disk_size_gb":400,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":65536,"num_cores":16,"cpu_quota_type":"Standard Dv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":65536,"is_hidden":false,"category":"General Purpose (HDD)","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":97871,"instance_type_id":"Standard_D32_v3","node_type_id":"Standard_D32_v3","description":"Standard_D32_v3","support_cluster_tags":true,"container_memory_mb":122339,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D32_v3","provider":"Azure","local_disk_size_gb":800,"supports_accelerated_networking":true,"compute_units":32.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":131072,"num_cores":32,"cpu_quota_type":"Standard Dv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":131072,"is_hidden":false,"category":"General Purpose (HDD)","num_cores":32.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":199583,"instance_type_id":"Standard_D64_v3","node_type_id":"Standard_D64_v3","description":"Standard_D64_v3","support_cluster_tags":true,"container_memory_mb":249479,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D64_v3","provider":"Azure","local_disk_size_gb":1600,"supports_accelerated_networking":true,"compute_units":64.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":262144,"num_cores":64,"cpu_quota_type":"Standard Dv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":262144,"is_hidden":false,"category":"General Purpose (HDD)","num_cores":64.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":18409,"instance_type_id":"Standard_D12_v2","node_type_id":"Standard_D12_v2","description":"Standard_D12_v2","support_cluster_tags":true,"container_memory_mb":23011,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D12_v2","provider":"Azure","local_disk_size_gb":200,"supports_accelerated_networking":true,"compute_units":4.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":28672,"num_cores":4,"cpu_quota_type":"Standard Dv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":28672,"is_hidden":false,"category":"Memory Optimized (Remote HDD)","num_cores":4.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":40658,"instance_type_id":"Standard_D13_v2","node_type_id":"Standard_D13_v2","description":"Standard_D13_v2","support_cluster_tags":true,"container_memory_mb":50823,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D13_v2","provider":"Azure","local_disk_size_gb":400,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":57344,"num_cores":8,"cpu_quota_type":"Standard Dv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":57344,"is_hidden":false,"category":"Memory Optimized (Remote HDD)","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":85157,"instance_type_id":"Standard_D14_v2","node_type_id":"Standard_D14_v2","description":"Standard_D14_v2","support_cluster_tags":true,"container_memory_mb":106447,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D14_v2","provider":"Azure","local_disk_size_gb":800,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":114688,"num_cores":16,"cpu_quota_type":"Standard Dv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":114688,"is_hidden":false,"category":"Memory Optimized (Remote HDD)","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":107407,"instance_type_id":"Standard_D15_v2","node_type_id":"Standard_D15_v2","description":"Standard_D15_v2","support_cluster_tags":true,"container_memory_mb":134259,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_D15_v2","provider":"Azure","local_disk_size_gb":1000,"supports_accelerated_networking":true,"compute_units":20.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":143360,"num_cores":20,"cpu_quota_type":"Standard Dv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":143360,"is_hidden":false,"category":"Memory Optimized (Remote HDD)","num_cores":20.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":18409,"instance_type_id":"Standard_DS12_v2","node_type_id":"Standard_DS12_v2","description":"Standard_DS12_v2","support_cluster_tags":true,"container_memory_mb":23011,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS12_v2","provider":"Azure","local_disk_size_gb":56,"supports_accelerated_networking":true,"compute_units":4.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":28672,"num_cores":4,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":28672,"is_hidden":false,"category":"Memory Optimized","num_cores":4.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":40658,"instance_type_id":"Standard_DS13_v2","node_type_id":"Standard_DS13_v2","description":"Standard_DS13_v2","support_cluster_tags":true,"container_memory_mb":50823,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS13_v2","provider":"Azure","local_disk_size_gb":112,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":57344,"num_cores":8,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":57344,"is_hidden":false,"category":"Memory Optimized","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":85157,"instance_type_id":"Standard_DS14_v2","node_type_id":"Standard_DS14_v2","description":"Standard_DS14_v2","support_cluster_tags":true,"container_memory_mb":106447,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS14_v2","provider":"Azure","local_disk_size_gb":224,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":114688,"num_cores":16,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":114688,"is_hidden":false,"category":"Memory Optimized","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":107407,"instance_type_id":"Standard_DS15_v2","node_type_id":"Standard_DS15_v2","description":"Standard_DS15_v2","support_cluster_tags":true,"container_memory_mb":134259,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":342},"node_instance_type":{"instance_type_id":"Standard_DS15_v2","provider":"Azure","local_disk_size_gb":280,"supports_accelerated_networking":true,"compute_units":20.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":143360,"num_cores":20,"cpu_quota_type":"Standard DSv2 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":143360,"is_hidden":false,"category":"Memory Optimized","num_cores":20.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":47015,"instance_type_id":"Standard_E8s_v3","node_type_id":"Standard_E8s_v3","description":"Standard_E8s_v3","support_cluster_tags":true,"container_memory_mb":58769,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_E8s_v3","provider":"Azure","local_disk_size_gb":128,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":65536,"num_cores":8,"cpu_quota_type":"Standard ESv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":65536,"is_hidden":false,"category":"Memory Optimized","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":97871,"instance_type_id":"Standard_E16s_v3","node_type_id":"Standard_E16s_v3","description":"Standard_E16s_v3","support_cluster_tags":true,"container_memory_mb":122339,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_E16s_v3","provider":"Azure","local_disk_size_gb":256,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":131072,"num_cores":16,"cpu_quota_type":"Standard ESv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":131072,"is_hidden":false,"category":"Memory Optimized","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":199583,"instance_type_id":"Standard_E32s_v3","node_type_id":"Standard_E32s_v3","description":"Standard_E32s_v3","support_cluster_tags":true,"container_memory_mb":249479,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_E32s_v3","provider":"Azure","local_disk_size_gb":512,"supports_accelerated_networking":true,"compute_units":32.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":262144,"num_cores":32,"cpu_quota_type":"Standard ESv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":262144,"is_hidden":false,"category":"Memory Optimized","num_cores":32.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":21587,"instance_type_id":"Standard_L4s","node_type_id":"Standard_L4s","description":"Standard_L4s","support_cluster_tags":true,"container_memory_mb":26984,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_L4s","provider":"Azure","local_disk_size_gb":678,"supports_accelerated_networking":false,"compute_units":4.0,"number_of_ips":2,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":32768,"num_cores":4,"cpu_quota_type":"Standard LS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":32768,"is_hidden":false,"category":"Storage Optimized","num_cores":4.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":47015,"instance_type_id":"Standard_L8s","node_type_id":"Standard_L8s","description":"Standard_L8s","support_cluster_tags":true,"container_memory_mb":58769,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_L8s","provider":"Azure","local_disk_size_gb":1388,"supports_accelerated_networking":false,"compute_units":8.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":65536,"num_cores":8,"cpu_quota_type":"Standard LS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":65536,"is_hidden":false,"category":"Storage Optimized","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":97871,"instance_type_id":"Standard_L16s","node_type_id":"Standard_L16s","description":"Standard_L16s","support_cluster_tags":true,"container_memory_mb":122339,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_L16s","provider":"Azure","local_disk_size_gb":2807,"supports_accelerated_networking":false,"compute_units":16.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":131072,"num_cores":16,"cpu_quota_type":"Standard LS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":131072,"is_hidden":false,"category":"Storage Optimized","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":199583,"instance_type_id":"Standard_L32s","node_type_id":"Standard_L32s","description":"Standard_L32s","support_cluster_tags":true,"container_memory_mb":249479,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_L32s","provider":"Azure","local_disk_size_gb":5630,"supports_accelerated_networking":false,"compute_units":32.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":262144,"num_cores":32,"cpu_quota_type":"Standard LS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":262144,"is_hidden":false,"category":"Storage Optimized","num_cores":32.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":2516,"instance_type_id":"Standard_F4s","node_type_id":"Standard_F4s","description":"Standard_F4s","support_cluster_tags":true,"container_memory_mb":3146,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_F4s","provider":"Azure","local_disk_size_gb":16,"supports_accelerated_networking":true,"compute_units":4.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":8192,"num_cores":4,"cpu_quota_type":"Standard FS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":16,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":8192,"is_hidden":false,"category":"Compute Optimized","num_cores":4.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":8873,"instance_type_id":"Standard_F8s","node_type_id":"Standard_F8s","description":"Standard_F8s","support_cluster_tags":true,"container_memory_mb":11092,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_F8s","provider":"Azure","local_disk_size_gb":32,"supports_accelerated_networking":true,"compute_units":8.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":16384,"num_cores":8,"cpu_quota_type":"Standard FS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":16384,"is_hidden":false,"category":"Compute Optimized","num_cores":8.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":21587,"instance_type_id":"Standard_F16s","node_type_id":"Standard_F16s","description":"Standard_F16s","support_cluster_tags":true,"container_memory_mb":26984,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":350},"node_instance_type":{"instance_type_id":"Standard_F16s","provider":"Azure","local_disk_size_gb":64,"supports_accelerated_networking":true,"compute_units":16.0,"number_of_ips":16,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":32768,"num_cores":16,"cpu_quota_type":"Standard FS Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"},{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":32768,"is_hidden":false,"category":"Compute Optimized","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":0,"spark_heap_memory":85157,"instance_type_id":"Standard_H16","node_type_id":"Standard_H16","description":"Standard_H16","support_cluster_tags":true,"container_memory_mb":106447,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":8},"node_instance_type":{"instance_type_id":"Standard_H16","provider":"Azure","local_disk_size_gb":2000,"supports_accelerated_networking":false,"compute_units":16.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":0,"memory_mb":114688,"num_cores":16,"cpu_quota_type":"Standard H Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":114688,"is_hidden":false,"category":"Compute Optimized","num_cores":16.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":2,"spark_heap_memory":85157,"instance_type_id":"Standard_NC12","node_type_id":"Standard_NC12","description":"Standard_NC12 (beta)","support_cluster_tags":true,"container_memory_mb":106447,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":48},"node_instance_type":{"instance_type_id":"Standard_NC12","provider":"Azure","local_disk_size_gb":680,"supports_accelerated_networking":false,"compute_units":12.0,"number_of_ips":2,"local_disks":1,"reserved_compute_units":1.0,"gpus":2,"memory_mb":114688,"num_cores":12,"cpu_quota_type":"Standard NC Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":48,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":114688,"is_hidden":false,"category":"GPU Accelerated","num_cores":12.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":4,"spark_heap_memory":174155,"instance_type_id":"Standard_NC24","node_type_id":"Standard_NC24","description":"Standard_NC24 (beta)","support_cluster_tags":true,"container_memory_mb":217694,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":48},"node_instance_type":{"instance_type_id":"Standard_NC24","provider":"Azure","local_disk_size_gb":1440,"supports_accelerated_networking":false,"compute_units":24.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":4,"memory_mb":229376,"num_cores":24,"cpu_quota_type":"Standard NC Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":64,"supported_disk_types":[{"azure_disk_volume_type":"STANDARD_LRS"}],"reserved_memory_mb":4800},"memory_mb":229376,"is_hidden":false,"category":"GPU Accelerated","num_cores":24.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":1,"spark_heap_memory":85157,"instance_type_id":"Standard_NC6s_v3","node_type_id":"Standard_NC6s_v3","description":"Standard_NC6s_v3 (beta)","support_cluster_tags":true,"container_memory_mb":106447,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":0},"node_instance_type":{"instance_type_id":"Standard_NC6s_v3","provider":"Azure","local_disk_size_gb":736,"supports_accelerated_networking":false,"compute_units":6.0,"number_of_ips":4,"local_disks":1,"reserved_compute_units":1.0,"gpus":1,"memory_mb":114688,"num_cores":6,"cpu_quota_type":"Standard NCSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":12,"supported_disk_types":[{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":114688,"is_hidden":false,"category":"GPU Accelerated","num_cores":6.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":2,"spark_heap_memory":174155,"instance_type_id":"Standard_NC12s_v3","node_type_id":"Standard_NC12s_v3","description":"Standard_NC12s_v3 (beta)","support_cluster_tags":true,"container_memory_mb":217694,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":0},"node_instance_type":{"instance_type_id":"Standard_NC12s_v3","provider":"Azure","local_disk_size_gb":1474,"supports_accelerated_networking":false,"compute_units":12.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":2,"memory_mb":229376,"num_cores":12,"cpu_quota_type":"Standard NCSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":24,"supported_disk_types":[{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":229376,"is_hidden":false,"category":"GPU Accelerated","num_cores":12.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false},{"display_order":0,"support_ssh":true,"num_gpus":4,"spark_heap_memory":352151,"instance_type_id":"Standard_NC24s_v3","node_type_id":"Standard_NC24s_v3","description":"Standard_NC24s_v3 (beta)","support_cluster_tags":true,"container_memory_mb":440189,"node_info":{"status":["NotEnabledOnSubscription"],"available_core_quota":0},"node_instance_type":{"instance_type_id":"Standard_NC24s_v3","provider":"Azure","local_disk_size_gb":2948,"supports_accelerated_networking":false,"compute_units":24.0,"number_of_ips":8,"local_disks":1,"reserved_compute_units":1.0,"gpus":4,"memory_mb":458752,"num_cores":24,"cpu_quota_type":"Standard NCSv3 Family vCPUs","local_disk_type":"AHCI","max_attachable_disks":32,"supported_disk_types":[{"azure_disk_volume_type":"PREMIUM_LRS"}],"reserved_memory_mb":4800},"memory_mb":458752,"is_hidden":false,"category":"GPU Accelerated","num_cores":24.0,"is_io_cache_enabled":false,"support_port_forwarding":true,"support_ebs_volumes":true,"is_deprecated":false}],"default_node_type_id":"Standard_DS3_v2"},"enableDatabaseSupportClusterChoice":true,"enableClusterAcls":true,"notebookRevisionVisibilityHorizon":0,"serverlessClusterProductName":"Serverless Pool","showS3TableImportOption":false,"redirectBrowserOnWorkspaceSelection":true,"maxEbsVolumesPerInstance":10,"enableRStudioUI":false,"isAdmin":true,"deltaProcessingBatchSize":1000,"timerUpdateQueueLength":100,"sqlAclsEnabledMap":{"spark.databricks.acl.enabled":"true","spark.databricks.acl.sqlOnly":"true"},"enableLargeResultDownload":true,"maxElasticDiskCapacityGB":5000,"serverlessDefaultMinWorkers":2,"zoneInfos":[],"enableCustomSpotPricingUIByTier":true,"serverlessClustersEnabled":true,"enableWorkspaceBrowserSorting":true,"enableSentryLogging":false,"enableFindAndReplace":true,"disallowUrlImportExceptFromDocs":false,"defaultStandardClusterModel":{"cluster_name":"","node_type_id":"Standard_DS3_v2","spark_version":"4.0.x-scala2.11","num_workers":null,"autoscale":{"min_workers":2,"max_workers":8},"autotermination_minutes":120,"enable_elastic_disk":false,"default_tags":{"Vendor":"Databricks","Creator":"ivv@adatis.co.uk","ClusterName":null,"ClusterId":""}},"enableEBSVolumesUIForJobs":true,"enablePublishNotebooks":false,"enableBitbucketCloud":true,"shouldShowCommandStatus":true,"createTableInNotebookS3Link":{"url":"https://docs.azuredatabricks.net/_static/notebooks/data-import/s3.html","displayName":"S3","workspaceFileName":"S3 Example"},"sanitizeHtmlResult":true,"enableClusterPinningUI":true,"enableJobAclsConfig":false,"enableFullTextSearch":false,"enableElasticSparkUI":true,"enableNewClustersCreate":true,"clusters":true,"allowRunOnPendingClusters":true,"useAutoscalingByDefault":true,"enableAzureToolbar":true,"enableRequireClusterSettingsUI":true,"fileStoreBase":"FileStore","enableEmailInAzure":true,"enableRLibraries":true,"enableTableAclsConfig":false,"enableSshKeyUIInJobs":true,"enableDetachAndAttachSubMenu":true,"configurableSparkOptionsSpec":[{"keyPattern":"spark\\.kryo(\\.[^\\.]+)+","valuePattern":".*","keyPatternDisplay":"spark.kryo.*","valuePatternDisplay":"*","description":"Configuration options for Kryo serialization"},{"keyPattern":"spark\\.io\\.compression\\.codec","valuePattern":"(lzf|snappy|org\\.apache\\.spark\\.io\\.LZFCompressionCodec|org\\.apache\\.spark\\.io\\.SnappyCompressionCodec)","keyPatternDisplay":"spark.io.compression.codec","valuePatternDisplay":"snappy|lzf","description":"The codec used to compress internal data such as RDD partitions, broadcast variables and shuffle outputs."},{"keyPattern":"spark\\.serializer","valuePattern":"(org\\.apache\\.spark\\.serializer\\.JavaSerializer|org\\.apache\\.spark\\.serializer\\.KryoSerializer)","keyPatternDisplay":"spark.serializer","valuePatternDisplay":"org.apache.spark.serializer.JavaSerializer|org.apache.spark.serializer.KryoSerializer","description":"Class to use for serializing objects that will be sent over the network or need to be cached in serialized form."},{"keyPattern":"spark\\.rdd\\.compress","valuePattern":"(true|false)","keyPatternDisplay":"spark.rdd.compress","valuePatternDisplay":"true|false","description":"Whether to compress serialized RDD partitions (e.g. for StorageLevel.MEMORY_ONLY_SER). Can save substantial space at the cost of some extra CPU time."},{"keyPattern":"spark\\.speculation","valuePattern":"(true|false)","keyPatternDisplay":"spark.speculation","valuePatternDisplay":"true|false","description":"Whether to use speculation (recommended off for streaming)"},{"keyPattern":"spark\\.es(\\.[^\\.]+)+","valuePattern":".*","keyPatternDisplay":"spark.es.*","valuePatternDisplay":"*","description":"Configuration options for ElasticSearch"},{"keyPattern":"es(\\.([^\\.]+))+","valuePattern":".*","keyPatternDisplay":"es.*","valuePatternDisplay":"*","description":"Configuration options for ElasticSearch"},{"keyPattern":"spark\\.(storage|shuffle)\\.memoryFraction","valuePattern":"0?\\.0*([1-9])([0-9])*","keyPatternDisplay":"spark.(storage|shuffle).memoryFraction","valuePatternDisplay":"(0.0,1.0)","description":"Fraction of Java heap to use for Spark's shuffle or storage"},{"keyPattern":"spark\\.streaming\\.backpressure\\.enabled","valuePattern":"(true|false)","keyPatternDisplay":"spark.streaming.backpressure.enabled","valuePatternDisplay":"true|false","description":"Enables or disables Spark Streaming's internal backpressure mechanism (since 1.5). This enables the Spark Streaming to control the receiving rate based on the current batch scheduling delays and processing times so that the system receives only as fast as the system can process. Internally, this dynamically sets the maximum receiving rate of receivers. This rate is upper bounded by the values `spark.streaming.receiver.maxRate` and `spark.streaming.kafka.maxRatePerPartition` if they are set."},{"keyPattern":"spark\\.streaming\\.receiver\\.maxRate","valuePattern":"^([0-9]{1,})$","keyPatternDisplay":"spark.streaming.receiver.maxRate","valuePatternDisplay":"numeric","description":"Maximum rate (number of records per second) at which each receiver will receive data. Effectively, each stream will consume at most this number of records per second. Setting this configuration to 0 or a negative number will put no limit on the rate. See the deployment guide in the Spark Streaming programing guide for mode details."},{"keyPattern":"spark\\.streaming\\.kafka\\.maxRatePerPartition","valuePattern":"^([0-9]{1,})$","keyPatternDisplay":"spark.streaming.kafka.maxRatePerPartition","valuePatternDisplay":"numeric","description":"Maximum rate (number of records per second) at which data will be read from each Kafka partition when using the Kafka direct stream API introduced in Spark 1.3. See the Kafka Integration guide for more details."},{"keyPattern":"spark\\.streaming\\.kafka\\.maxRetries","valuePattern":"^([0-9]{1,})$","keyPatternDisplay":"spark.streaming.kafka.maxRetries","valuePatternDisplay":"numeric","description":"Maximum number of consecutive retries the driver will make in order to find the latest offsets on the leader of each partition (a default value of 1 means that the driver will make a maximum of 2 attempts). Only applies to the Kafka direct stream API introduced in Spark 1.3."},{"keyPattern":"spark\\.streaming\\.ui\\.retainedBatches","valuePattern":"^([0-9]{1,})$","keyPatternDisplay":"spark.streaming.ui.retainedBatches","valuePatternDisplay":"numeric","description":"How many batches the Spark Streaming UI and status APIs remember before garbage collecting."}],"enableReactNotebookComments":true,"enableAdminPasswordReset":false,"checkBeforeAddingAadUser":true,"enableResetPassword":true,"maxClusterTagValueLength":256,"enableJobsSparkUpgrade":true,"createTableInNotebookDBFSLink":{"url":"https://docs.azuredatabricks.net/_static/notebooks/data-import/dbfs.html","displayName":"DBFS","workspaceFileName":"DBFS Example"},"perClusterAutoterminationEnabled":true,"enableNotebookCommandNumbers":true,"measureRoundTripTimes":true,"allowStyleInSanitizedHtml":false,"sparkVersions":[{"key":"3.3.x-scala2.10","displayName":"3.3 (includes Apache Spark 2.2.0, Scala 2.10)","packageLabel":"spark-image-86a9b375074f5afad339e70230ec0ec265c4cefbd280844785fab3bcde5869f9","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":[]},{"key":"4.1.x-scala2.11","displayName":"4.1 (includes Apache Spark 2.3.0, Scala 2.11)","packageLabel":"spark-image-f439d0c801506f0362ffed2e4541799ffeb433963a5b449822b3878a3751bcf8","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS","SUPPORTS_RSTUDIO"]},{"key":"4.1.x-ml-gpu-scala2.11","displayName":"4.1 ML Beta (includes Apache Spark 2.3.0, GPU, Scala 2.11)","packageLabel":"spark-image-5907529b625e97ac8feb0a069002b4fdb861a16740752b5df568fe4efb1c004e","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":[]},{"key":"4.0.x-scala2.11","displayName":"4.0 (includes Apache Spark 2.3.0, Scala 2.11)","packageLabel":"spark-image-f55a02a6e4d4df4eae4ac60d16781794f2061a3e86e0fc2c9697f69fd8211a22","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},{"key":"3.4.x-scala2.11","displayName":"3.4 (includes Apache Spark 2.2.0, Scala 2.11)","packageLabel":"spark-image-ae9f3e3a4d6c99f1d9704425a13db5561c44c9ec0b9a45121cc9c7e8036e4051","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION"]},{"key":"3.2.x-scala2.10","displayName":"3.2 (includes Apache Spark 2.2.0, Scala 2.10)","packageLabel":"spark-image-557788bea0eea16bbf7a8ba13ace07e64dd7fc86270bd5cea086097fe886431f","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":[]},{"key":"4.1.x-ml-scala2.11","displayName":"4.1 ML Beta (includes Apache Spark 2.3.0, Scala 2.11)","packageLabel":"spark-image-ad599fbbca53898d7531a4b94c73f3e68b3c2e49e3502c09f6bf01468d801882","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":[]},{"key":"latest-experimental-scala2.10","displayName":"[DO NOT USE] Latest experimental (3.5 snapshot, Scala 2.10)","packageLabel":"spark-image-5e4f1f2feb631875a6036dffb069ec14b436939b5efe0ecb3ff8220c835298d6","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},{"key":"latest-rc-scala2.11","displayName":"Latest RC (4.2 snapshot, Scala 2.11)","packageLabel":"spark-image-0592d44fb91741b05c314304a91661216c672f0424e342fe8a2a0b2c794d3cd0","upgradable":true,"deprecated":false,"customerVisible":false,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS","SUPPORTS_RSTUDIO"]},{"key":"latest-stable-scala2.11","displayName":"Latest stable (Scala 2.11)","packageLabel":"spark-image-f439d0c801506f0362ffed2e4541799ffeb433963a5b449822b3878a3751bcf8","upgradable":true,"deprecated":false,"customerVisible":false,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS","SUPPORTS_RSTUDIO"]},{"key":"4.1.x-gpu-scala2.11","displayName":"4.1 (includes Apache Spark 2.3.0, GPU, Scala 2.11)","packageLabel":"spark-image-aea6646f13f50000babc2af707cc31c43b11e68e7f452eb25c7dfd73ad24d705","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_RSTUDIO"]},{"key":"3.5.x-scala2.10","displayName":"3.5 LTS (includes Apache Spark 2.2.1, Scala 2.10)","packageLabel":"spark-image-9fa5a2faf500af1478f5d9e67ccbde7f9342a5c91a3b6260c974bc15e922928d","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},{"key":"latest-rc-scala2.10","displayName":"[DO NOT USE] Latest RC (3.5 snapshot, Scala 2.10)","packageLabel":"spark-image-5e4f1f2feb631875a6036dffb069ec14b436939b5efe0ecb3ff8220c835298d6","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},{"key":"latest-stable-scala2.10","displayName":"[DEPRECATED] Latest stable (Scala 2.10)","packageLabel":"spark-image-5e4f1f2feb631875a6036dffb069ec14b436939b5efe0ecb3ff8220c835298d6","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},{"key":"latest-experimental-gpu-scala2.11","displayName":"Latest experimental (4.2 snapshot, GPU, Scala 2.11)","packageLabel":"spark-image-dfabb78470bd1abaaf209f0f6397e0da699a115933f76bea620aa664f8667e7b","upgradable":true,"deprecated":false,"customerVisible":false,"capabilities":["SUPPORTS_RSTUDIO"]},{"key":"3.1.x-scala2.11","displayName":"3.1 (includes Apache Spark 2.2.0, Scala 2.11)","packageLabel":"spark-image-241fa8b78ee6343242b1756b18076270894385ff40a81172a6fb5eadf66155d3","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":[]},{"key":"3.1.x-scala2.10","displayName":"3.1 (includes Apache Spark 2.2.0, Scala 2.10)","packageLabel":"spark-image-7efac6b9a8f2da59cb4f6d0caac46cfcb3f1ebf64c8073498c42d0360f846714","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":[]},{"key":"3.3.x-scala2.11","displayName":"3.3 (includes Apache Spark 2.2.0, Scala 2.11)","packageLabel":"spark-image-46cc39a9afa43fbd7bfa9f4f5ed8d23f658cd0b0d74208627243222ae0d22f8d","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":[]},{"key":"3.5.x-scala2.11","displayName":"3.5 LTS (includes Apache Spark 2.2.1, Scala 2.11)","packageLabel":"spark-image-6f5628e4962f9e1963e43a5acfbc89a3e32ef15d3da0462bca09a985bae2fe14","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},{"key":"latest-experimental-scala2.11","displayName":"Latest experimental (4.2 snapshot, Scala 2.11)","packageLabel":"spark-image-0592d44fb91741b05c314304a91661216c672f0424e342fe8a2a0b2c794d3cd0","upgradable":true,"deprecated":false,"customerVisible":false,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS","SUPPORTS_RSTUDIO"]},{"key":"3.2.x-scala2.11","displayName":"3.2 (includes Apache Spark 2.2.0, Scala 2.11)","packageLabel":"spark-image-5537926238bc55cb6cd76ee0f0789511349abead3781c4780721a845f34b5d4e","upgradable":true,"deprecated":true,"customerVisible":false,"capabilities":[]},{"key":"latest-rc-gpu-scala2.11","displayName":"Latest RC (4.2 snapshot, GPU, Scala 2.11)","packageLabel":"spark-image-ed2636c4e44f9d4ce48b6ea1b4de7d172e33d9971f82700788751efee16d6bd1","upgradable":true,"deprecated":false,"customerVisible":false,"capabilities":["SUPPORTS_RSTUDIO"]},{"key":"3.4.x-scala2.10","displayName":"3.4 (includes Apache Spark 2.2.0, Scala 2.10)","packageLabel":"spark-image-f28e110ac5f9efcb3f48d812bf34961f4fe17336d732e33de5ffa64f269ed59a","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION"]}],"enablePresentationMode":false,"enableClearStateAndRunAll":true,"enableTableAclsByTier":true,"enableRestrictedClusterCreation":false,"enableFeedback":false,"enableClusterAutoScaling":true,"enableUserVisibleDefaultTags":true,"defaultNumWorkers":8,"serverContinuationTimeoutMillis":10000,"jobsUnreachableThresholdMillis":60000,"driverStderrFilePrefix":"stderr","roundTripReportTimeoutMs":5000,"enableNotebookRefresh":true,"createTableInNotebookImportedFileLink":{"url":"https://docs.azuredatabricks.net/_static/notebooks/data-import/imported-file.html","displayName":"Imported File","workspaceFileName":"Imported File Example"},"accountsOwnerUrl":"https://portal.azure.com/?feature.customportal=false&microsoft_azure_marketplace_ItemHideKey=DatabricksExtensionHidden&Microsoft_Azure_Databricks=true#resource/subscriptions/76dd74d5-e8e7-493d-91dc-d8113ee1f20c/resourceGroups/RGABI/providers/Microsoft.Databricks/workspaces/abiweuadlsdev","driverStdoutFilePrefix":"stdout","showDbuPricing":true,"databricksDocsBaseHostname":"docs.azuredatabricks.net","defaultNodeTypeToPricingUnitsMap":{"Standard_E64s_v3":16,"r3.2xlarge":2,"i3.4xlarge":4,"Standard_NC12s_v2":6.75,"class-node":1,"m4.2xlarge":1.5,"Standard_D11_v2":0.5,"r4.xlarge":1,"m4.4xlarge":3,"p3.2xlarge":4.15,"Standard_DS5_v2":3,"Standard_D2s_v3":0.5,"Standard_DS4_v2_Promo":1.5,"Standard_DS14":4,"Standard_DS11_v2_Promo":0.5,"r4.16xlarge":16,"Standard_NC6":1.5,"Standard_DS11":0.5,"Standard_D2_v3":0.5,"Standard_DS14_v2_Promo":4,"Standard_D64s_v3":12,"p2.8xlarge":9.76,"m4.10xlarge":8,"Standard_D8s_v3":1.5,"Standard_E32s_v3":8,"Standard_DS3":0.75,"Standard_DS2_v2":0.5,"r3.8xlarge":8,"r4.4xlarge":4,"dev-tier-node":1,"Standard_L8s":2,"Standard_D13_v2":2,"p3.16xlarge":33.2,"Standard_NC24rs_v3":20,"Standard_DS13_v2_Promo":2,"Standard_E4s_v3":1,"Standard_D3_v2":0.75,"Standard_NC24":6,"Standard_NC24r":6,"Standard_DS15_v2":5,"Standard_D16s_v3":3,"Standard_D5_v2":3,"Standard_E8s_v3":2,"Standard_DS2_v2_Promo":0.5,"c3.8xlarge":4,"Standard_D4_v3":0.75,"Standard_E2s_v3":0.5,"Standard_D32_v3":6,"Standard_DS3_v2":0.75,"Standard_NC6s_v3":5,"r3.4xlarge":4,"Standard_DS4":1.5,"i2.4xlarge":6,"Standard_DS3_v2_Promo":0.75,"m4.xlarge":0.75,"r4.8xlarge":8,"Standard_D14_v2":4,"Standard_H16":4,"Standard_NC12":3,"Standard_DS14_v2":4,"r4.large":0.5,"Standard_D15_v2":5,"Standard_DS12":1,"development-node":1,"i2.2xlarge":3,"Standard_NC6s_v2":3.38,"g2.8xlarge":6,"Standard_D12_v2":1,"i3.large":0.75,"Standard_NC12s_v3":10,"memory-optimized":1,"m4.large":0.4,"Standard_D16_v3":3,"Standard_F4s":0.5,"p2.16xlarge":19.52,"Standard_NC24rs_v2":13.5,"i3.8xlarge":8,"Standard_D32s_v3":6,"i3.16xlarge":16,"Standard_DS12_v2":1,"Standard_L32s":8,"Standard_D4s_v3":0.75,"Standard_DS13":2,"Standard_DS11_v2":0.5,"Standard_DS12_v2_Promo":1,"Standard_DS13_v2":2,"c3.2xlarge":1,"Standard_L4s":1,"Standard_F16s":2,"c4.2xlarge":1,"Standard_L16s":4,"i2.xlarge":1.5,"Standard_DS2":0.5,"compute-optimized":1,"c4.4xlarge":2,"Standard_DS5_v2_Promo":3,"Standard_D64_v3":12,"Standard_D2_v2":0.5,"Standard_D8_v3":1.5,"i3.2xlarge":2,"Standard_E16s_v3":4,"Standard_F8s":1,"c3.4xlarge":2,"Standard_NC24s_v2":13.5,"Standard_NC24s_v3":20,"Standard_D4_v2":1.5,"g2.2xlarge":1.5,"p3.8xlarge":16.6,"p2.xlarge":1.22,"m4.16xlarge":12,"Standard_DS4_v2":1.5,"c4.8xlarge":4,"i3.xlarge":1,"r3.xlarge":1,"r4.2xlarge":2,"i2.8xlarge":12},"tableFilesBaseFolder":"/tables","enableSparkDocsSearch":true,"sparkHistoryServerEnabled":true,"enableClusterAppsUIOnServerless":false,"enableEBSVolumesUI":true,"minDaysSinceDeletedToPurge":"30 days","homePageWelcomeMessage":"","metastoreServiceRowLimit":1000000,"enableIPythonImportExport":true,"enableClusterTagsUIForJobs":true,"enableClusterTagsUI":true,"enableNotebookHistoryDiffing":true,"branch":"2.73.271","accountsLimit":-1,"enableSparkEnvironmentVariables":true,"enableX509Authentication":false,"useAADLogin":true,"enableStructuredStreamingNbOptimizations":true,"enableNotebookGitBranching":true,"terminatedClustersWindow":2592000000,"local":false,"enableNotebookLazyRenderWrapper":false,"enableClusterAutoScalingForJobs":true,"enableStrongPassword":false,"showReleaseNote":false,"displayDefaultContainerMemoryGB":30,"broadenedEditPermission":false,"enableWorkspacePurgeDryRun":false,"disableS3TableImport":true,"enableArrayParamsEdit":true,"deploymentMode":"production","useSpotForWorkers":true,"removePasswordInAccountSettings":true,"preferStartTerminatedCluster":false,"enableUserInviteWorkflow":true,"createTableConnectorOptionLinks":[{"url":"https://docs.databricks.com/_static/notebooks/data-import/azure-blob-store.html","displayName":"Azure Blob Storage","workspaceFileName":"Azure Blob Storage Import Example Notebook"},{"url":"https://docs.azuredatabricks.net/_static/notebooks/data-import/jdbc.html","displayName":"JDBC","workspaceFileName":"JDBC Example"},{"url":"https://docs.azuredatabricks.net/_static/notebooks/cassandra.html","displayName":"Cassandra","workspaceFileName":"Cassandra Example"},{"url":"https://docs.azuredatabricks.net/_static/notebooks/structured-streaming-etl-kafka.html","displayName":"Kafka","workspaceFileName":"Kafka Example"},{"url":"https://docs.azuredatabricks.net/_static/notebooks/redis.html","displayName":"Redis","workspaceFileName":"Redis Example"},{"url":"https://docs.azuredatabricks.net/_static/notebooks/elasticsearch.html","displayName":"Elasticsearch","workspaceFileName":"Elasticsearch Example"}],"enableStaticNotebooks":true,"enableNewLineChart":true,"shouldReportUnhandledPromiseRejectionsToSentry":false,"sandboxForUrlSandboxFrame":"allow-scripts allow-popups allow-popups-to-escape-sandbox allow-forms","enableCssTransitions":true,"serverlessEnableElasticDisk":true,"minClusterTagKeyLength":1,"showHomepageFeaturedLinks":true,"pricingURL":"https://databricks.com/product/pricing","enableClusterEdit":true,"enableClusterAclsConfig":false,"useTempS3UrlForTableUpload":false,"notifyLastLogin":false,"enableFilePurge":true,"enableSshKeyUIByTier":true,"enableCreateClusterOnAttach":false,"defaultAutomatedPricePerDBU":0.35,"enableNotebookGitVersioning":true,"defaultMinWorkers":2,"commandStatusDebounceMaxWait":1000,"files":"files/","feedbackEmail":"feedback@databricks.com","enableDriverLogsUI":true,"enableExperimentalCharts":false,"defaultMaxWorkers":8,"enableWorkspaceAclsConfig":false,"serverlessRunPythonAsLowPrivilegeUser":false,"dropzoneMaxFileSize":2047,"enableNewClustersList":true,"enableNewDashboardViews":true,"enableJobListPermissionFilter":true,"terminatedInteractiveClustersMax":70,"driverLog4jFilePrefix":"log4j","enableSingleSignOn":false,"enableMavenLibraries":true,"updateTreeTableToV2Schema":false,"displayRowLimit":1000,"notebookTooManyCommandsNotificationLimit":100,"deltaProcessingAsyncEnabled":true,"enableSparkEnvironmentVariablesUI":false,"defaultSparkVersion":{"key":"4.0.x-scala2.11","displayName":"4.0 (includes Apache Spark 2.3.0, Scala 2.11)","packageLabel":"spark-image-f55a02a6e4d4df4eae4ac60d16781794f2061a3e86e0fc2c9697f69fd8211a22","upgradable":true,"deprecated":false,"customerVisible":true,"capabilities":["SUPPORTS_END_TO_END_ENCRYPTION","SUPPORTS_TABLE_ACLS"]},"enableNewLineChartParams":false,"deprecatedEnableStructuredDataAcls":false,"enableCustomSpotPricing":true,"enableRStudioFreeUI":false,"enableMountAclsConfig":false,"defaultAutoterminationMin":120,"useDevTierHomePage":false,"disableExportNotebook":false,"enableClusterClone":true,"enableNotebookLineNumbers":true,"enablePublishHub":false,"notebookHubUrl":"http://hub.dev.databricks.com/","commandStatusDebounceInterval":100,"enableTrashFolder":true,"showSqlEndpoints":true,"enableNotebookDatasetInfoView":true,"defaultTagKeys":{"CLUSTER_NAME":"ClusterName","VENDOR":"Vendor","CLUSTER_TYPE":"ResourceClass","CREATOR":"Creator","CLUSTER_ID":"ClusterId"},"enableClusterAclsByTier":true,"databricksDocsBaseUrl":"https://docs.azuredatabricks.net/","azurePortalLink":"https://portal.azure.com","cloud":"Azure","customSparkVersionPrefix":"custom:","disallowAddingAdmins":false,"enableSparkConfUI":true,"enableClusterEventsUI":true,"featureTier":"STANDARD_W_SEC_TIER","mavenCentralSearchEndpoint":"http://search.maven.org/solrsearch/select","defaultServerlessClusterModel":{"cluster_name":"","node_type_id":"Standard_DS13_v2","spark_version":"latest-stable-scala2.11","num_workers":null,"enable_jdbc_auto_start":true,"custom_tags":{"ResourceClass":"Serverless"},"autoscale":{"min_workers":2,"max_workers":8},"spark_conf":{"spark.databricks.cluster.profile":"serverless","spark.databricks.repl.allowedLanguages":"sql,python,r"},"autotermination_minutes":120,"enable_elastic_disk":true,"default_tags":{"Vendor":"Databricks","Creator":"ivv@adatis.co.uk","ClusterName":null,"ClusterId":""}},"enableClearRevisionHistoryForNotebook":true,"enableOrgSwitcherUI":true,"bitbucketCloudBaseApiV2Url":"https://api.bitbucket.org/2.0","clustersLimit":-1,"enableJdbcImport":true,"enableClusterAppsUIOnNormalClusters":false,"enableMergedServerlessUI":false,"enableElasticDisk":true,"logfiles":"logfiles/","enableRelativeNotebookLinks":true,"enableMultiSelect":true,"homePageLogo":"login/DB_Azure_Lockup_2x.png","enableWebappSharding":true,"enableNotebookParamsEdit":true,"enableClusterDeltaUpdates":true,"enableSingleSignOnLogin":false,"separateTableForJobClusters":true,"ebsVolumeSizeLimitGB":{"GENERAL_PURPOSE_SSD":[100,4096],"THROUGHPUT_OPTIMIZED_HDD":[500,4096]},"enableClusterDeleteUI":true,"enableMountAcls":false,"requireEmailUserName":true,"enableRServerless":true,"frameRateReportIntervalMs":10000,"dbcFeedbackURL":"http://feedback.databricks.com/forums/263785-product-feedback","enableMountAclService":true,"showVersion":false,"serverlessClustersByDefault":false,"collectDetailedFrameRateStats":true,"enableWorkspaceAcls":true,"maxClusterTagKeyLength":512,"gitHash":"","clusterTagReservedPrefixes":["azure","microsoft","windows"],"tableAclsEnabledMap":{"spark.databricks.acl.dfAclsEnabled":"true","spark.databricks.repl.allowedLanguages":"python,sql"},"showWorkspaceFeaturedLinks":true,"signupUrl":"","databricksDocsNotebookPathPrefix":"^https://docs\\.azuredatabricks\\.net/_static/notebooks/.+$","serverlessAttachEbsVolumesByDefault":false,"enableTokensConfig":true,"allowFeedbackForumAccess":true,"frameDurationReportThresholdMs":1000,"enablePythonVersionUI":true,"enableImportFromUrl":true,"allowDisplayHtmlByUrl":true,"enableTokens":true,"enableMiniClusters":false,"enableNewJobList":true,"maxPinnedClustersPerOrg":20,"enableDebugUI":false,"enableStreamingMetricsDashboard":true,"allowNonAdminUsers":true,"enableSingleSignOnByTier":true,"enableJobsRetryOnTimeout":true,"loginLogo":"/login/DB_Azure_Lockup_2x.png","useStandardTierUpgradeTooltips":true,"staticNotebookResourceUrl":"https://databricks-prod-cloudfront.cloud.databricks.com/static/d9b6f94329a854d5e4b563a3854b37b2e3a5420ba1b2e8434244d6f80244e6c6/","enableSpotClusterType":true,"enableSparkPackages":true,"checkAadUserInWorkspaceTenant":false,"dynamicSparkVersions":false,"useIframeForHtmlResult":false,"enableClusterTagsUIByTier":true,"enableUserPromptForPendingRpc":true,"enableNotebookHistoryUI":true,"addWhitespaceAfterLastNotebookCell":true,"enableClusterLoggingUI":true,"setDeletedAtForDeletedColumnsOnWebappStart":false,"enableDatabaseDropdownInTableUI":true,"showDebugCounters":false,"enableInstanceProfilesUI":true,"enableFolderHtmlExport":true,"homepageFeaturedLinks":[{"linkURI":"https://docs.azuredatabricks.net/_static/notebooks/azure/gentle-introduction-to-apache-spark-azure.html","displayName":"Introduction to Apache Spark on Databricks","icon":"img/home/Python_icon.svg"},{"linkURI":"https://docs.azuredatabricks.net/_static/notebooks/azure/databricks-for-data-scientists-azure.html","displayName":"Databricks for Data Scientists","icon":"img/home/Scala_icon.svg"},{"linkURI":"https://docs.azuredatabricks.net/_static/notebooks/structured-streaming-python.html","displayName":"Introduction to Structured Streaming","icon":"img/home/Python_icon.svg"}],"enableClusterStart":true,"maxImportFileVersion":5,"enableEBSVolumesUIByTier":true,"enableTableAclService":true,"removeSubCommandCodeWhenExport":true,"upgradeURL":"","maxAutoterminationMinutes":10000,"showResultsFromExternalSearchEngine":true,"autoterminateClustersByDefault":false,"notebookLoadingBackground":"#fff","sshContainerForwardedPort":2200,"enableStaticHtmlImport":true,"enableInstanceProfilesByTier":true,"showForgotPasswordLink":true,"defaultMemoryPerContainerMB":28000,"enablePresenceUI":true,"minAutoterminationMinutes":10,"accounts":true,"useOnDemandClustersByDefault":false,"enableAutoCreateUserUI":true,"defaultCoresPerContainer":4,"showTerminationReason":true,"enableNewClustersGet":true,"showPricePerDBU":true,"showSqlProxyUI":true,"enableNotebookErrorHighlighting":true}; var __DATABRICKS_NOTEBOOK_MODEL = 'JTdCJTIydmVyc2lvbiUyMiUzQSUyMk5vdGVib29rVjElMjIlMkMlMjJvcmlnSWQlMjIlM0E0MzQxNTg5Nzc5MzMxMjU4JTJDJTIybmFtZSUyMiUzQSUyMkJsb2cyJTIyJTJDJTIybGFuZ3VhZ2UlMjIlM0ElMjJweXRob24lMjIlMkMlMjJjb21tYW5kcyUyMiUzQSU1QiU3QiUyMnZlcnNpb24lMjIlM0ElMjJDb21tYW5kVjElMjIlMkMlMjJvcmlnSWQlMjIlM0E0MzQxNTg5Nzc5MzMxMjU5JTJDJTIyZ3VpZCUyMiUzQSUyMjNmYmJiYzI1LWVhZGEtNGZhOS1hOGNiLTJiNjFkODg2NWYzMCUyMiUyQyUyMnN1YnR5cGUlMjIlM0ElMjJjb21tYW5kJTIyJTJDJTIyY29tbWFuZFR5cGUlMjIlM0ElMjJhdXRvJTIyJTJDJTIycG9zaXRpb24lMjIlM0ExLjAlMkMlMjJjb21tYW5kJTIyJTNBJTIyJTIzJTIwRmlyc3QlMjBjb21tYW5kJTVDbiU1Q25wcmludCglNUMlMjJQaW5nJTVDJTIyKSUyMiUyQyUyMmNvbW1hbmRWZXJzaW9uJTIyJTNBMCUyQyUyMnN0YXRlJTIyJTNBJTIyZmluaXNoZWQlMjIlMkMlMjJyZXN1bHRzJTIyJTNBJTdCJTIydHlwZSUyMiUzQSUyMmh0bWwlMjIlMkMlMjJkYXRhJTIyJTNBJTIyJTNDZGl2JTIwY2xhc3MlM0QlNUMlMjJhbnNpb3V0JTVDJTIyJTNFUGluZyU1Q24lM0MlMkZkaXYlM0UlMjIlMkMlMjJhcmd1bWVudHMlMjIlM0ElN0IlN0QlMkMlMjJhZGRlZFdpZGdldHMlMjIlM0ElN0IlN0QlMkMlMjJyZW1vdmVkV2lkZ2V0cyUyMiUzQSU1QiU1RCUyQyUyMmRhdGFzZXRJbmZvcyUyMiUzQSU1QiU1RCU3RCUyQyUyMmVycm9yU3VtbWFyeSUyMiUzQW51bGwlMkMlMjJlcnJvciUyMiUzQW51bGwlMkMlMjJ3b3JrZmxvd3MlMjIlM0ElNUIlNUQlMkMlMjJzdGFydFRpbWUlMjIlM0ExNTI4NzE1ODg2MzcxJTJDJTIyc3VibWl0VGltZSUyMiUzQTE1Mjg3MTU4ODYwODIlMkMlMjJmaW5pc2hUaW1lJTIyJTNBMTUyODcxNTg4NjM4NSUyQyUyMmNvbGxhcHNlZCUyMiUzQWZhbHNlJTJDJTIyYmluZGluZ3MlMjIlM0ElN0IlN0QlMkMlMjJpbnB1dFdpZGdldHMlMjIlM0ElN0IlN0QlMkMlMjJkaXNwbGF5VHlwZSUyMiUzQSUyMnRhYmxlJTIyJTJDJTIyd2lkdGglMjIlM0ElMjJhdXRvJTIyJTJDJTIyaGVpZ2h0JTIyJTNBJTIyYXV0byUyMiUyQyUyMnhDb2x1bW5zJTIyJTNBbnVsbCUyQyUyMnlDb2x1bW5zJTIyJTNBbnVsbCUyQyUyMnBpdm90Q29sdW1ucyUyMiUzQW51bGwlMkMlMjJwaXZvdEFnZ3JlZ2F0aW9uJTIyJTNBbnVsbCUyQyUyMmN1c3RvbVBsb3RPcHRpb25zJTIyJTNBJTdCJTdEJTJDJTIyY29tbWVudFRocmVhZCUyMiUzQSU1QiU1RCUyQyUyMmNvbW1lbnRzVmlzaWJsZSUyMiUzQWZhbHNlJTJDJTIycGFyZW50SGllcmFyY2h5JTIyJTNBJTVCJTVEJTJDJTIyZGlmZkluc2VydHMlMjIlM0ElNUIlNUQlMkMlMjJkaWZmRGVsZXRlcyUyMiUzQSU1QiU1RCUyQyUyMmdsb2JhbFZhcnMlMjIlM0ElN0IlN0QlMkMlMjJsYXRlc3RVc2VyJTIyJTNBJTIyYSUyMHVzZXIlMjIlMkMlMjJsYXRlc3RVc2VySWQlMjIlM0FudWxsJTJDJTIyY29tbWFuZFRpdGxlJTIyJTNBJTIyQ29tbWFuZCUyMDElMjBUaXRsZSUyMiUyQyUyMnNob3dDb21tYW5kVGl0bGUlMjIlM0F0cnVlJTJDJTIyaGlkZUNvbW1hbmRDb2RlJTIyJTNBZmFsc2UlMkMlMjJoaWRlQ29tbWFuZFJlc3VsdCUyMiUzQWZhbHNlJTJDJTIyaVB5dGhvbk1ldGFkYXRhJTIyJTNBbnVsbCUyQyUyMnN0cmVhbVN0YXRlcyUyMiUzQSU3QiU3RCUyQyUyMm51aWQlMjIlM0ElMjJhYjdkMjhjZC04NzZkLTRlNGUtYWU2Mi05N2UwZTVkM2Q0ZWElMjIlN0QlMkMlN0IlMjJ2ZXJzaW9uJTIyJTNBJTIyQ29tbWFuZFYxJTIyJTJDJTIyb3JpZ0lkJTIyJTNBNDM0MTU4OTc3OTMzMTI2MiUyQyUyMmd1aWQlMjIlM0ElMjI5OGY1ODFiMS01MTQyLTRhZWEtYjlhNy0zYTAyYTM5ZDFmYzMlMjIlMkMlMjJzdWJ0eXBlJTIyJTNBJTIyY29tbWFuZCUyMiUyQyUyMmNvbW1hbmRUeXBlJTIyJTNBJTIyYXV0byUyMiUyQyUyMnBvc2l0aW9uJTIyJTNBMS41JTJDJTIyY29tbWFuZCUyMiUzQSUyMiUyMyUyMFNlY29uZCUyMGNvbW1hbmQlNUNuJTVDbnByaW50KCU1QyUyMlBvbmclNUMlMjIpJTIyJTJDJTIyY29tbWFuZFZlcnNpb24lMjIlM0EwJTJDJTIyc3RhdGUlMjIlM0ElMjJmaW5pc2hlZCUyMiUyQyUyMnJlc3VsdHMlMjIlM0ElN0IlMjJ0eXBlJTIyJTNBJTIyaHRtbCUyMiUyQyUyMmRhdGElMjIlM0ElMjIlM0NkaXYlMjBjbGFzcyUzRCU1QyUyMmFuc2lvdXQlNUMlMjIlM0VQb25nJTVDbiUzQyUyRmRpdiUzRSUyMiUyQyUyMmFyZ3VtZW50cyUyMiUzQSU3QiU3RCUyQyUyMmFkZGVkV2lkZ2V0cyUyMiUzQSU3QiU3RCUyQyUyMnJlbW92ZWRXaWRnZXRzJTIyJTNBJTVCJTVEJTJDJTIyZGF0YXNldEluZm9zJTIyJTNBJTVCJTVEJTdEJTJDJTIyZXJyb3JTdW1tYXJ5JTIyJTNBbnVsbCUyQyUyMmVycm9yJTIyJTNBbnVsbCUyQyUyMndvcmtmbG93cyUyMiUzQSU1QiU1RCUyQyUyMnN0YXJ0VGltZSUyMiUzQTE1Mjg3MTU4ODYzOTIlMkMlMjJzdWJtaXRUaW1lJTIyJTNBMTUyODcxNTg4NjMzMSUyQyUyMmZpbmlzaFRpbWUlMjIlM0ExNTI4NzE1ODg2NDAxJTJDJTIyY29sbGFwc2VkJTIyJTNBZmFsc2UlMkMlMjJiaW5kaW5ncyUyMiUzQSU3QiU3RCUyQyUyMmlucHV0V2lkZ2V0cyUyMiUzQSU3QiU3RCUyQyUyMmRpc3BsYXlUeXBlJTIyJTNBJTIydGFibGUlMjIlMkMlMjJ3aWR0aCUyMiUzQSUyMmF1dG8lMjIlMkMlMjJoZWlnaHQlMjIlM0ElMjJhdXRvJTIyJTJDJTIyeENvbHVtbnMlMjIlM0FudWxsJTJDJTIyeUNvbHVtbnMlMjIlM0FudWxsJTJDJTIycGl2b3RDb2x1bW5zJTIyJTNBbnVsbCUyQyUyMnBpdm90QWdncmVnYXRpb24lMjIlM0FudWxsJTJDJTIyY3VzdG9tUGxvdE9wdGlvbnMlMjIlM0ElN0IlN0QlMkMlMjJjb21tZW50VGhyZWFkJTIyJTNBJTVCJTVEJTJDJTIyY29tbWVudHNWaXNpYmxlJTIyJTNBZmFsc2UlMkMlMjJwYXJlbnRIaWVyYXJjaHklMjIlM0ElNUIlNUQlMkMlMjJkaWZmSW5zZXJ0cyUyMiUzQSU1QiU1RCUyQyUyMmRpZmZEZWxldGVzJTIyJTNBJTVCJTVEJTJDJTIyZ2xvYmFsVmFycyUyMiUzQSU3QiU3RCUyQyUyMmxhdGVzdFVzZXIlMjIlM0ElMjJhJTIwdXNlciUyMiUyQyUyMmxhdGVzdFVzZXJJZCUyMiUzQW51bGwlMkMlMjJjb21tYW5kVGl0bGUlMjIlM0ElMjJDb21tYW5kJTIwMiUyMFRpdGxlJTIyJTJDJTIyc2hvd0NvbW1hbmRUaXRsZSUyMiUzQXRydWUlMkMlMjJoaWRlQ29tbWFuZENvZGUlMjIlM0FmYWxzZSUyQyUyMmhpZGVDb21tYW5kUmVzdWx0JTIyJTNBZmFsc2UlMkMlMjJpUHl0aG9uTWV0YWRhdGElMjIlM0FudWxsJTJDJTIyc3RyZWFtU3RhdGVzJTIyJTNBJTdCJTdEJTJDJTIybnVpZCUyMiUzQSUyMjE1MjY0NzBlLTgxMjgtNDIyZi04MzQ1LTQ1MWI5OGUwZTgyNyUyMiU3RCU1RCUyQyUyMmRhc2hib2FyZHMlMjIlM0ElNUIlNUQlMkMlMjJndWlkJTIyJTNBJTIyZWQxNWFlNjMtNjVmYi00MTMwLTk2ZmQtOWIzNWU3MTQ5MDJiJTIyJTJDJTIyZ2xvYmFsVmFycyUyMiUzQSU3QiU3RCUyQyUyMmlQeXRob25NZXRhZGF0YSUyMiUzQW51bGwlMkMlMjJpbnB1dFdpZGdldHMlMjIlM0ElN0IlN0QlN0Q='; if (window.mainJsLoadError) { var u = 'https://databricks-prod-cloudfront.cloud.databricks.com/static/d9b6f94329a854d5e4b563a3854b37b2e3a5420ba1b2e8434244d6f80244e6c6/js/notebook-main.js'; var b = document.getElementsByTagName('body')[0]; var c = document.createElement('div'); c.innerHTML = ('Network Error' + 'Please check your network connection and try again.' + 'Could not load a required resource: ' + u + ''); c.style.margin = '30px'; c.style.padding = '20px 50px'; c.style.backgroundColor = '#f5f5f5'; c.style.borderRadius = '5px'; b.appendChild(c); } ">Your browser is not supported. Please use any of the following browsers: Chrome, Firefox, Safari, Opera. Not all notebook features are available within an iframe, but the main ones, such as syntax highlighting are intact and so is the ability to display a table results (with column sort capability!). The proposed way of embedding notebook code in a blog post is just the first idea that came to my mind. Maybe there are more clever ways to achieve this. Please let me know in the comments. (function(root, factory) { // `root` does not resolve to the global window object in a Browserified // bundle, so a direct reference to that object is used instead. var _srcDoc = window.srcDoc; if (typeof define === "function" && define.amd) { define(['exports'], function(exports) { factory(exports, _srcDoc); root.srcDoc = exports; }); } else if (typeof exports === "object") { factory(exports, _srcDoc); } else { root.srcDoc = {}; factory(root.srcDoc, _srcDoc); } })(this, function(exports, _srcDoc) { var idx, iframes; var isCompliant = !!("srcdoc" in document.createElement("iframe")); var sandboxMsg = "Polyfill may not function in the presence of the " + "`sandbox` attribute. Consider using the `force` option."; var sandboxAllow = /\ballow-same-origin\b/; /** * Determine if the operation may be blocked by the `sandbox` attribute in * some environments, and optionally issue a warning or remove the * attribute. */ var validate = function( iframe, options ) { var sandbox = iframe.getAttribute("sandbox"); if (typeof sandbox === "string" && !sandboxAllow.test(sandbox)) { if (options && options.force) { iframe.removeAttribute("sandbox"); } else if (!options || options.force !== false) { logError(sandboxMsg); iframe.setAttribute("data-srcdoc-polyfill", sandboxMsg); } } }; var implementations = { compliant: function( iframe, content, options ) { if (content) { validate(iframe, options); iframe.setAttribute("srcdoc", content); } }, legacy: function( iframe, content, options ) { var jsUrl; if (!iframe || !iframe.getAttribute) { return; } if (!content) { content = iframe.getAttribute("srcdoc"); } else { iframe.setAttribute("srcdoc", content); } if (content) { validate(iframe, options); // The value returned by a script-targeted URL will be used as // the iFrame's content. Create such a URL which returns the // iFrame element's `srcdoc` attribute. jsUrl = "javascript: window.frameElement.getAttribute('srcdoc');"; // Explicitly set the iFrame's window.location for // compatability with IE9, which does not react to changes in // the `src` attribute when it is a `javascript:` URL, for // some reason if (iframe.contentWindow) { iframe.contentWindow.location = jsUrl; } iframe.setAttribute("src", jsUrl); } } }; var srcDoc = exports; var logError; if (window.console && window.console.error) { logError = function(msg) { window.console.error("[srcdoc-polyfill] " + msg); }; } else { logError = function() {}; } // Assume the best srcDoc.set = implementations.compliant; srcDoc.noConflict = function() { window.srcDoc = _srcDoc; return srcDoc; }; // If the browser supports srcdoc, no shimming is necessary if (isCompliant) { return; } srcDoc.set = implementations.legacy; // Automatically shim any iframes already present in the document iframes = document.getElementsByTagName("iframe"); idx = iframes.length; while (idx--) { srcDoc.set( iframes[idx] ); } });

Connecting Azure Databricks to Data Lake Store

Just a quick post here to help anyone who needs integrate their Azure Databricks cluster with Data Lake Store. This is not hard to do but there are a few steps so its worth recording them here in a quick and easy to follow form.This assumes you have created your Databricks cluster and have created a data lake store you want to integrate with. If you haven’t created your cluster that’s described in a previous blog here which you may find useful.The objective here is to create a mount point, a folder in the lake accessible from Databricks so we can read from and write to ADLS. Here this is done in notebooks in Databricks using Python but if Scala is your thing then its just as easy. To create the mount point you need to run the following command:-configs = {"dfs.adls.oauth2.access.token.provider.type": "ClientCredential",            "dfs.adls.oauth2.client.id": "{YOUR SERVICE CLIENT ID}",            "dfs.adls.oauth2.credential": "{YOUR SERVICE CREDENTIALS}",            "dfs.adls.oauth2.refresh.url": "https://login.microsoftonline.com/{YOUR DIRECTORY ID}/oauth2/token"} dbutils.fs.mount(   source = "adl://{YOUR DATA LAKE STORE ACCOUNT NAME}.azuredatalakestore.net{YOUR DIRECTORY NAME}",   mount_point = "{mountPointPath}",   extra_configs = configs)So to do this we need to collect together the values to use for{YOUR SERVICE CLIENT ID}{YOUR SERVICE CREDENTIALS}{YOUR DIRECTORY ID}{YOUR DATA LAKE STORE ACCOUNT NAME}{YOUR DIRECTORY NAME}{mountPointPath}First the easy ones, my data lake store is called “pythonregression” and I want the folder I am going to use to be ‘/mnt/python’, these are just my choices.I need the service client id and credentials, for this I will create a new Application Registration by going to the Active Directory blade in the Azure portal and clicking on “New Application Registration”Fill in you chosen App name, here I have used the name ‘MyNewApp’, I know not really original. Then press ‘Create’ to create the App registrationThis will only take a few seconds to create and you should then see your App registration in the list of available apps. Click on the App you have created to see the details which will look something like this:Make a note of the ApplicationId GUID (partially deleted here), this is the SERVICE CLIENT ID you will need. Then from this screen click the “Settings” button and then the “Keys” link. We are going to create a key specifically for the purpose. Enter a Key Description, choose a Duration from the drop down and when you hit “Save” a key will be produced. Save this key, its the value you need for YOUR SERVICE CREDENTIALS and as soon as you leave the blade it will disappear. We now have everything we need except the DIRECTORY ID. To get the DIRECTORY ID go back to the Active Directory blade and click on “Properties” as shown below:-From here you can get the DIRECTORY IDOk, one last thing to do. You need to grant access to the “MyNewApp” App to the Data Lake Store, otherwise you will get access forbidden messages when you try to access ADLS from Databricks. This can be done from Data Explorer in ADLS using the link highlighted below.Now we have everything we need to mount the drive. In Databricks launch a workspace then create a new notebook (as described in my previous post). Run the command we put together above in the python notebookYou can then create directories and files in the lake from within your databricks notebookIf you want to you can unmount the drive using the following commandSomething to note. If you Terminate the cluster (terminate meaning shutdown) you can restart the cluster and the mounted folder will still be available to you, it doesn’t need to be remounted.You can access the file system from Python as follows:with open("/dbfs/mnt/python/newdir/iris_labels.txt", "w") as outfile:and then write to the file in ADLS as if it was a local file system.Ok, that was more detailed than I intended when I started, but I hope that was interesting and helpful. Let me know if you have any questions on any of the above and enjoy the power of Databricks and ADLS combined.