Adatis

Adatis BI Blogs

Setup Of Oracle Attunity Connectors

Fairly often I come across a project where there is a requirement to extract data from an Oracle source. Whilst this can be done in SSIS using a standard ODBC connection there are some drawbacks to this – the main one for me is that when working with larger data volumes the speed of data transfer can be quite slow. As such I have found the best solution is generally to setup the Attunity Connectors in order to facilitate quick extraction from the Oracle system. For a great comparison on the performance between straight ODBC and Attunity – please see this post from Len Wyatt. Unfortunately the setup can be a little bit fiddly so I thought I would document the process here (as much for me as anyone else). 1. The first step is to determine the version of the Attunity connectors required – this depends upon your SQL server version: SQL Server Version Attunity Connector Version Supported Oracle Versions SQL Server 2016 v4.0 Oracle 10.x; Oracle 11.x; Oracle 12c SQL Server 2014 v3.0 Oracle 10.x, Oracle 11.x, Oracle 12c SQL Server 2012 v2.0 Oracle 9.2.0.4 and higher, Oracle 10.x, Oracle 11.x SQL Server 2008 / 2008 R2 v1.2 Oracle 9.2.0.4 and higher, Oracle 10.x, Oracle 11.x 2. Following this you need to install the Oracle client. Generally you will be working in a 64-bit environment which will require both the 64-bit and 32-bit versions of the Oracle client to be installed (Data tools uses a 32-bit connection but deployed packages should run in 64-bit mode for optimum performance). The Oracle client can be downloaded from http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html you will need to sign up on the site and create an account if you don’t have one already. NOTE: It is not enough to only download the connectors for SSIS and run the installation packages in order to retrieve data you must install the Oracle client. Once you have determined the correct version of the Attunity connectors this you will need to select the relevant version of the Client which you wish to use. Because when Visual Studio is used to run SSIS package this runs in 32-bit mode, you will need the 32-bit version of the client installed on development machines. However, when you deploy the packages and run them from the SSIS Catalog this runs in 64-bit mode so you need to install the 64-bit version to get the best performance in production environments. There are several version of Oracle clients available for download on the site, most recently I used the following version of the Oracle components. I find that these install relatively simply and provide the required features without requiring to much disk space or consuming to much in the way of resources. Depending on the version of the client you have selected the installation instruction for Oracle will vary. However I would suggest you follow the following points: 1. Run the installation packages as an administrator by right clicking and selecting ‘Run as Administrator’ – previously when I have not done this it has led to unusual results. 2. You will need to install the clients on all machines which will be running the SSIS packages – this may be several servers including developer machines used for creating SSIS packages and servers used for schedules ETL runs. 3. The installation should be done in the following order: - Install Oracle 32-bit Client (if required) - Install Oracle 64-bit Client (if required) - Restart machine Once the client is installed a simple way to test that the client is working is to use SQL Developer to open the connection to the Oracle database. Once this is working correctly you then need to install the Attunity connectors. Again depending on the system you are working on you will need to install either the 64-bit of 32-bit Attunity components for SSIS. However here the rules are slightly different. - If you need to run in 64-bit or 64-bit and 32-bit mode, you should install only the 64-bit version of the components. - If you need to run in 32-bit mode only then you should install only the 32-bit version of the compoents. The latest version of the connectors (for SQL 2016) are available from: https://www.microsoft.com/en-us/download/details.aspx?id=52950 The installation of the components is pretty straightforward – once you downloaded them you run the executable and then restart the machine. Once completed the Attunity components will be available for selection within the SSIS toolbox in the Data Flow as per the below. The connectors are a very versatile component and support many of the features you would expect from an SSIS component. Be aware that if you are looking for expressions, then these are to be found on the Data Flow Properties rather than the properties of the actual component.

Slowly-Changing Parent-Child Dimensions – Part 2: The Implementation

Distributing the slides from my talk last night, it was well received and I think everyone was able to take something away. Credit goes to Marco Russo and Alberto Ferrari for their data modelling ideas, visit www.sqlbi.com  http://dl.dropbox.com/u/25925341/Learning%27s%20from%20large%20scale%20dw%20project.pptx Hopefully not my last talk as i enjoyed doing it.  Look out for further Bath based events at: http://avonim.wordpress.com/ All comments welcome CalvinFirst of all, an apology. It’s always hard to recognise whether your posts are useful for people out there and this one slipped through the cracks. After several requests, I thought I’d finally get around to posting the related solution to a post I wrote 2 years ago! Apologies to anyone this could have helped quite some time ago. To set the scene, I had to build a slowly-changing parent child dimension and manage the relationships accordingly. I described the theory behind the solution in this post, probably best give that a skim if you’re not familiar with the approach. To summarise – if a parent node changes, you create new SCD records for all descendants, essentially you create historic “branches” as each change happens. Initial Data Setup For the example, we’ll use the AdventureWorksDW2012 sample DB as it has an org chart we can very easily play with. You can find the download here. We’re going to create a new parent-child SCD dimension using dbo.DimOrganization by running the script below:CREATE TABLE [dbo].[DimOrganizationSCD]( [OrgKey] [int] IDENTITY(1,1), [ParentOrgKey] [int] NULL, [OrganizationID] [int] NOT NULL, [ParentOrganizationID] [int] NULL, [PercentageOfOwnership] [nvarchar](16) NULL, [OrganizationName] [nvarchar](50) NULL, [CurrencyKey] [int] NULL, [StartDate] datetime NOT NULL, [EndDate] datetime NULL, [CurrentFlag] [int] NOT NULL ) ON [PRIMARY] GO INSERT INTO [dbo].[DimOrganizationSCD] (ParentOrgKey, OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey, StartDate, EndDate, CurrentFlag) SELECT -1 [ParentOrgKey] ,[OrganizationKey] ,[ParentOrganizationKey] ,[PercentageOfOwnership] ,[OrganizationName] ,[CurrencyKey] ,'2014-01-01 00:00:000' [StartDate] ,NULL [EndDate] ,1 [CurrentFlag] FROM [dbo].[DimOrganization] UPDATE C SET ParentOrgKey = P.OrgKey FROM [dbo].[DimOrganizationSCD] C INNER JOIN [dbo].[DimOrganizationSCD] P ON C.ParentOrganizationID = P.OrganizationID Take a look in the table and you’ll see something like this: We now have a separation between the unique business identifier for each organisational entity (OrganizationID), and its historical version (OrganizationKey). The Org Change Occurs Next let’s make a change to track – maybe the North American operation has been split off as a subsidiary and we need to insert a hierarchy level for the new brand “SuperCycles US”. We’ll create an ‘updates’ table to represent the delta coming from our ETL and put in two records – the new organisation record for SuperCycles, and the update to North American Operation to represent the new Parent ID.CREATE TABLE [dbo].[DimOrganization_Updates]( [OrganizationID] [int] NOT NULL, [ParentOrganizationID] [int] NULL, [PercentageOfOwnership] [nvarchar](16) NULL, [OrganizationName] [nvarchar](50) NULL, [CurrencyKey] [int] NULL ) ON [PRIMARY] GO INSERT INTO [dbo].[DimOrganization_Updates] VALUES (15, 1, 1, 'SuperCycles US', 100), (2, 15, 1, 'North America Operations', 100) The ETL Logic For this example, I’ve written it as one big stored procedure, will all the logic occurring in one script. I’d normally break this into separate procedures with SSIS handling data flows, lookups etc and logging results of individual scripts, but a single SQL flow is easier to demonstrate. Firstly, we create a couple of variables used within the proc:--Createa a table used to hold all records impacted by the hierarchy changes DECLARE @Updates TABLE ( [OrganizationID] [int] NOT NULL, [ParentOrganizationID] [int] NULL, [PercentageOfOwnership] [nvarchar](16) NULL, [OrganizationName] [nvarchar](50) NULL, [CurrencyKey] [int] NULL ); --Use a common timestamp across all SCD updates DECLARE @SCDDate datetime = getdate(); Now, as far as any children of the North American Operation go, they technically haven’t changed and so we won’t see an update for them. So our ETL logic needs to take that into account. We take our update rows and build a recursive CTE that finds all active child rows in our dimension:--Use a recursive CTE to find all updates AND all descendants of updated records WITH OrgUpdates AS ( --Our Changed Records SELECT U.OrganizationID, U.ParentOrganizationID, U.PercentageOfOwnership, U.OrganizationName, U.CurrencyKey FROM [dbo].[DimOrganization_Updates] U LEFT JOIN [dbo].[DimOrganizationSCD] D on U.OrganizationID = D.OrganizationID and D.CurrentFlag = 1 UNION ALL --Our Descendants (this will recurse until no more descendants are found) SELECT D.OrganizationID, D.ParentOrganizationID, D.PercentageOfOwnership, D.OrganizationName, D.CurrencyKey FROM [dbo].[DimOrganizationSCD] D INNER JOIN OrgUpdates U on D.ParentOrganizationID = U.OrganizationID and D.CurrentFlag = 1 ) --Store the results of the CTE in a table variable INSERT INTO @Updates SELECT OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey FROM OrgUpdates This will loop a maximum of 100 times by default, you’ll need to override this if your parent-child structure has more than 100 levels. Hopefully this isn’t an issue for anyone! We now have a table variable populated with each of the business IDs that relates to a changed record, or a descendant of a changed record. From here on, we can follow a standard SCD Type 2 method – we stamp the existing records with an end date & amend the current flag, then insert the new records. The end-dating of the old records needs to be done first, as it is easier to identify the new records by their current flag once they have been inserted. So the historical update can be something like:--Mark the existing records as old by finding all related records that are currently active. --It is very important that we do this before inserting the new records as otherwise we would not be able to distinguish between old and new! UPDATE D SET EndDate = @SCDDate, CurrentFlag = 0 FROM @Updates U INNER JOIN [dbo].[DimOrganizationSCD] D on U.OrganizationID = D.OrganizationID and D.CurrentFlag = 1 And then we insert new records:--Insert the new rows into the dimension, these will each have new surrogate IDs associated --These will all have NULL ParentOrganizationKeys as we need to wait for all surrogates to be generated INSERT INTO [dbo].[DimOrganizationSCD] (OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey, StartDate, EndDate, CurrentFlag) SELECT OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey, @SCDDate StartDate, NULL EndDate, 1 CurrentFlag FROM @Updates Note that at this point, we’re taking the full cut of records from our “Updates” table, so this will include any new records. Our “SuperCycles US” record will have been inserted here, along with the updated records. The newly inserted records have deliberately not had their ParentOrganizationKey populated as for many of the record, the parent’s surrogate key has not yet been generated. We need to run an update on the dimension table after the insert to go through and fill this in.--Finally, update the ParentOrganizationKey for all Current records UPDATE C SET ParentOrgKey = P.OrgKey FROM [dbo].[DimOrganizationSCD] C INNER JOIN [dbo].[DimOrganizationSCD] P ON C.ParentOrganizationID = P.OrganizationID WHERE C.CurrentFlag = 1 AND P.CurrentFlag = 1 And that’s it, we now have a new branch of the organisation chart to represent today’s changes. Taking another look at our dimension table, we now have a new set of active records representing the current Org hierarchy. Improvements This particular logic has been written to demonstrate an approach, there are many improvements you may want to make if including in a large, resilient ETL process. Firstly, as mentioned above, I would separate the logic into individual stored procedures that each returned rowcounts for better logging. You can also amend the insertion procedure to take original values if you need to mix and match SCD types and include a Type 3 “original value” column. Finally, you may want to include a HierarchyId type column and populate using the full grandparent>parent>child surrogate key path, this allows you to query for such things as “all descendants” without having to build in recursion. This can vastly improve performance if you’re going to be running aggregations over huge volumes directly on the SQL layer. Further Reading Recursive Query using Common Table Expressions Hierarchal Data Code Extract Putting all the code together, our ETL query is as follows:--Create a table used to hold all records impacted by the hierarchy changes DECLARE @Updates TABLE ( [OrganizationID] [int] NOT NULL, [ParentOrganizationID] [int] NULL, [PercentageOfOwnership] [nvarchar](16) NULL, [OrganizationName] [nvarchar](50) NULL, [CurrencyKey] [int] NULL ); --Use a common timestamp across all SCD updates DECLARE @SCDDate datetime = getdate(); --Use a recursive CTE to find all updates AND all descendants of updated records WITH OrgUpdates AS ( --Our Changed Records SELECT U.OrganizationID, U.ParentOrganizationID, U.PercentageOfOwnership, U.OrganizationName, U.CurrencyKey FROM [dbo].[DimOrganization_Updates] U LEFT JOIN [dbo].[DimOrganizationSCD] D on U.OrganizationID = D.OrganizationID and D.CurrentFlag = 1 UNION ALL --Our Descendants (this will recurse until no more descendants are found) SELECT D.OrganizationID, D.ParentOrganizationID, D.PercentageOfOwnership, D.OrganizationName, D.CurrencyKey FROM [dbo].[DimOrganizationSCD] D INNER JOIN OrgUpdates U on D.ParentOrganizationID = U.OrganizationID and D.CurrentFlag = 1 ) --Store the results of the CTE in a table variable INSERT INTO @Updates SELECT OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey FROM OrgUpdates --Mark the existing records as old by finding all related records that are currently active. --It is very important that we do this before inserting the new records as otherwise we would not be able to distinguish between old and new! UPDATE D SET EndDate = @SCDDate, CurrentFlag = 0 FROM @Updates U INNER JOIN [dbo].[DimOrganizationSCD] D on U.OrganizationID = D.OrganizationID and D.CurrentFlag = 1 --Insert the new rows into the dimension, these will each have new surrogate IDs associated --These will all have NULL ParentOrganizationKeys as we need to wait for all surrogates to be generated INSERT INTO [dbo].[DimOrganizationSCD] (OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey, StartDate, EndDate, CurrentFlag) SELECT OrganizationID, ParentOrganizationID, PercentageOfOwnership, OrganizationName, CurrencyKey, @SCDDate StartDate, NULL EndDate, 1 CurrentFlag FROM @Updates --Finally, update the ParentOrganizationKey for all Current records UPDATE C SET ParentOrgKey = P.OrgKey FROM [dbo].[DimOrganizationSCD] C INNER JOIN [dbo].[DimOrganizationSCD] P ON C.ParentOrganizationID = P.OrganizationID WHERE C.CurrentFlag = 1 AND P.CurrentFlag = 1

Nonclustered Columnstore Indexes And Stored Procedures

I was recently working on a project with a developer who needed to execute a stored procedure which followed this pattern (amended for simplicity): CREATE PROC ExampleProc AS IF (SELECT COUNT(*) FROM Sys.indexes WHERE name = 'NCCIDX_DemoIndex') > 0      DROP INDEX [NCCIDX_DemoIndex] ON [dbo].[Demo] INSERT INTO [dbo].[Demo]([Field1], [Field2], [Field3]) VALUES('Some','Test','Data')      ,('More','Test','Data') CREATE NONCLUSTERED COLUMNSTORE INDEX [NCCIDX_DemoIndex] ON [dbo].[Demo] (    [Field1],     [Field2],     [Field3] )WITH (DROP_EXISTING = OFF) Unfortunately on execution the procedure would fail intermittently and the following error message would be returned: Msg 35330, Level 15, State 1, Procedure ExampleProc, Line 7 INSERT statement failed because data cannot be updated in a table with a columnstore index. Consider disabling the columnstore index before issuing the INSERT statement, then rebuilding the columnstore index after INSERT is complete. The reason for the error message is that, at the time of compilation if the COLUMNSTORE index exists the optimiser decides the INSERT statement will fail (although we drop the index within the procedure this will not be taken into account by the optimiser). If the procedure is executed and generates a plan at a time when the index does not exist it will run as expected. However we cannot resolve the issue reliably by first removing the index and then creating the procedure. The reason for this is that when the plan is subsequently flushed from the plan cache (something which can be forced by using DBCC FREEPROCCACHE) the optimiser will create a new plan and if the index exists when this happens the query will fail. The solution I found was to add OPTION(RECOMPILE) to the INSERT statement, this causes it to be assessed at runtime when the index has been dropped stopping the error. Please be aware as this will force recompilation each time the procedure is run adding a small overhead, whilst this was quite acceptable within the DW environment, you will need to make your own assessment. The amended procedure therefore follows the following format: CREATE PROC ExampleProc AS IF (SELECT COUNT(*) FROM Sys.indexes WHERE name = 'NCCIDX_DemoIndex') > 0      DROP INDEX [NCCIDX_DemoIndex] ON [dbo].[Demo] INSERT INTO [dbo].[Demo]([Field1], [Field2], [Field3]) VALUES('Some','Test','Data')      ,('More','Test','Data') OPTION(RECOMPILE) CREATE NONCLUSTERED COLUMNSTORE INDEX [NCCIDX_DemoIndex] ON [dbo].[Demo] (    [Field1],     [Field2],     [Field3] )WITH (DROP_EXISTING = OFF)

Optimising SSIS Lookups with HASHBYTES()

I was recently working on a load for a Dimension table which worked like the below. The package follows a fairly standard routine. Data for the Dimension is transformed into the required format using SSIS, following this it is inserted into the target table. On the way an initial check is made to determine if the record is new ‘LKU – Check Exists’ and a secondary check to determine if the record has changed, which is subsequently used in an update routine. Although the package was incrementally loading, it was having to sometimes check several million rows to determine if these had changed, as the table was quite wide I decided to implement the HASHBYTES() function in order to take a hash of the columns for comparison rather than compare each of the columns. The initial results were quite promising with the package running in around half the time, however on a particularly big load I noticed the package failed as it had run out of memory for the second Lookup (full cache was used). I found this odd because the HASHBYTES function only returned a single value and I anticipated this would be smaller than the sum of the columns, however on a little deeper investigation I found that by default any row returned using the HASHBYTES function are all of a default size 8000 bytes. A quick check of the SQL Documentation of the HASHBYTES function at http://msdn.microsoft.com/en-GB/library/ms174415.aspx states that the size of the data returned for the HASHBYTES function when used with the SHA2-256 algorithm is 32 bytes meaning that most of this space was being wasted. Therefore changing the formula from 1 to 2 below significantly increases the efficiency of the lookup and also make the cache size required smaller by a factor of around 250! 1. HASHBYTES('SHA2_256',[ProductName] + '|' + [CategoryName]) AS HashValue 2. CONVERT(VARBINARY(32),HASHBYTES('SHA2_256',[ProductName] + '|' + [CategoryName])) AS NewHash Upon doing this the size of each row was significantly reduced, the package is running faster than before and most importantly there are no memory issues :)

Prototyping Analysis Services Cubes

After running a modelling workshop with end users, my next activity on a project is generally to produce a prototype model with some test data, in order to check with users that the planned star schema design will work. My tool of choice for this job tends to be Power Pivot, but I was presented with a slightly different solution when working with a client recently. This got me thinking it would be worth contrasting the two approaches…. Power Pivot First of all, lets start off with Power Pivot. The goal here is to quickly produce a prototype to verify the star schema design. I tend to just request that a business user brings along an Excel workbook with samples of each dimension and facts pre-extracted in Excel, which can then be loaded into Power Pivot. I find that this saves time, although as an alternative of course Power Pivot could extract directly from SQL Server, Oracle etc if needed. As an example, we can quickly walk through creating a very simple model to create a star schema: I’ve copied the data that represents my single fact table and related dimensions into Excel. In this case it’s to model a simple Sales Order Transactions fact, therefore my Excel workbook has dimensions such a Product, Customer, Sales Territory etc, as we well as the data that represents the fact table: After completing and closing the workbook, we now create a new workbook and open the Power Pivot window. Within the Power Pivot window, the next step is to import the data from the Excel workbook created in step 1. This can be done by choosing From Other Sources->Excel workbook and then browsing to the file. Power Pivot will now present us with all of the sheets that it finds in the workbook, so we select the sheets that represent the dimensions and facts: The next step is to setup relationships between the various tables. Once this is done, we have a simple prototype data model, as shown below: After a bit of tidying up (e.g. creating some hierarchies and removing unwanted columns), we can now connect to a Pivot Table in Excel that will help verify the star schema: They key point here is that we have rapidly built a prototype, without the need for any ETL. This means we can quickly cover the design issues with the users, but this time with their actual data, rather than just on a whiteboard. Analysis Services The alternative approach that was presented to me recently is to develop the Analysis Services cube/Tabular model before carrying out the ETL. Essentially this means that the users connect to cube, as a production system, unaware that under the hood the ETL is not yet complete. How is this achieved? Essentially by putting logic in the SQL view layer. Its a well known Analysis Services best practice to bind your SSAS objects (e.g. dimensions, measure groups) to SQL Server views, rather than using objects within the DSV. Rather than each view pulling its data from a complete dimension or fact table, instead each view would pull its data from a staging area or a copy of the source database. The idea being that, over time, the contents of each view would be updated to point to the actual dimensions and facts, once they are built. Therefore a normal view (in this case for a product dimension) that feeds Analysis Services might look like: SELECT ProductKey, ProductName, ProductCategory, ProductSubCategory FROM Dim.Product With the ‘No ETL’ approach, the view looks very different, as any transformation and cleaning will have to be carried out from the view: SELECT ROW_NUMBER() OVER(ORDER BY PROD.ProductID) AS ProductKey, PROD.Name AS ProductName, CAST(ISNULL(SUB.Name, 'Unknown') AS VARCHAR(100)) AS ProductSubCategory, CAST(ISNULL(CAT.Name, 'Unknown') AS VARCHAR(100)) AS ProductCategory FROM [$(AdventureWorks)].Production.Product PROD LEFT JOIN [$(AdventureWorks)].Production.ProductSubcategory SUB ON SUB.ProductSubcategoryID = PROD.ProductSubcategoryID LEFT JOIN [$(AdventureWorks)].Production.ProductCategory CAT ON CAT.ProductCategoryID = SUB.ProductCategoryID This therefore incurs some technical debt, as the cube gets built before the ETL or even the physical dimension table. But the idea is that you can get the users using the cube, on production even, then gradually replace the views with proper ETL and dimension tables once you have some more feedback. Conclusion The Power Pivot method is tried and tested for me many times – I find that its very quick to get up and running. The Analysis Services approach is not quite as quick to get up and running, but offers the benefit that the cube can continually evolve in a fairly agile manner. Its early days for me using this approach, but the a disadvantage is that the views can get quite complex, impacting performance depending on the data volumes. Both methods probably have their place. If I wanted to validate my design, I think I would continue to produce a quick Power Pivot model. On the other hand, if quick delivery to production is a must, and the complexity/data volumes are not an issue, then I would use the view approach.

SQL 2014: Boosting ETL Performance using Memory-Optimised Tables

With SQL Server 2014 due for release soon, there’s a lot of talk about how little is changing for those of us more interested in Business Intelligence. We’ll be getting clustered, updatable columnstore, which is very convenient, but I’ve been looking into the potential benefits of other new features. The real gem for me is the potential of memory-optimised tables or in-memory OLTP (previously known as Project Hekaton). I’ve had a few questions around how columnstore and memory-optimised tables differ – they’re both in-memory solutions for data storage right? The big difference comes in disk usage – in order for a columnstore index to exist, there must also be a physical table written to the disk. Any updates to the table, even in SQL Server 2014, will require disk writes, log writes too if you’re not bulk loading. Memory-optimised tables don’t have this overhead – we can push the staging data straight into memory and skip the disk entirely. There are plenty of uses for this new type of table, but it’s the traditional ETL routine I’m interested in. We would normally pull data from our source systems, place it in a staging database while we clean, validate etc, then write the cleaned data to the warehouse: The data written to the staging database is transient, it’s only of use during the load process. It is therefore redundant to write this data to disk each time; I believe this is where memory-optimised tables will really have an impact. To prove this theory, I set up a quick demonstration. I started by creating two identical tables, one physical, one memory-optimised, both schema copies of [AdventureWorks2012].[Sales].[SalesOrderDetail]. The memory optimised table was created using the following script:     I then built the simplest of SSIS packages, a data flow to move data from AdventureWorks2012.Sales.SalesOrderDetail to a ‘staging’ table, then read the data from the staging table and write it back to a disk-based table, representing our final warehouse. I configured two versions of the package, one where the staging table was disk-based, one where it is our memory-optimised table. I deployed and scheduled these packages as alternating jobs and kicked them off for a few hours to see how performance varies over time. The results are as follows:   The initial results are very promising - we see a huge reduction in execution time when using in-memory tables for our transient ETL data. We’re still performing minimal logging with this setup – you can reduce this even further by switching to SCHEMA_ONLY durability, but by bypassing writing the data to disk we’ve cut our ETL window in half. There are obvious hardware limits to how this can be used, and in-memory tables have their own limitations, but implemented carefully it could change the face of warehouse loading drastically.