Jonathon Eve-O'Connor's Blog

Setup Of Oracle Attunity Connectors

Fairly often I come across a project where there is a requirement to extract data from an Oracle source. Whilst this can be done in SSIS using a standard ODBC connection there are some drawbacks to this – the main one for me is that when working with larger data volumes the speed of data transfer can be quite slow.

As such I have found the best solution is generally to setup the Attunity Connectors in order to facilitate quick extraction from the Oracle system. For a great comparison on the performance between straight ODBC and Attunity – please see this post from Len Wyatt.

Unfortunately the setup can be a little bit fiddly so I thought I would document the process here (as much for me as anyone else).

1. The first step is to determine the version of the Attunity connectors required – this depends upon your SQL server version:

SQL Server Version Attunity Connector Version Supported Oracle Versions
SQL Server 2016 v4.0 Oracle 10.x; Oracle 11.x; Oracle 12c
SQL Server 2014 v3.0 Oracle 10.x, Oracle 11.x, Oracle 12c
SQL Server 2012 v2.0 Oracle and higher, Oracle 10.x, Oracle 11.x
SQL Server 2008 / 2008 R2 v1.2 Oracle and higher, Oracle 10.x, Oracle 11.x

2. Following this you need to install the Oracle client. Generally you will be working in a 64-bit environment which will require both the 64-bit and 32-bit versions of the Oracle client to be installed (Data tools uses a 32-bit connection but deployed packages should run in 64-bit mode for optimum performance). The Oracle client can be downloaded from you will need to sign up on the site and create an account if you don’t have one already.

NOTE: It is not enough to only download the connectors for SSIS and run the installation packages in order to retrieve data you must install the Oracle client.

Once you have determined the correct version of the Attunity connectors this you will need to select the relevant version of the Client which you wish to use. Because when Visual Studio is used to run SSIS package this runs in 32-bit mode, you will need the 32-bit version of the client installed on development machines. However, when you deploy the packages and run them from the SSIS Catalog this runs in 64-bit mode so you need to install the 64-bit version to get the best performance in production environments.

There are several version of Oracle clients available for download on the site, most recently I used the following version of the Oracle components. I find that these install relatively simply and provide the required features without requiring to much disk space or consuming to much in the way of resources.


Depending on the version of the client you have selected the installation instruction for Oracle will vary. However I would suggest you follow the following points:

1. Run the installation packages as an administrator by right clicking and selecting ‘Run as Administrator’ – previously when I have not done this it has led to unusual results.

2. You will need to install the clients on all machines which will be running the SSIS packages – this may be several servers including developer machines used for creating SSIS packages and servers used for schedules ETL runs.

3. The installation should be done in the following order:

- Install Oracle 32-bit Client (if required)
- Install Oracle 64-bit Client (if required)
- Restart machine

Once the client is installed a simple way to test that the client is working is to use SQL Developer to open the connection to the Oracle database. Once this is working correctly you then need to install the Attunity connectors.

Again depending on the system you are working on you will need to install either the 64-bit of 32-bit Attunity components for SSIS. However here the rules are slightly different.

- If you need to run in 64-bit or 64-bit and 32-bit mode, you should install only the 64-bit version of the components.
- If you need to run in 32-bit mode only then you should install only the 32-bit version of the compoents.

The latest version of the connectors (for SQL 2016) are available from:

The installation of the components is pretty straightforward – once you downloaded them you run the executable and then restart the machine.

Once completed the Attunity components will be available for selection within the SSIS toolbox in the Data Flow as per the below. The connectors are a very versatile component and support many of the features you would expect from an SSIS component. Be aware that if you are looking for expressions, then these are to be found on the Data Flow Properties rather than the properties of the actual component.


Optimising SSIS Lookups with HASHBYTES()

I was recently working on a load for a Dimension table which worked like the below.


The package follows a fairly standard routine. Data for the Dimension is transformed into the required format using SSIS, following this it is inserted into the target table. On the way an initial check is made to determine if the record is new ‘LKU – Check Exists’ and a secondary check to determine if the record has changed, which is subsequently used in an update routine.

Although the package was incrementally loading, it was having to sometimes check several million rows to determine if these had changed, as the table was quite wide I decided to implement the HASHBYTES() function in order to take a hash of the columns for comparison rather than compare each of the columns.

The initial results were quite promising with the package running in around half the time, however on a particularly big load I noticed the package failed as it had run out of memory for the second Lookup (full cache was used). I found this odd because the HASHBYTES function only returned a single value and I anticipated this would be smaller than the sum of the columns, however on a little deeper investigation I found that by default any row returned using the HASHBYTES function are all of a default size 8000 bytes.


A quick check of the SQL Documentation of the HASHBYTES function at states that the size of the data returned for the HASHBYTES function when used with the SHA2-256 algorithm is 32 bytes meaning that most of this space was being wasted.

Therefore changing the formula from 1 to 2 below significantly increases the efficiency of the lookup and also make the cache size required smaller by a factor of around 250!

1. HASHBYTES('SHA2_256',[ProductName] + '|' + [CategoryName]) AS HashValue

2. CONVERT(VARBINARY(32),HASHBYTES('SHA2_256',[ProductName] + '|' + [CategoryName])) AS NewHash

Upon doing this the size of each row was significantly reduced, the package is running faster than before and most importantly there are no memory issues :)


Different Methods Of Type 2 Dimension Population

This post is to share some of my findings about the different ways of dealing with Type 2 changes in SSIS environment. I wanted to provide comparison of the methods in terms of performance, reusability and ease of maintenance.

For the purposes of the testing let’s take the scenario where a client would like to implement historical tracking in their DW environment for one of their dimensions, lets say that they have given the following requirements:-

  •  The ordered online flag should be replaced with a more user friendly text field containing ‘Yes’ or ‘No (done in a derived column transformation/SQL case statement).
  • The client would like to track any changes to the Purchase Order Number, Customer Account Number or Carrier Tracking Number so that they can preserve the history of these changes. These will be setup as Type 2 changing attributes and we will need to record the time that each change was induced.
  • Any changes to the OrderedOnline field will be as the result of corrections made to the source data and historic tracking of changes will not be required for this in the DW and all rows should be updated with the new value (a Type 1 change).

I created the following table to hold the data:

    (Order_Key INT IDENTITY(1,1) PRIMARY KEY NOT NULL --Surrogate Key
    ,SalesOrderID INT NOT NULL --Source Key  
--Source Key
,OrderNumber NVARCHAR(25) NOT NULL 
,PurchaseOrderNumber NVARCHAR(25) NOT NULL
   ,AccountNumber NVARCHAR(15) NOT NULL
    ,CarrierTrackingNumber NVARCHAR (25)
   ,OrderedOnline NVARCHAR (3) NOT NULL
    ,RecordValidFromDate DATETIME NOT NULL
   ,RecordExpiredDate DATETIME

The three methods I have selected are:-

  1. The SSIS inbuilt Slowly Changing Dimension (SCD) Task.
  2. A custom built SSIS package to perform the Type 2 changes.
  3.  Using a T-SQL Procedure to perform the dimension population.

In each of the comparisons I have performed the following tests:-

  1. An initial load of data (all of the records are new are there are no changes.
  2. A load of the same data from Step 1 (no changes all records exist)
  3. A load of data which contains both Type 1 and Type 2 changes

I used Adventure Works as a source and Step 1 & 2 both use the same source statement which looks like this:


Sales.SalesOrderHeader OrdHed
INNER JOIN Sales.SalesOrderDetail OrdDet
    ON Ordhed.SalesOrderID = Orddet.SalesOrderID

For the third test I created some copies of the source tables and then ran some basic statements on them to change the format of the data as below:

SELECT * INTO [Sales].SalesOrderDetailTest FROM [Sales].SalesOrderDetail
SELECT * INTO [Sales].SalesOrderHeaderTest FROM [Sales].SalesOrderHeader

--this will only exhibit Type 1 changes

UPDATE Sales.SalesOrderHeaderTest
SET OnlineOrderFlag = Case when  OnlineOrderFlag = 1 THEN 0 ELSE 1 END
WHERE salesOrderID BETWEEN 59392 AND 67257

--this will exhibit only a Type 2 change

UPDATE Sales.SalesOrderHeaderTest
SET PurchaseOrderNumber = 'X' + PurchaseOrderNumber
WHERE salesOrderID < 51525

--this does a mix of both.

UPDATE Sales.SalesOrderHeaderTest
    PurchaseOrderNumber = 'X' + PurchaseOrderNumber
    ,OnlineOrderFlag = CASE WHEN  OnlineOrderFlag = 1 THEN 0 ELSE 1 END
WHERE salesOrderID BETWEEN 51524 AND 59391

The data is then sourced from the new tables which provide the changes for testing.

1. SSIS Slowly Changing Dimension Task

Configuration of the package is very straightforward; you add a data source, drag on a slowly changing Dimension task to the data flow and double click it to configure how you want it to run. The whole process takes less than 10 minutes to complete and there are many options that should suit a variety of needs.



Once the configuration is completed we end up with a dataflow which looks like this:

2. Custom Built SSIS Package

Next I designed a package in SSIS but without using the SCD wizard. I used the following principles during design to try and improve performance:

1. Changes can be detected in the data flow by using simple lookup transformations (we could consider hash checking to improve performance if there were very many columns) to see if there is any change to the data, by using several lookups with we can determine if the change is Type 1, 2 or both.

2. Updates in the data flow are performed in SSIS using the ‘OLE DB Command’ task. As this works on a row-by-row basis this has a large impact on performance. To avoid this I created tables to hold the Type 1 and Type 2 changes and then used an ‘Execute SQL task’ in the control flow to make the updates in one go.

The completed dataflow looks like this:



New records go straight into the target tables and the ones which have changes are split using a multicast, where a change has occurred the records are inserted into holding tables and then processed in the control flow.

The control flow looks like this:

















The control flow executes the statements below:

--Make Type 1 changes

SET TGT.OrderedOnline = SRC.OrderedOnline
ON TGT.SalesOrderID = SRC.SalesOrderID
AND TGT.LineNumber = SRC.LineNumber;

--Type 2 expire old rows

SET TGT.RecordExpiredDate = SRC.TransactionDate
DimOrder TGT
DimOrderType2 SRC
TGT.SalesOrderID = SRC.SalesOrderID
TGT.LineNumber = SRC.LineNumber;

--Type 2 insert new records

INSERT INTO DimOrder(SalesOrderID,LineNumber,OrderNumber,PurchaseOrderNumber,AccountNumber
    ,TransactionDate AS RecordValidFromDate
FROM DimOrderType2;

3. T-SQL Script

I also thought I would try a basic SQL script to see how this performed.



            CREATE TABLE #OrderData
                        SalesOrderID INT NOT NULL
                        ,LineNumber INT NOT NULL
                        ,OrderNumber NVARCHAR(25) NOT NULL
                        ,PurchaseOrderNumber NVARCHAR(25) NOT NULL
                        ,AccountNumber NVARCHAR(15) NOT NULL
                        ,CarrierTrackingNumber NVARCHAR(25) NOT NULL
                        ,OrderedOnline NVARCHAR(3) NOT NULL
                        ,RecordExists smallINT NOT NULL default(0)

            DECLARE @TransactionDate Datetime

            SELECT @TransactionDate = GETDATE()

INSERT INTO #OrderData(SalesOrderID, LineNumber, OrderNumber, PurchaseOrderNumber, AccountNumber,  [CarrierTrackingNumber], [OrderedOnline], RecordExists

                        ,SalesOrderDetailID As LineNumber
                        ,SalesOrderNumber As OrderNumber
                        ,ISNULL(PurchaseOrderNumber,'') As PurchaseOrderNumber
                        ,ISNULL(CarrierTrackingNumber,'') As CarrierTrackingNumber
                        ,CASE OnlineOrderFlag 
                                    WHEN 0 THEN 'No'
                                    WHEN 1 THEN 'Yes'
                        END As OrderedOnline
                        ,0 As RecordExists
            FROM [AdventureWorks2012].Sales.SalesOrderHeaderTest OrdHed
            INNER JOIN [AdventureWorks2012].Sales.SalesOrderDetailTest OrdDet
                        ON Ordhed.SalesOrderID = Orddet.SalesOrderID

            --Check if records exist

            UPDATE SRC
            SET RecordExists = 1
            FROM #OrderData SRC
            INNER JOIN POC.dbo.DimOrder TGT
                        ON TGT.SalesOrderID = SRC.SalesOrderID
                        AND TGT.LineNumber = SRC.LineNumber

            --Insert new records

INSERT INTO DimOrder(SalesOrderID,LineNumber,OrderNumber,PurchaseOrderNumber,AccountNumber

            SELECT SalesOrderID
                        Once in awhile you happen upon a really useful feature in SSRS that you were unaware of. For me strip lines are definitely one of these features that you don’t see that much mention of but which can add valuable insight to a report.

Below is a simple example of a scatter chart with a horizontal and vertical strip line each marking the average values of their axis. 


In order to add strip lines to the report you need to do the following:

1. Select the chart axis to which you wish to add the strip line and go to the properties window. In this there is an option marked ‘StripLines’. When you click in the cell you can open up the Strip Line Collection editor.

image image


2. Click the ‘Add’ button in the strip line editor (note: you can add multiple strip lines to each axis if required).

3. In the option marked ‘IntervalOffset’ specify the required value for the Strip Line, you can either specify a fixed value or use an expression from your Dataset. If you require the line to repeat over fixed values you can fill in the properties for ‘Interval’.


4. You then need to set the display properties for the strip lines under appearance in order for them to be visible. To produce the example strip lines in this post I have set the ‘BorderColour’ to black and the ‘BorderStyle’ to dashed.




5. If you wish to add text after the strip line on the graph – for example to illustrate that values after the strip line are above average as shown below then the settings are under the title section.

image image

That’s it for now, hope that you’ve found this helpful:)