Wednesday, November 23, 2016

SSIS Pivot Transformation

I have written a post on this in 2011 based on old configuration which is PIVOT and UNPIVOT: Integration Services. Setting Pivot Transformation is not that easy with versions prior to 2012 but GUI given after can be easily used and implemented. If you need to understand the old way of pivoting, read the previous blog post, this post is based on the GUI given with SQL Server 2012 onward.

Pivot allows us to convert a normalized dataset into a less normalized format, transferring (pivoting) row values into columns. In simple term, it identifies distinct values in a designated column and forms columns based on identified values. Note that it is not fully dynamic and automated, it does not recognize the column names to be generated as Pivot Columns, you need to manually specify new column names but the new GUI can be used for getting the column names.

Let's see an example. The following query generates an output with Year+Customer, Month and OrderAmount;

USE AdventureWorks2014;
GO

SELECT 
    Convert(varchar(4), YEAR(OrderDate)) + '_' +  p.LastName OrderYear_Customer
    , DATENAME(mm, OrderDate) OrderMonth
    , SUM(SubTotal) OrderAmount
  FROM Sales.SalesOrderHeader h
    INNER JOIN Sales.Customer c
      ON c.CustomerID = h.CustomerID
    INNER JOIN Person.Person p    
      ON p.BusinessEntityID = c.PersonID
  GROUP BY 
 Convert(varchar(4), YEAR(OrderDate)) + '_' + p.LastName
    ,DATENAME(mm, OrderDate)
  ORDER BY 1, 2


Let's pivot this result using Integration Services as below output;


As you see, what we have to do is, transfer values in OrderMonth column into multiple columns and use OrderAmount column for calculating the relevant total for each month. Use this script for creating a table for holding above resultset. You can create it in a new database, in my case, it has been created in a database called DataWarehouse.

CREATE TABLE dbo.SalesForMonths
(
 Id int identity(1,1) primary key
 , OrderYear_Customer nvarchar(200) NOT NULL
 , January money NULL default(0)
 , February money NULL default(0)
 , March money NULL default(0)
 , April money NULL default(0)
 , May money NULL default(0)
 , June money NULL default(0)
 , July money NULL default(0)
 , August money NULL default(0)
 , September money NULL default(0)
 , October money NULL default(0)
 , November money NULL default(0)
 , December money NULL default(0)
)

Next step is creating a SSIS project and adding a package. Here are the steps for loading and pivoting data;

1. Add an OLE DB Data Source and configure it with above SELECT statement.
2. Add a Pivot Transformation and connect with OLE DB Source. Open the Editor of it (You can either double-click on it or select Edit menu item from the context menu).
3. We need to set following items;
  • Pivot Key: This needs a column that contains values for setting new columns (column names). With this example, it needs to be set with OrderMonth.
  • Set Key: This needs a sorted column that contains values for grouping. Select OrderYear_Customer for this property.
  • Pivot Value: This needs a column that contains values for new pivot output columns. Set OrderAmount to this.
4. Next is setting up output column names. Either we have to hard code this as January,February, March,..... or get the list generated from the Pivot Key Column set. Let's get the list generated. Select ignore un_matched Pivot Key values and report them after DataFlow execution checkbox and click OK.


Although we have not completed the package, let's execute;


Once the execution is done, go to Progress tab and get the values generated copied.


As mentioned above, either you can type these values for generating pivot column names or copy this and paste without typing. 

5. Let's open the Editor of Pivot Transformation again and paste the copied value. Once pasted, click on Generate Column Now button.


6. Click OK to complete the Pivot Transformation.
7. Add an OLE DB Destination and configure for the table created above. 
8. Match the columns flowing from the Pivot with it and save the OLE DB Destination.


9. Execute the package now. 


10. If you query for dbo.SalesForMonths now, you should see the loaded pivot result.



No comments: