Thursday, February 27, 2014

SQL Server: Calculating running totals using T-SQL

This is not a new topic. If you search, you will surely find many posts on this, mostly with traditional techniques but not using newest capabilities such as SQL Windowing. Since I wrote a post on Analysis Services for the same, thought to write the same on T-SQL too. Here is the way of calculating running totals using Window components and functions which provides an efficient way of calculating and simple code structure.

The following query shows the way of calculating. The first code creates a window based on SalesOrderId (which is unique) and get the running totals over SubTotal for a given year. The second code creates a window on OrderDate (which is not unique). This will show the totals for the date instead of Running-Totals for the date unless the range is specified using boundaries. That is the reason for adding upper and lower boundaries using ROW, UNBOUNED PRECEDING and CURRENT ROW inside the window for restricting rows to be participated for the calculation.

  1. -- Window based on OrderId which is unique
  2. SELECT OrderDate, SalesOrderID, SalesOrderNumber, CustomerID
  3.     , SubTotal Total
  4.     , SUM(SubTotal) OVER(ORDER BY SalesOrderID) RunningTotal
  5. FROM Sales.SalesOrderHeader
  6. ORDER BY OrderDate, SalesOrderID
  7.  
  8. -- Window based on OrderDate which is not unique
  9. SELECT OrderDate, SalesOrderID, SalesOrderNumber, CustomerID
  10.     , SubTotal Total
  11.     , SUM(SubTotal) OVER(ORDER BY OrderDate
  12.         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) RunningTotal
  13. FROM Sales.SalesOrderHeader
  14. ORDER BY OrderDate, SalesOrderID

Both queries produce same result;

image

Here is a comparison on the same using traditional techniques. Though it shows that the query uses window is faster than other queries, always check and pick the best.

  1. -- Using Window components
  2. SELECT OrderDate, SalesOrderID, SalesOrderNumber, CustomerID
  3.     , SubTotal Total
  4.     , SUM(SubTotal) OVER(ORDER BY SalesOrderID) RunningTotal
  5. FROM Sales.SalesOrderHeader
  6. WHERE YEAR(OrderDate) = 2006
  7. ORDER BY OrderDate, SalesOrderID
  8.  
  9. -- Using self-join
  10. SELECT h1.OrderDate, h1.SalesOrderID, h1.SalesOrderNumber, h1.CustomerID
  11.     , h1.SubTotal Total
  12.     , SUM(h2.SubTotal) RunningTotal
  13. FROM Sales.SalesOrderHeader h1
  14.     INNER JOIN Sales.SalesOrderHeader h2
  15.         ON h1.SalesOrderID >= h2.SalesOrderID
  16.             AND YEAR(h2.OrderDate) = 2006
  17. WHERE YEAR(h1.OrderDate) = 2006
  18. GROUP BY h1.OrderDate, h1.SalesOrderID, h1.SalesOrderNumber
  19.     , h1.CustomerID, h1.SubTotal
  20. ORDER BY h1.OrderDate, h1.SalesOrderID
  21.  
  22. -- Using sub query
  23. SELECT h1.OrderDate, h1.SalesOrderID, h1.SalesOrderNumber, h1.CustomerID
  24.     , h1.SubTotal Total
  25.     , (SELECT SUM(h2.SubTotal) RunningTotal
  26.         FROM Sales.SalesOrderHeader h2
  27.         WHERE h1.SalesOrderID >= h2.SalesOrderID
  28.             AND YEAR(h2.OrderDate) = 2006)
  29. FROM Sales.SalesOrderHeader h1
  30. WHERE YEAR(h1.OrderDate) = 2006
  31. ORDER BY h1.OrderDate, h1.SalesOrderID

image

5 comments:

Sandun Perera said...

Just a thought when I'm skimming through...

Article is understandable as it before MSSQL 2012. but why don't you use LAG() and LEAD() functions for running total calculations?

Dinesh Priyankara said...

Hi Sadun,

Thanks for the comment.

LAG and LEAD functions not for calculating running totals, they are used for getting previous and next values of the window.

I wrote a small post on it before, here is the link: http://dinesql.blogspot.com/2014/02/sql-server-2012-getting-previous-and.html

Sandun Perera said...

got it, thanks!

Asela Abeysinghe said...

Good post. I did it different in a complex way . Now i can simplify my quarry :).

Dinesh Priyankara said...

Thanks Asela