Thursday, June 14, 2012

SQL Aggregation "OVER()"

Over the years I have written many stored procedures in which we had to do some sort of aggregation.

The traditional way would be to sum the value and put in a group by clause at the end of the sql statement.

I found that I landed up building monster queries in order to do something simple.

Then a couple years back I came accross the OVER clause.



You can use the OVER clause with functions to compute aggregated values such as moving averages, cumulative aggregates, running totals, or a top N per group results.

What it allows you to do is retrieve detail rows, and get aggregate data along side it without using a group by clause, which I thought was pretty cool.

Below is practical example of how it would be used.


Firstly create the Turnover Table


 CREATE TABLE [dbo].[Turnover](
 [Id] [int] IDENTITY(1,1) NOT NULL,
 [Department] [varchar](50) NOT NULL,
 [Sales] [decimal](25, 10) NOT NULL,
 [FinMonth] [int] NOT NULL,
 [FinYear] [int] NOT NULL
) ON [PRIMARY]

Insert Some Records


SET IDENTITY_INSERT [dbo].[Turnover] ON
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (1, N'Service Department', 100.0000000000, 1, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (2, N'Sales Department', 120.0000000000, 1, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (5, N'Service Department', 130.0000000000, 2, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (6, N'Sales Department', 125.0000000000, 2, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (7, N'Service Department', 80.0000000000, 3, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (8, N'Sales Department', 180.0000000000, 3, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (9, N'Service Department', 200.0000000000, 4, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (10, N'Sales Department', 34.0000000000, 4, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (11, N'Service Department', 55.0000000000, 5, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (12, N'Sales Department', 66.0000000000, 5, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (13, N'Service Department', 22.0000000000, 6, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (14, N'Sales Department', 666.0000000000, 6, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (15, N'Service Department', 345.0000000000, 7, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (16, N'Sales Department', 123.0000000000, 7, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (17, N'Service Department', 789.0000000000, 8, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (18, N'Sales Department', 222.0000000000, 8, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (19, N'Service Department', 222.0000000000, 9, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (20, N'Sales Department', 111.0000000000, 9, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (21, N'Service Department', 764.0000000000, 10, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (22, N'Sales Department', 345.0000000000, 10, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (23, N'Service Department', 234.0000000000, 11, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (24, N'Sales Department', 876.0000000000, 11, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (25, N'Service Department', 667.0000000000, 12, 2012)
INSERT INTO [dbo].[Turnover] ([Id], [Department], [Sales], [FinMonth], [FinYear]) VALUES (26, N'Sales Department', 140.0000000000, 12, 2012)
SET IDENTITY_INSERT [dbo].[Turnover] OFF

Below is my SQL statement that will do the aggregation


 Select   T.*,
              RT.RunningTotal,
              Rank() OVER(Order By Sales desc) as Rating,
              Sum(Sales) OVER(Partition By FinYear,FinMonth) as TotalSalesMonth,
              Sum(Sales) OVER(Partition By FinYear) as TotalSalesYear,
              Avg(Sales) OVER(Partition By FinYear,FinMonth) as AverageSalesMonth,
              Avg(Sales) OVER(Partition By FinYear) as AverageSalesYear,
              ROW_NUMBER() OVER (Order By Id) as RowNumber
   
From    dbo.Turnover T
Cross Apply (Select Sum(Sales) as RunningTotal
                        From Turnover
                        Where Id <= T.Id
             ) as RT

I did this example in SQL 2005, so the aggregation for the Running Total didn't work thus why I used a cross apply that does the same thing, however in SQL 2012 you can simply do the following:

SUM(Sales) OVER (ORDER BY Id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

Results



In basic terms it means the following:


  • Over: for the set of rows
  • (Partition By FinYear): that have the same FinYear, "Basically the Group By in the old way of doing things"
I hope you got some value out of this post.

Till Later

Nick












No comments:

Post a Comment