Using SQL Window and Offset Window Functions
Today I was faced with a task of diseminating sequence of events in a report, so I thought I would share how I approached the task.
The Rexy can be found here : https://rextester.com/AVTOWD47990
Using window functions such as ROW_NUMBER(), RANK(), DENSE_RANK() and NTILE() we can gain insight into how data are ranked amoung a result set. Each of these functions have specific use cases, rank and dense rank treat ties amoungst data slightly differently while ROW_NUMBER() is useful for assigning a ID field to a result set if for some reason one is not already present.
Have a look at the data below which has 4 fields, OrderId,CustId,Val and OrderDate. Typically you will be comparing values based on a sequence of a date field. In this example we will apply first the window functions and secondly offset window functions.
Lets first apply the window function query to our data, this will show how each treats data slightly differently.
This will produce the result set
Lets Examine row numbers 6,7 and 8 to gain a better understanding of the window functions.
Firstly lets cover NTILE(), NTILE accepts an integer parameter which is the number of groups to partition by. For example I choose 5, so we can mod the number of rows by that parameter 9 % 5 = 1, we will have 4 equal groups (with 2 in this case) and 1 odd group with only one.
Next Rank, Rank will honor the ties in our data for example for rows 6 and 7 we have val of 36. Notice that rank honors the tie by assigning 6 to both rows but then skips the next rank and jumps straight to 8. This is essentially the different between RANK() and DENSE_RANK()
Lastly Dense_Rank will honor the ties in data and keep the sequence of rank instead of jumping to the next value. So instead of jumping to rank 8 for row number 8, it is assigned a dense_rank of 7, dense_rank will always have contigious values.
Last but not least is the LAG() and LEAD() functions, these are referred to as OFFSET WINDOW functions, so just imagine putting little windows inside of your result sets and that allow you to view the next logical iteration of a row. This iterations are decided by the PARTITION BY statement in the function, each call to this function will need a PARTITION BY and ORDER BY inside of the over clause. The Complete syntax looks as follows:
Here we are saying, lets select the next VAL field, per CustId and order by OrderDate and OrderId (Ascending is assumed if you do not specify ASC or DESC). Lag Works the same way, a common hang up is the Order By Clause inside of the Partition, usually you can play around and figure out the correct combination of ASC DESC and PARTITION BY to achieve your reporting needs.
Here is the result set from the query above, notice that when the custID is the first in window the PREVVAL field is NULL and when it is the final VAL in window the NEXTVAL is NULL. This confirms that our window is setup correctly for CUSTID.