Transaction data aggregate - sql-server

As a disclaimer, I am not entirely sure the title of the question is best, if not I apologize.
I am trying to calculate cycle times for individuals, but files are occasionally transferred out of their work queues and eventually back. There are no unique transaction IDs recorded just a date and time stamp.
I tried looking for an aggregate group by functions and was told that is not a feature sql-server has.
I started by trying to identify the first and last transaction and was going to build out the query from there but it wasn't too helpful. Any insight would be very helpful.
Changedate is when the transfer from one person to another is recorded (year, moth, day time)
select a.claimId,
a.claimincidentID,
cast(a.changeDate as date) changedate,
a.claimNum,
a.Coverage,
a.AssignedAdjID,
a.AssignedAdj,
a.AssignedUnit,
a.TransferedAdjID,
a.TransferedAdj,
a.TransferedUnit,
a.usertypeid,
a.ChangedBy,
b.Feature_Create_Date,
DATEDIFF(day, b.Feature_Create_Date, a.changedate) transfer1,
cast(FIRST_VALUE(changeDate) OVER (ORDER BY changedate ASC)as date) AS firstchangedate,
cast(LAST_VALUE(changeDate) OVER (ORDER BY a.changedate ASC)as date) AS lastchangedate
from DB1.dbo.Assign_Transfer a
left join DB2.claimslist b on a.claimid=b.claimId
group by a.claimId, a.claimincidentID, a.changeDate, a.claimNum, a.Coverage, a.AssignedAdjID, a.AssignedAdj, a.AssignedUnit, a.TransferedAdjID, a.TransferedAdj, a.TransferedUnit, a.usertypeid, a.ChangedBy, b.Feature_Create_Date

Think of each of these rows as a Start (because the most recent one hasn't ended)
We would need to generate the complement End for this person in the chain.
Then with pairs of Start/End one could create GrossDuration.
Even after we get an assignment's start and end date/time,
we will have workday (8-4, or 9-5, or noon-8, ...) considerations,
also Sat/Sun/Hol and Vacation/out-of-office.
All of which affect Duration--- For Each Person differently.
Which would need to be factored by workday/etc into AdjDuration.
Lets say we can sequence these
Row_Number() Over (Partition by claimID Order by changeDate) as tfrNum
Assigned is the prior, and Transfered is the next
1, 2, 3, ... thru N
V
a.changeDate -- NOW()
V V
a.AssignedAdjID, | a.TransferedAdjID,
a.AssignedAdj, | a.TransferedAdj,
a.AssignedUnit, | a.TransferedUnit,
|
a.usertypeid,
a.ChangedBy,
So, is tfrNum=1 or tfrNum=N the oddball??
Lets look at pairs: each pair goes StartFrom->EndTo
1-2, 2-3, 3-4, 4-5, 5-6, 6-Now
----
From row1 we get TransferredID Start(changeDate) and
from row2 we get AssignedAdjID End (changeDate)
-- 2-3, 3-4, 4-5, etc repeating
--except for
From row6 we get TransferredID Start(changeDate) and
from default (still them) End (Now)
-- -- except again when TransferredUnit is "Closed"
After getting these pairs and their Start and End, we can do the Duration calc.
I need to visualize this problem before I try to run some sql. Real data would help.
Lets start with this, and later I would expand on it after you get it working and look at some data--
With cte_tfrNum (claimID, changeDate, tfrNum, tfrMax) AS
(
SELECT
a.claimId
,a.changeDate
,ROW_NUMBER() Over ( Partition By a.claimId Order By a.changeDate) as tfrNum
,b.tfrMax
FROM DB1.dbo.Assign_Transfer a
-- just for giggles, lets also get the max# of transfers for this claim
Left Join
(SELECT claimId, COUNT(*) as tfrMax
FROM DB1.dbo.Assign_Transfer
Group By claimId
) as b
On b.claimId = a.claimId
)
-- Statement using the CTE
Select
tfrTo.*
From cte_tfrNum as tfrTo

Thank you! I was able to take what you gave me and add a few things to be able to look at what I needed.
select
case when abc.tfrMax > abc.tfrnum then datediff(day,lag(abc.changedate) over(partition by abc.claimID order by abc.claimId),abc.changeDate)
when abc.tfrMax = abc.tfrnum then datediff(day,lag(abc.changedate) over(partition by abc.claimID order by abc.claimId),abc.changeDate)
end as test
, abc.*
from
(
SELECT
a.claimId
,a.changeDate
,a.AssignedAdj
,a.TransferedAdj
,a.Coverage
,ROW_NUMBER() Over ( Partition By a.claimId Order By a.changeDate) as tfrNum
,b.tfrMax
FROM db1.dbo.Assign_Transfer a
Left Join
(SELECT claimId, COUNT(*) as tfrMax
FROM db1.dbo.Assign_Transfer
Group By claimId
) as b
On b.claimId = a.claimId
) abc
group by
abc.claimId
,abc.changeDate
,abc.AssignedAdj
,abc.TransferedAdj
,abc.Coverage
,abc.tfrMax
,abc.tfrNum

Related

Calculating Days Between Dates in Separate Rows For Same UnitID

I am trying to calculate the time a commercial real estate space sits vacant. I have move-in & move-out dates for each tenant that has occupied that unit. It is easy to calculate the occupied time of each tenant as that data is within the same row. However, I want to calculate the vacant time: the time between move-out of the previous tenant and move-in of the next tenant. These dates appear in separate rows.
Here is a sample of what I have currently:
SELECT
uni_vch_UnitNo AS UnitNumber,
uty_vch_Code AS UnitCode,
uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul_dtm_MoveInDate, tul_dtm_MoveOutDate) AS Occupancy_Days
FROM TenantUnitLeases
JOIN units
ON tul_int_UnitId = uni_int_UnitId
JOIN UnitTypes
ON uni_int_UnitTypeId = uty_int_Id
WHERE
tul_int_UnitId = '26490'
ORDER BY tul_dtm_MoveInDate ASC
Is there a way to assign an id to each row in chronological, sequential order and find the difference between row 2 move-in date less row 1 move-out date and so on?
Thank you in advance for the help.
I can't really tell which tables provide which columns for your query. Please alias and dot-qualify them in the future.
If you're using SQL 2012 or later, you've got LEAD and LAG functions which do exactly what you want: bring a "leading" or "lagging" row into a current row. See if this works (hopefully it should at least get you started):
SELECT
uni_vch_UnitNo AS UnitNumber,
uty_vch_Code AS UnitCode,
uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul_dtm_MoveInDate, tul_dtm_MoveOutDate) AS Occupancy_Days
, LAG(tul_dtm_MoveOutDate) over (partition by uni_vch_UnitNo order by tul_dtm_MoveOutDate) as Previous_Move_Out_Date
, DATEDIFF(day,LAG(tul_dtm_MoveOutDate) over (partition by uni_vch_UnitNo order by tul_dtm_MoveOutDate),tul_dtm_MoveInDate) as Days_Vacant
FROM TenantUnitLeases
JOIN units
ON tul_int_UnitId = uni_int_UnitId
JOIN UnitTypes
ON uni_int_UnitTypeId = uty_int_Id
WHERE
tul_int_UnitId = '26490'
ORDER BY tul_dtm_MoveInDate ASC
Just comparing a value from the current row with a value in the previous row is functionality provided by the lag() function.
Try this in your query:
select...
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DateDiff(day, Lag(tul_dtm_MoveOutDate,1) over(partition by uty_vch_Code, tul_int_FacilityId order by tul_dtm_MoveInDate), tul_dtm_MoveInDate) DaysVacant,
...
This needs a window function or correlated sub query. The goal is to provide the previous move out date for each row, which is in turn a function of that row. The term 'window' in this context means to apply an aggregate function over a smaller range than the whole set.
If you had a function called GetPreviousMoveOutDate, the parameters would be the key to filter on, and the ranges to search within the filter. So we would pass the UnitID as the key and the MoveInDate for this row, and the function should return the most recent MoveOutDate for the same unit that is before the passed in date. By getting the max date before this one, we will ensure we get only the previous occupancy if it exists.
To use a sub-query in ANSI-SQL you just add the select as a column. This should work on MS-SQL as well as other DB platforms; however, it requires using aliases for the table names so they can be referenced in the query more than once. I've updated your sample SQL with aliases using the AS syntax, although it looks redundant to your table naming convention. I added a uni_dtm_UnitFirstAvailableDate to your units table to handle the first vacancy, but this can be a default:
SELECT
uni.uni_vch_UnitNo AS UnitNumber,
uty.uty_vch_Code AS UnitCode,
uty.uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul.tul_dtm_MoveInDate AS Move_In_Date,
tul.tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul.tul_dtm_MoveInDate, tul.tul_dtm_MoveOutDate) AS Occupancy_Days,
-- select the date:
(SELECT MAX (prev_tul.tul_dtm_MoveOutDate )
FROM TenantUnitLeases AS prev_tul
WHERE prev_tul.tul_int_UnitId = tul.tul_int_UnitId
AND prev_tul.tul_dtm_MoveOutDate > tul.tul_dtm_MoveInDate
AND prev_tul.tul_dtm_MoveOutDate is not null
) AS previous_moveout,
-- use the date in a function:
DATEDIFF(day, tul.tul_dtm_MoveInDate,
ISNULL(
(SELECT MAX (prev_tul.tul_dtm_MoveOutDate )
FROM TenantUnitLeases AS prev_tul
WHERE prev_tul.tul_int_UnitId = tul.tul_int_UnitId
AND prev_tul.tul_dtm_MoveOutDate > tul.tul_dtm_MoveInDate
AND prev_tul.tul_dtm_MoveOutDate is not null
) , uni.uni_dtm_UnitFirstAvailableDate) -- handle first occupancy
) AS Vacancy_Days
FROM TenantUnitLeases AS tul
JOIN units AS uni
ON tul.tul_int_UnitId = uni.uni_int_UnitId
JOIN UnitTypes AS uty
ON uni.uni_int_UnitTypeId = uty.uty_int_Id
WHERE
tul.tul_int_UnitId = '26490'
ORDER BY tul.tul_dtm_MoveInDate ASC

While loop for Teradata

world.
I'm try to find a way to compile multiple events together. Data looks like this:
Basically, it is series of row data with event logs
I want to generate an aggregation of these row events such that if a new event occurred within 30 seconds of the other ending, it combines the time together. However, if the event log does not have an abutting event, then it is not captured. And these events are 'person' specific.
I envision the output to look something like this:
My intuition suggests using some kind of while loop, but I'm not sure where to start
There's no need for recursion (and very hard to write) or a loop over a cursor.
SELECT
Person,
Min(starttime),
Max(starttime),
-- get a concatenated string
Trim(Trailing ',' FROM (XmlAgg(Reason || ',' ORDER BY Reason ) (VARCHAR(1000))))
FROM
(
SELECT Person, Start_timestamp, Stop_timestamp, Reason,
-- assign the same number to all rows within 30 seconds
Sum(flag) Over
Over (PARTITION BY Person
ORDER BY Start_timestamp
ROWS Unbounded Preceding) AS grp
FROM
(
SELECT Person, Start_timestamp, Stop_timestamp, Reason,
-- check if previous end is within 30 seconds of the current start
CASE WHEN Lag(Stop_timestamp)
Over (PARTITION BY Person
ORDER BY Start_timestamp) + INTERVAL '30' SECOND < Start_timestamp
THEN 0
ELSE 1
END AS flag
FROM tab
) AS dt
) AS dt
-- aggregate per person and group
GROUP BY Person, grp
If your Teradata version supports SESSIONIZE you can simplify the group calculation, but I couldn't write this syntaxc ad hoc :-)
You can achieve this using RECURSIVE CTE
WITH RECURSIVE MYREC(Person,Start_timestamp,Stop_timestamp ,Reason,LVL)
AS(
SELECT Person,MIN(Start_timestamp),MAX(Stop_timestamp),MIN(Reason)(varchar(100)) AS Reason,1
FROM MYTABLE
GROUP BY 1
UNION ALL
SELECT b.Person,b.Start_timestamp,b.Stop_timestamp ,trim(a.Reason) || ',' || trim( b.Reason), LVL+1
FROM MYTABLE a INNER JOIN MYREC b
ON a.Person = b.Person
AND a.Reason > b.Reason
)
SELECT Person,Start_timestamp,Stop_timestamp,Reason
FROM MYREC
QUALIFY RANK() OVER(PARTITION BY Person ORDER BY Reason DESC) = 1
Change MYTABLE to your tablename

T-SQL - Get last as-at date SUM(Quantity) was not negative

I am trying to find a way to get the last date by location and product a sum was positive. The only way i can think to do it is with a cursor, and if that's the case I may as well just do it in code. Before i go down that route, i was hoping someone may have a better idea?
Table:
Product, Date, Location, Quantity
The scenario is; I find the quantity by location and product at a particular date, if it is negative i need to get the sum and date when the group was last positive.
select
Product,
Location,
SUM(Quantity) Qty,
SUM(Value) Value
from
ProductTransactions PT
where
Date <= #AsAtDate
group by
Product,
Location
i am looking for the last date where the sum of the transactions previous to and including it are positive
Based on your revised question and your comment, here another solution I hope answers your question.
select Product, Location, max(Date) as Date
from (
select a.Product, a.Location, a.Date from ProductTransactions as a
join ProductTransactions as b
on a.Product = b.Product and a.Location = b.Location
where b.Date <= a.Date
group by a.Product, a.Location, a.Date
having sum(b.Value) >= 0
) as T
group by Product, Location
The subquery (table T) produces a list of {product, location, date} rows for which the sum of the values prior (and inclusive) is positive. From that set, we select the last date for each {product, location} pair.
This can be done in a set based way using windowed aggregates in order to construct the running total. Depending on the number of rows in the table this could be a bit slow but you can't really limit the time range going backwards as the last positive date is an unknown quantity.
I've used a CTE for convenience to construct the aggregated data set but converting that to a temp table should be faster. (CTEs get executed each time they are called whereas a temp table will only execute once.)
The basic theory is to construct the running totals for all of the previous days using the OVER clause to partition and order the SUM aggregates. This data set is then used and filtered to the expected date. When a row in that table has a quantity less than zero it is joined back to the aggregate data set for all previous days for that product and location where the quantity was greater than zero.
Since this may return multiple positive date rows the ROW_NUMBER() function is used to order the rows based on the date of the positive quantity day. This is done in descending order so that row number 1 is the most recent positive day. It isn't possible to use a simple MIN() here because the MIN([Date]) may not correspond to the MIN(Quantity).
WITH x AS (
SELECT [Date],
Product,
[Location],
SUM(Quantity) OVER (PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS Quantity,
SUM([Value]) OVER(PARTITION BY Product, [Location] ORDER BY [Date] ASC) AS [Value]
FROM ProductTransactions
WHERE [Date] <= #AsAtDate
)
SELECT [Date], Product, [Location], Quantity, [Value], Positive_date, Positive_date_quantity
FROM (
SELECT x1.[Date], x1.Product, x1.[Location], x1.Quantity, x1.[Value],
x2.[Date] AS Positive_date, x2.[Quantity] AS Positive_date_quantity,
ROW_NUMBER() OVER (PARTITION BY x1.Product, x1.[Location] ORDER BY x2.[Date] DESC) AS Positive_date_row
FROM x AS x1
LEFT JOIN x AS x2 ON x1.Product=x2.Product AND x1.[Location]=x2.[Location]
AND x2.[Date]<x1.[Date] AND x1.Quantity<0 AND x2.Quantity>0
WHERE x1.[Date] = #AsAtDate
) AS y
WHERE Positive_date_row=1
Do you mean that you want to get the last date of positive quantity come to positive in group?
For example, If you are using SQL Server 2012+:
In following scenario, when the date going to 01/03/2017 the summary of quantity come to 1(-10+5+6).
Is it possible the quantity of following date come to negative again?
;WITH tb(Product, Location,[Date],Quantity) AS(
SELECT 'A','B',CONVERT(DATETIME,'01/01/2017'),-10 UNION ALL
SELECT 'A','B','01/02/2017',5 UNION ALL
SELECT 'A','B','01/03/2017',6 UNION ALL
SELECT 'A','B','01/04/2017',2
)
SELECT t.Product,t.Location,SUM(t.Quantity) AS Qty,MIN(CASE WHEN t.CurrentSum>0 THEN t.Date ELSE NULL END ) AS LastPositiveDate
FROM (
SELECT *,SUM(tb.Quantity)OVER(ORDER BY [Date]) AS CurrentSum FROM tb
) AS t GROUP BY t.Product,t.Location
Product Location Qty LastPositiveDate
------- -------- ----------- -----------------------
A B 3 2017-01-03 00:00:00.000

Flatten/merge overlapping time intervals

I have a 'Service' table with millions of rows. Each row corresponds to a service provided by a staff in a given date and time interval (Each row has a unique ID). There are cases where a staff might provide services in overlapping time frames. I need to write a query that merges overlapping time intervals and returns the data in the format shown below.
I tried grouping by StaffID and Date fields and getting the Min of BeginTime and Max of EndTime but that does not account for the non-overlapping time frames. How can I accomplish this? Again, the table contains several million records so a recursive CTE approach might have performance issues. Thanks in advance.
Service Table
ID StaffID Date BeginTime EndTime
1 101 2014-01-01 08:00 09:00
2 101 2014-01-01 08:30 09:30
3 101 2014-01-01 18:00 20:30
4 101 2014-01-01 19:00 21:00
Output
StaffID Date BeginTime EndTime
101 2014-01-01 08:00 09:30
101 2014-01-01 18:00 21:00
Here is another sample data set with a query proposed by a contributor.
http://sqlfiddle.com/#!6/bfbdc/3
The first two rows in the results set should be merged into one row (06:00-08:45) but it generates two rows (06:00-08:30 & 06:00-08:45)
I only came up with a CTE query as the problem is there may be a chain of overlapping times, e.g. record 1 overlaps with record 2, record 2 with record 3 and so on. This is hard to resolve without CTE or some other kind of loops, etc. Please give it a go anyway.
The first part of the CTE query gets the services that start a new group and are do not have the same starting time as some other service (I need to have just one record that starts a group). The second part gets those that start a group but there's more then one with the same start time - again, I need just one of them. The last part recursively builds up on the starting group, taking all overlapping services.
Here is SQLFiddle with more records added to demonstrate different kinds of overlapping and duplicate times.
I couldn't use ServiceID as it would have to be ordered in the same way as BeginTime.
;with flat as
(
select StaffID, ServiceDate, BeginTime, EndTime, BeginTime as groupid
from services S1
where not exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime <= S1.BeginTime and S2.EndTime <> S1.EndTime
and S2.EndTime > S1.BeginTime)
union all
select StaffID, ServiceDate, BeginTime, EndTime, BeginTime as groupid
from services S1
where exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime = S1.BeginTime and S2.EndTime > S1.EndTime)
and not exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime < S1.BeginTime
and S2.EndTime > S1.BeginTime)
union all
select S.StaffID, S.ServiceDate, S.BeginTime, S.EndTime, flat.groupid
from flat
inner join services S
on flat.StaffID = S.StaffID
and flat.ServiceDate = S.ServiceDate
and flat.EndTime > S.BeginTime
and flat.BeginTime < S.BeginTime and flat.EndTime < S.EndTime
)
select StaffID, ServiceDate, MIN(BeginTime) as begintime, MAX(EndTime) as endtime
from flat
group by StaffID, ServiceDate, groupid
order by StaffID, ServiceDate, begintime, endtime
Elsewhere I've answered a similar Date Packing question with
a geometric strategy. Namely, I interperet the date ranges
as a line, and utilize geometry::UnionAggregate to merge
the ranges.
Your question has two peculiarities though. First, it calls
for sql-server-2008. geometry::UnionAggregate is not then
avialable. However, download the microsoft library at
https://github.com/microsoft/SQLServerSpatialTools and load
it in as a clr assembly to your instance and you have it
available as dbo.GeometryUnionAggregate.
But the real peculiarity that has my interest is the concern
that you have several million rows to work with. So I thought
I'd repeat the strategy here but with an added technique to
improve it's performance. This technique will work well if
you have a lot of your StaffID/date subsets that are the same.
First, let's build a numbers table. Swap this out with your favorite
way to do it.
select i = row_number() over (order by (select null))
into #numbers
from #services; -- where i put your data
Then convert the dates to floats and use those floats to create
geometrical points.
These points can then be turned into lines via STUnion and STEnvelope.
With your ranges now represented as geometric lines, merge them via
UnionAggregate. The resulting geometry object 'lines' might contain
multiple lines. But any overlapping lines turn into one line.
select s.StaffID,
s.Date,
linesWKT = geometry::UnionAggregate(line).ToString()
-- If you have SQLSpatialTools installed then:
-- linesWKT = dbo.GeometryUnionAggregate(line).ToString()
into #aggregateRangesToGeo
from #services s
cross apply (select
beginTimeF = convert(float, convert(datetime,beginTime)),
endTimeF = convert(float, convert(datetime,endTime))
) prepare
cross apply (select
beginPt = geometry::Point(beginTimeF, 0, 0),
endPt = geometry::Point(endTimeF, 0, 0)
) pointify
cross apply (select
line = beginPt.STUnion(endPt).STEnvelope()
) lineify
group by s.StaffID,
s.Date;
You have one 'lines' object for each staffId/date combo. But depending
on your dataset, there may be many 'lines' objects that are the same
between these combos. This may very well be true if staff are expected
to follow a routine and data is recorded to the nearest whatever.
So get a distinct lising of 'lines' objects. This should improve
performance.
From this, extract the individual lines inside 'lines'. Envelope the lines,
which ensures that the lines are stored only as their endpoints. Read the
endpoint x values and convert them back to their time representations.
Keep the WKT representation to join it back to the combos later on.
select lns.linesWKT,
beginTime = convert(time, convert(datetime, ap.beginTime)),
endTime = convert(time, convert(datetime, ap.endTime))
into #parsedLines
from (select distinct linesWKT from #aggregateRangesToGeo) lns
cross apply (select
lines = geometry::STGeomFromText(linesWKT, 0)
) geo
join #numbers n on n.i between 1 and geo.lines.STNumGeometries()
cross apply (select
line = geo.lines.STGeometryN(n.i).STEnvelope()
) ln
cross apply (select
beginTime = ln.line.STPointN(1).STX,
endTime = ln.line.STPointN(3).STX
) ap;
Now just join your parsed data back to the StaffId/Date combos.
select ar.StaffID,
ar.Date,
pl.beginTime,
pl.endTime
from #aggregateRangesToGeo ar
join #parsedLines pl on ar.linesWKT = pl.linesWKT
order by ar.StaffID,
ar.Date,
pl.beginTime;

Do I need to use the dreaded sql server loop/ cursor for the result set I need?

I need a sql server result set that "breaks" on a column value, but if I order by this column in a ranking function, the order I really need is lost. This is best explained by example. The query I'm currently experimenting with is:
select RANK() over(partition by Symbol, Period order by TradeDate desc)
SymbSmaOverUnderGroup, Symbol, TradeDate, Period, Value, Low, LowMinusVal,
LMVSign
from #smasAndLow3
and it returns:
Rnk Symbol TradeDate Period Value Low LowMinusVal LMVSign
1 A 9/6/12 5 37.09 36.71 -.38 U
2 A 9/5/12 5 37.03 36.62 -.41 U
3 A 9/4/12 5 37.07 36.71 -.36 U
4 A 8/31/12 5 37.15 37.30 .15 O
5 A 8/30/12 5 37.22 37.40 .18 O
6 A 8/29/12 5 37.00 36.00 -1.00 U
7 A 8/28/12 5 37.10 37.00 -.10 U
The rank I need here is: 1,1,1,2,2,3,3. So I need to partition by Symbol, Period, and I need to start a new partition on LMVSign (which only contains the values U, O, and E), but it's essential that I order by TradeDate desc. Unless I'm mistaken, partitioning or ordering by LMVSign will make it impossible to sort on the date column. I hope this makes sense. I'm working like mad to do this without a cursor, but I can't get it to work.. thanks in advance.
UPDATE after clarification: I think that you are entering the world of islands and gaps. If your requirement is to group rows by Symbol, Period and LMVSign ordered descendingly by TradeDate, ranking them when any one of these columns change, you might use this (by Itzik Ben-Gan's solution to islands and gaps).
; with islandsAndGaps as
(
select *,
-- Create groups. Important part is order by
-- The difference remains the same as two sequences
-- run along, but the number itself is not ordered
row_number() over (partition by Symbol, Period
order by TradeDate)
- row_number() over (partition by Symbol, Period
order by LMVSign, TradeDate) grp
from Table1
),
grouped as
(
select *,
-- So to order it we use last date in group
-- (mind partition by uses changed order by from second row_number
-- and unordered group number
max(TradeDate) over(partition by LMVSign, grp) DateGroup
from islandsAndGaps
)
-- now we can get rank
select dense_rank() over (order by DateGroup desc) Rnk,
*
from grouped
order by TradeDate desc
Take a look at Sql Fiddle.
OLD answer:
Partition by restarts ranking. I think that you need order by:
dense_rank() over (order by Symbol, Period, LMVSign desc) Rnk
and then you should use TradeDate in order by:
order by Rnk, TradeDate desc
If you need it as a number, add another column:
row_number() over (order by Symbol, Period, LMVSign desc, TradeDate desc) rn

Resources