Related
I am writing a script that will run on SQL Server 2014.
I have a table of transactions recording transfers from one work center to another. The simplified table is below:
DECLARE #transactionTable TABLE (wono varchar(10),transferDate date
,fromWC varchar(10),toWC varchar(10),qty float)
INSERT INTO #transactionTable
SELECT '0000000123','5/10/2018','STAG','PP-B',10
UNION
SELECT '0000000123','5/11/2018','PP-B','PP-T',5
UNION
SELECT '0000000123','5/11/2018','PP-T','TEST',3
UNION
SELECT '0000000123','5/12/2018','PP-B','PP-T',5
UNION
SELECT '0000000123','5/12/2018','PP-T','TEST',5
UNION
SELECT '0000000123','5/13/2018','PP-T','TEST',2
UNION
SELECT '0000000123','5/13/2018','TEST','FGI',8
UNION
SELECT '0000000123','5/14/2018','TEST','FGI',2
SELECT *,
fromTotal = -SUM(qty) OVER(PARTITION BY fromWC ORDER BY wono, transferdate, fromWC),
toTotal = SUM(qty) OVER(PARTITION BY toWC ORDER BY wono, transferdate, toWC)
FROM #transactionTable
ORDER BY wono, transferDate, fromWC
I want to get a running balance of the fromWC and toWC after each transaction.
Given the records above, the end result should be this:
I believe it is possible to use SUM(qty) OVER(PARTITION BY..., but I am not sure how to write the statement. When I try to get the increase and decrease, each line always results in 0.
How do I write the SUM statement to achieve the desired results?
UPDATE
This image shows each transaction, the resulting WC qty, and highlights the corresponding from and to work centers for each transaction.
For example, looking at the second record on 5/11, 3 were transferred from PP-T to TEST. After the transaction, there were 5 in PP-B, 2 in PP-T, and 3 in TEST.
I can get close, except starting balances:
SELECT wono, transferDate, fromWC, toWC, qty,
SUM( CASE WHEN WC = fromWC THEN RunningTotal ELSE 0 END ) AS FromQTY,
SUM( CASE WHEN WC = toWC THEN RunningTotal ELSE 0 END ) AS ToQTY
FROM( -- b
SELECT *, SUM(Newqty) OVER(PARTITION BY WC ORDER BY wono,transferdate, fromWC, toWC) AS RunningTotal
FROM(-- a
SELECT wono, transferDate, fromWC, toWC, fromWC AS WC, qty, -qty AS Newqty, 'From' AS RecType
FROM #transactionTable
UNION ALL
SELECT wono, transferDate, fromWC, toWC, toWC AS WC, qty, qty AS Newqty, 'To' AS RecType
FROM #transactionTable
) AS a
) AS b
GROUP BY wono, transferDate, fromWC, toWC, qty
My logic assumes that all balances start at 0, therefore "STAG" balance will be -10.
How the query works:
"Unpivot" the input record set into "From" and "To" records with quantities negated for "From" records.
Calculate running totals for each "WC".
Combine "Unpivoted" records back into original shape
Solution 2
WITH CTE
AS(
SELECT *,
ROW_NUMBER() OVER( ORDER BY wono, transferDate, fromWC, toWC ) AS Sequence
FROM #transactionTable
),
CTE2
AS(
SELECT *,
fromTotal = -SUM(qty) OVER(PARTITION BY fromWC ORDER BY Sequence),
toTotal = SUM(qty) OVER(PARTITION BY toWC ORDER BY Sequence)
FROM CTE
)
SELECT a.Sequence, b.Sequence, c.Sequence, a.wono, a.transferDate, a.fromWC, a.toWC, a.qty, a.fromTotal + ISNULL( b.toTotal, 0 ) AS FromTotal, a.toTotal + ISNULL( c.fromTotal, 0 ) AS ToTotal
FROM CTE2 AS a
OUTER APPLY( SELECT TOP 1 * FROM CTE2 WHERE wono = a.wono AND Sequence < a.Sequence AND toWC = a.fromWC ORDER BY Sequence DESC ) AS b
OUTER APPLY( SELECT TOP 1 * FROM CTE2 WHERE wono = a.wono AND Sequence < a.Sequence AND fromWC = a.toWC ORDER BY Sequence DESC ) AS c
ORDER BY a.Sequence
Note: This solution would benefit greatly from an "ID" column, that mirrors transaction order OR at least you will need an index on wono, transferDate, fromWC, toWC
*Edit (Hopefully to be more clear)
Table below, I would like to count ids and count duplicate ids where the createddate has a gap of 3 months or more for that ID.
Query I have so far...
if object_id('tempdb..#temp') is not null
begin drop table #temp end
select
top 100
a.id, a.CreatedDate
into #temp
from tbl a
where 1=1
--and year(CreatedDate) = '2015'
if object_id('tempdb..#temp2') is not null
begin drop table #temp2 end
select t.id, count(t.id) as Total_Cnt
into #temp2
from #temp t
group by id
select distinct #temp2.Total_Cnt, #temp2.id, #temp.CreatedDate, DENSE_RANK() over (partition by #temp.id order by createddate) RK
from #temp2
inner join #temp on #temp2.id = #temp.id
where 1=1
order by Total_Cnt desc
Results:
Total_cnt id createddate rk
3 1 01-01-2015 1
3 1 03-02-2015 2
3 1 01-02-2015 3
2 2 05-01-2015 1
2 2 05-02-2015 2
1 3 06-01-2015 1
1 4 07-01-2015 1
Count ids and only count duplicate ids when the createddate from the id is greater than 3 months.
Something like this...
Total_cnt id Countwith3monthgap
3 1 2
2 2 1
1 3 1
1 4 1
You can use a cte and ROW_NUMBER to get your order and self join the cte based on the order..
WITH cte AS
( SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedDate) Rn
FROM
Test
)
SELECT
c1.ID,
COUNT(CASE WHEN c2.CreatedDate IS NULL THEN 1
WHEN c1.CreatedDate >= DATEADD(month,3,c2.CreatedDate) THEN 1
END)
FROM
cte c1
LEFT JOIN cte c2 ON c1.ID = c2.ID
AND c1.RN = c2.RN + 1
GROUP BY
c1.ID
You also need to use a conditional count where the Previous CreatedDate is null or if the Current CreatedDate is >= the Previous CreatedDate + 3 months
If you happen to be using SQL 2012+ you can also use LAG here to get the same result
SELECT
ID,
COUNT(*)
FROM
(SELECT
ID,
CreatedDate CurrentDate,
LAG(CreatedDate) OVER (PARTITION BY ID ORDER BY CreatedDate) PreviousDate
FROM
Test
) T
WHERE
PreviousDate IS NULL
OR CurrentDate >= DATEADD(month, 3, PreviousDate)
GROUP BY
ID
You can use a lag to get the previous date, Null for the first in the list
SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t
You can use that as a subquery and get the difference in months using DATEDIFF
SELECT sub.id,DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate)
FROM (SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t) sub
WHERE DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
You can then take your totals
SELECT sub.id,COUNT(sub.id) as cnt
FROM (SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t) sub
WHERE DATEDIFF(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
GROUP BY sub.id
Note that using datediff the last day of january is three months before the first day of march. That appears to be the logic you were after.
You might want to define your three month gap criteria as
WHERE sub.PreviousCreateDate <= DATEADD(month, -3, sub.CreatedDate)
OR sub.PreviousCreateDate IS NULL
or
WHERE sub.CreatedDate >= DATEADD(month, +3, sub.PreviousCreateDate )
OR sub.PreviousCreateDate IS NULL
I'm guessing that your desired definition of three-month gap doesn't coincide with datediff()'s. Most of the logic here is to look back at the previous date and decide if the gap is big enough to qualify.
When datediff() counts three months difference we still need to make sure the day of month is later than the first one (per example and ID 5). If difference is more than three months then we're good automatically.
But I'm also assuming that you would want to treat the distance from November 30th to February 28th (or 29th in a leap year) as a full three months because the end date falls on the final day of the month. By adjusting the end date by an extra day this is an easy scenario to snag as it will bump the date into the following month and increase the month difference by one as well. If that's not what you want then just remove the dateadd(day, 1, ...) portion and use only the raw CreatedDate value.
You sample data is limited so I'm also making the assumption that the gaps are measure between consecutive dates. If you're wanting to find blocks of runs that don't span more than three months across the set, then that's a different problem and you should clarify with more information.
Since you've indicated that you're probably on SQL Server 2008 you'll have to do without the lag() function. Although the first query could be adjusted for that it's likely easier to go with the second approach at the end.
with diffs as (
select
ID,
row_number() over (partition by ID order by CreatedDate) as RN,
case when
datediff(
month,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
CreatedDate
) = 3
and
datepart(
day,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate)
) <= datepart(day, CreatedDate)
or
datediff(
month,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
/* adding one day to handle gaps like Nov30 - Feb28/29 and Jan31 - Apr30 */
dateadd(day, 1, CreatedDate)
) >= 4
then 1
else 0
end as GapFlag
from <T> /* <--- your table name here */
), gaps as (
select
ID, RN,
sum(1 + GapFlag) over (partition by ID order by RN) as Counter
from diffs
)
select ID, count(distinct Counter - RN) as "Count"
from gaps
group by ID
The rest of the logic is a typical gaps and islands scenario looking for holes in the sum(1 + GapCount) sequence with the offset of 1 acting pretty much like row_number().
http://sqlfiddle.com/#!6/61b12/3
JamieD77's approach is also valid. I was originally thinking your problem involved more than looking at the rows in sequence. Here's how I would tweak it for the gap definition I've been running with:
with data as (
select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
from T
)
select ID, count(*) as "Count"
from data d1 left outer join data d0
on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
or datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
or d0.ID is null
group by ID
Edit: You have changed the question since yesterday.
Change this line in the first query to include the total count:
...
select count(*) as TotalCnt, ID, count(distinct Counter - RN) as GapCount
...
Second would look like:
with data as (
select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
from T
)
select
count(*) as TotalCnt, ID,
count(case when
datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
or datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
or d0.ID is null then 1 end
) as GapCount
from data d1 left outer join data d0
on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
group by ID
I have a query which retrieves latest data from table with pagination which is working fine.
But once the data is older from current time it should also appear but after the latest one.
SELECT A.UserId,A.FirstName,A.LastName,A.PostDate
(SELECT ROW_NUMBER() OVER(ORDER BY CAST(M.PostDate AS DATETIMEOFFSET) DESC) AS 'RowNumber'
M.UserId,
M.FirstName,
M.LastName,
M.PostDate
FROM Messages AS M
Where M.PostDate >= GetDate()
) A
WHERE A.RowNumber BETWEEN #RowStart AND #RowEnd
ORDER BY CAST(A.PostDate AS DATETIMEOFFSET) DESC
I'm not sure if I'm exactly getting what you're looking for, but this should get you something close. I created a CTE with two computed columns:
BeforeAfter, which determines if the date happens before or after the passed in date
AbsDiff, which gives the absolute value of the date diff.
Then, I just sort using those two fields:
DECLARE #current datetime = GETDATE()
;WITH cteMessages AS
(
SELECT UserId, FirstName, LastName, PostDate,
CASE WHEN PostDate < #current THEN 1 ELSE 0 END AS BeforeAfter,
ABS(DATEDIFF(SECOND, #current, PostDate)) AS AbsDiff
FROM Messages
)
SELECT * FROM cteMessages
ORDER BY BeforeAfter, AbsDiff
From those results, you can see how they are sorted first by messages newer than the passed in date, then in reverse order from older messages. You can substitute that order by into your Row_Number function.
You cant do Order By ASC and DESC on the Same column of a Table if you do you will get the following Error
Msg 169, Level 15, State 1, Line 1 A column has been specified more
than once in the order by list. Columns in the order by list must be
unique.
If you want it you may do using Join of the same Table and order by the same column like
SELECT A.UserId,A.FirstName,A.LastName,A.PostDate
(SELECT ROW_NUMBER() OVER(ORDER BY CAST(M.PostDate AS DATETIMEOFFSET) DESC) AS 'RowNumber'
M.UserId,
M.FirstName,
M.LastName,
M.PostDate
FROM Messages AS M
Where M.PostDate >= GetDate()
) A
Inner Join Messages AS Msg on Msg.UserId=A.UserId
WHERE A.RowNumber BETWEEN #RowStart AND #RowEnd
ORDER BY CAST(A.PostDate AS DATETIMEOFFSET) DESC,
CAST(Msg.PostDate AS DATETIMEOFFSET) ASC
I think you should remove the following line
Where M.PostDate >= GetDate()
I'm looking for a way to fetch at least 20 rows, regardless of the date or all rows with a date greater than a set date (whichever of the 2 fetches the most rows).
For example:
SELECT *
FROM table1
WHERE the_date >= '2014-11-01'
ORDER BY the_date DESC
will give me what I want, but if this returns less than 20 rows then I want to keep going before that date until I get 20 (or until there are no more rows - whichever comes first).
I could just select all and take the ones I need programatically, but I'm trying to avoid that as that would be a major change to the actual code in many places.
Can this be done?
Note: I am using SQL Server 2008.
Here's an approach:
select *
from
(
select row_number() over (order by the_date desc) num, t.*
from table1 t
) i
where i.the_date >= '2014-11-01'
or i.num < 21
Very strange requirement but something like this should work.
DECLARE #Date date = '2014-11-01';
with MyResults as
(
SELECT *, 1 as RowNum
FROM table1
WHERE (the_date >= #Date )
UNION ALL
SELECT top 10 *, ROW_NUMBER() over(order by the_date desc)
FROM table1
order by the_date desc
)
Select *
from MyResults
where RowNum <= 20
The right answer was almost there, but the user has removed it, so here it is again:
SELECT *
FROM table1
WHERE the_date >= '2014-11-01'
union
SELECT top 20 *
FROM table1
ORDER BY the_date desc
The removed answer had union all
This answer has a problem with duplicate dates , but with the data I have that is not an issue
I have a table of contactIDs and datetimes, the time being when a letter was generated for the contact. Each contact can only have one letter generated a day. I want to write a query to select any contact that has had letters generated on more than one consecutive day.
I guess I'd need to increment the datetime as records are found but how would I do this separately for each contact?
select contactid from ContactTable a inner join Contacttable B on a.contctid=b.contactid and datediff(day,a.date,b.date)=1
I decided to utilise a calendar table. Use your favourite search engine to find a script to create a calendar table.
Alternatively, here's one I made earlier
So here's the query I have rolled with in full, I will explain the detail of it afterwards
DECLARE #your_table table (
contact_id int
, created_on datetime
);
INSERT INTO #your_table (contact_id, created_on)
SELECT 9, '2014-01-02 06:00'
UNION ALL SELECT 9, '2014-01-02 18:00'
UNION ALL SELECT 9, '2014-01-05 08:00'
UNION ALL SELECT 9, '2014-01-07 01:00'
UNION ALL SELECT 3, '2014-01-02 00:01'
UNION ALL SELECT 3, '2014-01-03 23:59' -- Over 24 hours but a "day" different
UNION ALL SELECT 7, '2014-01-04 01:00'
UNION ALL SELECT 7, '2014-01-06 01:00'
UNION ALL SELECT 7, '2014-01-08 01:00'
UNION ALL SELECT 7, '2014-01-09 01:00'
UNION ALL SELECT 7, '2014-01-10 01:00'
UNION ALL SELECT 7, '2014-01-11 01:00'
;
; WITH x AS (
SELECT your_table.contact_id
, your_table.created_on
, calendar.the_date
, Row_Number() OVER (PARTITION BY your_table.contact_id ORDER BY calendar.the_date) As sequence
FROM #your_table As your_table
INNER
JOIN dbo.calendar
ON your_table.created_on >= calendar.the_date
AND your_table.created_on < DateAdd(dd, 1, calendar.the_date)
)
, y AS (
SELECT curr.contact_id
, curr.created_on
, curr.the_date As the_date
, prev.the_date As previous_date
, DateDiff(dd, prev.the_date, curr.the_date) As difference_in_days
FROM x As curr
LEFT
JOIN x As prev
ON curr.contact_id = prev.contact_id
AND curr.sequence = prev.sequence + 1
)
SELECT contact_id
, created_on
, the_date
, previous_date
, difference_in_days
FROM y
WHERE difference_in_days = 1
Because you didn't provide any sample data that's where I had to start, so the query is self-contained using a table variable (#your_table) as its source.
Once populated we start out with a couple of Common-Table Expressions (CTE for short). Read up here if you're not familiar with the concept: http://msdn.microsoft.com/en-us/library/ms175972.aspx . There's not a lot of difference between these and subqueries.
Our first CTE (x) joins #your_table to the calendar table. It does this by returning the single row from the calendar on which the created_on date lies, by checking that it is greater than (or equal to) the calendar date and less than the next calendar date (DateAdd()).
Once complete we use the windowed function - Row_Number() to provide some sequencing.
We partition (i.e. reset the sequence) for each contact_id and sort the sequence by the created_on date.
Moving on to the second CTE (y): we perform a self-join on CTE x joining each contact record with its "previous" based on the sequencing.
This allows us to work out the difference in days (DateDiff()) between the current and the previous records.
Finally we reduce our resultset to only those records where the difference in days is 1 i.e. contacts on consecutive days