Improve the query by removing the sub query - sql-server

I have a Customer table
+--------+---------+
| Id | Name |
+--------+---------+
| 1 | A |
| 2 | b |
| 3 | c |
| 4 | d |
| 5 | 3 |
| 6 | f |
| 7 | g |
+--------+---------+
and an order table
+-----+------+--------------------------+
| ID | C_Id | OrderDate |
+-----+------+--------------------------+
| 1 | 1 | 2017-05-12 00:00:00.000 |
| 2 | 2 | 2017-12-12 00:00:00.000 |
| 3 | 3 | 2017-11-12 00:00:00.000 |
| 4 | 4 | 2017-12-12 00:00:00.000 |
| 5 | 1 | 2017-12-12 00:00:00.000 |
| 6 | 2 | 2017-12-12 00:00:00.000 |
| 7 | 3 | 2017-12-12 00:00:00.000 |
| 8 | 4 | 2017-11-12 00:00:00.000 |
| 9 | 2 | 2017-06-12 00:00:00.000 |
| 10 | 3 | 2017-07-12 00:00:00.000 |
+-----+------+--------------------------+
I need the result of the customers who did not buy in last month.
That is from the order table Customer 3 and 4 have purchased in last month(November). The result should not include customer 3 and 4 even they had purchase in earlier months.
I have this query which returns the result perfectly.
SELECT C_ID , MONTH(OrderDate) from [Order]
WHERE MONTH(OrderDate) <> MONTH(GETDATE()) - 1
AND C_ID NOT IN (
SELECT C_ID FROM [Order]
WHERE MONTH(OrderDate) = MONTH(GETDATE()) - 1)
Can anyone help me to write this query without using subquery
EDIT: For more clarity, I need to exclude the customers from the result(get all orders for current year) if they had any purchase in November, also I need results for this year alone.

I think you need to use another way for checking earlier purchases because with MONTH(OrderDate) <> MONTH(GETDATE()) - 1 you have a problem if purchases are in different years.
You need to expand your condition. For example
(
(MONTH(OrderDate)<MONTH(DATEADD(MONTH,-1,GETDATE())) AND YEAR(OrderDate)=YEAR(DATEADD(MONTH,-1,GETDATE())))
OR YEAR(OrderDate)<YEAR(DATEADD(MONTH,-1,GETDATE()))
)
Or you can use the EOMONTH function (from SQL Server 2012) for it. I think this variant will be more useful.
SELECT
C_ID,
MONTH(OrderDate)
FROM [Order]
WHERE EOMONTH(OrderDate)<EOMONTH(DATEADD(MONTH,-1,GETDATE())) -- check for month and year
AND C_ID NOT IN (
SELECT C_ID FROM [Order]
WHERE EOMONTH(OrderDate)=EOMONTH(DATEADD(MONTH,-1,GETDATE())) -- check for month and year
)
I think here is more useful using a variable
DECLARE #lastMonth date=EOMONTH(DATEADD(MONTH,-1,GETDATE()))
SELECT
C_ID,
MONTH(OrderDate)
FROM [Order]
WHERE EOMONTH(OrderDate)<#lastMonth -- check for month and year
AND C_ID NOT IN (
SELECT C_ID FROM [Order]
WHERE EOMONTH(OrderDate)=#lastMonth -- check for month and year
)
A variant without a subquery
SELECT
C_ID,
MIN(OrderDate) FirstOrderDate,
MAX(OrderDate) LastOrderDate
FROM [Order]
WHERE OrderDate<=EOMONTH(DATEADD(MONTH,-1,GETDATE()))
GROUP BY C_ID
HAVING EOMONTH(MAX(OrderDate))<EOMONTH(DATEADD(MONTH,-1,GETDATE()))
Or
DECLARE #lastMonth date=EOMONTH(DATEADD(MONTH,-1,GETDATE()))
SELECT
C_ID,
MIN(OrderDate) FirstOrderDate,
MAX(OrderDate) LastOrderDate
FROM [Order]
WHERE OrderDate<=#lastMonth
GROUP BY C_ID
HAVING EOMONTH(MAX(OrderDate))<#lastMonth
But here I return only MIN(OrderDate) and MAX(OrderDate) but maybe it'll suit you.
I don't think that a variant with a subquery is worse. I think it's more clearly.
DECLARE #lastMonth date=EOMONTH(DATEADD(MONTH,-1,GETDATE()))
SELECT
C_ID,
YEAR(OrderDate) [Year],
MONTH(OrderDate) [Month]
COUNT(ID) OrderCount
FROM [Order]
WHERE EOMONTH(OrderDate)<#lastMonth
--AND YEAR(OrderDate)=YEAR(#lastMonth) -- if you need only orders from this year
AND C_ID IN(
SELECT DISTINCT C_ID
FROM [Order]
WHERE EOMONTH(OrderDate)=#lastMonth
)
GROUP BY C_ID,YEAR(OrderDate),MONTH(OrderDate)

When working with dated information you cannot just use a month number because month 1 comes after month 12 (of last year). So, work with dates, not month numbers.
For this query we need "this month" and "last month" by their dates (not the month numbers) and we can start with this by using getdate()
A useful "trick" here is to calculate the first day of this month which we can do by calculating the number of months from zero datediff(month,0, getdate() ) then adding that number to zero dateadd(month, ..., 0). So once we have the first of this month it is easy to calculate first of last month and first of next month simply by subtracting or adding 1 month.
So, for a solution that will work in any version of SQL Server:
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Orders
([ID] int, [C_Id] int, [OrderDate] datetime)
;
INSERT INTO Orders
([ID], [C_Id], [OrderDate])
VALUES
(1, 1, '2017-05-12 00:00:00'),
(2, 2, '2017-12-12 00:00:00'),
(3, 3, '2017-11-12 00:00:00'),
(4, 4, '2017-12-12 00:00:00'),
(5, 1, '2017-12-12 00:00:00'),
(6, 2, '2017-12-12 00:00:00'),
(7, 3, '2017-12-12 00:00:00'),
(8, 4, '2017-11-12 00:00:00'),
(9, 2, '2017-06-12 00:00:00'),
(10, 3, '2017-07-12 00:00:00')
;
CREATE TABLE Customers
([Id] int, [Name] varchar(1))
;
INSERT INTO Customers
([Id], [Name])
VALUES
(1, 'A'),
(2, 'b'),
(3, 'c'),
(4, 'd'),
(5, '3'),
(6, 'f'),
(7, 'g')
;
Query 1:
declare #this_month datetime = dateadd(month, datediff(month,0, getdate() ), 0)
declare #last_month datetime = dateadd(month,-1,#this_month)
select
c.Id
, c.name
, count(case when o.OrderDate >= #last_month and o.OrderDate < #this_month then 1 end) last_month
, count(case when o.OrderDate >= #this_month then 1 end) this_month
from customers c
LEFT join orders o on c.id = o.c_id
and OrderDate >= #last_month
and OrderDate < dateadd(month,1,#this_month)
group by c.Id, c.name
having count(case when o.OrderDate >= #last_month and o.OrderDate < #this_month then 1 end) = 0
and count(case when o.OrderDate >= #this_month then 1 end) > 0
Results:
| Id | name | last_month | this_month |
|----|------|------------|------------|
| 1 | A | 0 | 1 |
| 2 | b | 0 | 2 |
----
declare #this_year datetime = dateadd(year, datediff(year,0, getdate() ), 0)
declare #this_month datetime = dateadd(month, datediff(month,0, getdate() ), 0)
declare #last_month datetime = dateadd(month,-1,#this_month)
select
c.Id
, c.name
, count(case when o.OrderDate >= #last_month and o.OrderDate < #this_month then 1 end) last_month
, count(o.OrderDate) this_year
from customers c
LEFT join orders o on c.id = o.c_id
and OrderDate >= #this_year
and OrderDate < dateadd(year,1,#this_year)
group by c.Id, c.name
having count(case when o.OrderDate >= #last_month and o.OrderDate < #this_month then 1 end) = 0
and count(o.OrderDate) > 0
;

Related

SQL Server 2014 Merging Overlapping Date Ranges

I have a table with 200.000 rows in a SQL Server 2014 database looking like this:
CREATE TABLE DateRanges
(
Contract VARCHAR(8),
Sector VARCHAR(8),
StartDate DATE,
EndDate DATE
);
INSERT INTO DateRanges (Contract, Sector, StartDate, Enddate)
SELECT '111', '999', '01-01-2014', '03-31-2014'
union
SELECT '111', '999', '04-01-2014', '06-30-2014'
union
SELECT '111', '999', '07-01-2014', '09-30-2014'
union
SELECT '111', '999', '10-01-2014', '12-31-2014'
union
SELECT '111', '888', '08-01-2014', '08-31-2014'
union
SELECT '111', '777', '08-15-2014', '08-31-2014'
union
SELECT '222', '999', '01-01-2014', '03-31-2014'
union
SELECT '222', '999', '04-01-2014', '06-30-2014'
union
SELECT '222', '999', '07-01-2014', '09-30-2014'
union
SELECT '222', '999', '10-01-2014', '12-31-2014'
union
SELECT '222', '666', '11-01-2014', '11-30-2014'
UNION
SELECT '222', '555', '11-15-2014', '11-30-2014';
As you can see there can be multiple overlaps for each contract and what I would like to have is the result like this
Contract Sector StartDate EndDate
---------------------------------------------
111 999 01-01-2014 07-31-2014
111 888 08-01-2014 08-14-2014
111 777 08-15-2014 08-31-2014
111 999 09-01-2014 12-31-2014
222 999 01-01-2014 10-31-2014
222 666 11-01-2014 11-14-2014
222 555 11-15-2014 11-30-2014
222 999 12-01-2014 12-31-2014
I can not figure out how this can be done and the examples i have seen on this site quite do not fit my problem.
This answer makes use of a few different techniques. The first is a recursive-cte that creates a table with every relevant cal_date which then gets cross apply'd with unique Contract values to get every combination of both values. The second is window-functions such as lag and row_number to determine a variety of things detailed in the comments below. Lastly, and probably most importantly, gaps-and-islands to determine when one Contract/Sector combination ends and the next begins.
Answer:
--determine range of dates
declare #bgn_dt date = (select min(StartDate) from DateRanges)
, #end_dt date = (select max(EndDate) from DateRanges)
--use a recursive CTE to create a record for each day / Contract
; with dates as
(
select #bgn_dt as cal_date
union all
select dateadd(d, 1, a.cal_date) as cal_date
from dates as a
where a.cal_date < #end_dt
)
select d.cal_date
, c.Contract
into #contract_dates
from dates as d
cross apply (select distinct Contract from DateRanges) as c
option (maxrecursion 0)
--Final Select
select f.Contract
, f.Sector
, min(f.cal_date) as StartDate
, max(f.cal_date) as EndDate
from (
--Use the sum-over to obtain the Island Numbers
select dr.Contract
, dr.Sector
, dr.cal_date
, sum(dr.IslandBegin) over (partition by dr.Contract order by dr.cal_date asc) as IslandNbr
from (
--Determine if the record is the start of a new Island
select a.Contract
, a.Sector
, a.cal_date
, case when lag(a.Sector, 1, NULL) over (partition by a.Contract order by a.cal_date asc) = a.Sector then 0 else 1 end as IslandBegin
from (
--Determine which Contract/Date combinations are valid, and rank the Sectors that are in effect
select cd.cal_date
, dr.Contract
, dr.Sector
, dr.EndDate
, row_number() over (partition by dr.Contract, cd.cal_date order by dr.StartDate desc) as ConractSectorRnk
from #contract_dates as cd
left join DateRanges as dr on cd.Contract = dr.Contract
and cd.cal_date between dr.StartDate and dr.EndDate
) as a
where a.ConractSectorRnk = 1
and a.Contract is not null
) as dr
) as f
group by f.Contract
, f.Sector
, f.IslandNbr
order by f.Contract asc
, min(f.cal_date) asc
Output:
+----------+--------+------------+------------+
| Contract | Sector | StartDate | EndDate |
+----------+--------+------------+------------+
| 111 | 999 | 2014-01-01 | 2014-07-31 |
| 111 | 888 | 2014-08-01 | 2014-08-14 |
| 111 | 777 | 2014-08-15 | 2014-08-31 |
| 111 | 999 | 2014-09-01 | 2014-12-31 |
| 222 | 999 | 2014-01-01 | 2014-10-31 |
| 222 | 666 | 2014-11-01 | 2014-11-14 |
| 222 | 555 | 2014-11-15 | 2014-11-30 |
| 222 | 999 | 2014-12-01 | 2014-12-31 |
+----------+--------+------------+------------+

Extending a Pivot

I have a Documents table and an Events table.
Documents table has ID and a bunch of other fields not relevant to
this question.
Events table has DocID, EventType, EventDate, and UserID.
A document may have zero or more Events of any of these EventTypes:
0 = Created
1 = Modified
2 = Submitted
3 = Approved
DocID | EventType | EventDate | UserID
-----------------------------------------------
1 | 0 | 1-2-2017 | 123
1 | 1 | 1-3-2017 | 456
1 | 1 | 1-4-2017 | 489
1 | 2 | 1-5-2017 | 357
2 | 0 | 1-6-2017 | 951
2 | 1 | 1-7-2017 | 654
2 | 2 | 1-8-2017 | 654
2 | 3 | 1-9-2017 | 357
Pivoting the Events table is easy enough:
SELECT DocID, [0] AS CreatedDate, [1] AS ModifiedDate,
[2] AS SubmittedDate, [3] AS ApprovedDate
FROM (SELECT DocID, EventType, EventDate FROM Events
WHERE DocID IS NOT NULL AND EventDate IS NOT NULL) AS DocEvents
PIVOT (MAX(EventDate) FOR EventType IN ([0], [1], [2], [3]))
AS DocEventsPivot
For my purposes, the most recent event of a given type is wanted, thus the MAX aggregate:
DocID | CreatedDate | ModifiedDate | SubmittedDate | ApprovedDate
-----------------------------------------------------------------
1 | 1-2-2017 | 1-4-2017 | 1-5-2017 | NULL
2 | 1-6-2017 | 1-7-2017 | 1-8-2017 | 1-9-2017
How can I get the UserID translated to CreatedBy, ModifiedBy, SubmittedBy, and ApprovedBy to correspond to the dates of the appropriate EventType?
I will not know the possible values of UserID in advance.
Desired Output:
DocID | CreatedDate | ModifiedDate | SubmittedDate | ApprovedDate | CreatedBy | ModifiedBy | SubmittedBy | ApprovedBy
---------------------------------------------------------------------------------------------------------------------
1 | 1-2-2017 | 1-4-2017 | 1-5-2017 | NULL | 123 | 489 | 357 | NULL
2 | 1-6-2017 | 1-7-2017 | 1-8-2017 | 1-9-2017 | 951 | 654 | 654 | 657
Rather than using PIVOT another solution is using OUTER APPLY.
CREATE TABLE #Documents (ID int)
CREATE TABLE #Events (DocID int, EventType int, EventDate date, UserID int)
INSERT INTO #Documents VALUES
(1),
(2)
INSERT INTO #Events VALUES
(1, 0, '1-2-2017', 123),
(1, 1, '1-3-2017', 456),
(1, 1, '1-4-2017', 489),
(1, 2, '1-5-2017', 357),
(2, 0, '1-6-2017', 951),
(2, 1, '1-7-2017', 654),
(2, 2, '1-8-2017', 654),
(2, 3, '1-9-2017', 357)
SELECT
DOC.ID AS 'DocID',
CRT.EventDate AS 'CreatedDate',
MFY.EventDate AS 'ModifiedDate',
SUB.EventDate AS 'SubmittedDate',
APR.EventDate AS 'ApprovedDate',
CRT.UserID AS 'CreatedBy',
MFY.UserID AS 'ModifiedBy',
SUB.UserID AS 'SubmittedBy',
APR.UserID AS 'ApprovedBy'
FROM
#Documents AS DOC
OUTER APPLY (SELECT TOP 1 EventDate, UserID FROM #Events WHERE DocID = DOC.ID AND EventType = 0 ORDER BY EventDate DESC) AS CRT
OUTER APPLY (SELECT TOP 1 EventDate, UserID FROM #Events WHERE DocID = DOC.ID AND EventType = 1 ORDER BY EventDate DESC) AS MFY
OUTER APPLY (SELECT TOP 1 EventDate, UserID FROM #Events WHERE DocID = DOC.ID AND EventType = 2 ORDER BY EventDate DESC) AS SUB
OUTER APPLY (SELECT TOP 1 EventDate, UserID FROM #Events WHERE DocID = DOC.ID AND EventType = 3 ORDER BY EventDate DESC) AS APR
DROP TABLE #Documents
DROP TABLE #Events
Try the following
CREATE TABLE #Events (DocID int, EventType int, EventDate date, UserID int)
INSERT INTO #Events VALUES
(1, 0, '1-2-2017', 123),
(1, 1, '1-3-2017', 456),
(1, 1, '1-4-2017', 489),
(1, 2, '1-5-2017', 357),
(1, 2, '1-4-2017', 666),
(2, 0, '1-6-2017', 951),
(2, 1, '1-7-2017', 654),
(2, 2, '1-8-2017', 654),
(2, 3, '1-9-2017', 357)
SELECT
DocID,
MAX(CASE WHEN EventType=0 THEN EventDate END) [CreatedDate],
MAX(CASE WHEN EventType=1 THEN EventDate END) [ModifiedDate],
MAX(CASE WHEN EventType=2 THEN EventDate END) [SubmittedDate],
MAX(CASE WHEN EventType=3 THEN EventDate END) [ApprovedDate],
MAX(CASE WHEN EventType=0 THEN LastUserID END) [CreatedBy],
MAX(CASE WHEN EventType=1 THEN LastUserID END) [ModifiedBy],
MAX(CASE WHEN EventType=2 THEN LastUserID END) [SubmittedBy],
MAX(CASE WHEN EventType=3 THEN LastUserID END) [ApprovedBy]
FROM
(
SELECT
DocID,
EventDate,
EventType,
LAST_VALUE(UserID)OVER(
PARTITION BY DocID,EventType
ORDER BY EventDate
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) LastUserID
FROM #Events
) q
GROUP BY DocID
DROP TABLE #Events
Or you can use IFF instead CASE. I prefer use CASE because it isn't block ELSE
SELECT
DocID,
MAX(IIF(EventType=0,EventDate,NULL)) [CreatedDate],
MAX(IIF(EventType=1,EventDate,NULL)) [ModifiedDate],
MAX(IIF(EventType=2,EventDate,NULL)) [SubmittedDate],
MAX(IIF(EventType=3,EventDate,NULL)) [ApprovedDate],
MAX(IIF(EventType=0,LastUserID,NULL)) [CreatedBy],
MAX(IIF(EventType=1,LastUserID,NULL)) [ModifiedBy],
MAX(IIF(EventType=2,LastUserID,NULL)) [SubmittedBy],
MAX(IIF(EventType=3,LastUserID,NULL)) [ApprovedBy]
FROM
(
SELECT
DocID,
EventDate,
EventType,
LAST_VALUE(UserID)OVER(
PARTITION BY DocID,EventType
ORDER BY EventDate
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) LastUserID
FROM #Events
) q
GROUP BY DocID

(T-SQL) How to query an audit table, and find changes between 2 dates

The audit table looks like this:
Audit ID VendorID PaymentType CreateDateUTC
999 8048 2 2017-10-30-08:84:24
1000 1234 5 2017-10-31-01:17:34
1001 8048 7 2017-10-31-01:17:45
1002 1234 5 2017-10-31-01:17:53
1003 1234 7 2017-10-31-01:18:23
1004 1234 5 2017-11-01-01:18:45
In this example, you can see that say - VendorID 1234 started as PaymentType 5, then had another entry where it's still 5 (the audit table records additional changes not relevant to my query), then it changes to 7, but then back to 5.
Say I'd want to answer the question: 'Between now and date X, these VendorIDs had a change in PaymentType'. A bonus would be - this was the previous PaymentType.
Expected Results:
VendorID PaymentType Prev_PaymentType
8048 7 2
So say if I queried between now and 10-31-01:00:00, I'd want it to return VendorID 8048 as having changed (and as a bonus, that it's previous PaymentType was 2), but VendorID 1234 shouldn't show up, since at 2017-10-31-01:00:00 it was a 5, and now is still a 5, despite the intermittent changes.
How would one go about querying the VendorIDs whose payment type changed between 2 dates?
Thanks!
Here is an alternative approach that my prove useful, using OUTER APPLY. Note that the AuditID column is used as a tie-breaker mostly because the sample data does not have datetime values.
SQL Fiddle
CREATE TABLE AuditTable (
AuditID int
, VendorID int
, PaymentType int
, CreateDateUTC date
);
INSERT INTO AuditTable
VALUES (999, 8048, 2, '2017-10-30'),
(1000, 1234, 5, '2017-10-31'),
(1001, 8048, 7, '2017-10-31'),
(1002, 1234, 5, '2017-10-31'),
(1003, 1234, 7, '2017-10-31'),
(1004, 1234, 5, '2017-11-01');
Query 1:
select
*
from AuditTable a
outer apply (
select top(1) PaymentType, CreateDateUTC
from AuditTable t
where a.VendorID = t.VendorID
and a.CreateDateUTC >= t.CreateDateUTC
and a.AuditID > t.AuditID
order by CreateDateUTC DESC, AuditID DESC
) oa (PrevPaymentType, PrevDate)
order by
vendorid
, CreateDateUTC
Results:
| AuditID | VendorID | PaymentType | CreateDateUTC | PrevPaymentType | PrevDate |
|---------|----------|-------------|---------------|-----------------|------------|
| 1000 | 1234 | 5 | 2017-10-31 | (null) | (null) |
| 1002 | 1234 | 5 | 2017-10-31 | 5 | 2017-10-31 |
| 1003 | 1234 | 7 | 2017-10-31 | 5 | 2017-10-31 |
| 1004 | 1234 | 5 | 2017-11-01 | 7 | 2017-10-31 |
| 999 | 8048 | 2 | 2017-10-30 | (null) | (null) |
| 1001 | 8048 | 7 | 2017-10-31 | 2 | 2017-10-30 |
CREATE TABLE AuditTable (
AuditID INT,
VendorID INT,
PaymentType INT,
CreateDateUTC DATE
);
INSERT INTO AuditTable VALUES
(999 , 8048, 2, '2017-10-30'),
(1000, 1234, 5, '2017-10-31'),
(1001, 8048, 7, '2017-10-31'),
(1002, 1234, 5, '2017-10-31'),
(1003, 1234, 7, '2017-10-31'),
(1004, 1234, 5, '2017-11-01');
WITH CTE AS (
SELECT *,
ROW_NUMBER () OVER (PARTITION BY CreateDateUTC ORDER BY PaymentType) AS N1
FROM AuditTable
WHERE CreateDateUTC <= '2017-11-02' AND CreateDateUTC >= '2017-10-01'
) ,
MAXP AS(
SELECT VendorID, PaymentType, CreateDateUTC
FROM CTE
WHERE N1 = (SELECT MAX(N1) FROM CTE)
)
SELECT TOP 1 MAXP.VendorID, MAXP.PaymentType AS PaymentType, CTE.PaymentType AS Prev_PaymentType
FROM MAXP
JOIN CTE ON CTE.VendorID = MAXP.VendorID;
Result:
+----------+-------------+------------------+
| VendorID | PaymentType | Prev_PaymentType |
+----------+-------------+------------------+
| 8048 | 7 | 2 |
+----------+-------------+------------------+
Demo
Here is a variant without using LEAD() or LAG() but does use ROW_NUMBER and COUNT() OVER().
See this verision work at:SQL Fiddle
CREATE TABLE AuditTable (
AuditID int
, VendorID int
, PaymentType int
, CreateDateUTC date
);
INSERT INTO AuditTable
VALUES (999, 8048, 2, '2017-10-30'),
(1000, 1234, 5, '2017-10-31'),
(1001, 8048, 7, '2017-10-31'),
(1002, 1234, 5, '2017-10-31'),
(1003, 1234, 7, '2017-10-31'),
(1004, 1234, 5, '2017-11-01');
Query 1:
WITH
rowz AS (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY VendorID
ORDER BY CreateDateUTC, AuditID) AS lagno
FROM AuditTable
),
cte AS (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY VendorID, CreateDateUTC
ORDER BY c DESC, span_dt) rn
FROM (
SELECT r1.AuditID, r1.VendorID, r1.CreateDateUTC
, r1.PaymentType AS prevpaymenttype
, r2.PaymentType
, COALESCE(r2.CreateDateUTC, CAST(GETDATE() AS date)) span_dt
, COUNT(*) OVER (PARTITION BY r1.VendorID, r1.CreateDateUTC, r1.PaymentType) c
FROM rowz r1
LEFT JOIN rowz r2 ON r1.VendorID = r2.VendorID
AND r1.lagno = r2.lagno - 1
) d
)
SELECT
AuditID, VendorID, PrevPaymentType, PaymentType, CreateDateUTC
FROM (
SELECT
*
FROM cte
WHERE ('20171031' BETWEEN CreateDateUTC AND span_dt AND rn = 1)
OR (CAST(GETDATE() AS date) BETWEEN CreateDateUTC AND span_dt AND rn = 1)
) d
WHERE PaymentType <> PrevPaymentType
Results:
| AuditID | VendorID | PrevPaymentType | PaymentType | CreateDateUTC |
|---------|----------|-----------------|-------------|---------------|
| 999 | 8048 | 2 | 7 | 2017-10-30 |

Change output according to LAG

I have the current issue:
I'm trying to get the amount of time each of our workers have worked in a day to calculate our company's productivity. We have the time each of our workers has entered and left the building.
The rule is, sometimes our workers leaves the building to smoke or get something from the news stand outside, so we don't take that into consideration and count as if the person never left the building.
We have a cafeteria inside our building so most people don't actually leave the building to have lunch/dinner, so we just remove 1 hour from their productivity calculation, but, if they leave for more then 45 minutes, we will consider that the worker left to lunch/dinner.
I need the end result to look like this:
+----------+----------------+----------------+---------+----------+
| PersonID | IN | OUT | MINUTES | EatOut |
+----------+----------------+----------------+---------+----------+
| 1 | 20170807 08:00 | 20170807 17:25 | 465 | 1 |
+----------+----------------+----------------+---------+----------+
| 2 | 20170807 08:00 | 20170807 17:00 | 540 | 0 |
+----------+----------------+----------------+---------+----------+
My query I have so far:
DECLARE #mytable TABLE(
PersonId INT,
Situation VARCHAR(3),
SituationDtm DATETIME
);
INSERT INTO #mytable VALUES
(1, 'IN', '20170807 08:00'),
(1, 'OUT', '20170807 12:30'),
(1, 'IN', '20170807 14:00'),
(1, 'OUT', '20170807 17:15'),
(2, 'IN', '20170807 08:00'),
(2, 'OUT', '20170807 09:15'),
(2, 'IN', '20170807 09:30'),
(2, 'OUT', '20170807 17:00');
WITH CTE AS (
SELECT
[PersonId],
Situation AS 'CUR.Situation',
SituationDtm AS 'CUR.SituationDtm',
LEAD(Situation) OVER(PARTITION BY PersonId ORDER BY SituationDtm) AS 'NEXT.Situation',
LEAD(SituationDtm) OVER(PARTITION BY PersonId ORDER BY SituationDtm) AS 'NEXT.SituationDtm'
FROM
#mytable
)
SELECT
[CUR.Situation],
[CUR.SituationDtm],
[NEXT.Situation],
[NEXT.SituationDtm],
DATEDIFF(MINUTE, [CUR.SituationDtm], [NEXT.SituationDtm]) AS 'MINUTES'
FROM
CTE
Thanks in advance
You can further query as below: Since you are looking your solution in SQL Server 2008 where you do not have lead/lag you can query as below:
;With Cte as (
Select *, RowN = Row_Number() over(Partition by PersonId order by SituationDtm) from #mytable
), Cte2 as (
Select c1.*, c2.Situation as NextSituation, c2.SituationDtm as NextSituationDtm from cte c1 left join cte c2 on c1.RowN+1 = c2.RowN
and c1.PersonId = c2.PersonId
)
Select PersonId,
Min(SituationDTM) as [In],
Max(Situationdtm) as [Out],
Sum(Case when Situation = 'OUT' and NextSituation = 'IN' and datediff(mi,SituationDtm, NextSituationDTM) > 60 then 1 else 0 end) EatOut,
Sum(Case when Situation = 'OUT' and NextSituation = 'IN' and datediff(mi,SituationDtm, NextSituationDTM) > 60 then 0 else datediff(mi,SituationDtm, NextSituationDTM) end) as [minutes]
from Cte2
group by PersonId
In later versions after >= 2012 you can query as below:
Select PersonId,
Min(SituationDTM) as [In],
Max(Situationdtm) as [Out],
Sum(Case when Situation = 'OUT' and NextSituation = 'IN' and datediff(mi,SituationDtm, NextSituationDTM) > 60 then 1 else 0 end) EatOut,
Sum(Case when Situation = 'OUT' and NextSituation = 'IN' and datediff(mi,SituationDtm, NextSituationDTM) > 60 then 0 else datediff(mi,SituationDtm, NextSituationDTM) end) as [minutes]
from (
Select *, NextSituationDTM = lead(situationdtm) over (partition by personid order by situationdtm),
NextSituation = lead(Situation) over (partition by personid order by situationdtm) from #mytable
) a
group by PersonId
Output as below:
+----------+-------------------------+-------------------------+--------+---------+
| PersonId | In | Out | EatOut | minutes |
+----------+-------------------------+-------------------------+--------+---------+
| 1 | 2017-08-07 08:00:00.000 | 2017-08-07 17:15:00.000 | 1 | 465 |
| 2 | 2017-08-07 08:00:00.000 | 2017-08-07 17:00:00.000 | 0 | 540 |
+----------+-------------------------+-------------------------+--------+---------+

How to calculate SUM balance of all accounts in T-SQL?

I have this table and data
CREATE TABLE #transactions (
[transactionId] [int] NOT NULL,
[accountId] [int] NOT NULL,
[dt] [datetime] NOT NULL,
[balance] [smallmoney] NOT NULL,
CONSTRAINT [PK_transactions_1] PRIMARY KEY CLUSTERED
( [transactionId] ASC)
)
INSERT #transactions ([transactionId], [accountId], [dt], [balance]) VALUES
(1, 1, CAST(0x0000A13900107AC0 AS DateTime), 123.0000),
(2, 1, CAST(0x0000A13900107AC0 AS DateTime), 192.0000),
(3, 1, CAST(0x0000A13A00107AC0 AS DateTime), 178.0000),
(4, 2, CAST(0x0000A13B00107AC0 AS DateTime), 78.0000),
(5, 2, CAST(0x0000A13D011D1860 AS DateTime), 99.0000),
(6, 2, CAST(0x0000A13F00000000 AS DateTime), 97.0000),
(7, 1, CAST(0x0000A13D0141E640 AS DateTime), 201.0000),
(8, 3, CAST(0x0000A1420094DD60 AS DateTime), 4000.0000),
(9, 3, CAST(0x0000A14300956A00 AS DateTime), 4100.0000),
(10, 3, CAST(0x0000A14700000000 AS DateTime), 4200.0000),
(11, 2, CAST(0x0000A14B00B84BB0 AS DateTime), 110.0000)
I need two queries.
For each transaction, I want to return in a query the most recent balance for each account, and an extra column with a SUM of each account balance at that point in time.
Same as 1 but grouped by date without the time portion. So the latest account balance at the end of each day (where there is a transaction in any account) for each account, but SUMed together as in 1.
Data above is sample data that I just made up, but my real table has hundreds of rows and ten accounts (which may increase soon). Each account has a unique accountId. Seems quite a tricky piece of SQL.
EXAMPLE
For 1. I need a result like this:
+---------------+-----------+-------------------------+---------+-------------+
| transactionId | accountId | dt | balance | sumBalances |
+---------------+-----------+-------------------------+---------+-------------+
| 1 | 1 | 2013-01-01 01:00:00.000 | 123 | 123 |
| 2 | 1 | 2013-01-01 01:00:00.000 | 192 | 192 |
| 3 | 1 | 2013-01-02 01:00:00.000 | 178 | 178 |
| 4 | 2 | 2013-01-03 01:00:00.000 | 78 | 256 |
| 5 | 2 | 2013-01-05 17:18:00.000 | 99 | 277 |
| 7 | 1 | 2013-01-05 19:32:00.000 | 201 | 300 |
| 6 | 2 | 2013-01-07 00:00:00.000 | 97 | 298 |
| 8 | 3 | 2013-01-10 09:02:00.000 | 4000 | 4298 |
| 9 | 3 | 2013-01-11 09:04:00.000 | 4100 | 4398 |
| 10 | 3 | 2013-01-15 00:00:00.000 | 4200 | 4498 |
| 11 | 2 | 2013-01-19 11:11:00.000 | 110 | 4511 |
+---------------+-----------+-------------------------+---------+-------------+
So, for transactionId 8, I take the latest balance for each account in turn and then sum them. AccountID 1: is 201, AccountId 2 is 97 and AccountId 3 is 4000. Therefore the result for transactionId 8 will be 201+97+4000 = 4298. When calculating the set must be ordered by dt
For 2. I need this
+------------+-------------+
| date | sumBalances |
+------------+-------------+
| 01/01/2013 | 192 |
| 02/01/2013 | 178 |
| 03/01/2013 | 256 |
| 05/01/2013 | 300 |
| 07/01/2013 | 298 |
| 10/01/2013 | 4298 |
| 11/01/2013 | 4398 |
| 15/01/2013 | 4498 |
| 19/01/2013 | 4511 |
+------------+-------------+
So on date 15/01/2013 the latest account balance for each account in turn (1,2,3) is 201,97,4200. So the result for that date would be 201+97+4200 = 4498
This gives your first desired resultset (SQL Fiddle)
WITH T
AS (SELECT *,
balance -
isnull(lag(balance) OVER (PARTITION BY accountId
ORDER BY dt, transactionId), 0) AS B
FROM #transactions)
SELECT transactionId,
accountId,
dt,
balance,
SUM(B) OVER (ORDER BY dt, transactionId ROWS UNBOUNDED PRECEDING) AS sumBalances
FROM T
ORDER BY dt;
It subtracts the current balance of the account from the previous balance to get the net difference then calculates a running total of those differences.
And that can be used as a base for your second result
WITH T1
AS (SELECT *,
balance -
isnull(lag(balance) OVER (PARTITION BY accountId
ORDER BY dt, transactionId), 0) AS B
FROM #transactions),
T2 AS (
SELECT transactionId,
accountId,
dt,
balance,
ROW_NUMBER() OVER (PARTITION BY CAST(dt AS DATE) ORDER BY dt DESC, transactionId DESC) AS RN,
SUM(B) OVER (ORDER BY dt, transactionId ROWS UNBOUNDED PRECEDING) AS sumBalances
FROM T1)
SELECT CAST(dt AS DATE) AS [date], sumBalances
FROM T2
WHERE RN=1
ORDER BY [date];
Part 1
; WITH a AS (
SELECT *, r = ROW_NUMBER()OVER(PARTITION BY accountId ORDER BY dt)
FROM #transactions t
)
, b AS (
SELECT t.*
, transamount = t.balance - ISNULL(t0.balance,0)
FROM a t
LEFT JOIN a t0 ON t0.accountId = t.accountId AND t0.r + 1 = t.r
)
SELECT transactionId, accountId, dt, balance
, sumBalance = SUM(transamount)OVER(ORDER BY dt, transactionId)
FROM b
ORDER BY dt
Part 2
; WITH a AS (
SELECT *, r = ROW_NUMBER()OVER(PARTITION BY accountId ORDER BY dt)
FROM #transactions t
)
, b AS (
SELECT t.*
, transamount = t.balance - ISNULL(t0.balance,0)
FROM a t
LEFT JOIN a t0 ON t0.accountId = t.accountId AND t0.r + 1 = t.r
)
, c AS (
SELECT transactionId, accountId, dt, balance
, sumBalance = SUM(transamount)OVER(ORDER BY CAST(dt AS DATE))
, r1 = ROW_NUMBER()OVER(PARTITION BY accountId, CAST(dt AS DATE) ORDER BY dt DESC)
FROM b
)
SELECT dt = CAST(dt AS DATE)
, sumBalance
FROM c
WHERE r1 = 1
ORDER BY CAST(dt AS DATE)

Resources