I'm looking to create a query that could compare a customer's latest order purchase amount to the previous order amount of the customer's last purchase. Please see example data screenshot below:
Ideally I'd like the query to look for these things in the results:
Total amount from previous order before most recent order date (in this case 9/6/18 would be most recent and 2/2/17 would be the last purchase)
Difference in amount between most recent order and last order amount ($2000-$25 = $1975)
Create a condition in the query to look for customers whose most recent order attempt is 1000 > last purchase amount and the age of the customer's account age is > than 60 days
Note: These conditions for the last bullet could be modified as needed (customer's account age is > 90 days, different in order amount is $500, etc)
Thank you for the assistance!
For 2012 onward you can use LAG
declare #amount decimal(16,2) = 1000
declare #days int = 60
select
*
,TotalWithPrevious = [Order Amount] + lag([Order Amount]) over (partition by UserID order by [Order Date] desc)
,DifferenceofPrevious = [Order Amount] - lag([Order Amount]) over (partition by UserID order by [Order Date] desc)
,CheckCondition = case
when [Order Amount] - lag([Order Amount]) over (partition by UserID order by [Order Date] desc) >= #amount
and datediff(day,[Order Date],lag([Order Date]) over (partition by UserID order by [Order Date] desc)) >= #days
then 'True'
else 'False'
end
from YourTable
Related
I'm having a problem with a query that populates the daily census(# of current inpatients) for a hospital unit. This previous post is where I found the query.
SELECT
[date], COUNT (DISTINCT
CASE WHEN admit_date <= [date] AND discharge_date >= [date] THEN id END)) AS census
FROM
dbo.patients, dbo.census
GROUP BY
[date]
ORDER BY
[date]
There are 2 tables:
dbo.patients with id, admit_date, and discharge_date columns
dbo.census has a date column with every date since 2017, and a census column, which is blank
The query populates the census column, but the census count diminishes toward the end of the dates to smaller numbers then it should. For example, there are 65 null values for discharge_date, so there should be a census count of 65 for today's date, but the query produces a count of 8.
Probably need to account for NULL discharge date
SELECT [date], COUNT (DISTINCT
CASE WHEN admit_date <= [date] AND COALESCE(discharge_date, GETDATE()) >= [date] THEN id END))
AS census
FROM dbo.patients
CROSS JOIN dbo.census
GROUP BY [date]
ORDER BY [date]
That is, assuming [date] is some sort of current date/time stamp. Also, as per Sean Lange's comment, if you really want a CROSS JOIN then you should specify that in the query.
I'm trying to get a count of unique items in a column given an ID number and where the date is within the last 12 months. I need to iterate this over each row in my table.
I am using a combination of dense_rank() and (Over partition by to calculate the count of unique items, but I haven't been able to add in the date filter successfully. The results I see so far are showing count of distinct Unique_Code for each row with the same ID regardless of the date.
select ID,
Unique_Code,
Transaction_Date,
DATEADD(Month, -12, Transaction_Date) as L12M,
dense_rank() over (partition by ID order by Transaction_Date, Unique_Code) as [Unique_Count]
from (select *, (case when datediff(day, lag(Transaction_Date,1,Transaction_Date) over (partition by Unique_Code order by ID), Transaction_Date)
<= 1
then 1 else 2
end) as grp
from datatable1)
I expect the results to show a count of unique items from the unique_code column for the id in the row and where previous entries within the same ID are with the transaction date and the transaction date - 12 months. Right now I am seeing a count of unique items from the unique_code column from each entry with the same ID regardless of the date range.
Unfortunately I do not have your source data to test, however, I've added an extra column to yours below:
select
ID
, Unique_Code
, Transaction_Date
, DATEADD(Month, -12, Transaction_Date) as L12M
, dense_rank() over (partition by ID order by Transaction_Date, Unique_Code) as [Unique_Count]
, rank() over (partition by Transaction_Date order by ID) NewUniqueCount
from (select *, (case when datediff(day, lag(Transaction_Date,1,Transaction_Date) over (partition by Unique_Code order by ID), Transaction_Date) <= 1
then 1 else 2 end) as grp from datatable1)
Let me know if it works? - It should work.
I’m struggling a bit here. The data is fabricated, but the query concept is very real.
I need to select the Customer, Current Amount, Previous Amount, Sequence and Date
WHERE DATE < 1190105
AND the DATE/SEQ is the maximum date/seq prior to that date point grouping by customer.
I’ve spent quite a few days trying all sorts of things using HAVING, nested select to try and obtain the max-date/amount and min-date/amount by customer and can’t quite get my head around it. I’m sure it should be quite easy, but any help you can offer would be really appreciated.
Thanks
**SEQ DATE CUSTOMER AMOUNT**
1 1181225 Bob 400
2 1181226 Fred 300
3 1190101 Bob 100
4 1190104 Fred 500
5 1190104 George 200
6 1190105 Bob 150
7 1190106 Bob 200
8 1190110 Fred 160
9 1190110 Bob 300
10 1190112 Fred 400
Opt 1 use row number and lag functions
SELECT
ROW_NUMBER() OVER (Partition By CustomerID Order By [Date]) as Sec,
[Date],
Customer,
Amount as CurrentAmount,
Lead(Amount) OVER (Partition By CustomerID, Order By [Date]) as PreviousAmount
FROM
YourTable
WHERE
[DATE] < 1190105
Opt use outer apply
SELECT
ROW_NUMBER() OVER (Partition By Customer Order By [Date]) as Sec,
[Date],
Customer,
Amount as CurrentAmount,
Prev.Amount as PreviousAmount
FROM
YourTable T
OUTER APPLY (
SELECT TOP 1 Amount FROM YourTable
WHERE Customer = T.Customer AND [Date] < T.[Date]
ORDER BY [DATE] DESC
) Prev
WHERE
DATE < 1190105
Opt 3 use a correlated subquery
SELECT
ROW_NUMBER() OVER (Partition By Customer Order By [Date]) as Sec,
[Date],
Customer,
Amount as CurrentAmount,
(
SELECT TOP 1 Amount FROM YourTable
WHERE Customer = T.Customer AND [Date] < T.[Date]
ORDER BY [DATE] DESC
) as PreviousAmount
FROM YourTable
WHERE
DATE < 1190105
First restrict the rows with the date filter, then search for the max by customer.
Using GROUP BY:
DECLARE #FilterDate INT = 1190105
;WITH MaxDateByCustomer AS
(
SELECT
T.CUSTOMER,
MaxSEQ = MAX(T.SEQ)
FROM
YourTable AS T
WHERE
T.Date < #FilterDate
GROUP BY
T.CUSTOMER
)
SELECT
T.*
FROM
YourTable AS T
INNER JOIN MaxDateByCustomer AS M ON
T.CUSTOMER = M.CUSTOMER AND
T.SEQ = M.MaxSEQ
Using ROW_NUMBER window function:
DECLARE #FilterDate INT = 1190105
;WITH DateRankingByCustomer AS
(
SELECT
T.*,
DateRanking = ROW_NUMBER() OVER (PARTITION BY T.CUSTOMER ORDER BY T.SEQ DESC)
FROM
YourTable AS T
WHERE
T.Date < #FilterDate
)
SELECT
D.*
FROM
DateRankingByCustomer AS D
WHERE
D.DateRanking = 1
I am using the SUM Over the first time and have the following query noe:
SELECT Id, Amount, Date, TotalAmount = SUM(Amount) OVER (order by Amount)
FROM Account
WHERE Date >= '2016-03-01' AND Date <= '2016-03-10' AND UserId = 'xyz'
ORDER BY ValutaDate
The TotalAmount should be a running total over the whole table for a specific user (so the Sum Over clause should respect the where clause for the user). On the other hand I just need a few records and not the whole table, thats why I added the where clause specifying the date range. But now, of course, my sum gets calculated just for the range I specified.
What should I do, to get just a few records specified by date range but get the sum calculated over the whole table though. Is there an performant way to accomplish this?
Thanks in advance for helping me out.
Break the running total into its own query
; WITH all_rows_one_user as (SELECT *
, TotalAmount = SUM(Amount) OVER (order by ValutaDate)
FROM Account
WHERE UserId = 'xyz')
SELECT Id, Amount, Date, TotalAmount
FROM all_rows_one_user
WHERE Date >= '2016-03-01' AND Date <= '2016-03-10'
ORDER BY ValutaDate
Same query, different syntax:
SELECT Id, Amount, Date, TotalAmount
FROM (SELECT *
, TotalAmount = SUM(Amount) OVER (order by ValutaDate)
FROM Account
WHERE UserId = 'xyz') AS all_rows_one_user
WHERE Date >= '2016-03-01' AND Date <= '2016-03-10'
ORDER BY ValutaDate
The WHERE clause is applied first so the SUM can't access rows that don't match that.
You can use apply though. Note it will be reading the entire table for that user so might not perform too well without a decent index.
SELECT a.Id, a.Amount, a.Date, ta.TotalAmount
FROM Account a
OUTER APPLY (SELECT SUM(CASE WHEN a2.Date <= Account.Date THEN a2.TotalAmount ELSE 0 END) AS TotalAmount FROM Account a2 WHERE a2.UserId = Account.UserId) ta
WHERE a.Date >= '2016-03-01' AND a.Date <= '2016-03-10' AND a.UserId = 'xyz'
*Edit (Hopefully to be more clear)
Table below, I would like to count ids and count duplicate ids where the createddate has a gap of 3 months or more for that ID.
Query I have so far...
if object_id('tempdb..#temp') is not null
begin drop table #temp end
select
top 100
a.id, a.CreatedDate
into #temp
from tbl a
where 1=1
--and year(CreatedDate) = '2015'
if object_id('tempdb..#temp2') is not null
begin drop table #temp2 end
select t.id, count(t.id) as Total_Cnt
into #temp2
from #temp t
group by id
select distinct #temp2.Total_Cnt, #temp2.id, #temp.CreatedDate, DENSE_RANK() over (partition by #temp.id order by createddate) RK
from #temp2
inner join #temp on #temp2.id = #temp.id
where 1=1
order by Total_Cnt desc
Results:
Total_cnt id createddate rk
3 1 01-01-2015 1
3 1 03-02-2015 2
3 1 01-02-2015 3
2 2 05-01-2015 1
2 2 05-02-2015 2
1 3 06-01-2015 1
1 4 07-01-2015 1
Count ids and only count duplicate ids when the createddate from the id is greater than 3 months.
Something like this...
Total_cnt id Countwith3monthgap
3 1 2
2 2 1
1 3 1
1 4 1
You can use a cte and ROW_NUMBER to get your order and self join the cte based on the order..
WITH cte AS
( SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedDate) Rn
FROM
Test
)
SELECT
c1.ID,
COUNT(CASE WHEN c2.CreatedDate IS NULL THEN 1
WHEN c1.CreatedDate >= DATEADD(month,3,c2.CreatedDate) THEN 1
END)
FROM
cte c1
LEFT JOIN cte c2 ON c1.ID = c2.ID
AND c1.RN = c2.RN + 1
GROUP BY
c1.ID
You also need to use a conditional count where the Previous CreatedDate is null or if the Current CreatedDate is >= the Previous CreatedDate + 3 months
If you happen to be using SQL 2012+ you can also use LAG here to get the same result
SELECT
ID,
COUNT(*)
FROM
(SELECT
ID,
CreatedDate CurrentDate,
LAG(CreatedDate) OVER (PARTITION BY ID ORDER BY CreatedDate) PreviousDate
FROM
Test
) T
WHERE
PreviousDate IS NULL
OR CurrentDate >= DATEADD(month, 3, PreviousDate)
GROUP BY
ID
You can use a lag to get the previous date, Null for the first in the list
SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t
You can use that as a subquery and get the difference in months using DATEDIFF
SELECT sub.id,DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate)
FROM (SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t) sub
WHERE DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
You can then take your totals
SELECT sub.id,COUNT(sub.id) as cnt
FROM (SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t) sub
WHERE DATEDIFF(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
GROUP BY sub.id
Note that using datediff the last day of january is three months before the first day of march. That appears to be the logic you were after.
You might want to define your three month gap criteria as
WHERE sub.PreviousCreateDate <= DATEADD(month, -3, sub.CreatedDate)
OR sub.PreviousCreateDate IS NULL
or
WHERE sub.CreatedDate >= DATEADD(month, +3, sub.PreviousCreateDate )
OR sub.PreviousCreateDate IS NULL
I'm guessing that your desired definition of three-month gap doesn't coincide with datediff()'s. Most of the logic here is to look back at the previous date and decide if the gap is big enough to qualify.
When datediff() counts three months difference we still need to make sure the day of month is later than the first one (per example and ID 5). If difference is more than three months then we're good automatically.
But I'm also assuming that you would want to treat the distance from November 30th to February 28th (or 29th in a leap year) as a full three months because the end date falls on the final day of the month. By adjusting the end date by an extra day this is an easy scenario to snag as it will bump the date into the following month and increase the month difference by one as well. If that's not what you want then just remove the dateadd(day, 1, ...) portion and use only the raw CreatedDate value.
You sample data is limited so I'm also making the assumption that the gaps are measure between consecutive dates. If you're wanting to find blocks of runs that don't span more than three months across the set, then that's a different problem and you should clarify with more information.
Since you've indicated that you're probably on SQL Server 2008 you'll have to do without the lag() function. Although the first query could be adjusted for that it's likely easier to go with the second approach at the end.
with diffs as (
select
ID,
row_number() over (partition by ID order by CreatedDate) as RN,
case when
datediff(
month,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
CreatedDate
) = 3
and
datepart(
day,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate)
) <= datepart(day, CreatedDate)
or
datediff(
month,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
/* adding one day to handle gaps like Nov30 - Feb28/29 and Jan31 - Apr30 */
dateadd(day, 1, CreatedDate)
) >= 4
then 1
else 0
end as GapFlag
from <T> /* <--- your table name here */
), gaps as (
select
ID, RN,
sum(1 + GapFlag) over (partition by ID order by RN) as Counter
from diffs
)
select ID, count(distinct Counter - RN) as "Count"
from gaps
group by ID
The rest of the logic is a typical gaps and islands scenario looking for holes in the sum(1 + GapCount) sequence with the offset of 1 acting pretty much like row_number().
http://sqlfiddle.com/#!6/61b12/3
JamieD77's approach is also valid. I was originally thinking your problem involved more than looking at the rows in sequence. Here's how I would tweak it for the gap definition I've been running with:
with data as (
select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
from T
)
select ID, count(*) as "Count"
from data d1 left outer join data d0
on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
or datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
or d0.ID is null
group by ID
Edit: You have changed the question since yesterday.
Change this line in the first query to include the total count:
...
select count(*) as TotalCnt, ID, count(distinct Counter - RN) as GapCount
...
Second would look like:
with data as (
select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
from T
)
select
count(*) as TotalCnt, ID,
count(case when
datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
or datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
or d0.ID is null then 1 end
) as GapCount
from data d1 left outer join data d0
on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
group by ID