Write Query That Consider Date Interval - sql-server

I have a table that contains Transactions of Customers.
I should Find Customers That had have at least 2 transaction with amount>20000 in Three consecutive days each month.
For example , Today is 2022/03/12 , I should Gather Data Of Transactions From 2022/02/13 To 2022/03/12, Then check These Data and See If a Customer had at least 2 Transaction With Amount>=20000 in Three consecutive days.
For Example, Consider Below Table:
Id
CustomerId
Transactiondate
Amount
1
1
2022-01-01
50000
2
2
2022_02_01
20000
3
3
2022_03_05
30000
4
3
2022_03_07
40000
5
2
2022_03_07
20000
6
4
2022_03_07
30000
7
4
2022_03_07
30000
The Out Put Should be : CustomerId =3 and CustomerId=4
I write query that Find Customer For Special day , but i don't know how to find these customers in one month with out using loop.
the query for special day is:
With cte (select customerid, amount, TransactionDate,Dateadd(day,-2,TransactionDate) as PrevDate
From Transaction
Where TransactionDate=2022-03-12)
Select CustomerId,Count(*)
From Cte
Where
TransactionDate>=Prevdate and TransactionDate<=TransactionDate
And Amount>=20000
Group By CustomerId
Having count(*)>=2

Hi there are many options how to achieve this.
I think that easies (from perfomance maybe not) is using LAG function:
WITH lagged_days AS (
SELECT
ISNULL(LAG(Transactiondate) OVER(PARTITION BY CustomerID ORDER BY id),
LEAD(Transactiondate) OVER(PARTITION BY CustomerID ORDER BY id)) lagged_dt
,*
FROM Transaction
), valid_cust_base as (
SELECT
*
FROM lagged_days
WHERE DATEPART(MONTH, lagged) = DATEPART(MONTH, Transactiondate)
AND datediff(day, Transactiondate, lagged_dt) <= 3
AND Amount >= 20000
)
SELECT
CustomerID
FROM valid_cust_base
GROUP BY CustomerID
HAVING COUNT(*) >= 2
First I have created lagged TransactionDate over customer (I assume that id is incremental). Then I have Selected only transactions within one month, with amount >= 20000 and where date difference between transaction is less then 4 days. Then just select customers who had more than 1 transaction.
In LAG First value is always missing per Customer missing, but you still need to be able say: 1st and 2nd transaction are within 3 days. Thats why I am replacing first NULL value with LEAD. It doesn't matter if you use:
ISNULL(LAG(Transactiondate) OVER(PARTITION BY CustomerID ORDER BY id),
LEAD(Transactiondate) OVER(PARTITION BY CustomerID ORDER BY id)) lagged_dt
OR
ISNULL(LEAD(Transactiondate) OVER(PARTITION BY CustomerID ORDER BY id),
LAG(Transactiondate) OVER(PARTITION BY CustomerID ORDER BY id)) lagged_dt
The main goal is to have for each transaction closest TransactionDate.

Related

I would like the number '1000' to appear once only and then '0' for the remaining records until the next month appears-maybe a case type statement?

I am using SQL and I would like this number '1000' to appear once per month. I have a record set which has the first of every month appearing multiple times. I would like the number '1000' to appear once only and then '0' for the remaining records until the next month appears. I would like the below please- maybe a case type statement/order parition by? I am using SQL Server 2018 ##SQLSERVER. Please see table below of how i would like the data to appear.
Many Thanks :)
Date
Amount
01/01/2022
1000
01/01/2022
0
01/01/2022
0
01/02/2022
1000
01/02/2022
0
01/02/2022
0
01/03/2022
1000
01/03/2022
0
Solution for your problem:
WITH CT1 AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY CONCAT(MONTH([Date]),YEAR([Date])) ORDER BY [Date]) as rn
FROM your_table
)
SELECT [Date],
CASE WHEN rn = 1 THEN 1000 ELSE 0 END AS Amount
FROM CT1;
Working Example: DB<>Fiddle Link
Given just a list of dates you could use row_number and a conditional expression to arbitrarily assign one row of each month a value of 1000
select *,
Iif(Row_Number() over(partition by Month(date) order by (select null)) = 1, 1000, 0) Amount
from t
order by [date], Amount desc;

Choosing distinct ID with differing column values

Lets say I have this query:
SELECT id, date, amount, cancelled
FROM transactions
Which gives me the following results:
id date amount cancelled
1 01/2019 25.10 0
1 02/2019 19.55 1
1 06/2019 20.33 0
2 10/2019 11.00 0
If there are duplicate IDs, how can I get the one with the latest date? So it would look like this:
id date amount cancelled
1 06/2019 20.33 0
2 10/2019 11.00 0
One method is with ROW_NUMBER and a common table expression like this example. In a multi-statement batch, be mindful to terminate the preceding statement with a semi-colon to avoid parsing errors.
WITH data_with_date_sequence AS (
SELECT
id
, date
, amount
, cancelled
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY date DESC) AS seq
FROM dbo.SomeTable
)
SELECT
id
, date
, amount
, cancelled
FROM data_with_date_sequence
WHERE seq = 1;
One option could be to use ROW_NUMBER function, which will group rows by id and order them by date within same id.
;WITH max_dates AS (
SELECT id,
, date
, amount
, cancelled
, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) AS Position
FROM transactions
)
SELECT * FROM max_dates WHERE Position = 1

SQL Server: How to get a rolling sum over 3 days for different customers within same table

This is the input table:
Customer_ID Date Amount
1 4/11/2014 20
1 4/13/2014 10
1 4/14/2014 30
1 4/18/2014 25
2 5/15/2014 15
2 6/21/2014 25
2 6/22/2014 35
2 6/23/2014 10
There is information pertaining to multiple customers and I want to get a rolling sum across a 3 day window for each customer.
The solution should be as below:
Customer_ID Date Amount Rolling_3_Day_Sum
1 4/11/2014 20 20
1 4/13/2014 10 30
1 4/14/2014 30 40
1 4/18/2014 25 25
2 5/15/2014 15 15
2 6/21/2014 25 25
2 6/22/2014 35 60
2 6/23/2014 10 70
The biggest issue is that I don't have transactions for each day because of which the partition by row number doesn't work.
The closest example I found on SO was:
SQL Query for 7 Day Rolling Average in SQL Server
but even in that case there were transactions made everyday which accomodated the rownumber() based solutions
The rownumber query is as follows:
select customer_id, Date, Amount,
Rolling_3_day_sum = CASE WHEN ROW_NUMBER() OVER (partition by customer_id ORDER BY Date) > 2
THEN SUM(Amount) OVER (partition by customer_id ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
END
from #tmp_taml9
order by customer_id
I was wondering if there is way to replace "BETWEEN 2 PRECEDING AND CURRENT ROW" by "BETWEEN [DATE - 2] and [DATE]"
One option would be to use a calendar table (or something similar) to get the complete range of dates and left join your table with that and use the row_number based solution.
Another option that might work (not sure about performance) would be to use an apply query like this:
select customer_id, Date, Amount, coalesce(Rolling_3_day_sum, Amount) Rolling_3_day_sum
from #tmp_taml9 t1
cross apply (
select sum(amount) Rolling_3_day_sum
from #tmp_taml9
where Customer_ID = t1.Customer_ID
and datediff(day, date, t1.date) <= 3
and t1.Date >= date
) o
order by customer_id;
I suspect performance might not be great though.

Difference of two months expenses category wise in sql

I am trying to find the difference of expenses of previous and current month in sql.
I have a table like this
Date Amount Category
2/18/2015 100 Salary
2/12/2015 150 Rent
2/21/2015 200 Allowances
1/4/2015 200 Salary
1/17/2015 50 Rent
1/20/2015 100 Allowances
Now I want a result like this
Category CurrentMonthAmount PreviousMonthAmount Difference
Salary 100 200 100
Rent 150 50 100
Allowances 200 100 100
Try using conditional Aggregate
;WITH cte
AS (SELECT Category,
Max(CASE WHEN Month([date]) = Month(Getdate()) and year([date]) =year(getdate()) THEN amount END) CurrentMonthAmount,
Max(CASE WHEN Month([date]) = Month(Getdate()) - 1 and year([date]) =year(getdate()) THEN amount END) PreviousMonthAmount
FROM Yourtable
GROUP BY Category)
SELECT Category,
CurrentMonthAmount,
PreviousMonthAmount,
[Difference]=Abs(CurrentMonthAmount - PreviousMonthAmount)
FROM cte
SQLFIDDLE DEMO
Are you trying to do the computations within your SQL only or via some scripts like PHP.? Furthermore, can you state if you are choosing specific records for this operation (specific rows i mean). Give some more clarification

Custom Query Output Ordering with a CASE function

I often find when I am pulling data for analysis, that I group the number of orders a customer has placed into ranges, such as:
1-2
3-5
6-9
10-12
13-15
I do this with a CASE function. However, when you get the query results, the order ranges will be listed like:
1-2
10-12
13-15
3-5
6-9
This easy to correct in Excel when you have 1 query and a few order range groups. However, when you're pulling many queries, it's a pain to correct this over and over.
What is the best way to pull a range and have it ordered correctly?
here's an example of the query I would write:
SELECT
OrderRange = CASE
WHEN COUNT(OrderID) BETWEEN 1 AND 5 THEN '1-5'
WHEN COUNT(OrderID) BETWEEN 6 AND 10 THEN '6-10'
WHEN COUNT(OrderID) > 10 THEN '10+'
ELSE 'Error'
END
FROM Orders
GROUP BY CASE
WHEN COUNT(OrderID) BETWEEN 1 AND 5 THEN '1-5'
WHEN COUNT(OrderID) BETWEEN 6 AND 10 THEN '6-10'
WHEN COUNT(OrderID) > 10 THEN '10+'
ELSE 'Error'
END
ORDER BY... ?
I'd keep a table of ranges, e.g. (indices not written)
CREATE TABLE Ranges (RangeSet int, MinVal int, MaxVal int, Name varchar(50));
and then e.g.
INSERT INTO ranges VALUES
(1,1,5,'1-5'),(1,6,10,'6-10'),(1,11,-1,'11+'),
(2,1,10,'1-10'),(2,11,20,'11-20'),(2,21,30,'21-30'),(2,31,-1,'31+');
you get the idea. Now you do something like (table and field names free fiction)
SELECT
CustomerID,
count(OrderID) AS OrderCount
FROM Orders
WHERE <whatever, e.g order_date BETWEEN ... AND ...>
GROUP BY CustomerID
HAVING OrderCount>0
as you'd normally would expect, but wrap it in a superquery joining to the Ranges table
SELECT
BaseView.CustomerID as CustomerID,
Ranges.Name as OrderRange
FROM (
SELECT
CustomerID,
count(OrderID) AS OrderCount
FROM Orders
WHERE <whatever, e.g order_date BETWEEN ... AND ...>
GROUP BY CustomerID
HAVING OrderCount>0
) AS BaseView
INNER JOIN Ranges ON
Ranges.RangeSet=<id-of-required-rangeset>
AND BaseView.OrderCount>=Ranges.MinVal
AND (BaseView.OrderCount<=Ranges.MaxVal OR Ranges.MaxVal=-1)
ORDER BY RangeSet.MinVal DESC
;
Now you just have to supply the RangeSet you want to apply, maybe creating a new one on occasion.
Disclaimer: This is a performance-killer
If I'm understanding you correctly you want the list of customers and order ranges ordered from least to highest. You should be able to do that by just ordering by the count(orderID)
SELECT CustomerID,
OrderRange = CASE
WHEN COUNT(OrderID) BETWEEN 1 AND 5 THEN '1-5'
WHEN COUNT(OrderID) BETWEEN 6 AND 10 THEN '6-10'
WHEN COUNT(OrderID) > 10 THEN '10+'
ELSE 'Error'
END ,
FROM Orders
GROUP BY CustomerID
order by count(orderid)
Results:
CustomerId OrderRange
CENTC 1-5
GROSR 1-5
LAZYK 1-5
...
ROMEY 1-5
VINET 1-5
ALFKI 6-10
CACTU 6-10
...
VICTE 6-10
WANDK 6-10
BLONP 10+
GREAL 10+
RICAR 10+
...
QUICK 10+
ERNSH 10+
SAVEA 10+

Resources