How to find sum of values of previous dates - sql-server

I have 3 columns in Invoice table.
InvoicePeriod
InvoiceType
Fees
I have data like this:
InvoicePeriod InvoiceType Fees
2020-06-30 ABC 10.0
2020-06-30 ABC 40.0
2020-06-30 ABC 32.0
2020-09-30 ABC 5.0
2020-09-30 XYZ 30.0
2020-12-31 ABC 20.0
2020-12-31 ABC 10.0
2021-01-31 XYZ 60.0
2021-02-01 DEF 36.0
Now I want the last(max) of invoice period of each invoice type and the summation of fees of previous dates.
Output:
InvoicePeriod InvoiceType Fees
2020-12-31 ABC 87.0
2021-01-31 XYZ 30.0
2021-02-01 DEF 0.0
How can I achieve this?
Thanks,
Ankit

You want to group by InvoiceType (since you want one row per type) and you want the aggregate functions max and sum to combine values within those groups.
So
SELECT MAX(InvoicePeriod), InvoiceType, SUM(Fees)
FROM mytable
GROUP BY InvoiceType
Edited to exclude the fees that match the max date, now that I understand the problem better:
SELECT t2.MaxPeriod, t2.InvoiceType, SUM(CASE WHEN t1.InvoicePeriod < t2.MaxPeriod THEN t1.Fees ELSE 0 END)
FROM test t1 INNER JOIN
(
SELECT MAX(InvoicePeriod) MaxPeriod, InvoiceType
FROM test
GROUP BY InvoiceType
) t2 ON t1.InvoiceType = t2.InvoiceType
GROUP BY t2.MaxPeriod, t2.InvoiceType
There are different ways of doing this, but I think the above does what you want so you could build off of it. The inner query gets the max InvoicePeriod for each InvoiceType. The outer query uses that and also sums the Fees when the date is less than the max for that group.

I think this is what you're looking for.
SELECT
MAX(in_main.InvoicePeriod) AS InvoicePeriod
, InvoiceType
/* Subtract out fees on last invoice date*/
, SUM(Fees) - (
SELECT COALESCE(SUM(Fees), 0)
FROM Invoice in_sub
WHERE (
in_sub.InvoiceType = in_main.InvoiceType
AND
in_sub.InvoicePeriod = MAX(in_main.InvoicePeriod)
)
) AS Fees
FROM Invoice in_main
GROUP BY InvoiceType
http://sqlfiddle.com/#!18/288adf/2/0

Steps:
Aggregate per period and type.
Get the sum of a type's previous periods.
Use TOP WITH TIES in combination with ROW_NUMBER in order to keep all types' last periods.
The query:
select top(1) with ties
invoiceperiod,
invoicetype,
coalesce(sum(sum(fees)) over (
partition by invoicetype
order by invoiceperiod
rows between unbounded preceding and 1 preceding
), 0.0) as sum_fees
from invoice
group by invoiceperiod, invoicetype
order by row_number() over (partition by invoicetype order by invoiceperiod desc);
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=6a651f85961b27be687026c2ce73c8f9

Related

Need help selecting a record between two date ranges?

I was trying to select a record between two date ranges but I keep getting duplicate record when two date range overlaps as shown below.
Here is an example.
Policy Info
Policy # Policy Effective Date Policy termination date Year
001 2018-10-01 2019-10-01 2018
002 2019-10-01 2020-10-01 2019
003 2020-10-01 2021-10-01 2020
004 2021-10-01 2022-10-01 2022
Policy Limit
LimitID Effective Date Termination Date Limit
1 2018-10-01 2021-10-01 1000
2 2018-10-01 3000-01-01 2500
How can I select Limit ID: 1 for Policy #: 001,002 003 or for the years 2018, 2019, 2020 and for any policy effective date greater than 2021-01-01 use Limit ID = 2
I tried the following but it keeps creating dupicate
((limit.effective_from_date < policy.effective_to_date
AND limit.effective_to_date > policy.effective_from_date
)
OR
(limit.effective_from_date = policy.effective_from_date
AND limit.effective_to_date = CONVERT(datetime, '01/01/3000', 102)))
but the above condition creates a duplicate. Is there any effective way of selecting a record within overlapping date ranges.
Any help will be appreciated!
Your problem is that you have overlapping periods for Policy Limits and you need to choose one. For what I understand from your data and I'm inferring a lot, you need to get the first limit for the FIRST period that it's [Policy Limit].[Effective Date] is earlier than the [Policy Info].[Policy Effective Date]
while [Policy Limit].[Termination Date] is later than [Policy Info].[Policy Termination Date].
If all my guessing is correct, you can do something like
drop table if exists #PolicyInfo
drop table if exists #PolicyLimit
CREATE TABLE #PolicyInfo (
Policy INT,
Policy_Effective_Date DATE,
Policy_termination_date DATE,
[Year] int
)
CREATE TABLE #PolicyLimit(
LimitID INT,
Effective_Date DATE,
Termination_Date DATE,
Limit INT
)
INSERT INTO #PolicyInfo (Policy, Policy_Effective_Date, Policy_termination_date, [Year])
VALUES
(001, '2018-10-01', '2019-10-01', 2018),
(002, '2019-10-01', '2020-10-01', 2019),
(003, '2020-10-01', '2021-10-01', 2020),
(004, '2021-10-01', '2022-10-01', 2022)
INSERT INTO #PolicyLimit (LimitID, Effective_Date, Termination_Date, Limit)
VALUES
(1, '2018-10-01','2021-10-01',1000),
(2, '2018-10-01','3000-01-01',2500)
;with cte AS (
-- Join PolicyInfo with PolicyLimit
-- condition: Policy_Effective_Date are between Effective_Date, pl.Termination_Date
-- AND
-- Policy_Termination_Date are between Effective_Date, pl.Termination_Date
SELECT *,
-- rank with partion by Policy
ROW_NUMBER() OVER (PARTITION BY [pi].Policy ORDER BY pl.Effective_Date, pl.Termination_Date) rn
FROM #PolicyInfo [pi]
INNER JOIN #PolicyLimit pl ON
[pi].Policy_Effective_Date BETWEEN pl.Effective_Date AND pl.Termination_Date
AND [pi].Policy_termination_date BETWEEN pl.Effective_Date AND pl.Termination_Date
)
SELECT Policy, LimitID
FROM cte
WHERE rn = 1 -- Select the first Limit per partition

How to find ratio value between two consecutive dates

Given a table which consists of:
ID_User, Date
I'd like to find the ratio between every two consecutive days,
The ratio between the same people who attended day x and day x+1.
i'll give an example:
let's say :
Bill 12155 2018-05-01
Jim 52135 2018-05-01
Homer 52135 2018-05-01
Jecki 56135 2018-05-01
Michael 45644 2018-05-02
Jim 52135 2018-05-02
Jessy 45645 2018-05-02
Homer 52135 2018-05-02
So the ratio would be 2/4 = 0.5
I tried resolving it on my own for the last day but had some struggles.
I started by grouping by date:
Select Date, ID_USER
GROUP BY DATE, ID_USER
ORDER BY DATE, ID_USER
can someone please give me some pointers,
Thank you all!
Try this:
SELECT t1.[Date],
( CONVERT(decimal, SUM(CASE WHEN t2.[ID] IS NOT NULL THEN 1 ELSE 0 END) ) / COUNT(t1.[ID]) ) AS [Ratio]
FROM #YourTbl t1
LEFT OUTER JOIN #YourTbl t2 ON t2.[ID] = t1.[ID] AND t2.[Date] = DATEADD(DAY, 1, t1.[Date])
GROUP BY t1.[Date]
Group your data by the first Date (in your sample, 05-01-2018).
Then, self-join the table by doing a LEFT OUTER JOIN so you have the full list of data and a second list of only the data where the same user (based on ID) is in the data again for the next day (DATEADD( DAY, 1, ... )).
Then you can tell if any user has attended two days in a row based on a given date by checking any field in t2 to be NULL.
To get a ratio of Users who attended t1.[Date] and the next date t2.[Date], total up the users in t2 where the ID is NOT NULL and divide it by the total count of users for that day in t1. Now, since SUM returns an INT in this case and you need a decimal, CONVERT the SUM to DECIMAL and you will get a decimal number.
Here are the results for your sample data: Note: After changing the ID of either Jim or Homer since they originally had the same ID.
Date Ratio
2018-05-01 0.50000000000
2018-05-02 0.00000000000
The self-join solution is valid. You might try this approach as well:
with data as (
select "date",
case when dateadd(day, 1, "date") =
lead("date") over (partition by id order by "date")
then 1 end as returned
from T
)
select "date", count(returned) * 1. / count(*) as ratio
from data
group by "date";
If you want to eliminate the final date since it's always zero, you could easily add case when "date" <> max("date") over () then 1 end as notfinal and filter based on that.
https://rextester.com/HHL82126

How to get last record based on a date with some other aggregate column in a SQL query

I know this question is already asked like 100 time and I reviewed all of them but I'm Kinda Stuck and had to asked for help
I have table like this:
hivenumber Visitdate CombsNO WaxNo BeeBehave
------------------------------------------------
1 2017-11-10 10 2 4
2 2017-11-10 11 1 3
3 2017-11-10 12 3 3
1 2017-11-12 13 1 1
3 2017-11-11 14 5 2
At first I want to aggregate it by HiveNumber
Select HiveNumber
From tHivesDetails
Group BY HiveNumber
Then I want the last record of CombNo for each HiveNumber
Select Top(1) CombNo
From `tHivesDetails`
Order By VisitDate Desc
Then I need sum of Wax for each HiveNumber
Select Sum(Wax)
From `tHivesDetails`
Group BY HiveNumber
and at the end I want average of BeeBehave
Select Avg(BeeBehave)
From tHivesDetails
Group By HiveNumber
I don't know how to combine these queries to 1 and have one table with all I need in this case. I read most of same question but unfortunately couldn't figure it how do that.
I want a result like this:
hivenumber Visitdate CombsNO WaxNo BeeBehave
------------------------------------------------
1 2017-11-12 13 Sum avg
2 2017-11-10 11 sum avg
3 2017-11-11 14 sum avg
"Window Functions" to the rescue. You can use aggregate functions with an over clause to produce values on each row of a result. ROW_NUMBER() allows use of order by as well, do by ordering within each "partition" by the dates descending, the number 1 is given to "the most recent:" visit (per hive due to the partition).
select *
from (
Select *
, row_number() over(partition by HiveNumber order by VisitDate DESC) rn
, sum(Wax) over(partition by HiveNumber) sum_wax
, Avg(BeeBehave) over(partition by HiveNumber) avg_bb
From tHivesDetails
) d
where rn = 1
Try this:
SELECT tA.HiveNumber, tA.WaxNoSum, tA.BeeBehaveSum, tB.CombsNoLatest
FROM (SELECT HiveNumber, SUM(WaxNo) AS WaxNoSum, AVG(BeeBehave) AS BeeBehaveSum
FROM tHivesDetails
GROUP BY HiveNumber) AS tA LEFT JOIN (SELECT HiveNumber, MAX(CombsNO) AS CombsNoLatest
FROM tHivesDetails
GROUP BY HiveNumber) AS tB ON tA.HiveNumber = tB.HiveNumber

SQL Server: update table with value from previous record

I have tried several ways using LAG(), ROW_NUMBER() and so on, but I cannot get it working... Please help.
Assume we have this table:
Date Time Amount Balance
---------------------------------------------
20171001 12:44:00 102.00 102.00
20171002 09:32:12 10.00 null
20171002 20:00:00 123.00 null
20171003 07:43:12 5.29 null
My goal is to update the Balance but these records are not ordered in this table.
I have tried to use this code:
with t1 as
(
select
Date, Time, Amount, Balance,
lag(Balance) over (order by Date, Time) Balance_old
from
table1
)
update table1
set Balance = Amount + Balance_old
where Balance_old is not null
However, this seems to only update 1 record instead of 3 in the above example. Even when I try to do something similar with ROW_NUMBER() then I do not get the results I require.
The results I would like to have are as follows:
Date Time Amount Balance
---------------------------------------------
20171001 12:44:00 102.00 102.00
20171002 09:32:12 10.00 112.00
20171002 20:00:00 123.00 235.00
20171003 07:43:12 5.29 240.29
Please notice: in my situation there is always a record which has a value in Balance. This is the starting point which can be 0 or <>0 (but not null).
As one of the approaches is to simply use sum() over() window function.
-- set up
select *
into t1
from (
select cast('20171001' as date) Date1, cast('12:44:00' as time) Time1, 102.00 Amount, 102.00 Balance union all
select cast('20171002' as date), cast('09:32:12' as time), 10.00, null union all
select cast('20171002' as date), cast('20:00:00' as time), 123.00, null union all
select cast('20171003' as date), cast('07:43:12' as time), 5.29, null
) q
-- UPDATE statement
;with t2 as(
select date1
, time1
, amount
, balance
, sum(isnull(balance, amount)) over(order by date1, time1) as balance1
from t1
)
update t2
set balance = balance1
The result:
Date1 Time1 Amount Balance
---------- ---------------- ---------- -------------
2017-10-01 12:44:00.0000000 102.00 102.00
2017-10-02 09:32:12.0000000 10.00 112.00
2017-10-02 20:00:00.0000000 123.00 235.00
2017-10-03 07:43:12.0000000 5.29 240.29

T-SQL Query to remove duplicate records in the output based on one particular column

I am running SQL Server 2014 and I have the following T-SQL query:
USE MYDATABASE
SELECT *
FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
RESERVATIONLIST mentioned in the code above is a view. The query gives me the following output (extract):
ID NAME DOA DOD Nights Spent MTH
--------------------------------------------------------------------
251 AH 2015-01-12 2015-01-15 3 JANUARY 2015
258 JV 2015-01-28 2015-02-03 4 JANUARY 2015
258 JV 2015-01-28 2015-02-03 2 FEBRUARY 2015
The above output consist of around 12,000 records.
I need to modify my query so that it eliminates all duplicate ID and give me the following results:
ID NAME DOA DOD Nights Spent MTH
--------------------------------------------------------------------
251 AH 2015-01-12 2015-01-15 3 JANUARY 2015
258 JV 2015-01-28 2015-02-03 4 JANUARY 2015
I tried something like this, but it's not working:
USE MYDATABASE
SELECT *
FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015', 'FEBRUARY 2015')
GROUP BY [ID]
HAVING COUNT ([MTH]) > 1
Following query will return one row per ID :
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY (SELECT NULL)) rn FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
) T
WHERE rn = 1
Note : this will return a random row from multiple rows having same ID. IF you want to select some specific row then you have to define it in order by. For e.g. :
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DOA DESC) rn FROM RESERVATIONLIST
WHERE [MTH] IN ('JANUARY 2015','FEBRUARY 2015')
) T
WHERE rn = 1
definitely, it will return the row having max(DOA).
You are trying to do a GROUP BY statement which IMHO is the right way to go. You should formulate all columns that are a constant, and roll-up the others. Depending on the value of DOD and DOA I can see two solutions:
SELECT ID,NAME,DOA,DOD,SUM([Nights Spent]) as Nights,
min(MTH) as firstRes, max(MTH) as lastRes
FROM RESERVATIONLIST
GROUP BY ID,NAME,DOA,DOD
OR
SELECT ID,NAME,min(DOA) as firstDOA,max(DOD) as lastDOD,SUM([Nights Spent]) as Nights,
min(MTH) as firstRes, max(MTH) as lastRes
FROM RESERVATIONLIST
GROUP BY ID,NAME

Resources