Cumulative Count of NULL restarting at NOT NULL - sql-server

I would like to add a column indicating the number invites a person received before they accepted by incrementally counting the number of null columns before a non-null while partitioning over the PERSON_ID and ordering by the INVITED_DATE.
My table has the following format:
| UNIQUE_ID | PERSON_ID | INVITED_DATE | ACCEPTED_DATE |
| 12345 | 567 | 12-01-18 | NULL |
| 12346 | 567 | 12-02-18 | NULL |
| 12347 | 567 | 12-03-18 | NULL |
| 12348 | 567 | 12-04-18 | 12-04-18 |
| 12349 | 567 | 12-05-18 | NULL |
| 12350 | 568 | 12-01-18 | NULL |
| 12351 | 568 | 12-02-18 | 12-02-18 |
The output should ideally look like the following:
| UNIQUE_ID | PERSON_ID | INVITED_DATE | ACCEPTED_DATE | INVITES_BEFORE_ACCEPT |
| 12345 | 567 | 12-01-18 | NULL | 1 |
| 12346 | 567 | 12-02-18 | NULL | 2 |
| 12347 | 567 | 12-03-18 | NULL | 3 |
| 12348 | 567 | 12-04-18 | 12-04-18 | 0 |
| 12349 | 567 | 12-05-18 | NULL | 1 |
| 12350 | 568 | 12-01-18 | NULL | 1 |
| 12351 | 568 | 12-02-18 | 12-02-18 | 0 |
So far I've tried a number iterations of ROW NUMBER with OVER and PARTITION but I've found it will need to be an OUTER APPLY. The following OUTER APPLY counts over the data but doesn't restart the count with a successful accept.
SELECT t.* , invites.INVITES_BEFORE_ACCEPT
FROM table t
OUTER APPLY (
SELECT COUNT(*) INVITES_BEFORE_ACCEPT
FROM table t2
WHERE t.PERSON_ID = t2.PERSON_ID and t.INVITED_DATE < t2.ACCEPTED_DATE
) invites

One way would be
WITH t
AS (SELECT *,
COUNT(ACCEPTED_DATE)
OVER (
PARTITION BY PERSON_ID
ORDER BY INVITED_DATE) AS Grp
FROM [table])
SELECT *,
SUM(CASE
WHEN ACCEPTED_DATE IS NULL
THEN 1
ELSE 0
END)
OVER (
PARTITION BY PERSON_ID, Grp
ORDER BY INVITED_DATE) AS INVITES_BEFORE_ACCEPT
FROM t
Demo

Related

SQL Server Dynamic Resetting Running Balance

My current issue is that I have a running balance, where one value falls below another the running balance needs to reset. But not only reset, but also use a another value as its starting value and start the balance again.
Below is the table with data in it:
+-------------+--------+---------------------+-------------------+--------------+-------------+
| Tran_DateSK | Amount | Running_AccountFees | Overlimit_Balance | Restart_Calc | Actual_Calc |
+-------------+--------+---------------------+-------------------+--------------+-------------+
| 20200217 | 39 | 39 | 3867.76 | 0 | 39 |
| 20200217 | 50 | 89 | 3867.76 | 0 | 89 |
| 20200316 | 39 | 128 | 4735.52 | 0 | 128 |
| 20200316 | 50 | 178 | 4735.52 | 0 | 178 |
| 20200324 | 50 | 228 | 2685.52 | 0 | 228 |
| 20200330 | 50 | 278 | 49.52 | 1 | 49.52 |
| 20200415 | 39 | 317 | 49.52 | 1 | 49.52 |
| 20200515 | 39 | 356 | 3917.28 | 0 | 88.52 |
| 20200515 | 50 | 406 | 3917.28 | 0 | 138.52 |
| 20200519 | 50 | 456 | 3467.28 | 0 | 188.52 |
| 20200604 | 50 | 506 | 3017.28 | 0 | 238.52 |
| 20200609 | 50 | 556 | 2167.28 | 0 | 288.52 |
| 20200611 | 50 | 606 | 49.28 | 1 | 49.28 |
| 20200615 | 39 | 645 | 3917.04 | 0 | 88.28 |
| 20200615 | 50 | 695 | 3917.04 | 0 | 138.28 |
| 20200616 | 50 | 745 | 3017.04 | 0 | 188.28 |
| 20200616 | 50 | 795 | 3017.04 | 0 | 238.28 |
| 20200619 | 50 | 845 | 2567.04 | 0 | 288.28 |
| 20200624 | 50 | 895 | 47.04 | 1 | 47.04 |
| 20200715 | 39 | 934 | 47.04 | 1 | 47.04 |
+-------------+--------+---------------------+-------------------+--------------+-------------+
Actual Calc is the desired outcome and Running account fees is the issue.
Running account fees is the running balance of "Amount" and overlimit_balance is the test. We need to see that the running_accountfees isn't greater than over limit,
If it is, take overlimits value and start calculating again by adding amount on again.
My query that produced this:
SELECT
[Transaction].ReportDateSK AS 'Tran_DateSK'
,[Transaction].AmountChange/100.00 AS 'Amount'
,SUM([Transaction].AmountChange/100.00)
OVER (PARTITION BY [Transaction].AccountSK
ORDER BY [Transaction].ReportDateSK
ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING) AS 'Running_AccountFees'
,[Summary].Overlimit_Balance AS 'Overlimit_Balance'
,CASE
WHEN SUM([Transaction].AmountChange/100.00)
OVER (PARTITION BY [Transaction].AccountSK
ORDER BY [Transaction].ReportDateSK
ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING) > [Summary].Overlimit_Balance
THEN 1
ELSE 0
END AS 'Restart_Calc'
,'' AS 'Actual_Calc'
FROM
Fact.[Transaction] [Transaction]
INNER JOIN Fact.AccountSummary [Summary] ON [Summary].DateSK = [Transaction].ReportDateSK
AND [Summary].AccountSK = [Transaction].AccountSK
AND [Summary].[Current] = 1
WHERE IsFeeTransaction = 1
AND [Transaction].AccountSK = 725
AND [Transaction].ReportDateSK BETWEEN 20200217 AND 20200730
Realised that the data in your question is essentially the source data and have been able to come up with the below. It isn't exactly pretty but it provides the correct output. Explanations on how it works are in the comments:
declare #t table(Tran_DateSK int, Amount decimal(10,2), Running_AccountFees int, Overlimit_Balance decimal(10,2), Restart_Calc bit, Actual_Calc decimal(10,2));
insert into #t values(20200217,39,39,3867.76,0,39),(20200217,50,89,3867.76,0,89),(20200316,39,128,4735.52,0,128),(20200316,50,178,4735.52,0,178),(20200324,50,228,2685.52,0,228),(20200330,50,278,49.52,1,49.52),(20200415,39,317,49.52,1,49.52),(20200515,39,356,3917.28,0,88.52),(20200515,50,406,3917.28,0,138.52),(20200519,50,456,3467.28,0,188.52),(20200604,50,506,3017.28,0,238.52),(20200609,50,556,2167.28,0,288.52),(20200611,50,606,49.28,1,49.28),(20200615,39,645,3917.04,0,88.28),(20200615,50,695,3917.04,0,138.28),(20200616,50,745,3017.04,0,188.28),(20200616,50,795,3017.04,0,238.28),(20200619,50,845,2567.04,0,288.28),(20200624,50,895,47.04,1,47.04),(20200715,39,934,47.04,1,47.04);
with t as
(
select Tran_DateSK
,Amount
-- Check if the Running_AccountFees are over the Overlimit_Balance
,case when sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) > Overlimit_Balance
-- If so, check if the Running_AccountFees in the previous row were also over the Overlimit_Balance
then case when (sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) - Amount) > lag(Overlimit_Balance,1,0) over (order by Tran_DateSK,Amount,Overlimit_Balance)
then 0 -- and in those instances this means multiple Restart_Calcs in a row, so set the Amount to zero as we don't want to increase the fees when calculating the Actual_Calc
else Amount
end
else Amount
end as Amount_Adj
,sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) as Running_AccountFees
,lag(Overlimit_Balance,1,0) over (order by Tran_DateSK,Amount,Overlimit_Balance) as Prev_Overlimit_Balance
,Overlimit_Balance
,case when sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) > Overlimit_Balance
then 1
else 0
end as Restart_Calc
from #t
)
,b as
(
select *
,case when Running_AccountFees > Overlimit_Balance -- If this row is the first in a possible series of balance resets
and sum(Amount_Adj) over (order by Tran_DateSK,Amount,Overlimit_Balance rows between unbounded preceding and 1 preceding) <= Prev_Overlimit_Balance
then Overlimit_Balance -- Take the Overlimit_Balance and subtract the *Adjusted* Running_AccountFees
- sum(Amount_Adj) over (order by Tran_DateSK,Amount,Overlimit_Balance rows between unbounded preceding and 1 preceding)
- Amount_Adj
else 0
end as Reset_Bal
from t
)
select Tran_DateSK
,Amount
,Running_AccountFees
,Overlimit_Balance
,Restart_Calc
-- For each *Adjusted* Running_AccountFees, apply the most negative Reset_Bal value, as this will contain the entire amount that needs to be reset from the current *Adjusted* Running_AccountFees to get the correct Balance_Calc
,sum(Amount_Adj) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding)
+ min(Reset_Bal) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding)
as Balance_Calc
from b
order by Tran_DateSK;
Output
+-------------+--------+---------------------+-------------------+--------------+--------------+
| Tran_DateSK | Amount | Running_AccountFees | Overlimit_Balance | Restart_Calc | Balance_Calc |
+-------------+--------+---------------------+-------------------+--------------+--------------+
| 20200217 | 39.00 | 39.00 | 3867.76 | 0 | 39.00 |
| 20200217 | 50.00 | 89.00 | 3867.76 | 0 | 89.00 |
| 20200316 | 39.00 | 128.00 | 4735.52 | 0 | 128.00 |
| 20200316 | 50.00 | 178.00 | 4735.52 | 0 | 178.00 |
| 20200324 | 50.00 | 228.00 | 2685.52 | 0 | 228.00 |
| 20200330 | 50.00 | 278.00 | 49.52 | 1 | 49.52 |
| 20200415 | 39.00 | 317.00 | 49.52 | 1 | 49.52 |
| 20200515 | 39.00 | 356.00 | 3917.28 | 0 | 88.52 |
| 20200515 | 50.00 | 406.00 | 3917.28 | 0 | 138.52 |
| 20200519 | 50.00 | 456.00 | 3467.28 | 0 | 188.52 |
| 20200604 | 50.00 | 506.00 | 3017.28 | 0 | 238.52 |
| 20200609 | 50.00 | 556.00 | 2167.28 | 0 | 288.52 |
| 20200611 | 50.00 | 606.00 | 49.28 | 1 | 49.28 |
| 20200615 | 39.00 | 645.00 | 3917.04 | 0 | 88.28 |
| 20200615 | 50.00 | 695.00 | 3917.04 | 0 | 138.28 |
| 20200616 | 50.00 | 745.00 | 3017.04 | 0 | 188.28 |
| 20200616 | 50.00 | 795.00 | 3017.04 | 0 | 238.28 |
| 20200619 | 50.00 | 845.00 | 2567.04 | 0 | 288.28 |
| 20200624 | 50.00 | 895.00 | 47.04 | 1 | 47.04 |
| 20200715 | 39.00 | 934.00 | 47.04 | 1 | 47.04 |
+-------------+--------+---------------------+-------------------+--------------+--------------+

How to find max(sortnumber) on item code in SQL Server?

I have following SQL Server table ITEM:
+------------+-----------+------+--------+-----------+------------+
| Date | item_code | name | in/out | total_qty | SortNumber |
+------------+-----------+------+--------+-----------+------------+
| 08/07/2019 | 001 | A | -50 | 100 | 8 |
| 07/07/2019 | 001 | A | 50 | 100 | 7 |
| 06/07/2019 | 003 | C | 25 | 25 | 6 |
| 05/07/2019 | 001 | A | 50 | 50 | 5 |
| 04/07/2019 | 002 | B | 100 | 200 | 4 |
| 03/07/2019 | 003 | C | -25 | 0 | 3 |
| 02/07/2019 | 003 | C | 25 | 25 | 2 |
| 01/07/2019 | 002 | B | 100 | 100 | 1 |
+------------+-----------+------+--------+-----------+------------+
I've tried:
select itemcode, max(Sort_Number)
from ITEM
group by item_code
order by item_code asc
but I want result:
+---------------------+-----------+------------------+
| Distinct(item_code) | Total_qty | Max(Sort_Number) |
+---------------------+-----------+------------------+
| 001 | 100 | 8 |
| 002 | 200 | 4 |
| 003 | 25 | 6 |
+---------------------+-----------+------------------+
Can anyone help me?
The below query gives you the desired result -
With cteItem as
(
select item_code, total_qty, SortNumber,
Row_Number() over (partition by item_code order by SortNumber desc) maxSortNumber
from ITEM
)
select item_code, total_qty, SortNumber from cteItem where maxSortNumber = 1
just need to add max(sort_number) to your query
select item_code ,max(total_qty), max(sort_number)
from ITEM
group by item_code
order by item_code asc

T-SQL: Values are grouped by month, if there is no value for a month the month should also appear and display "NULL"

i have a SQL that displays turnover, stock and other values for stores grouped by month. Logically, if there is no value for a month, the month doesn't appear. The target is that the empty month should appear and display "NULL" for the values. The empty months should range from the #FROM to the #TO parameter (201807 to 201907) in this case.
Before:
+-------+--------+----------+----------+-------+
| Store | Month | Incoming | Turnover | Stock |
+-------+--------+----------+----------+-------+
| 123 | 201810 | 5 | 4 | 1 |
| 123 | 201811 | 0 | 1 | 0 |
| 123 | 201901 | 25 | 5 | 20 |
| 123 | 201902 | 5 | 10 | 15 |
| 123 | 201903 | 8 | 9 | 14 |
| 123 | 201904 | 5 | 4 | 15 |
| 123 | 201905 | 10 | 5 | 20 |
+-------+--------+----------+----------+-------+
After:
+-------+--------+----------+----------+-------+
| Store | Month | Incoming | Turnover | Stock |
+-------+--------+----------+----------+-------+
| 123 | 201807 | NULL | NULL | NULL |
| 123 | 201808 | NULL | NULL | NULL |
| 123 | 201809 | NULL | NULL | NULL |
| 123 | 201810 | 5 | 4 | 1 |
| 123 | 201811 | 0 | 1 | 0 |
| 123 | 201812 | NULL | NULL | NULL |
| 123 | 201901 | 25 | 5 | 20 |
| 123 | 201902 | 5 | 10 | 15 |
| 123 | 201903 | 8 | 9 | 14 |
| 123 | 201904 | 5 | 4 | 15 |
| 123 | 201905 | 10 | 5 | 20 |
| 123 | 201906 | NULL | NULL | NULL |
| 123 | 201907 | NULL | NULL | NULL |
+-------+--------+----------+----------+-------+
Code Example: db<>fiddle
I have absolutely no idea how to solve this and will thank you in advance for your help! :)
You can try to use cte recursive make a calendar table, then do outer-join
;WITH CTE AS (
SELECT CAST(CAST(#FROM AS VARCHAR(10)) + '01' AS DATE) fromDt,
CAST(CAST(#TO AS VARCHAR(10)) + '01' AS DATE) toDt,
Store
FROM (SELECT DISTINCT Store FROM #Test) t1
UNION ALL
SELECT DATEADD(MONTH,1,fromDt),toDt,Store
FROM CTE
WHERE DATEADD(MONTH,1,fromDt) <= toDt
)
SELECT FORMAT(fromDt,'yyyyMM') Month,
c.Store,
t.Incoming,
t.Turnover,
t.Stock
FROM CTE c
LEFT JOIN #Test t on
c.fromDt = CAST(CAST(t.Month AS VARCHAR(10)) + '01' AS DATE)
and
c.Store = t.Store
sqlfiddle

I made stored procedure, but I don't know what to put on my WHERE clause to filter the null column

I made a INNER JOIN in stored procedure, but I don't know what to put to my WHERE clause to filter those column with null values and only shows those rows who has not null on a particular column.
CREATE PROCEDURE [dbo].[25]
#param1 int
AS
SELECT c.Name, c.Age, c2.Name, c2.Country
FROM Cus C
INNER JOIN Cus2 C2 ON c.id = c2.id
WHERE c2.country is not null and c2.id = #param1
Order by c2.Country
RETURN 0
ID 1
+-----+----+---------+---------+
| QID | ID | Name | Country |
+-----+----+---------+---------+
| 1 | 1 | Null | PH |
| 2 | 1 | Null | CN |
| 3 | 1 | Japhet | USA |
| 4 | 1 | Abegail | UK |
| 5 | 1 | Norlee | Ger |
+-----+----+---------+---------+
ID 2
+-----+----+----------+---------+
| QID | ID | Name | Country |
+-----+----+----------+---------+
| 1 | 2 | Null | PH |
| 2 | 2 | Null | CN |
| 3 | 2 | Reynaldo | USA |
| 4 | 2 | Abegail | UK |
| 5 | 2 | Norlee | Ger |
+-----+----+----------+---------+
ID 3
+-----+----+----------+---------+
| QID | ID | Name | Country |
+-----+----+----------+---------+
| 1 | 3 | Gab | PH |
| 2 | 3 | Null | CN |
| 3 | 3 | Reynaldo | USA |
| 4 | 3 | Abegail | UK |
| 5 | 3 | Norlee | Ger |
+-----+----+----------+---------+
I want when I choose any of the user in the C Table it will display the C child table data and remove the null name rows and remain the rows with not null name column.
Desired Result:
C Table (Parent)
+----+---------+-----+
| ID | Name | Age |
+----+---------+-----+
| 3 | Abegail | 31 |
+----+---------+-----+
C2 Table (Child)
+-----+----+----------+---------+
| QID | ID | Name | Country |
+-----+----+----------+---------+
| 1 | 3 | Gab | PH |
| 3 | 3 | Reynaldo | USA |
| 4 | 3 | Abegail | UK |
| 5 | 3 | Norlee | Ger |
+-----+----+----------+---------+
WHERE column IS NOT NULL is the syntax to filter out NULL values.
Solution 1: test not null value
Example:
WHERE yourcolumn IS NOT NULL
Solution 2: test comparaison value in your where clause (comparaison substract null values)
Examples:
WHERE yourcolumn = value
WHERE yourcolumn <> value
WHERE yourcolumn in ( value)
WHERE yourcolumn not in ( value)
WHERE yourcolumn between value1 and value2
WHERE yourcolumn not between value1 and value2

How to retrieve the data on a single row?

I have a table with some data, something like this:
+---------+---------+---------+---------+-------------+
| Column1 | Column2 | Column3 | Column4 | Column5 |
+---------+---------+---------+---------+-------------+
| 38073 | 16 | abc | 444 | 4/28/2015 |
| 38076 | 70 | gug | 555 | 4/30/2015 |
| 38098 | 13 | yyy | 111 | 5/12/2015 |
| 38098 | 13 | yyy | 112 | 5/13/2015 |
| 38098 | 13 | yyy | 113 | 5/14/2015 |
| 38098 | 13 | yyy | 114 | 5/15/2015 |
| 38100 | 17 | abc | 115 | 5/13/2015 |
+---------+---------+---------+---------+-------------+
What I want to do is to have the values from Columns 4 and 5 on a single row, something like this :
+---------+----------+-----------+----------+-----------+----------+-----------+----------+-------------+
| Col1 | Col4Val1 | Col5Val1 | Col4Val2 | Col5Val2 | Col4Val3 | Col5Val3 | Col4Val4 | Col5Val4 |
+---------+----------+-----------+----------+-----------+----------+-----------+----------+-------------+
| 38073 | 444 | 4/28/2015 | null | null | null | null | null | null |
| 38076 | 555 | 4/30/2015 | null | null | null | null | null | null |
| 38098 | 111 | 5/12/2015 | 112 | 5/13/2015 | 113 | 5/14/2015 | 114 | 5/15/2015 |
+---------+----------+-----------+----------+-----------+----------+-----------+----------+-------------+
Appreciate the help if possible.
Thank you.
Bogdan
You can use a UNION to unpivot the data with a CTE, then PIVOT the columns. You can achieve this dynamically too, there are hundreds of articles that will show you how to do that:
;WITH CTE AS (
SELECT [Column1], CAST([Column4] AS VARCHAR) AS [ColumnVals], 'Col4Val'+CAST(ROW_NUMBER() OVER(PARTITION BY [Column1] ORDER BY (SELECT 1)) AS VARCHAR) AS [Pivot]
FROM Table1
UNION
SELECT [Column1], [Column5], 'Col5Val'+CAST(ROW_NUMBER() OVER(PARTITION BY [Column1] ORDER BY (SELECT 1)) AS VARCHAR) AS [Pivot]
FROM Table1)
SELECT [Column1], [Col4Val1], [Col5Val1], [Col4Val2], [Col5Val2], [Col4Val3], [Col5Val3], [Col4Val4], [Col5Val4]
FROM CTE
PIVOT (MAX([ColumnVals]) FOR [Pivot] IN ([Col4Val1], [Col5Val1], [Col4Val2], [Col5Val2], [Col4Val3], [Col5Val3], [Col4Val4], [Col5Val4])) PIV
Here's a working fiddle: http://sqlfiddle.com/#!6/e992f/1

Resources