I have a table that has the following columns: Item, Date, Status Description, and Stock Status. It contains historical data for the stocking status of items.
Below is a brief sample of the input in table form.
Item
Date
Status Description
Stock Status
ABC123
2020-10-02
Listed Out of Stock (ABC123)
Out of Stock
ABC123
2020-10-15
In Stock (ABC123)
In Stock
ABC123
2021-05-04
Listed Out of Stock (ABC123)
Out of Stock
ABC123
2021-07-15
Listed Out of Stock (ABC123)
Out of Stock
ABC123
2021-07-27
Listed Out of Stock (ABC123)
Out of Stock
ABC123
2021-08-09
Listed Out of Stock (ABC123)
Out of Stock
ABC123
2021-10-19
In Stock (ABC123)
In Stock
...
...
...
...
And here is a script to create the test input data into a temp table.
DROP TABLE #OOS_History;
CREATE TABLE #OOS_History
(
[Item] NVARCHAR(100),
[Date] DATETIME2,
[Status Description] NVARCHAR(100),
[Stock Status] NVARCHAR(25)
);
INSERT INTO #OOS_History ( [Item], [Date], [Status Description], [Stock Status] )
VALUES
('ABC123', '10/2/2020 13:53', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '10/15/2020 9:20', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '5/4/2021 8:22', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '7/15/2021 13:47', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '7/27/2021 8:04', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '8/9/2021 13:12', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '10/19/2021 8:04', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '10/28/2021 11:52', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '10/29/2021 9:24', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '12/6/2021 7:00', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '12/6/2021 7:02', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '12/15/2021 10:47', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '2/21/2022 14:25', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '4/7/2022 8:36', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '4/13/2022 7:39', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '4/19/2022 13:06', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '4/22/2022 14:07', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '4/28/2022 11:30', 'Listed Out of Stock (ABC123)', 'Out of Stock'),
('ABC123', '5/21/2022 6:25', 'In Stock (ABC123)', 'In Stock'),
('ABC123', '7/12/2022 14:10', 'Listed Out of Stock (ABC123)', 'Out of Stock');
My goal is to create a new table that has the following 3 columns: Item, Start Date, and End Date. Each row will represent an interval that an item is out of stock.
Below is the desired output which should list the intervals (start and end date) that an item was out of stock.
Item
Start Date
End Date
ABC123
2020-10-02
2020-10-15
ABC123
2021-05-04
2021-10-19
ABC123
2021-10-28
2021-10-29
ABC123
2021-12-06
2021-12-15
ABC123
2022-04-07
2022-04-13
ABC123
2022-04-19
2022-04-22
ABC123
2022-04-28
2022-05-21
ABC123
2022-07-12
NULL
I believe I can do this by finding the groups that each row belongs in. In this case there should be 8 groups (rows 1-2), (rows 3-7), (rows 8-9), (rows 10-13), (rows 14-15), (rows 16-17), (rows 18-19), and (row 20). Then I can take the first row from each group as the starting date and the last row from each group as the ending date.
I've done some preliminary research and this appears to be an islands and gaps problem. I've found an answer on stackoverflow that looks like a good starting point but I'm stuck on how to apply it to my problem.
I'm copying the SQL query structure from the answer on this stackoverflow post (Lag() with condition in sql server) in an attempt to find the groups within my data.
The script below containing the CTE shows what I have so far.
WITH Grouped AS
(
SELECT
[Item],
[Date],
[Status Description],
[Stock Status],
ROW_NUMBER() OVER(ORDER BY [Date] ASC) AS 'rn1',
ROW_NUMBER() OVER(PARTITION BY CASE [Stock Status] WHEN 'Out of Stock' THEN 1 END ORDER BY [Date] ASC) AS 'rn2' -- this line needs changed to fix my grouping issue?
FROM #OOS_History
),
OrderInGroup AS
(
SELECT
[Item],
[Date],
[Status Description],
[Stock Status],
[rn1],
[rn2],
[rn1] - [rn2] AS 'GroupNumber',
ROW_NUMBER() OVER(PARTITION BY [rn1]-[rn2] ORDER BY [Date] ASC) AS rank_asc,
ROW_NUMBER() OVER(PARTITION BY [rn1]-[rn2] ORDER BY [Date] DESC) AS rank_desc
FROM Grouped
)
SELECT
*,
LAG([Stock Status], rank_asc) OVER(ORDER BY [Date] ASC) AS 'LastEventOfPrevGroup',
LEAD([Stock Status], rank_desc) OVER(ORDER BY [Date] DESC) AS 'FirstEventOfNextGroup'
FROM OrderInGroup
ORDER BY [Date] ASC;
Attached is an image that shows the output of the above CTE query. The numbered red boxes around the rows indicate my desired grouping of the rows. The "GroupNumber" column indicates the actual grouping of the rows and it is not correct.
I'm assuming the issue lies within the "rn2" window function but I can't figure out what to change within the partition by and/or order by to get the desired grouping.
Here's a more intuitive approach, in my opinion:
WITH
OOS_History_With_Status_Change_Flag AS
(
SELECT
*,
CASE
WHEN LAG([Stock Status], 1, 1) OVER (ORDER BY [Date]) <> [Stock Status]
THEN 1 -- i.e. the status changed on this date
END AS [Status Change]
FROM
#OOS_History
),
OOS_History_Only_Status_Changes AS
(
SELECT
*,
CAST([Date] AS DATE) AS [Start Date],
CAST(LEAD([Date], 1) OVER (ORDER BY [Date]) AS DATE) AS [End Date]
FROM
OOS_History_With_Status_Change_Flag
WHERE
[Status Change] = 1
)
SELECT
[Item],
[Start Date],
[End Date]
FROM
OOS_History_Only_Status_Changes
WHERE
[Stock Status] = 'Out of Stock'
ORDER BY
[Start Date]
;
In essence, the only rows you care about from your initial dataset are ones that represent a status change (i.e. rows with a lagging row, when ordered by date, of differing status). The first CTE creates this flag column using the LAG() function; the second CTE uses this flag as a filter and defines the Start Date and End Date columns.
And just like that, you're done! Select rows where the status is out-of-stock, and you're golden.
Related
I am trying to create a view in SQL Server that renders columns from 3 tables and that creates 4 NEW columns. These new columns use the SQL "Case...When" syntax to conditionally render the SUM of existing fields depending on the values of an adjacent column [Portfolio Report Category] in the same table.
The objective is to display a single account's total market value for 4 distinct asset categories. Right now an [Account Number] can have MANY records with MANY [Investment Amount] for a single [Portfolio Report Category]. So obviously this view should allow an [Account Number] to have only ONE record with 4 summed monetary values that meet the "Case...When" conditions.
I have read similar threads, but none that addresses an aggregate method after the "Then" keyword in the conditional block. When I place the Sum() before "Case" and remove it after the "Then," there is no error but the calculations are wrong: In many cases, they are much higher. I was under the impression that a simple Try_Convert to money would be sufficient, and indeed if I execute SUM(Try_Convert(money, [Market_Value]) Where [Account Number] = 'x', the result is correct. But not when I'm creating the view.
Here is the error when testing the following code with SELECT:
Column 'KTHoldings.Portfolio Report Category' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Further below is the KTHoldings table returning all records for a single account. I basically need the [Other Total Value] column of my view to sum its [Market Value] data.
CREATE VIEW VW_KTAccountsADR AS (
SELECT adr.[Account Number], adr.[Account_ Display Name], adr.[Account Category Code], adr.[Division Code], adr.[Division Name], adr.[Market Value Amount],
CASE WHEN kth.[Portfolio Report Category] = '705' THEN SUM((TRY_CONVERT(money, kth.[Market Value]))) ELSE '0.00' END AS [Crypto Total Value],
CASE WHEN kth.[Portfolio Report Category] = '691' THEN SUM((TRY_CONVERT(money, kth.[Market Value]))) ELSE '0.00' END AS [Precious Metal Total Value],
CASE WHEN kth.[Portfolio Report Category] IN ('010', '011', '020', '025', '030') THEN SUM((TRY_CONVERT(money, kth.[Market Value]))) ELSE '0.00' END AS [Stock Total Value],
CASE WHEN kth.[Portfolio Report Category] NOT IN ('010', '011', '020', '025', '030', '691', '705') THEN SUM((TRY_CONVERT(money, kth.[Market Value]))) ELSE '0.00' END AS [Other Total Value],
ktc.[Available Cash Amount] AS [CASH TOTAL]
FROM KTAccountsADR adr
INNER JOIN KTHoldings kth
ON adr.[Account Number] = kth.[Account Number]
INNER JOIN KTCash ktc
ON adr.[Account Number] = ktc.[Account Number]
GROUP BY adr.[Account Number], adr.[Account_ Display Name], adr.[Account Category Code], adr.[Division Code], adr.[Division Code], adr.[Division Name], adr.[Market Value Amount], ktc.[Available Cash Amount]
)
KTAccountsADR Table
Account Number Display Name Division Code Market Value Amount
07007835 Frank C Thomas Roth IRA 27 390410.98
07007835 Frank C Thomas Roth IRA 27 390410.98
07007835 Frank C Thomas Roth IRA 27 390410.98
07007835 Frank C Thomas Roth IRA 27 390410.98
001000 Carl S Sykes Roth IRA 27 196338.1292
001000 Carl S Sykes Roth IRA 27 196338.1292
001000 Carl S Sykes Roth IRA 27 196338.1292
KTHoldings Table
Account Number Display Name Market Value Portfolio Report Category
001000 Carl S Sykes Roth IRA 9998.4792 600
001000 Carl S Sykes Roth IRA 43467.09 600
001000 Carl S Sykes Roth IRA 84524.71 600
KTCash Table
Account Number Available Cash Amount
001000 58347.85
You need to put your case inside your sum:
SUM(CASE WHEN kth.[Portfolio Report Category] = '705' THEN TRY_CONVERT(money, kth.[Market Value]) ELSE '0.00' END) AS [Crypto Total Value],
Here is a working example:
declare #Adr table ([Account Number] varchar(6), [Account_ Display Name] varchar(64), [Account Category Code] varchar, [Division Code] varchar, [Division Name] varchar, [Market Value Amount] money);
declare #Kth table ([Account Number] varchar(6), [Portfolio Report Category] varchar(3), [Market Value] money);
declare #Ktc table ([Account Number] varchar(6), [Available Cash Amount] money);
insert into #Adr ([Account Number], [Account_ Display Name])
values ('001000', 'Carl S Sykes Roth IRA');
insert into #Kth ([Account Number], [Market Value], [Portfolio Report Category])
values ('001000', 9998.4792, '600'),
('001000', 43467.09, '600'),
('001000', 84524.71, '600');
insert into #Ktc ([Account Number])
values ('001000');
SELECT adr.[Account Number], adr.[Account_ Display Name], adr.[Account Category Code], adr.[Division Code], adr.[Division Name], adr.[Market Value Amount]
, SUM(CASE WHEN kth.[Portfolio Report Category] = '705' THEN (TRY_CONVERT(money, kth.[Market Value])) ELSE 0 END) AS [Crypto Total Value]
, SUM(CASE WHEN kth.[Portfolio Report Category] = '691' THEN (TRY_CONVERT(money, kth.[Market Value])) ELSE 0 END) AS [Precious Metal Total Value]
, SUM(CASE WHEN kth.[Portfolio Report Category] IN ('010', '011', '020', '025', '030') THEN (TRY_CONVERT(money, kth.[Market Value])) ELSE 0 END) AS [Stock Total Value]
, SUM(CASE WHEN kth.[Portfolio Report Category] NOT IN ('010', '011', '020', '025', '030', '691', '705') THEN (TRY_CONVERT(money, kth.[Market Value])) ELSE 0 END) AS [Other Total Value]
, ktc.[Available Cash Amount] AS [CASH TOTAL]
FROM #Adr adr
INNER JOIN #Kth kth ON adr.[Account Number] = kth.[Account Number]
INNER JOIN #Ktc ktc ON adr.[Account Number] = ktc.[Account Number]
GROUP BY adr.[Account Number], adr.[Account_ Display Name], adr.[Account Category Code], adr.[Division Code], adr.[Division Code], adr.[Division Name], adr.[Market Value Amount], ktc.[Available Cash Amount];
Which returns (un-necessary columns removed):
Account Number Crypto Total Value Precious Metal Total Value Stock Total Value Other Total Value CASH TOTAL
001000 0.00 0.00 0.00 137990.2792 NULL
Edit: As you have now added sample data, the reason you were getting incorrect values is because you had duplicate rows in your KTAccountsADR which then multiple the rows in the KTHoldings table. Resolve the duplicates and you will get the correct values when using case inside sum.
Note, its best practice to ensure you are returning the same datatype from all branches of a case expression.
I need to a distinct number of values in a column and group it by date, The real problem is the count should not include if it already occurred in a previous result.
Eg: Consider the table tblcust
Date Customers
March 1 Mike
March 1 Yusuf
March 1 John
March 2 Ajay
March 2 Mike
March 2 Anna
The result should be
Date Customer_count
March 1 3
March 2 2
If I use
select date,count(distinct(customer)) as customer_count
group by date
The Result I am getting is
Date customer_count
March 1 3
March 2 3
The customer, Mike has been visited twice, It should not be counted as a new customer.
You can try and achieve this using SQL Server ROW_NUMBER Function
.
create table tblCust (DtDate varchar(20), Customers Varchar(20))
insert into tblCust Values
('March 1', 'Mike'),
('March 1', 'Yusuf'),
('March 1', 'John'),
('March 2', 'Ajay'),
('March 2', 'Mike'),
('March 2', 'Anna')
Select dtDate
, Customers
, ROW_NUMBER() OVER (PARTITION BY Customers ORDER BY DtDate) as SrNo
from tblCust
order by DtDate, SrNo
select Dtdate,
count(distinct(customers)) as customer_count
from(Select dtDate
, Customers
, ROW_NUMBER() OVER (PARTITION BY Customers ORDER BY DtDate) as SrNo
from tblCust
)a where SrNo = 1
group by Dtdate
Here is the live db<>fiddle demo
I am still new to SQL and data manipulation and I have no idea if it is possible as I have not come across anything from searching that is the same. I am trying to aggregate data based on a active period between a start and end date. My current code that I have included should hopefully give an idea on the sort of aggregation I would like to perform.
I have tried to search for a way to do this in SQL or Powerbi but so far have come up short. Most of the examples I found are to perform calculations on a single column rather than the entire dataset. My original idea was to do a calculated column giving a list of months active but I failed to then aggregate on that enough to condense down the very large data set.
SELECT [Group ID], [User], [Location], [category], [Setup Date], [End Date], Count([Name]) AS 'Number of Names'
Avg([Duration in Weeks]) AS 'Avg duration',Avg([Days since last production]) AS 'Avg days since last production',Avg([Losses]) AS 'Avg losses',
ROUND(AVG([Number produced]/NULLIF([duration in weeks],0)),2) AS Productivity
FROM [Summary].[dbo].[Summary$]
Group BY [Group ID], [User], [Location], [Category], [Setup Date], [End Date]
For example if the set up date is "2017-01-09" and the end date is "2017-03-30" I would like to be able to aggregate this data in Jan 2018, Feb 2018 and Mar 2018 data all in the same table. I hope I have given enough information and explained clearly but please let me know if I need to provide anything else.
Small snippet of raw data
You can use a CTE to create your periods
DECLARE #yearfrom int = 2017, #yearto int = 2019
;WITH months AS (
SELECT 1 AS MonthNum, DATENAME(Month,DATEFROMPARTS(1,1,1)) AS MonthName
UNION ALL
SELECT MonthNum + 1, DATENAME(Month,DATEFROMPARTS(1, MonthNum + 1, 1)) AS MonthName
FROM months
WHERE MonthNum <= 11
),
years as (
SELECT #yearfrom as year
union all
select year + 1
from years where year < #yearto
),
dates as (
SELECT Year, MonthNum, MonthName, DATEFROMPARTS(Year, MonthNum, 1) AS DateStart, DATEADD(MONTH, 1, DATEFROMPARTS(Year, MonthNum, 1)) AS DateEnd
FROM years
CROSS JOIN months
)
SELECT d.year, d.MonthNum, d.MonthName, [Group ID], [User], [Location], [category], Count([Name]) AS 'Number of Names'
Avg([Duration in Weeks]) AS 'Avg duration',Avg([Days since last production]) AS 'Avg days since last production',Avg([Losses]) AS 'Avg losses',
ROUND(AVG([Number produced]/NULLIF([duration in weeks],0)),2) AS Productivity
FROM
DATES d
LEFT JOIN [Summary].[dbo].[Summary$] ON
([Setup Date] >= d.DateStart AND [Setup Date] < d.DateEnd)
OR ([End Date] >= d.DateStart AND [End Date] < d.DateEnd)
Group BY d.year, d.monthnum, d.monthname, [Group ID], [User], [Location], [Category]
I'm looking to create a query that could compare a customer's latest order purchase amount to the previous order amount of the customer's last purchase. Please see example data screenshot below:
Ideally I'd like the query to look for these things in the results:
Total amount from previous order before most recent order date (in this case 9/6/18 would be most recent and 2/2/17 would be the last purchase)
Difference in amount between most recent order and last order amount ($2000-$25 = $1975)
Create a condition in the query to look for customers whose most recent order attempt is 1000 > last purchase amount and the age of the customer's account age is > than 60 days
Note: These conditions for the last bullet could be modified as needed (customer's account age is > 90 days, different in order amount is $500, etc)
Thank you for the assistance!
For 2012 onward you can use LAG
declare #amount decimal(16,2) = 1000
declare #days int = 60
select
*
,TotalWithPrevious = [Order Amount] + lag([Order Amount]) over (partition by UserID order by [Order Date] desc)
,DifferenceofPrevious = [Order Amount] - lag([Order Amount]) over (partition by UserID order by [Order Date] desc)
,CheckCondition = case
when [Order Amount] - lag([Order Amount]) over (partition by UserID order by [Order Date] desc) >= #amount
and datediff(day,[Order Date],lag([Order Date]) over (partition by UserID order by [Order Date] desc)) >= #days
then 'True'
else 'False'
end
from YourTable
I'm using MS SQL Server. I have a single table called 'Commissions' that contains customer, date, category, and commission amount. I'm trying to get a customer count by category.
I've figured out the SQL code that gives me a total customer count for a single category.
declare #category nvarchar(50)
set #category = 'Shirts'
select #category , COUNT(*)
from (
SELECT SUM(Commission) AS CommTotal, [Real Customer]
FROM dbo.Commissions
WHERE category = #category and ([Line Item Date] >= CONVERT(DATETIME, '2014-10-01', 102) AND [Line Item Date] < CONVERT(DATETIME, '2015-10-01', 102))
GROUP BY [Real Customer]
) as Agg
The output of this code produces
Shirts 652
What I would like to do next is to do the equivalent of adding a wrapper around this code, that would give me a customer count of all 5 of the categories.
'-- wrapper
select distinct category from commissions
Shirts 652
Pants 1420
Shoes 342
Socks 553
Hats 992
Any suggestions would be appreciated.
First: In your code you are selecting by a single category. In your output example you want all categories. So the selection has to be removed.
Second: I am assuming that the combination of customer and category is not unique but you want a customer only to be counted once per category.
Third: I am copying your date selection.
You need to group by category and then count the number of customers per group counting each customer just once.
Try this:
select category, count(distinct [Real Customer])
from Commissions
where ([Line Item Date] >= CONVERT(DATETIME, '2014-10-01', 102)
and [Line Item Date] < CONVERT(DATETIME, '2015-10-01', 102))
group by Category
Albert
You can use this query. It will count the number of distinct values of [Real Customer] for each category.
SELECT category, COUNT(DISTINCT [Real Customer])
FROM dbo.Commissions
WHERE [Line Item Date] >= CONVERT(DATETIME, '2014-10-01', 102) AND [Line Item Date] < CONVERT(DATETIME, '2015-10-01', 102)
GROUP BY category
You can also optionally add a where clause to limit the number of categories, if you want, to those that you listed. Or leave the query as above to get all categories.