Imagine that I have a table in the database to keep the history of customers' status.
If I want to get customers from for example status 1001 to 1002, it’s simple
Select * from TableName where StartStatus=1001 and EndStatus=1002
If I want to write a query that returns the customers that change from status 1001 to 1005, how can I do that?
The result should be just one record for each customer (I need to omit the internal changes for a customer, for example, do not need 1001 to 1002 and 1002 to 1003 and 1003 to 1004)
For example in this data, the customer with id 2 changed from 1006 to 1005, then the query shouldn't return it
Assuming that we're not worried about Customers moving 'backwards' into 1005 as long as there is ever a StartStatus of 1001 and an EndStatus of 1005 this should work
CREATE TABLE #Customer (CustomerID INT, StartStatus INT, EndStatus INT)
INSERT INTO #Customer (CustomerID, StartStatus, EndStatus)
VALUES (1, 1000, 1001),
(1, 1001, 1002),
(1, 1002, 1003),
(1, 1003, 1004),
(1, 1004, 1005),
(2, 1006, 1005)
SELECT C1.CustomerID, C1.StartStatus, C2.EndStatus
FROM #Customer AS C1
INNER JOIN #Customer AS C2 ON C2.CustomerID = C1.CustomerID
WHERE C1.StartStatus = 1001 AND C2.EndStatus = 1005
Related
I need to find out when a Customer changed for an Order and was reverted back in the following data. The Order can be assigned to several different Customer before being reverted back to the original Customer. Here is the raw data:
ORDER_NUM
CUSTOMER
LOAD_DATE
111
aaa
2023-02-09 04:49:41.335
111
bbb
2023-02-09 04:49:42.338
111
aaa
2023-02-09 04:49:43.278
222
aaa
2023-02-09 04:49:44.213
222
bbb
2023-02-09 04:49:45.254
333
aaa
2023-02-09 04:49:46.334
333
bbb
2023-02-09 04:49:47.101
333
ccc
2023-02-09 04:49:48.196
I developed the following MATCH_RECOGNIZE query in Oracle and it works:
select * from order_customer
match_recognize(
partition by order_number
order by load_date
one row per match
pattern (init modified+ reversed)
define
init as customer_id = customer_id,
modified as customer_id <> init.customer_id,
reversed as customer_id = init.customer_id
);
But it seems like Snowflake currently doesn't support Correlated pattern definition in MATCH_RECOGNIZE. What is the best way to implement this in Snowflake?
This use case could be resolved with FIRST_VALUE:
select * from order_customer
match_recognize(
partition by order_number
order by load_date
one row per match
pattern (init modified+ reversed)
define
init as customer_id = FIRST_VALUE(customer_id),
modified as customer_id <> FIRST_VALUE(customer_id),
reversed as customer_id = FIRST_VALUE(customer_id)
);
For input:
create or replace table order_customer (order_number number,
customer_id varchar(80),
load_date timestamp);
insert into order_customer values (111, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (111, 'bbb', CURRENT_TIMESTAMP);
insert into order_customer values (111, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (222, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (222, 'bbb', CURRENT_TIMESTAMP);
insert into order_customer values (333, 'aaa', CURRENT_TIMESTAMP);
insert into order_customer values (333, 'bbb', CURRENT_TIMESTAMP);
insert into order_customer values (333, 'ccc', CURRENT_TIMESTAMP);
Output:
Simplifying the entire pattern to (init modified+ init):
select * from order_customer
match_recognize(
partition by order_number
order by load_date
one row per match
pattern (init modified+ init)
define
init as customer_id = FIRST_VALUE(customer_id),
modified as customer_id <> FIRST_VALUE(customer_id)
);
The Problem
I'm trying to detect and react to changes in a table where each update is being recorded as a new row with some values being the same as the original, some changed (the ones I want to detect) and some NULL values (not considered changed).
For example, given the following table MyData, and assuming the OrderNumber is the common value,
ID OrderNumber CustomerName PartNumber Qty Price OrderDate
1 123 Acme Corp. WG301 4 15.02 2020-01-02
2 456 Base Inc. AL337 7 20.15 2020-02-03
3 123 NULL WG301b 5 19.57 2020-01-02
If I execute the query for OrderNumber = 123 I would like the following data returned:
Column OldValue NewValue
ID 1 3
PartNumber WG301 WG301b
Qty 4 5
Price 15.02 19.57
Or possibly a single row result with only the changes filled, like this (however, I would strongly prefer the former format):
ID OrderNumber CustomerName PartNumber Qty Price OrderDate
3 NULL NULL WG301b 5 19.57 NULL
My Solution
I have not had a chance to test this, but I was considering writing the query with the following approach (pseudo-code):
select
NewOrNull(last.ID, prev.ID) as ID,
NewOrNull(last.OrderNumber, prev.OrderNumber) as OrderNumber
NewOrNull(last.CustomerName, prev.CustomerName) as CustomerName,
...
from last row with OrderNumber = 123
join previous row where OrderNumber = 123
Where the function NewOrNull(lastVal, prevVal) returns NULL if the values are equal or lastVal value is NULL, otherwise the lastVal.
Why I'm Looking for an Answer
I'm afraid that the ugly join, the number of calls to the function, and the procedural approach may make this approach not scalable. Before I start down the rabbit hole, I was wondering...
The Question
...are there any other approaches I should try, or any best practices to solving this specific type of problem?
I came up with a solution for the second (less preferred) format:
The Data
Using the following data:
INSERT INTO MyData
([ID], [OrderNumber], [CustomerName], [PartNumber], [Qty], [Price], [OrderDate])
VALUES
(1, 123, 'Acme Corp.', 'WG301', '4', '15.02', '2020-01-02'),
(2, 456, 'Base Inc.', 'AL337', '7', '20.15', '2020-02-03'),
(3, 123, NULL, 'WG301b', '5', '19.57', '2020-01-02'),
(4, 123, 'ACME Corp.', 'WG301b', NULL, NULL, '2020-01-02'),
(6, 456, 'Base Inc.', NULL, '7', '20.15', '2020-02-05');
The Function
This function returns the updated value if it has changed, otherwise NULL:
CREATE FUNCTION dbo.NewOrNull
(
#newValue sql_variant,
#oldValue sql_variant
)
RETURNS sql_variant
AS
BEGIN
DECLARE #ret sql_variant
SELECT #ret = CASE
WHEN #newValue IS NULL THEN NULL
WHEN #oldValue IS NULL THEN #newValue
WHEN #newValue = #oldValue THEN NULL
ELSE #newValue
END
RETURN #ret
END;
The Query
This query returns the history of changes for the given order number:
select dbo.NewOrNull(new.ID, old.ID) as ID,
dbo.NewOrNull(new.OrderNumber, old.OrderNumber) as OrderNumber,
dbo.NewOrNull(new.CustomerName, old.CustomerName) as CustomerName,
dbo.NewOrNull(new.PartNumber, old.PartNumber) as PartNumber,
dbo.NewOrNull(new.Qty, old.Qty) as Qty,
dbo.NewOrNull(new.Price, old.Price) as Price,
dbo.NewOrNull(new.OrderDate, old.OrderDate) as OrderDate
from MyData new
left join MyData old
on old.ID = (
select top 1 ID
from MyData pre
where pre.OrderNumber = new.OrderNumber
and pre.ID < new.ID
order by pre.ID desc
)
where new.OrderNumber = 123
The Result
ID OrderNumber CustomerName PartNumber Qty Price OrderDate
1 123 Acme Corp. WG301 4 15.02 2020-01-02
3 (null) (null) WG301b 5 19.57 (null)
4 (null) ACME Corp. (null) (null) (null) (null)
The Fiddle
Here's the SQL Fiddle that shows the whole thing in action.
http://sqlfiddle.com/#!18/b720f/5/0
Trying to select the total delivery from each store where the status is 100, some cases one repair number having 2 100(status delivery). how i can remove all the duplicated from selection even no need one means if its duplicated should cancel that repair from counting. kindly check my code below that's what i reach now.
SELECT UL.StoreName, COUNT(DISTINCT JT.REPAIRNO) AS TotalDelivery
FROM DataDetails AS UL LEFT OUTER JOIN
JOBTRACKING AS JT ON UL.storeID = JT.store_code
WHERE (CAST(JT.created_Date AS date)='2017-03-08')
AND JT.JOBSTATUS=100
GROUP BY UL.StoreName
for example
Name TotalDelivery
ABC 4
XYZ 4
this one come from
RepairNo Store Status CreatedDate
1000 ABC 100 3/8/2017
1001 ABC 100 3/8/2017
1001 ABC 100 3/8/2017
1008 ABC 100 3/8/2017
1009 ABC 100 3/8/2017
1011 XYZ 100 3/8/2017
1011 XYZ 100 3/8/2017
1013 XYZ 100 3/8/2017
1014 XYZ 100 3/8/2017
1015 XYZ 100 3/8/2017
1015 XYZ 100 3/8/2017
need the result as below
Name TotalDelivery
ABC 3
XYZ 2
it will return all the rows and removes duplication but it will return one from duplicate , i want to remove that one also. only a row those dont have any duplucates. thanks in advance.
If you want the non-duplicate results, you need to use SUB QUERY clause to filter them out. Try the below query.
Updated
SELECT UL.StoreName, COUNT(1) AS TotalDelivery
FROM DataDetails AS UL
LEFT OUTER JOIN JOBTRACKING AS JT
ON UL.storeID = JT.store_code
WHERE CAST(JT.created_Date AS date)='2017-03-08'
AND JT.JOBSTATUS=100
AND JT.REPAIRNO IN (SELECT REPAIRNO from JOBTRACKING j WHERE j.store_code = UL.storeID GROUP BY j.REPAIRNO HAVING COUNT(1) = 1)
GROUP BY UL.StoreName, UL.storeID
Test Script
CREATE TABLE #DataDetails
(
StoreName CHAR(3), storeID int
)
CREATE TABLE #JOBTRACKING
(
store_code int, REPAIRNO INT, JOBSTATUS INT, created_Date DATE
)
INSERT #DataDetails VALUES( 'ABC', 1), ('XYZ', 2)
INSERT #JOBTRACKING VALUES (1, 1000, 100, '2017-03-08'), (1, 1001, 100, '2017-03-08'), (1, 1001, 100, '2017-03-08'), (1, 1008, 100, '2017-03-08'), (1, 1009, 100, '2017-03-08')
,(2, 1011, 100, '2017-03-08'), (2, 1011, 100, '2017-03-08'), (2, 1013, 100, '2017-03-08'), (2, 1014, 100, '2017-03-08'), (2, 1015, 100, '2017-03-08'), (2, 1015, 100, '2017-03-08')
SELECT UL.StoreName, COUNT(1) AS TotalDelivery
FROM #DataDetails AS UL
LEFT OUTER JOIN #JOBTRACKING AS JT
ON UL.storeID = JT.store_code
WHERE CAST(JT.created_Date AS date)='2017-03-08'
AND JT.JOBSTATUS=100
AND JT.REPAIRNO IN (SELECT REPAIRNO from #JOBTRACKING j WHERE j.store_code = UL.storeID GROUP BY j.REPAIRNO HAVING COUNT(1) = 1)
GROUP BY UL.StoreName, UL.storeID
Results
+-----------+---------------+
| StoreName | TotalDelivery |
+-----------+---------------+
| ABC | 3 |
| XYZ | 2 |
+-----------+---------------+
I need more details but as I understand, Firstly you should prepare the total select data list result. For example in the inner select you can group the data and eleminate the all concurrent or redundant data or you can apply where criteria then use outer select and group by GROUP BY UL.StoreName and then you will get the true answer. Do not use distinct !
I need to run some query against each rowset in a table (Azure SQL):
ID CustomerID MsgTimestamp Msg
-------------------------------------------------
1 123 2017-01-01 10:00:00 Hello
2 123 2017-01-01 10:01:00 Hello again
3 123 2017-01-01 10:02:00 Can you help me with my order
4 123 2017-01-01 11:00:00 Are you still there
5 456 2017-01-01 10:07:00 Hey I'm a new customer
What I want to do is to extract "chat session" for every customer from message records, that is, if the gap between someone's two consecutive messages is less than 30 minutes, they belong to the same session. I need to record the start and end time of each session in a new table. In the example above, start and end time of the first session for customer 123 are 10:00 and 10:02.
I know I can always use cursor and temp table to achieve that goal, but I'm thinking about utilizing any pre-built mechanism to reach better performance. Please kindly give me some input.
You can use window functions instead of cursor. Something like this should work:
declare #t table (ID int, CustomerID int, MsgTimestamp datetime2(0), Msg nvarchar(100))
insert #t values
(1, 123, '2017-01-01 10:00:00', 'Hello'),
(2, 123, '2017-01-01 10:01:00', 'Hello again'),
(3, 123, '2017-01-01 10:02:00', 'Can you help me with my order'),
(4, 123, '2017-01-01 11:00:00', 'Are you still there'),
(5, 456, '2017-01-01 10:07:00', 'Hey I''m a new customer')
;with x as (
select *, case when datediff(minute, lag(msgtimestamp, 1, '19000101') over(partition by customerid order by msgtimestamp), msgtimestamp) > 30 then 1 else 0 end as g
from #t
),
y as (
select *, sum(g) over(order by msgtimestamp) as gg
from x
)
select customerid, min(msgtimestamp), max(msgtimestamp)
from y
group by customerid, gg
I have a table that lists all users for my company. There are multiple entries for each staff member showing how they have been employed.
RowID UserID FirstName LastName Title StartDate Active EndDate
-----------------------------------------------------------------------------------
1 1 John Smith Manager 2017-01-01 0 2017-01-31
2 1 John Smith Director 2017-02-01 0 2017-02-28
3 1 John Smith CEO 2017-03-01 1 NULL
4 2 Sam Davey Manager 2017-01-01 0 2017-02-28
5 2 Sam Davey Manager 2017-03-01 0 NULL
6 3 Hugh Holland Admin 2017-02-01 1 NULL
7 4 David Smith Admin 2017-01-01 0 2017-02-28
I am trying to write a query that will tell me someones length of service at any given time.
The part I am having trouble with is as a single person is represented by multiple rows as their information changes over time I need combine multiple rows...
I have a query to report on who is employed at a point in time which is as far as I have gotten.
DECLARE #DateCheck datetime
SET #DateCheck = '2017/05/10'
SELECT *
FROM UsersTest
WHERE #DateCheck >= StartDate AND #DateCheck <= ISNULL(EndDate, #DateCheck)
You need to use the datediff function. The key will be choosing the appropriate number - days, months, years. The return value is an integer so if you choose years, it will be rounded (and remember, it will round for each record, not for the summary. I've chosen months below. The following has been added to get the most recent information for user name:
WITH CurrentName AS
(SELECT UserID, FirstName, LastName
from
UserStartStop
where Active = 1 -- You can replace this with a date check
)
SELECT uss.UserID,
MAX(cn.FirstName) as FirstName, -- the max is necessary because we are
-- grouping. Could include in group by
MAX(cn.LastName) as LastName,
SUM(DATEDIFF(mm,uss.StartDate,COALESCE(uss.EndDate,GETDATE())))
from UserStartStop uss
JOIN CurrentName cn
on uss.UserID = cn.UserID
GROUP BY UserID
order by UserID
For months in service, change 'd' to 'mm':
Create table #UsersTest (
RowId int
, UserID int
, FirstName nvarchar(100)
, LastName nvarchar(100)
, Title nvarchar(100)
, StartDate date
, Active bit
, EndDate date)
Insert #UsersTest values (1, 1, 'John', 'Smith', 'Manager', '2017-01-01', 0, '2017-01-31')
Insert #UsersTest values (1, 1, 'John', 'Smith', 'Director', '2017-02-01', 0, '2017-02-28')
Insert #UsersTest values (1, 1, 'John', 'Smith', 'CEO', '2017-03-01', 1, null)
Insert #UsersTest values (1, 2, 'Sam', 'Davey', 'Manager', '2017-01-01', 0, '2017-02-28')
Insert #UsersTest values (1, 2, 'Sam', 'Davey', 'Manager', '2017-03-01', 0, null)
Insert #UsersTest values (1, 3, 'Hugh', 'Holland', 'Admin', '2017-02-01', 1, null)
Insert #UsersTest values (1, 4, 'David', 'Smith', 'Admin', '2017-01-01', 0, '2017-02-28')
Declare #DateCheck as datetime = '2017/05/10'
Select UserID, FirstName, LastName
, Datediff(d, Min([StartDate]), iif(isnull(Max([EndDate]),'1900-01-01')<#DateCheck, #DateCheck ,Max([Enddate]))) as [LengthOfService]
from #UsersTest
Group by UserID, FirstName, LastName
Try it's
Select
FirstName,
LastName,
Min(StartDate)StartDate,
Max(isnull(EndDate,getdate()) as EndDate
from Table