OrderID, OrderTime, OrderItem
1, 1 am, Orange
1, 2 am, Apple
2, 3 am, Grape
3, 2 am, Apple
3, 3 am, Coconut
1, 5 am, Banana
1, 6 am, Apple
The above is the original table. The below is the table I want. if the order id is the same for continuously, I wwant to change the order time to the minimum time.
OrderID, OrderTime, OrderItem
1, 1 am, Orange
1, 1 am, Apple
2, 3 am, Grape
3, 2 am, Apple
3, 2 am, Coconut
1, 5 am, Banana
1, 5 am, Apple
There is no "natural" order to your data in SQL Server if you don't specify an order then the order you get back is in no way guaranteed. You need a column that you can use to define an order by. Once you have this you can do something like
declare #data table
(
OrderItemId int identity(1,1) primary key,
OrderID int,
OrderTime datetime,
OrderItem varchar(10))
insert into #data
SELECT 1,'00:01:00','Orange' UNION ALL
SELECT 1,'00:01:00','Apple' UNION ALL
SELECT 2,'00:03:00','Grape' UNION ALL
SELECT 3,'00:02:00','Apple' UNION ALL
SELECT 3,'00:02:00','Coconut' UNION ALL
SELECT 1,'00:05:00','Banana' UNION ALL
SELECT 1,'00:05:00','Apple';
WITH cte1 AS
(
SELECT *, ROW_NUMBER() over (ORDER BY OrderItemId)-
ROW_NUMBER() over (PARTITION BY OrderID ORDER BY OrderItemId) AS Grp
FROM #data
), cte2 AS
(
SELECT *, MIN(OrderTime) OVER (PARTITION BY OrderID, Grp) AS MinOrderTime
FROM cte1
)
UPDATE cte2 SET OrderTime = MinOrderTime
SELECT OrderID,convert(varchar, OrderTime, 8) OrderItemId, OrderItem
FROM #data
ORDER BY OrderItemId
Related
Given the following table structure
Column
Id
Name
DateCreated
with the following data
id
Name
DateCreated
1
Joe
1/13/2021
2
Fred
1/13/2021
3
Bob
1/12/2021
4
Sue
1/12/2021
5
Sally
1/10/2021
6
Alex
1/9/2021
I need SQL that will page over the data based on datecreated. The query should return the top 3 records, and any record which also shares the datecreated of the top 3.
So give the data above, we should get back Joe, Fred and Bob (as the top 3 records) plus Sue since sue has the same date as Bob.
Is there something like ROW_NUMBER that increments for each row where it encounters a different value.
For some context this query is being used to generate an agenda type view, and once we select any date we want to keep all data for that date together.
EDIT
I do have a solution but it smells:
;WITH CTE AS ( SELECT ROW_NUMBER() OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable),
PAGE AS (SELECT *
FROM CTE
WHERE RowNum<=5)
SELECT *
FROM Page
UNION
SELECT *
FROM CTE
WHERE DateCreated=(SELECT MIN(DateCreated) FROM Page)
I've used a TOP 3 WITH TIES example and a ROW_NUMBER example and a CTE to return four records:
DROP TABLE IF EXISTS #tmp
GO
CREATE TABLE #tmp (
Id INT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
dateCreated DATE
)
GO
INSERT INTO #tmp VALUES
( 1, 'Joe', '13 Jan 2021' ),
( 2, 'Fred', '13 Jan 2021' ),
( 3, 'Bob', '12 Jan 2021' ),
( 4, 'Sue', '12 Jan 2021' ),
( 5, 'Sally', '10 Jan 2021' ),
( 6, 'Alex', '9 Jan 2021' )
GO
-- Gets same result
SELECT TOP 3 WITH TIES *
FROM #tmp t
ORDER BY dateCreated DESC
;WITH cte AS (
SELECT ROW_NUMBER() OVER( ORDER BY dateCreated DESC ) rn, *
FROM #tmp
)
SELECT *
FROM #tmp t
WHERE EXISTS
(
SELECT *
FROM cte c
WHERE rn <=3
AND t.dateCreated = c.dateCreated
)
My results:
As #Charlieface, we only need to replace ROW_NUMBER with DENSE_RANK. So that the ROW_NUMBER will be tied according to the same value.
When we run the query:
SELECT DENSE_RANK () OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable
The result will show as follows:
So as a result, we can set RowNum<=3 in the query to get the top 3:
;WITH CTE AS ( SELECT DENSE_RANK() OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable),
PAGE AS (SELECT *
FROM CTE
WHERE RowNum<=3)
SELECT *
FROM Page
UNION
SELECT *
FROM CTE
WHERE DateCreated=(SELECT MIN(DateCreated) FROM Page)
The First one is as yours the second one is as above. The results of the two queries are the same.
Kindly let us know if you need more infomation.
follow this question
I have...
ID SKU PRODUCT
=======================
1 FOO-23 Orange
2 BAR-23 Orange
3 FOO-24 Apple
4 FOO-25 Orange
5 FOO-25 null
6 FOO-25 null
expected result:
1 FOO-23 Orange
3 FOO-24 Apple
5 FOO-25 null
6 FOO-25 null
This query isn't getting me there. How can I SELECT DISTINCT on just one column and eliminate null in SELECT DISTINCT?
SELECT *
FROM (SELECT ID, SKU, Product,
ROW_NUMBER() OVER (PARTITION BY PRODUCT ORDER BY ID) AS RowNumber
FROM MyTable
WHERE SKU LIKE 'FOO%') AS a
WHERE a.RowNumber = 1
Perhaps one approach is using the WITH TIES in concert with a conditional PARTITION
Example
Declare #YourTable Table ([ID] int,[SKU] varchar(50),[PRODUCT] varchar(50))
Insert Into #YourTable Values
(1,'FOO-23','Orange')
,(2,'BAR-23','Orange')
,(3,'FOO-24','Apple')
,(4,'FOO-25','Orange')
,(5,'FOO-25',NULL)
,(6,'FOO-25',NULL)
Select top 1 with ties *
From #YourTable
Where SKU Like 'FOO%'
Order By Row_Number() over (Partition By IsNull(Product,NewID()) Order By ID)
Returns
ID SKU PRODUCT
6 FOO-25 NULL
5 FOO-25 NULL
3 FOO-24 Apple
1 FOO-23 Orange
Using John Cappelletti's sample data here is another approach. All you really needed was to add the OR predicate to your where clause.
Declare #YourTable Table ([ID] int,[SKU] varchar(50),[PRODUCT] varchar(50))
Insert Into #YourTable Values
(1,'FOO-23','Orange')
,(2,'BAR-23','Orange')
,(3,'FOO-24','Apple')
,(4,'FOO-25','Orange')
,(5,'FOO-25',NULL)
,(6,'FOO-25',NULL)
SELECT *
FROM
(
SELECT ID
, SKU
, Product
, ROW_NUMBER() OVER (PARTITION BY PRODUCT ORDER BY ID) AS RowNumber
FROM #YourTable
WHERE SKU LIKE 'FOO%'
) AS a
WHERE a.RowNumber = 1
OR a.PRODUCT IS NULL --This was the only part you were missing
I changed your row_number to dense rank:
Declare #YourTable Table ([ID] int,[SKU] varchar(50),[PRODUCT] varchar(50))
Insert Into #YourTable Values
(1,'FOO-23','Orange')
,(2,'BAR-23','Orange')
,(3,'FOO-24','Apple')
,(4,'FOO-25','Orange')
,(5,'FOO-25',NULL)
,(6,'FOO-25',NULL)
SELECT *
FROM (SELECT ID, SKU, Product,
Dense_RANK() OVER (PARTITION BY SKU ORDER BY Product) AS RowNumber
FROM #YourTable
WHERE left(SKU,3) = 'FOO') AS a
WHERE a.RowNumber = 1
Results:
ID SKU Product RowNumber
1 FOO-23 Orange 1
3 FOO-24 Apple 1
5 FOO-25 NULL 1
6 FOO-25 NULL 1
Good morning all
I would appreciate any help you can give me in this subject
I have a table that grows in time with the same Id1
but some time Id2 change , like a historic of a park.
I would like to find the best way with a query to retrieve
the rows where id2 changes and time
example if table contents are
Id1 Id2 time
1 1 10:00
1 1 10:30
1 2 10:40
1 2 10:45
1 2 11:00
1 3 11:45
1 3 12:45
query output would be
Id1 oldId2 newId2 time
1 1 2 10:40
1 2 3 11:45
i have done with a stored procedure, but I was wondering of there is a faster/cleaner way to get this
thanks in advance
You can do this by Ranking functions..
Schema:
CREATE TABLE #TAB (Id1 INT,Id2 INT, timeS TIME )
INSERT INTO #TAB
SELECT 1 AS Id1 , 1 Id2, '10:00' AS timeS
UNION ALL
SELECT 1, 1, '10:30'
UNION ALL
SELECT 1, 2, '10:40'
UNION ALL
SELECT 1, 2, '10:45'
UNION ALL
SELECT 1, 2, '11:00'
UNION ALL
SELECT 1, 3, '11:45'
UNION ALL
SELECT 1, 3, '12:45'
Now do select with ROW_NUMBER and CTE for retrieving previous/next row values.
;WITH CTE
AS (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS RNO
,ID1
,ID2
,timeS
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY ID2 ORDER BY TIMES) AS SNO
,*
FROM #TAB
) A
WHERE SNO = 1
)
SELECT C1.Id1
,C1.Id2 AS OLD_ID2
,C2.Id2 AS NEW_ID2
,C2.timeS
FROM CTE C1
LEFT JOIN CTE C2 ON C1.RNO + 1 = C2.RNO
WHERE C2.Id1 IS NOT NULL
Result:
+-----+---------+---------+------------------+
| Id1 | OLD_ID2 | NEW_ID2 | timeS |
+-----+---------+---------+------------------+
| 1 | 1 | 2 | 10:40:00.0000000 |
| 1 | 2 | 3 | 11:45:00.0000000 |
+-----+---------+---------+------------------+
Note: If you want to get Previous/Next Row values into current row, you can use LEAD LAG functions. But they support only in SQL Server 2012+.
The above Left Join with CTE will work for lower versions too.
declare #t table (Id1 int, Id2 int, [time] time)
insert into #t
select 1, 1, '10:00' union
select 1, 1, '10:30' union
select 1, 2, '10:40' union
select 1, 2, '10:45' union
select 1, 2, '11:00' union
select 1, 3, '11:45' union
select 1, 3, '12:45'
select Id1, oldId = (select top 1 id2 from #t where Id1=t.Id1 and Id2 < t.Id2 order by id2, time desc), newId = id2, time = min(time)
from #t t
where id2 > 1
group by Id1, id2
i have done some changes to the code from Shakeer Mirza.
the pratical problem that originated the question in the first place is:
i have a table that represents the history of an equipment. Being machine internal id(Num_TPA).
Each time there is a malfunction, the machine is replaced by another it keeps the same Num_TPA but Serial_number changes
i needed to know what is the historic on internal_id->Num_TPA . the new and the old serial_number , and the date of replacement
and this is what it came out.
;WITH CTE
AS (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS RNO
,[Num_TPA]
,[Serial_number]
,[Time]
,a.SNO
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY [Num_TPA]
ORDER BY [Data_Hora_Ficheiro]) AS SNO
,*
FROM tab_values
) A
WHERE SNO > 1
)
SELECT C1.[Num_TPA]
,C1.[Serial_number] AS OLD_ID2
,C2.[Serial_number] AS NEW_ID2
,C2.[Data_Hora_Ficheiro]
,c2.SNO
,c2.RNO
FROM tab_values C1
LEFT JOIN CTE C2 ON (
C1.[Num_TPA] = C2.[Num_TPA]
AND c1.[Serial_number] != c2.[Serial_number]
AND C2.[Time] > c1.TIME
)
WHERE C2.[Num_TPA] IS NOT NULL
AND SNO = 2
UNION
SELECT C1.[Num_TPA]
,C1.[Serial_number] AS OLD_ID2
,C2.[Serial_number] AS NEW_ID2
,C2.[Data_Hora_Ficheiro]
,c2.SNO
,c2.RNO
FROM CTE C1
LEFT JOIN CTE C2 ON (
C1.SNO + 1 = C2.SNO
AND C1.[Num_TPA] = C2.[Num_TPA]
)
WHERE C2.[Num_TPA] IS NOT NULL
AND C2.SNO > 2
I have a login table that contains the ID of the customer and the timestamp of the login time (customerid, timestamp).
I am looking to get all the customer IDs that logged in at least three times within sixty minutes. By the way, the login table is huge. Self joining is not an option.
For example:
customer id | timestamp
1 | 2016-08-16 00:00
2 | 2016-08-16 00:00
3 | 2016-08-16 00:00
1 | 2016-08-16 00:25
2 | 2016-08-16 01:25
3 | 2016-08-16 00:25
1 | 2016-08-16 00:47
2 | 2016-08-16 01:27
3 | 2016-08-16 02:25
3 | 2016-08-16 03:25
1 | 2016-08-16 01:05
For this example, the query should return only customerid 1. Any ideas?
Tested with rexTester: http://rextester.com/RMST24716 (thanks TT.!)
CREATE TABLE loginTable (id INT NOT NULL, timestamp DATETIME NOT NULL);
INSERT INTO loginTable (id, timestamp) values
( 1, '2016-08-16 00:00'),
( 2, '2016-08-16 00:00'),
( 3, '2016-08-16 00:00'),
( 1, '2016-08-16 00:25'),
( 2, '2016-08-16 01:25'),
( 3, '2016-08-16 00:25'),
( 1, '2016-08-16 00:47'),
( 2, '2016-08-16 01:27'),
( 3, '2016-08-16 02:25'),
( 3, '2016-08-16 03:25'),
( 1, '2016-08-16 01:05');
SELECT distinct a.id
FROM loginTable as a
join loginTable as b on a.id = b.id and a.timestamp < b.timestamp
join loginTable as c on b.id = c.id and b.timestamp < c.timestamp
where Datediff(minute, a.timestamp, c.timestamp) <= 60;
Hope it helps (http://rextester.com/CTR13554):
SELECT a.id, a.timestamp, COUNT(DISTINCT b.timestamp)
FROM loginTable a
JOIN loginTable b ON a.id = b.id AND a.timestamp <= b.timestamp
JOIN loginTable c ON a.id = c.id AND a.timestamp <= c.timestamp
WHERE 1=1
AND ABS(DATEDIFF(minute,a.timestamp,b.timestamp)) <= 60
AND ABS(DATEDIFF(minute,a.timestamp,c.timestamp)) <= 60
GROUP BY a.id, a.timestamp
HAVING COUNT(DISTINCT b.timestamp) >= 3
btw, in your example, customer 1 logged in 3 times within an hour twice: [00:00;00:25;00:47] AND [00:25;00:47;01:05]
here the code for a quick test of the code above:
CREATE TABLE loginTable (id INT NOT NULL, timestamp DATETIME NOT NULL)
INSERT INTO loginTable (id, timestamp)
SELECT 1, '2016-08-16 00:00'
UNION SELECT 2, '2016-08-16 00:00'
UNION SELECT 3, '2016-08-16 00:00'
UNION SELECT 1, '2016-08-16 00:25'
UNION SELECT 2, '2016-08-16 01:25'
UNION SELECT 3, '2016-08-16 00:25'
UNION SELECT 1, '2016-08-16 00:47'
UNION SELECT 2, '2016-08-16 01:27'
UNION SELECT 3, '2016-08-16 02:25'
UNION SELECT 3, '2016-08-16 03:25'
UNION SELECT 1, '2016-08-16 01:05'
I'm able to test only on rextester and for mssql the following seems to work: hopefully your mssql version supports analytical functions too.
In this case, no self joined is needed and the table is scanned only once.
CREATE TABLE loginTable (id INT NOT NULL, timestamp DATETIME NOT NULL)
INSERT INTO loginTable (id, timestamp)
SELECT 1, '2016-08-16 00:00'
UNION SELECT 2, '2016-08-16 00:00'
UNION SELECT 3, '2016-08-16 00:00'
UNION SELECT 1, '2016-08-16 00:25'
UNION SELECT 2, '2016-08-16 01:25'
UNION SELECT 3, '2016-08-16 00:25'
UNION SELECT 1, '2016-08-16 00:47'
UNION SELECT 2, '2016-08-16 01:27'
UNION SELECT 3, '2016-08-16 02:25'
UNION SELECT 3, '2016-08-16 03:25'
UNION SELECT 1, '2016-08-16 01:05';
select id, min_t, max_t from (
select id,
min(timestamp) over (partition by id order by id, timestamp rows between 2 preceding and current row) as min_t,
max(timestamp) over (partition by id order by id, timestamp rows between 2 preceding and current row) as max_t,
count(timestamp) over (partition by id order by id, timestamp rows between 2 preceding and current row) as num_t
from loginTable
) ts_data
where ABS(DATEDIFF(minute,min_t,max_t)) <= 60 and num_t=3;
(thanks to #Salvador to have shared some test scripts)
Explanation
The idea here is to scan just once the logintable by timestamp and keep in memory for every id the last three occurrences (current included).
If the minimal timestamp and the maximum timestamp of the three happen in a 60 minutes period, we have almost the result.
Finally, we have to manage one "corner case":
when we encounter the first or second login of a customer, we could have both the min and the max timestamp in the 60 minutes span (in cases of first login they would be the same).
However they wouldn't satisfy OP requirment (he talked of 3 distinct logins) so we have to count the number of logins and make sure they are 3 (num_t=3)
Edited
(thanks again to #Salvador for the warning)
There was an error in the first version where in the windows specification I said "rows between 3 preceding". Indeed I had to look at 3 rows, but the current one was included so I should have set "rows between 2 preceding".
This query gets customerids who logged in at least three times within 60 minutes from NOW:
SELECT customerid FROM
(SELECT customerid, count(*) as loginnumber FROM LoginTable
GROUP BY customerid
WHERE [timestamp] > DATEADD(minute, -60, GetDate()) ) LT
WHERE loginnumber >= 3
Using self-join
SELECT M.customer_id FROM (
SELECT Distinct T1.customer_id, T1.Time,
T2.Time,
Datediff(minute,T1.Time,T2.Time) as diff
FROM Table T1 JOIN Table T2 ON T1.customer_id=T2.customer_id
AND T1.Time<T2.Time
) M
WHERE diff<=60
Group By M.customer_id
Having count(M.*)>=3
an easy way to find the ones which don't span hour boundaries is:
select
id,
datepart(yy,timestamp) as yy,
datepart(mm,timestamp) as mm,
datepart(dd,timestamp) as dd,
datepart(hh,timestamp) as hh,
count(*)
from
logintable
group by
id,
datepart(yy,timestamp),
datepart(mm,timestamp),
datepart(dd,timestamp),
datepart(hh,timestamp)
having
count(*) >= 3
if your table is very large, you might knock it down to clients with at least three logins per day, and then self-join. It will still miss logins across day spans, but it's a simple solution that moves you forward while you work on a more sophisticated one.
I adapted my Pqsql solution to mssql:
you can see the result here http://rextester.com/CBPW42897
WITH tbl AS (
SELECT id
, IIF( DATEDIFF(minute,
lag(ts, 1) OVER (PARTITION BY id ORDER BY ts asc ),
ts )<=60,
1, 0) as freq60
FROM loginTable
)
SELECT id FROM tbl
GROUP BY tbl.id HAVING SUM(freq60) >=3
ORDER BY tbl.id
I like MSSQL's handy functions IIF and DATEDIFF, but its a bit awkward to specified the same window every time.
The following code works in PgSQL,
with tbl as (
select cust_id
,case when extract(epoch from (ts - lag(ts, 1) over w) ) < 3600 then 1
else 0
end as freq60
from loginTable
window w as (partition by id order by ts asc )
)
select cust_id
from tbl
group by tbl.cust_id having sum(freq60) >=3
order by tbl.cust_id
The idea is pretty straightforward. Create a window frame by customer id, sorted member rows by time. let each row's timestamp minus its previous row's timestamp to get the interval, if interval within 60m, return 1, else 0, then do aggregation on the result. return the id whose freq >= 3
There's only one time sort, done within the window function. faster by orders of magnitude than self-join when dealing with tens of thousands of records.
id freq60 prev_ts ts intval
1 0 null 2016-08-16 00:00:00
1 1 2016-08-16 00:00:00 2016-08-16 00:25:00 25
1 1 2016-08-16 00:25:00 2016-08-16 00:47:00 22
1 1 2016-08-16 00:47:00 2016-08-16 01:05:00 18
I am trying to get some sorting and keep together (not really grouping) working.
In my sample data I would like to keep the DealerIDs together, sorted by IsPrimaryDealer DESC, but show the group (ok maybe it is grouping) of dealers by the ones with the most recent entry.
Result set 2 is the closest, but Grant and his brother should be displayed as the first two rows, in that order. (Grant should be row 1, Grants Brother row 2 because Grants Brother was the most recently added)
DECLARE #temp TABLE (
DealerPK int not null IDENTITY(1,1), DealerID int,
IsPrimaryDealer bit, DealerName varchar(50), DateAdded datetime
)
INSERT INTO #temp VALUES
(1, 1, 'Bob', GETDATE() - 7),
(2, 1, 'Robert', GETDATE() - 7),
(3, 1, 'Grant', GETDATE() - 7),
(3, 0, 'Grants Brother', GETDATE() - 1),
(2, 0, 'Roberts Nephew', GETDATE() - 2),
(1, 0, 'Bobs Cousin', GETDATE() - 3)
-- Data As Entered
SELECT * FROM #temp
-- Data Attempt at Row Numbering
SELECT *, intPosition =
ROW_NUMBER() OVER (PARTITION BY IsPrimaryDealer ORDER BY DealerID, IsPrimaryDealer DESC)
FROM #temp
ORDER BY DateAdded DESC
-- Data Attempt By DateAdded
SELECT *, intPosition =
ROW_NUMBER() OVER (PARTITION BY DealerID ORDER BY DateAdded DESC)
FROM #temp
ORDER BY intPosition, DateAdded
Expected Result
PK DID IsPr Name DateAdded
3 3 1 Grant 2015-10-08 17:14:26.497
4 3 0 Grants Brother 2015-10-14 17:14:26.497
2 2 1 Robert 2015-10-08 17:14:26.497
5 2 0 Roberts Nephew 2015-10-13 17:14:26.497
1 1 1 Bob 2015-10-08 17:14:26.497
6 1 0 Bobs Cousin 2015-10-12 17:14:26.497
As requested by OP:
;WITH Cte AS(
SELECT *,
mx = MAX(DateAdded) OVER(PARTITION BY DealerID) FROM #temp
)
SELECT *
FROM Cte
ORDER BY mx DESC, DealerID, IsPrimaryDealer DESC
Hope i understood your question,
This query results expected output :
SELECT Row_number()
OVER (
PARTITION BY DealerID
ORDER BY DealerPK)RN,
DealerPK,
DealerID,
IsPrimaryDealer,
DealerName,
DateAdded
FROM #temp
ORDER BY DealerID DESC