This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 7 months ago.
I'm running Microsoft SQL Server 2014 - 12.0.4213.0 (X64).
(Apologies - I'm a newbie and I know I'm running an old version)
I have the following table:
ID
Name
Time
1
Finished
2022-07-13 17:09:48.0000000
1
Start
2022-07-13 17:00:48.0000000
2
Clean
2022-07-13 15:09:48.0000000
2
Waiting
2022-07-13 17:34:48.0000000
2
Clean
2022-07-13 12:09:48.0000000
3
Start
2022-07-12 18:09:48.0000000
3
Middle
2022-07-12 14:09:48.0000000
3
Middle
2022-06-13 17:09:48.0000000
I want to return a group that will show the max time for each ID number, but also return the Name value of that max row.
I can do a
SELECT
ID, MAX(Time)
FROM
...
WHERE
...
GROUP BY
(ID)
but I need to pull in the Name column as well. I just want one row per ID returning the max time for that ID, and the Name associated with that Time & ID number
Any help would be great thank you
This kind of thing has been asked and answered so many times, but finding the right search term can be challenging. Here is how you can tackle this with your sample data.
declare #Something table
(
ID int
, Name varchar(20)
, Time datetime2
)
insert #Something values
(1, 'Finished', '2022-07-13 17:09:48.0000000')
, (1, 'Start', '2022-07-13 17:00:48.0000000')
, (2, 'Clean', '2022-07-13 15:09:48.0000000')
, (2, 'Waiting', '2022-07-13 17:34:48.0000000')
, (2, 'Clean', '2022-07-13 12:09:48.0000000')
, (3, 'Start', '2022-07-12 18:09:48.0000000')
, (3, 'Middle', '2022-07-12 14:09:48.0000000')
, (3, 'Middle', '2022-06-13 17:09:48.0000000')
select ID
, Name
, Time
from
(
select *
, RowNum = ROW_NUMBER()over(partition by s.ID order by s.Time desc)
from #Something s
) x
where x.RowNum = 1
Just another option (a nudge less performant)
Select Top 1 with ties *
From YourTable
Order By row_number() over (partition by ID order by Time desc)
This can also work
select * from table
where time in (select max(time) from table group by id )
But other's answers seem more efficient.
I have not tested this, if it's wrong then will delete the answer.
Related
I have a self-referencing table of parents and children and I have written a recursive CTE so I now have a list of parent-child relationships with their depths against them i.e. which generation they are in.
Is it now possible to pivot this to show great-grandparents' Ids in the left column, then grandparents in the next column, then parents, then children etc. with their respective generations as the column headings please?
e.g. with the data you see I'm inserting into my temp table, can I get this please?
Gen0 Gen1 Gen2 Gen3 Gen4
1 2 3 4 5
10 20 100 1000
10 20 200
10 30
10 40
create table [#Data] ([ParentId] int, [ChildId] int)
insert [#Data] values
(1, 2)
, (2, 3)
, (3, 4)
, (4, 5 )
, (10, 20)
, (10, 30)
, (10, 40)
, (20, 100)
, (20, 200)
, (200, 1000)
;with [CTE] AS
(
select [A].[ParentId], [A].[ChildId], 1 as [Generation]
from [#Data] [A]
left join [#Data] [B]
on [A].[ParentId] = [B].[ChildId]
where [B].[ChildId] is null
union all
select [D].[ParentId], [D].[ChildId], [Generation] + 1
from [CTE] [C]
join [#Data] [D]
on [C].[ChildId] = [D].[ParentId]
)
select * from [CTE] order by [ParentId], [ChildId]
I am using SQL 2017.
Many thanks for looking.
You can use something like this to query your CTE:
;with [CTE] AS
(/*your code here*/)
select
g1.[ParentId] as Gen0
, coalesce(g1.[ChildId], g2.[ParentId]) as Gen1
, coalesce(g2.[ChildId], g3.[ParentId]) as Gen2
, coalesce(g3.[ChildId], g4.[ParentId]) as Gen3
, g4.[ChildId] as Gen4
from
[CTE] as g1
left join [CTE] as g2 on g1.[ChildId] = g2.[ParentId]
left join [CTE] as g3 on g2.[ChildId] = g3.[ParentId]
left join [CTE] as g4 on g3.[ChildId] = g4.[ParentId]
where g1.[Generation] = 1
Result:
See the complete code here. This is a static solution, if you want something that works with an arbitrary number of generations, you'll have to transform this code in dynamic TSQL.
P.S. I think there is probably a typo in your expected results: in Gen3 column 1000 should be on the third row, not on the second since 1000 is child of 200, not of 100
I am Trying to compare two strings and pick the longest if they are similar, I have managed to pick the longest by using the following code:
SELECT D.RID, ProductID, Product, [Length] FROM (
SELECT RID, MAX([Length]) AS theLength FROM SortData GROUP BY RID)
AS X INNER JOIN SortData AS D ON D.RID = X.RID AND D.[Length] = X.theLength
But I am now trying to make sure that the code only pick the longest string if it is a like the word it is comparing it to, I have attempted the following code in a few ways but I would be grateful if somebody could help me:
SELECT D.RID, D.ProductID, Product, [Length] FROM (
SELECT RID, Product, MAX([Length]) AS theLength FROM SortData GROUP BY RID)
AS X INNER JOIN SortData AS D ON D.RID = X.RID AND D.[Length] = X.theLength WHERE
D.Product LIKE Product
Using this code I get the Following Error:
Msg 8120, Level 16, State 1, Line 3 Column 'SortData.Product' is
invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause. Msg 209, Level 16, State 1,
Line 5 Ambiguous column name 'Product'. Msg 209, Level 16, State 1,
Line 2 Ambiguous column name 'Product'.
Example of the Data I would Like to pick:
1 Sam
1 Samantha
2 Oliver
3 Ollie
4 Benjamin
4 Ben
...
I would expect the output list to be like:
1 Samantha
2 Oliver
3 Ollie
4 Benjamin
...
To Clarify what I am trying to do in the context of this example, I am trying to compare the two Names and if the are LIKE (e.g. x.Name LIKE Name) then pick the longest...
As Requested here is further test data:
1 Hydrogen
1 Hydrogen Oxide
1 Carbon Monoxide
2 Carbon
2 Carbon
2 Carbon Dioxide
3 Carbon Monoxide
3 Carbon Dioxide
3 Oxygen
4 Hydrogen Dioxide
Desired Results are as so:
1 Hydrogen Oxide
1 Carbon Monoxide
2 Carbon Dioxide
3 Carbon Monoxide
3 Oxygen
4 Hydrogen Dioxide
Perhaps another option: The WITH TIES clause in concert with Row_Number()
Example
Select Top 1 with ties *
From YourTable
Order By Row_Number() over (Partition by ID Order By Len(Name) desc)
Your query doesn't come close to your sample data and output. So I built this around the sample data provided to demonstrate one way of solving this.
declare #Something table
(
Col1 int
, Col2 varchar(20)
)
insert #Something values
(1, 'Sam')
, (1, 'Samantha')
, (2, 'Oliver')
, (3, 'Ollie')
select x.Col1
, x.Col2
from
(
select *
, RowNum = ROW_NUMBER() over(partition by Col1 order by LEN(Col2) desc)
from #Something
) x
where x.RowNum = 1
---EDIT---
To demonstrate that this code still returns the desired output from your new sample data...
declare #Something table
(
Col1 int
, Col2 varchar(20)
)
insert #Something values
(1, 'Sam')
, (1, 'Samantha')
, (2, 'Oliver')
, (3, 'Ollie')
, (4, 'Benjamin')
, (4, 'Ben')
select x.Col1
, x.Col2
from
(
select *
, RowNum = ROW_NUMBER() over(partition by Col1 order by LEN(Col2) desc)
from #Something
) x
where x.RowNum = 1
This returns:
1 Samantha
2 Oliver
3 Ollie
4 Benjamin
Since you claim this still doesn't work you need to provide an example of how or why this doesn't work. You keep mentioning LIKE but have not explained or demonstrated how that comes into play here. Help me understand the problem and I can help you find a solution.
I Ended up figuring it out and using the following code:
SELECT D.RID, ProductID, D.Product, [Length] FROM
(
SELECT RID, MAX([Length]) AS theLength
FROM SortData GROUP BY RID
) AS X
INNER JOIN SortData AS D ON D.RID = X.RID AND D.[Length] = X.theLength
WHERE D.Product LIKE Product
GO
I am having an issue with my date values and the data types for the date field is date-time but at the sametime i am getting a lot of records for the same id within 48 hours. The goal is just to return one record only if patient makes visit to the hospital within 48. For example if patient A goes to ER on 1/1/2014 and again goes back to 1/2/2014 then i only want to show the first visit which 1/1/2014. I really believe the issue is at this line
AND A.[ADMT_TS] < DateAdd(d, 2, ADMT_TS)
and i think i need to do some conversion first in order to get the correct values.
here is my query and please not that i have other queries before the select statement here but i am only posting this section which where i am trying to get the first 48 hours.
SELECT [ID], [LOCATION], [ADMT_TS]
FROM ERS WHERE RN = 1
UNION ALL
SELECT [ID], [LOCATION], [ADMT_TS]
FROM ERS A
WHERE RN > 1 AND EXISTS (SELECT 1 FROM ERS WHERE RN = 1 AND [ID] = A.[ID])
AND NOT EXISTS(SELECT 1 FROM ERS WHERE RN = 1 AND [ID] = A.[ID] AND A.[ADMT_TS] < DateAdd(d, 2, ADMT_TS))
This will work but may not be the best option. If you post some data and give us an idea of how many rows may/will be in ERS table, I can adjust the query if needed
SELECT [Id]
,[Loc]
,MIN([admt_ts])
FROM [NewJunk].[dbo].[ERS]
WHERE RN = 1
GROUP BY id, loc
I need to get an ordered hierarchy of a tree, in a specific way. The table in question looks a bit like this (all ID fields are uniqueidentifiers, I've simplified the data for sake of example):
EstimateItemID EstimateID ParentEstimateItemID ItemType
-------------- ---------- -------------------- --------
1 A NULL product
2 A 1 product
3 A 2 service
4 A NULL product
5 A 4 product
6 A 5 service
7 A 1 service
8 A 4 product
Graphical view of the tree structure (* denotes 'service'):
A
___/ \___
/ \
1 4
/ \ / \
2 7* 5 8
/ /
3* 6*
Using this query, I can get the hierarchy (just pretend 'A' is a uniqueidentifier, I know it isn't in real life):
DECLARE #EstimateID uniqueidentifier
SELECT #EstimateID = 'A'
;WITH temp as(
SELECT * FROM EstimateItem
WHERE EstimateID = #EstimateID
UNION ALL
SELECT ei.* FROM EstimateItem ei
INNER JOIN temp x ON ei.ParentEstimateItemID = x.EstimateItemID
)
SELECT * FROM temp
This gives me the children of EstimateID 'A', but in the order that it appears in the table. ie:
EstimateItemID
--------------
1
2
3
4
5
6
7
8
Unfortunately, what I need is an ordered hierarchy with a result set that follows the following constraints:
1. each branch must be grouped
2. records with ItemType 'product' and parent are the top node
3. records with ItemType 'product' and non-NULL parent grouped after top node
4. records with ItemType 'service' are bottom node of a branch
So, the order that I need the results, in this example, is:
EstimateItemID
--------------
1
2
3
7
4
5
8
6
What do I need to add to my query to accomplish this?
Try this:
;WITH items AS (
SELECT EstimateItemID, ItemType
, 0 AS Level
, CAST(EstimateItemID AS VARCHAR(255)) AS Path
FROM EstimateItem
WHERE ParentEstimateItemID IS NULL AND EstimateID = #EstimateID
UNION ALL
SELECT i.EstimateItemID, i.ItemType
, Level + 1
, CAST(Path + '.' + CAST(i.EstimateItemID AS VARCHAR(255)) AS VARCHAR(255))
FROM EstimateItem i
INNER JOIN items itms ON itms.EstimateItemID = i.ParentEstimateItemID
)
SELECT * FROM items ORDER BY Path
With Path - rows a sorted by parents nodes
If you want sort childnodes by ItemType for each level, than you can play with Level and SUBSTRING of Pathcolumn....
Here SQLFiddle with sample of data
This is an add-on to Fabio's great idea from above. Like I said in my reply to his original post. I have re-posted his idea using more common data, table name, and fields to make it easier for others to follow.
Thank you Fabio! Great name by the way.
First some data to work with:
CREATE TABLE tblLocations (ID INT IDENTITY(1,1), Code VARCHAR(1), ParentID INT, Name VARCHAR(20));
INSERT INTO tblLocations (Code, ParentID, Name) VALUES
('A', NULL, 'West'),
('A', 1, 'WA'),
('A', 2, 'Seattle'),
('A', NULL, 'East'),
('A', 4, 'NY'),
('A', 5, 'New York'),
('A', 1, 'NV'),
('A', 7, 'Las Vegas'),
('A', 2, 'Vancouver'),
('A', 4, 'FL'),
('A', 5, 'Buffalo'),
('A', 1, 'CA'),
('A', 10, 'Miami'),
('A', 12, 'Los Angeles'),
('A', 7, 'Reno'),
('A', 12, 'San Francisco'),
('A', 10, 'Orlando'),
('A', 12, 'Sacramento');
Now the recursive query:
-- Note: The 'Code' field isn't used, but you could add it to display more info.
;WITH MyCTE AS (
SELECT ID, Name, 0 AS TreeLevel, CAST(ID AS VARCHAR(255)) AS TreePath
FROM tblLocations T1
WHERE ParentID IS NULL
UNION ALL
SELECT T2.ID, T2.Name, TreeLevel + 1, CAST(TreePath + '.' + CAST(T2.ID AS VARCHAR(255)) AS VARCHAR(255)) AS TreePath
FROM tblLocations T2
INNER JOIN MyCTE itms ON itms.ID = T2.ParentID
)
-- Note: The 'replicate' function is not needed. Added it to give a visual of the results.
SELECT ID, Replicate('.', TreeLevel * 4)+Name 'Name', TreeLevel, TreePath
FROM MyCTE
ORDER BY TreePath;
I believe that you need to add the following to the results of your CTE...
BranchID = some kind of identifier that uniquely identifies the branch. Forgive me for not being more specific, but I'm not sure what identifies a branch for your needs. Your example shows a binary tree in which all branches flow back to the root.
ItemTypeID where (for example) 0 = Product and 1 = service.
Parent = identifies the parent.
If those exist in the output, I think you should be able to use the output from your query as either another CTE or as the FROM clause in a query. Order by BranchID, ItemTypeID, Parent.
I am trying to have a running average column in the SELECT statement based on a column from the n previous rows in the same SELECT statement. The average I need is based on the n previous rows in the resultset.
Let me explain
Id Number Average
1 1 NULL
2 3 NULL
3 2 NULL
4 4 2 <----- Average of (1, 3, 2),Numbers from previous 3 rows
5 6 3 <----- Average of (3, 2, 4),Numbers from previous 3 rows
. . .
. . .
The first 3 rows of the Average column are null because there are no previous rows. The row 4 in the Average column shows the average of the Number column from the previous 3 rows.
I need some help trying to construct a SQL Select statement that will do this.
This should do it:
--Test Data
CREATE TABLE RowsToAverage
(
ID int NOT NULL,
Number int NOT NULL
)
INSERT RowsToAverage(ID, Number)
SELECT 1, 1
UNION ALL
SELECT 2, 3
UNION ALL
SELECT 3, 2
UNION ALL
SELECT 4, 4
UNION ALL
SELECT 5, 6
UNION ALL
SELECT 6, 8
UNION ALL
SELECT 7, 10
--The query
;WITH NumberedRows
AS
(
SELECT rta.*, row_number() OVER (ORDER BY rta.ID ASC) AS RowNumber
FROM RowsToAverage rta
)
SELECT nr.ID, nr.Number,
CASE
WHEN nr.RowNumber <=3 THEN NULL
ELSE ( SELECT avg(Number)
FROM NumberedRows
WHERE RowNumber < nr.RowNumber
AND RowNumber >= nr.RowNumber - 3
)
END AS MovingAverage
FROM NumberedRows nr
Assuming that the Id column is sequential, here's a simplified query for a table named "MyTable":
SELECT
b.Id,
b.Number,
(
SELECT
AVG(a.Number)
FROM
MyTable a
WHERE
a.id >= (b.Id - 3)
AND a.id < b.Id
AND b.Id > 3
) as Average
FROM
MyTable b;
Edit: I missed the point that it should average the three previous records...
For a general running average, I think something like this would work:
SELECT
id, number,
SUM(number) OVER (ORDER BY ID) /
ROW_NUMBER() OVER (ORDER BY ID) AS [RunningAverage]
FROM myTable
ORDER BY ID
A simple self join would seem to perform much better than a row referencing subquery
Generate 10k rows of test data:
drop table test10k
create table test10k (Id int, Number int, constraint test10k_cpk primary key clustered (id))
;WITH digits AS (
SELECT 0 as Number
UNION SELECT 1
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9
)
,numbers as (
SELECT
(thousands.Number * 1000)
+ (hundreds.Number * 100)
+ (tens.Number * 10)
+ ones.Number AS Number
FROM digits AS ones
CROSS JOIN digits AS tens
CROSS JOIN digits AS hundreds
CROSS JOIN digits AS thousands
)
insert test10k (Id, Number)
select Number, Number
from numbers
I would pull the special case of the first 3 rows out of the main query, you can UNION ALL those back in if you really want it in the row set. Self join query:
;WITH NumberedRows
AS
(
SELECT rta.*, row_number() OVER (ORDER BY rta.ID ASC) AS RowNumber
FROM test10k rta
)
SELECT nr.ID, nr.Number,
avg(trailing.Number) as MovingAverage
FROM NumberedRows nr
join NumberedRows as trailing on trailing.RowNumber between nr.RowNumber-3 and nr.RowNumber-1
where nr.Number > 3
group by nr.id, nr.Number
On my machine this takes about 10 seconds, the subquery approach that Aaron Alton demonstrated takes about 45 seconds (after I changed it to reflect my test source table) :
;WITH NumberedRows
AS
(
SELECT rta.*, row_number() OVER (ORDER BY rta.ID ASC) AS RowNumber
FROM test10k rta
)
SELECT nr.ID, nr.Number,
CASE
WHEN nr.RowNumber <=3 THEN NULL
ELSE ( SELECT avg(Number)
FROM NumberedRows
WHERE RowNumber < nr.RowNumber
AND RowNumber >= nr.RowNumber - 3
)
END AS MovingAverage
FROM NumberedRows nr
If you do a SET STATISTICS PROFILE ON, you can see the self join has 10k executes on the table spool. The subquery has 10k executes on the filter, aggregate, and other steps.
Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
Check out some solutions here. I'm sure that you could adapt one of them easily enough.
If you want this to be truly performant, and arn't afraid to dig into a seldom-used area of SQL Server, you should look into writing a custom aggregate function. SQL Server 2005 and 2008 brought CLR integration to the table, including the ability to write user aggregate functions. A custom running total aggregate would be the most efficient way to calculate a running average like this, by far.
Alternatively you can denormalize and store precalculated running values. Described here:
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx
Performance of selects is as fast as it goes. Of course, modifications are slower.