Max Value with unique values in more than one column - sql-server

I feel like I'm missing something really obvious here.
Using T-SQL/SQL-Server:
I have unique values in more than one column but want to select the max version based on one particular column.
Dataset:
Example
ID | Name| Version | Code
------------------------
1 | Car | 3 | NULL
1 | Car | 2 | 1000
1 | Car | 1 | 2000
Target status: I want my query to only select the row with the highest version value. Running a MAX on the version column pulls all three because of the distinct values in the 'Code' column:
SELECT ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
The net result is that I get all three entries as per the data set due to the unique values in the Code column, but I only want the top row (Version 3).
Any help would be appreciated.

You need to identify the row with the highest version as 1 query and use another outer query to pull out all the fields for that row. Like so:
SELECT t.ID, t.Name, GRP.Version, t.Code
FROM (
SELECT ID
,Name
,MAX(Version) as Version
FROM Table
GROUP BY ID, Name
) GRP
INNER JOIN Table t on GRP.ID = t.ID and GRP.Name = t.Name and GRP.Version = t.Version

You can also use row_number() to do this kind of logic, for example like this:
select ID, Name, Version, Code
from (
select *, row_number() over (order by Version desc) as RN
from Table1
) X where RN = 1
Example in SQL Fiddle

add the top statment to force the return of a single row. Also add the order by notation
SELECT top 1 ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
order by max(version) desc

Related

Select rows where a value is maximum, and a column is null

I have a table, products, that looks along these lines:
productID | version | done
1 | 1 | 2000-01-01
1 | 2 | NULL
2 | 1 | NULL
2 | 2 | 2000-01-01
Version is assumed to be increasing.
What I want is a query that returns a ProductID and its highest / current Version, if the Done column for that version is NULL. In plain English, I want all products where the latest version is not Done, and the corresponding version. The goal: among products, find the ones with a new version that have not been "done" / processed yet.
Note: in the example above, I would expect the query to return ProductID 1, Version 2 only. I do not want the highest not-done version of a product, I want the highest version of a product, if it is not-done. Sorry if the clarification is overkill.
I wrote a query which appears to do what I want:
SELECT productID ProductID, version Version
FROM products
WHERE done IS NULL
AND version IN (
SELECT MAX(version)
FROM products
GROUP BY productID
)
However, it also appears to not be very efficient. So my question is, is there a better way to approach this query?
We can try using ROW_NUMBER here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY productID ORDER BY version DESC) rn
FROM products
)
SELECT productID, version
FROM cte
WHERE rn = 1 AND done IS NULL;
Demo
The CTE above assigns a row number, starting with 1, to latest record for each product, according to version. Then, we subquery and retain only product records where the latest one happens to not have a value assigned to the done column.
Seems you are almost correct with your query, what's missing is the correlation between the productID of your subquery and your main table.
SELECT t.productID ProductID, t.version Version
FROM products t
WHERE t.done IS NULL
AND version IN (
SELECT MAX(p.version)
FROM products p
WHERE p.productID = t.productID
GROUP BY p.productID
)
Another solution is to use join
select t1.* from products t1
inner join
(select max(version) as versionId, productID
from products
group by productID) t2 on t2.productID = t1.productID and t2.versionId = t1.version
where coalesce(done, '') = ''

SQL GROUP BY with columns which contain mirrored values

Sorry for the bad title. I couldn't think of a better way to describe my issue.
I have the following table:
Category | A | B
A | 1 | 2
A | 2 | 1
B | 3 | 4
B | 4 | 3
I would like to group the data by Category, return only 1 line per category, but provide both values of columns A and B.
So the result should look like this:
category | resultA | resultB
A | 1 | 2
B | 4 | 3
How can this be achieved?
I tried this statement:
SELECT category, a, b
FROM table
GROUP BY category
but obviously, I get the following errors:
Column 'a' is invalid in the select list because it is not contained
in either an aggregate function or the GROUP BY clause.
Column 'b' is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
How can I achieve the desired result?
Try this:
SELECT category, MIN(a) AS resultA, MAX(a) AS resultB
FROM table
GROUP BY category
If the values are mirrored then you can get both values using MIN, MAX applied on a single column like a.
Seams you don't really want to aggregate per category, but rather remove duplicate rows from your result (or rather rows that you consider duplicates).
You consider a pair (x,y) equal to the pair (y,x). To find duplicates, you can put the lower value in the first place and the greater in the second and then apply DISTINCT on the rows:
select distinct
category,
case when a < b then a else b end as attr1,
case when a < b then b else a end as attr2
from mytable;
Considering you want a random record from duplicates for each category.
Here is one trick using table valued constructor and Row_Number window function
;with cte as
(
SELECT *,
(SELECT Min(min_val) FROM (VALUES (a),(b))tc(min_val)) min_val,
(SELECT Max(max_val) FROM (VALUES (a),(b))tc(max_val)) max_val
FROM (VALUES ('A',1,2),
('A',2,1),
('B',3,4),
('B',4,3)) tc(Category, A, B)
)
select Category,A,B from
(
Select Row_Number()Over(Partition by category,max_val,max_val order by (select NULL)) as Rn,*
From cte
) A
Where Rn = 1

SQL MAX Date Does Not Decipher Seconds

I have a table which contains the following data:
ID | ObjectID | ActionDate
=======================================
12345 | 422107 | 2016-10-05 11:24:23.790
12346 | 422107 | 2016-10-05 11:24:28.797
I want to return the ID and max date, but the MAX function does not seem to be calculating down to seconds value (SS). Am I missing something, or is this a limitation with the MAX function? Here is the code I am using:
SELECT
TMOA.ObjectID AS [ObjID]
, TMOA.ID AS [ObjActionID]
, MAX(TMOA.ActionDate) AS [PrepDate]
FROM
TM_Procedure AS TMPRD
left join TM_ObjectAction AS TMOA ON TMPRD.ID = TMOA.ObjectID
GROUP BY
TMOA.ObjectID
, TMPRD.ID
, TMOA.ID
Looks like you're grouping by the ID of the table which is UNIQUE. More than likely that's why you're getting a record that you don't want. Just select the MAX(ActionDate) and see what you get.
If you get the records you want, then you have to figure out which column you are selecting/grouping by that is causing the records you don't want. My guess is that it's either TMOA.ObjectID or TMOA.ID
One option is to use the window function Row_Number()
Select *
From (
Select *
,RowNr=Row_Number() over (Partition By ObjectID Order by ActionDate Desc
From YourTable
) A
Where RowNr=1

Select Different Column Value for Row with Max Value

I'm hoping for a cleaner way to do something that I know how to do one way. I want to retrieve the UserId for the MAX ID value as well as that MAX ID value. Let's say I have a table with data like this:
ID UserId Value
1 10 'Foo'
2 15 'Blah'
3 10 'Blech'
4 20 'Qwerty'
I want to retrieve:
ID UserId
4 20
I know I could do this like so:
SELECT
t.ID,
t.UserID
FROM
(
SELECT MAX(ID) as [MaxID]
FROM table
) as m
JOIN table as t ON m.MaxID = t.ID
I'm only vaguely familiar with the ROW_NUMBER(), RANK() and other similar methods and I can't help believing that this scenario could benefit from some such method to get rid of joining back to the table.
You can definitely use ROW_NUMBER for something like this:
with t1Rank as
(
select *
, t1Rank = row_number() over (order by ID desc)
from t1
)
select ID, UserID
from t1Rank
where t1Rank = 1
SQL Fiddle with demo.
The advantage with this approach is you can bring Value (or other fields as required) into the result set, too. Plus you can tweak the ordering/grouping as required.
You could also just do it with a sub-query like this:
SELECT ID ,
UserID
FROM table
WHERE ID = ( SELECT MAX(ID)
FROM table
);
SELECT TOP 1 ID, UserID FROM <table> ORDER BY ID DESC

Getting filtered results with subquery

I have a table with something like the following:
ID Name Color
------------
1 Bob Blue
2 John Yellow
1 Bob Green
3 Sara Red
3 Sara Green
What I would like to do is return a filtered list of results whereby the following data is returned:
ID Name Color
------------
1 Bob Blue
2 John Yellow
3 Sara Red
i.e. I would like to return 1 row per user. (I do not mind which row is returned for the particular user - I just need that the [ID] is unique.) I have something already that works but is really slow where I create a temp table adding all the ID's and then using a "OUTER APPLY" selecting the top 1 from the same table, i.e.
CREATE TABLE #tb
(
[ID] [int]
)
INSERT INTO #tb
select distinct [ID] from MyTable
select
T1.[ID],
T2.[Name],
T2.Color
from
#tb T1
OUTER APPLY
(
SELECT TOP 1 * FROM MyTable T2 WHERE T2.[ID] = T1.[ID]
) AS V2
DROP TABLE #tb
Can somebody suggest how I may improve it?
Thanks
Try:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS 'RowNo',
ID, Name, Color
FROM table
)
SELECT ID,Name,color
FROM CTE
WHERE RowNo = 1
or
select
*
from
(
Select
ID, Name, Color,
rank() over (partition by Id order by sum(Name) desc) as Rank
from
table
group by
ID
)
HRRanks
where
rank = 1
If you're using SQL Server 2005 or higher, you could use the Ranking functions and just grab the first one in the list.
http://msdn.microsoft.com/en-us/library/ms189798.aspx

Resources