Getting filtered results with subquery - sql-server

I have a table with something like the following:
ID Name Color
------------
1 Bob Blue
2 John Yellow
1 Bob Green
3 Sara Red
3 Sara Green
What I would like to do is return a filtered list of results whereby the following data is returned:
ID Name Color
------------
1 Bob Blue
2 John Yellow
3 Sara Red
i.e. I would like to return 1 row per user. (I do not mind which row is returned for the particular user - I just need that the [ID] is unique.) I have something already that works but is really slow where I create a temp table adding all the ID's and then using a "OUTER APPLY" selecting the top 1 from the same table, i.e.
CREATE TABLE #tb
(
[ID] [int]
)
INSERT INTO #tb
select distinct [ID] from MyTable
select
T1.[ID],
T2.[Name],
T2.Color
from
#tb T1
OUTER APPLY
(
SELECT TOP 1 * FROM MyTable T2 WHERE T2.[ID] = T1.[ID]
) AS V2
DROP TABLE #tb
Can somebody suggest how I may improve it?
Thanks

Try:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS 'RowNo',
ID, Name, Color
FROM table
)
SELECT ID,Name,color
FROM CTE
WHERE RowNo = 1
or
select
*
from
(
Select
ID, Name, Color,
rank() over (partition by Id order by sum(Name) desc) as Rank
from
table
group by
ID
)
HRRanks
where
rank = 1

If you're using SQL Server 2005 or higher, you could use the Ranking functions and just grab the first one in the list.
http://msdn.microsoft.com/en-us/library/ms189798.aspx

Related

Max Value with unique values in more than one column

I feel like I'm missing something really obvious here.
Using T-SQL/SQL-Server:
I have unique values in more than one column but want to select the max version based on one particular column.
Dataset:
Example
ID | Name| Version | Code
------------------------
1 | Car | 3 | NULL
1 | Car | 2 | 1000
1 | Car | 1 | 2000
Target status: I want my query to only select the row with the highest version value. Running a MAX on the version column pulls all three because of the distinct values in the 'Code' column:
SELECT ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
The net result is that I get all three entries as per the data set due to the unique values in the Code column, but I only want the top row (Version 3).
Any help would be appreciated.
You need to identify the row with the highest version as 1 query and use another outer query to pull out all the fields for that row. Like so:
SELECT t.ID, t.Name, GRP.Version, t.Code
FROM (
SELECT ID
,Name
,MAX(Version) as Version
FROM Table
GROUP BY ID, Name
) GRP
INNER JOIN Table t on GRP.ID = t.ID and GRP.Name = t.Name and GRP.Version = t.Version
You can also use row_number() to do this kind of logic, for example like this:
select ID, Name, Version, Code
from (
select *, row_number() over (order by Version desc) as RN
from Table1
) X where RN = 1
Example in SQL Fiddle
add the top statment to force the return of a single row. Also add the order by notation
SELECT top 1 ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
order by max(version) desc

Returning earliest ID from table including other rows

Given this data:
itemID note color updateddate description
AA123 not unique blue 2014-01-01 duplicate 1
AB789 unique green 2013-11-20 unique 1
AA123 not unique pink 2012-01-01 duplicate 2
CC123 unique blue 2014-12-11 unique 2
CA123 unique red 2014-08-06 unique 3
CB333 unique red 2014-03-03 unique 4
CX123 unique brown 2014-09-01 unique 5
XX111 not unique red 2014-07-07 duplicate 3
XX111 not unique yellow 2014-06-06 duplicate 4
XX111 not unique purple 2014-05-05 duplicate 5
How can i select from it, returning all rows fully, but where there are Id's that are duplicate i only want to return the earliest one by its updateddate? In MySQL i understand this is fairly easy to do, but in MSSQL i cant fathom it.
You can achieve this with self join and a CTE:
WITH MinDates_CTE (itemD, MinDate) AS
(
SELECT itemID,
MIN(updatdate) AS MinDate
FROM MyTable
GROUP BY itemID
)
SELECT MyTable.*
FROM MyTable
JOIN MinDates_CTE
ON MinDates_CTE.itemID = MyTable.itemID
AND MinDates_CTE.MinDate = MyTable.updatedate
Alternatively you can use a window function:
SELECT itemID, note, color, updateddate, description
FROM (SELECT t.itemID,
t.note,
t.color,
MIN(t.updateddate) OVER (PARTITION BY t.itemId) as MinDate,
t.description
FROM MyTable t) u
where updatedate = MinDate
It can be done with ROW_NUMBER and PARTITION BY combination, Assuming that your table name is Test, the following could be a select statement that would be providing you the desired output
SELECT *
FROM
(SELECT itemID,
Note,
Color,
UpdatedDate,
Description,
RowOrder = ROW_NUMBER() OVER(PARTITION BY itemID
ORDER BY updateddate ASC)
FROM dbo.Test
) AS TempTab
WHERE RowOrder = 1;

need to update first non null field_x in non-normalised table

I have the following table that I have to work with.
SQL Fiddle
Basically, it is a product that stores up to 10 barcodes for a product code (simplified example). At any time, any number of those 10 barcode fields might have a value.
I have another table that has a list of product code and barcode, and need to add these to the product barcode table.
I need to perform an update so that any of the barcodes in barcodes_to_import are appended to the product_barcode table, into the first non null barcode column.
table product_barcodes
product_Code barcode_1 barcode_2 barcode_3 barcode_4 barcode_5
ABC 1 2 3
BCD 4
table barcodes_to_import
product_code barcode
ABC 7
BCD 8
Expected output:
product_Code barcode_1 barcode_2 barcode_3 barcode_4 barcode_5
ABC 1 2 3 7
BCD 4 8
create table product_barcodes(product_Code varchar(10),barcode_1 int,barcode_2 int,barcode_3 int
,barcode_4 int,barcode_5 int,barcode_6 int,barcode_7 int,barcode_8 int,barcode_9 int,barcode_10 int)
create table barcodes_to_import(product_code varchar(10),barcode int)
--Inserted Sample values as below
SELECT * FROM product_barcodes
SELECT * FROM barcodes_to_import
--Output Query
;with cte
as
(
select product_code,data,col_name
from product_barcodes
unpivot
(
data for col_name in (
barcode_1,barcode_2,barcode_3,barcode_4,barcode_5
,barcode_6,barcode_7,barcode_8,barcode_9,barcode_10
)
) upvt
)
,cte1
as
(
select *,ROW_NUMBER() OVER(PARTITION BY product_code ORDER BY col_name) as rn
from
(
select product_code, data,col_name from cte
union all
select product_code,barcode,'barcode_z' as col_name from barcodes_to_import
) t
)
select
product_Code
,SUM(1) as barcode_1
,SUM([2]) as barcode_2
,SUM([3]) as barcode_3
,SUM([4]) as barcode_4
,SUM([5]) as barcode_5
,SUM([6]) as barcode_6
,SUM([7]) as barcode_7
,SUM([8]) as barcode_8
,SUM([9]) as barcode_9
,SUM([10]) as barcode_10
from cte1
PIVOT
(
AVG(data) for rn in (1,[2],[3],[4],[5],[6],[7],[8],[9],[10])
) pvt
GROUP BY product_Code

T-SQL order by, based on other column value

I'm stuck with a query which should be pretty simple but, for reasons unknown, my brain is not playing ball here ...
Table:
id(int) | strategy (varchar) | value (whatever)
1 "ABC" whatevs
2 "ABC" yeah
3 "DEF" hello
4 "DEF" kitty
5 "QQQ" hurrr
The query should select ALL rows grouped on strategy but only one row per strategy - the one with the higest id.
In the case above, it should return rows with id 2, 4 and 5
SELECT id, strategy , value
FROM (
SELECT id, strategy , value
,ROW_NUMBER() OVER (PARTITION BY strategy ORDER BY ID DESC) rn
FROM Table_Name
) Sub
WHERE rn = 1
Working SQL FIDDLE
You can use window function to get the solution you want. Fiddle here
with cte as
(
select
rank()over(partition by strategy order by id desc) as rnk,
id, strategy, value from myT
)
select id, strategy, value from
cte where rnk = 1;
Try this:
SELECT T2.id,T1.strategy,T1.value
FROM TableName T1
INNER JOIN
(SELECT MAX(id) as id,strategy
FROM TableName
GROUP BY strategy) T2
ON T1.id=T2.id
Result:
ID STRATEGY VALUE
2 ABC yeah
4 DEF kitty
5 QQQ hurrr
See result in SQL Fiddle.
SELECT id, strategy , value
FROM (
SELECT id, strategy , value
,MAX(id) OVER (PARTITION BY strategy) MaxId
FROM YourTable
) Sub
WHERE id=MaxId
You may try this one as well:
SELECT id, strategy, value FROM TableName WHERE id IN (
SELECT MAX(id) FROM TableName GROUP BY strategy
)
Bit depends on your data, you might get results faster with it as it does not do sorting, but by the other hand it uses IN, which can slow you down if there is many 'strategies'

Select Different Column Value for Row with Max Value

I'm hoping for a cleaner way to do something that I know how to do one way. I want to retrieve the UserId for the MAX ID value as well as that MAX ID value. Let's say I have a table with data like this:
ID UserId Value
1 10 'Foo'
2 15 'Blah'
3 10 'Blech'
4 20 'Qwerty'
I want to retrieve:
ID UserId
4 20
I know I could do this like so:
SELECT
t.ID,
t.UserID
FROM
(
SELECT MAX(ID) as [MaxID]
FROM table
) as m
JOIN table as t ON m.MaxID = t.ID
I'm only vaguely familiar with the ROW_NUMBER(), RANK() and other similar methods and I can't help believing that this scenario could benefit from some such method to get rid of joining back to the table.
You can definitely use ROW_NUMBER for something like this:
with t1Rank as
(
select *
, t1Rank = row_number() over (order by ID desc)
from t1
)
select ID, UserID
from t1Rank
where t1Rank = 1
SQL Fiddle with demo.
The advantage with this approach is you can bring Value (or other fields as required) into the result set, too. Plus you can tweak the ordering/grouping as required.
You could also just do it with a sub-query like this:
SELECT ID ,
UserID
FROM table
WHERE ID = ( SELECT MAX(ID)
FROM table
);
SELECT TOP 1 ID, UserID FROM <table> ORDER BY ID DESC

Resources