Return column names based on which holds the maximum value in the record - sql-server

I have a table with the following structure ...
+--------+------+------+------+------+------+
| ID | colA | colB | colC | colD | colE | [...] etc.
+--------+------+------+------+------+------+
| 100100 | 15 | 100 | 90 | 80 | 10 |
+--------+------+------+------+------+------+
| 100200 | 10 | 80 | 90 | 100 | 10 |
+--------+------+------+------+------+------+
| 100300 | 100 | 90 | 10 | 10 | 80 |
+--------+------+------+------+------+------+
I need to return a concatenated value of column names which hold the maximum 3 values per row ...
+--------+----------------------------------+
| ID | maxCols |
+--------+----------------------------------+
| 100100 | colB,colC,colD |
+--------+------+------+------+------+------+
| 100200 | colD,colC,colB |
+--------+------+------+------+------+------+
| 100300 | colA,colB,colE |
+--------+------+------+------+------+------+
It's okay to not concatenate the column names, and have maxCol1 | maxCol2 | maxCol3 if that's simpler
The order of the columns is important when concatenating them
The number of columns is limited and not dynamic
The number of rows is many

You could use UNPIVOT and get TOP 3 for each ID
;with temp AS
(
SELECT ID, ColValue, ColName
FROM #SampleData sd
UNPIVOT
(
ColValue For ColName in ([colA], [colB], [colC], [colD], [colE])
) unp
)
SELECT sd.ID, ca.ColMax
FROM #SampleData sd
CROSS APPLY
(
SELECT STUFF(
(
SELECT TOP 3 WITH TIES
',' + t.ColName
FROM temp t
WHERE t.ID = sd.ID
ORDER BY t.ColValue DESC
FOR XML PATH('')
)
,1,1,'') AS ColMax
) ca
See demo here: http://rextester.com/CZCPU51785

Here is one trick to do it using Cross Apply and Table Valued Constructor
SELECT Id,
maxCols= Stuff(cs.maxCols, 1, 1, '')
FROM Yourtable
CROSS apply(SELECT(SELECT TOP 3 ',' + NAME
FROM (VALUES (colA,'colA'),(colB,'colB'),(colC,'colC'),
(colD,'colD'),(colE,'colE')) tc (val, NAME)
ORDER BY val DESC
FOR xml path, type).value('.[1]', 'nvarchar(max)')) cs (maxCols)
If needed it can be made dynamic using Information_schema.Columns

Related

How to check if SQL records are in a specific order

I'm having trouble figuring out how I can check if records on a table are in a specific order. The simplified table design is essentially this:
+------------+----------------+--------+
| ID (GUID) | StartDate | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 | 22 |
| BB22001... | 8/17/2019 3:33 | 21 |
| 4D40B12... | 8/17/2019 3:47 | 21 |
| 3655125... | 8/17/2019 4:06 | 20 |
| 3456CD1... | 8/17/2019 4:22 | 20 |
| 38BAF92... | 8/17/2019 4:40 | 19 |
| E60CBE8... | 8/17/2019 5:09 | 19 |
| 5F2756B... | 8/17/2019 5:24 | 18 |
+------------+----------------+--------+
The ID column is a non-sequential GUID. The table is sorted by default on the StartDate when data is entered. However I am trying to flag instances where the NumCol values are out of descending order. The NumCol values can be identical on adjacent records, but ultimately they must be descending.
+--------+
| NumCol |
+--------+
| 22 |
| *20 | <- I want the ID where this occurs
| 21 |
| 20 |
| 20 |
| 19 |
| 19 |
| 18 |
+--------+
I've tried LEFT JOIN this table to itself, but can't seem to come up with an ON clause that gives the right results:
ON a.ID <> b.ID AND a.NumCol > b.NumCol
I also thought I could use OFFSET n ROWS to compare the default sorted table against one with an ORDER BY NumCol performed on it. I can't come up with anything that works.
I need a solution that will work for both SQL Server and SQL Compact.
With EXISTS:
select t.* from tablename t
where exists (
select 1 from tablename
where numcol > t.numcol and startdate > t.startdate
)
Or with row_number() window function:
select t.id, t.startdate, t.numcol
from (
select *,
row_number() over (order by startdate desc) rn1,
row_number() over (order by numcol) rn2
from tablename
) t
where rn1 > rn2
See the demo.
This might be easiest:
select * from T t1
where NumCol < (select max(NumCol) from T t2 where t2.StartDate > t1.StartDate);
The exists version is probably better to optimize though.
Using analytic functions you could try this approach which finds breaks in the monotonicity of consecutive rows. It might not return all the rows you're interested in seeing:
with data as (
select *, lag(NumCol) over (order by StartDate desc) as prevNumCol
from T
)
select * from data where prevNumCol > NumCol;
Here's a better solution that's probably not available in both of your environments:
with data as (
select *,
max(NumCol) over (
order by StartDate desc
rows between unbounded preceding and current row
) as prevMax
from T
)
select * from data where prevMax > NumCol;

Find records of nearest date SQL

I have a table dbo.X with DateTime column lastUpdated and a code product column CodeProd which may have hundreds of records, with CodeProd duplicated because the table is used as "stock history"
My Stored Procedure has parameter #Date, I want to get all CodeProd nearest to that date so for example if I have:
+----------+--------------+--------+
| CODEPROD | lastUpdated | STATUS |
+----------+--------------+--------+
| 10 | 2-1-2019 | C1 |
| 10 | 1-1-2019 | C2 |
| 10 | 31-12-2019 | C1 |
| 11 | 31-12-2018 | C1 |
| 11 | 30-12-2018 | C1 |
| 12 | 30-8-2018 | C3 |
+----------+--------------+--------+
and #Date= '1-1-2019'
I wanna get:
+----+--------------+------+
| 10 | 1-1-2019 | C2 |
| 11 | 31-12-2018 | C1 |
| 12 | 30-8-2018 | C3 |
+----+--------------+------+
How to find it?
You can use TOP(1) WITH TIES to get one row with nearest date for each CODEPROD which should be less than provided date.
Try like following code.
SELECT TOP(1) WITH TIES *
FROM [YourTableName]
WHERE lastupdated <= #date
ORDER BY Row_number()
OVER (
partition BY [CODEPROD]
ORDER BY lastupdated DESC);
You can use apply :
select distinct t.CODEPROD, t1.lastUpdated, t1.STATUS
from table t cross apply
( select top (1) t1.*
from table t1
where t1.CODEPROD = t.CODEPROD and t1.lastUpdated <= #date
order by t1.lastUpdated desc
) t1;

Selecting rows with minimal values [col1] of maximal values [col2] by id sql server

Sample data on
MAIN_TABLE:
+-----+--------+-------+
| ID | HEIGHT | STOCK |
+-----+--------+-------+
| ID1 | 180 | 680 |
| ID1 | 170 | 680 |
| ID1 | 130 | 360 |
| ID2 | 250 | 420 |
| ID2 | 190 | 420 |
| ID2 | 70 | 120 |
| ... | ... | ... |
+-----+--------+-------+
I need to select distinct ID rows that have max STOCK with the min HEIGHT.
The desired result would be:
+-----+--------+-------+
| ID | HEIGHT | STOCK |
+-----+--------+-------+
| ID1 | 170 | 680 |
| ID2 | 190 | 420 |
| ... | ... | ... |
+-----+--------+-------+
Query code, that i'm using to achieve it:
WITH MAX_STOCK (ID, maxstock) as
(
select ID, max(STOCK) as maxstock
from MAIN_TABLE
group by ID
),
TABLE_STOCK (ID, HEIGHT, STOCK) AS
(
select a.ID, a.HEIGHT, a.STOCK
from MAIN_TABLE a join MAX_STOCK b
on a.ID= b.ID and a.STOCK = b.maxstock
),
MIN_HEIGHT (ID, minheight) as
(
select ID, min(HEIGHT) as minheight
from TABLE_STOCK
group by ID
),
TABLE_HEIGHT (ID, HEIGHT, STOCK) AS
(
select a.ID, a.HEIGHT, a.STOCK
from TABLE_STOCK a join MIN_HEIGHT b
on a.ID= b.ID and a.HEIGHT = b.minheight
)
If I select any of tables MAX_STOCK, TABLE_STOCK, MIN_HEIGHT,
i have results in 1-2seconds time.
But when selecting TABLE_HEIGHT, which would be my desired result,
It is executing 6min+ with no answer on data with 600 rows
How should i write this query to have the result in reasonable time?
I think you can use window functions to achieve this. Try following example.
create table #Main(
Id varchar(50),
HEIGHT int,
STOCK int
)
insert into #Main values('ID1',180,680),('ID1',170,680),('ID1',130,360 ),('ID2',250,420),('ID2',190,420),('ID2',70,120)
select * from (
select *,dense_rank() over(partition by Id order by STOCK desc,HEIGHT asc) as sira
from #Main
) k
where sira=1

mssql - retrieve unique values of a column based on another column

I have a table with two columns: ColumnA, ColumnB, with rows:
| A | 1 |
| B | 1 |
| B | 2 |
| C | 1 |
| C | 1 |
| C | 1 |
| A | 2 |
| B | 1 |
| A | 2 |
| A | 1 |
I would like to write a query that would return all unique values for ColumnB, for each unique value of ColumnA, where ColumnA has more than 1 value in ColumnB i.e.
| A | 1 |
| A | 2 |
| B | 1 |
| B | 2 |
C 1 should be omitted because there is only one distinct value for ColumnA = 'C'
There might be a simpler approach but this works:
SELECT t.ColumnA, t2.ColumnB
FROM ( select ColumnA
from dbo.TableName t
group by t.ColumnA
having count(distinct t.ColumnB) > 1) t
CROSS APPLY ( select distinct t2.ColumnB
from dbo.TableName t2
where t.ColumnA=t2.ColumnA ) t2
The first subquery returns all unique ColumnA values that have multiple (different) ColumnB values. The 2nd subquery returns all distinct ColumnB values of those ColumnA-values with CROSS APPLY.
SELECT DISTINCT * FROM x WHERE ColumnA IN(
SELECT xd.ColumnA
FROM (
SELECT DISTINCT ColumnA, ColumnB FROM x
) xd
GROUP BY xd.ColumnA HAVING COUNT(*) > 1
)
SELECT y.ColumnA, y.ColumnB
FROM (
SELECT ColumnA, ColumnB, COUNT(*) OVER (PARTITION BY ColumnA) m
FROM x
GROUP BY ColumnA, ColumnB
) y
WHERE m > 1

Update All other Records Based on a single record

I have a table with a million records. I need to update some columns which are null based on the existing 'not null' records of a particular id based columns. I've tried with one query, it seems to be working fine but I don't have confidence in it that it will be able to update all those 1 million records exactly the way I need. I'm providing you some sample data how my table looks like.Any help will be appreciated
SELECT * INTO #TEST FROM (
SELECT 1 AS EMP_ID,10 AS DEPT_ID,15 AS ITEM_NBR ,NULL AS AMOUNT,NULL AS ITEM_NME
UNION ALL
SELECT 1,20,16,500,'ABCD'
UNION ALL
SELECT 1,30,17,NULL,NULL
UNION ALL
SELECT 2,10,15,1000,'XYZ'
UNION ALL
SELECT 2,30,16,NULL,NULL
UNION ALL
SELECT 2,40,17,NULL,NULL
) AS A
Sample data:
+--------+---------+----------+--------+----------+
| EMP_ID | DEPT_ID | ITEM_NBR | AMOUNT | ITEM_NME |
+--------+---------+----------+--------+----------+
| 1 | 10 | 15 | NULL | NULL |
| 1 | 20 | 16 | 500 | ABCD |
| 1 | 30 | 17 | NULL | NULL |
| 2 | 10 | 15 | 1000 | XYZ |
| 2 | 30 | 16 | NULL | NULL |
| 2 | 40 | 17 | NULL | NULL |
+--------+---------+----------+--------+----------+
Expected result:
+--------+---------+----------+--------+----------+
| EMP_ID | DEPT_ID | ITEM_NBR | AMOUNT | ITEM_NME |
+--------+---------+----------+--------+----------+
| 1 | 10 | 15 | 500 | ABCD |
| 1 | 20 | 16 | 500 | ABCD |
| 1 | 30 | 17 | 500 | ABCD |
| 2 | 10 | 15 | 1000 | XYZ |
| 2 | 30 | 16 | 1000 | XYZ |
| 2 | 40 | 17 | 1000 | XYZ |
+--------+---------+----------+--------+----------+
I tried this but I'm unable to conclude whether it is updating all the 1 million records properly.
SELECT * FROM #TEST T
inner JOIN #TEST T1 ON T1.EMP_ID=T.EMP_ID
WHERE T1.AMOUNT IS NOT NULL
UPDATE T SET AMOUNT=T1.AMOUNT
FROM #TEST T
inner JOIN #TEST T1 ON T1.EMP_ID=T.EMP_ID
WHERE T1.AMOUNT IS not NULL
I have used UPDATE using inner join
UPDATE T
SET T.AMOUNT = X.AMT,T.ITEM_NME=X.I_N
FROM #TEST T
JOIN
(SELECT EMP_ID,MAX(AMOUNT) AS AMT,MAX(ITEM_NME) AS I_N
FROM #TEST
GROUP BY EMP_ID) X ON X.EMP_ID = T.EMP_ID
SELECT * into #Test1
FROM #TEST
WHERE AMOUNT IS NOT NULL
For records validation run this query first
SELECT T.AMOUNT, T1.AMOUNT, T1.EMP_ID,T1.EMP_ID
FROM #TEST T
inner JOIN #TEST1 T1 ON T1.EMP_ID=T.EMP_ID
WHERE T.AMOUNT IS NULL
Begin Trans
UPDATE T
SET T.AMOUNT=T1.AMOUNT, T.ITEM_NME= = T1.ITEM_NME
FROM #TEST T
inner JOIN #TEST1 T1 ON T1.EMP_ID=T.EMP_ID
WHERE T.AMOUNT IS NULL
rollback
SELECT EMP_ID,MAX(AMOUNT) as AMOUNT MAX(ITEM_NAME) as ITEM_NAME
INTO #t
FROM #TEST
GROUP BY EMP_ID
UPDATE t SET t.AMOUNT = t1.AMOUNT, t.ITEM_NAME = t1.ITEM_NAME
FROM #TEST t INNER JOIN #t t1
ON t.emp_id = t1.emp_id
WHERE t.AMOUNT IS NULL and t.ITEM_NAME IS NULL
Use MAX aggregate function to get amount and item name for each employee and then replace null values of amount and item name with those values. For validation use COUNT function to calculate the number of rows with values of amount and item name as null. If the number of rows is zero then table is updated correctly

Resources