SQL Server UPDATE - GROUP BY - MAX - sql-server

This is SQL Server 2016. I have the following data in only one table:
custID | prodID | title | titleCount | isMasterTitle
--------+--------+--------+-------------+-----------
266 | 191750 | prod01 | 1 | 0
266 | 191750 | prod02 | 4 | 0
266 | 191750 | prod03 | 25 | 0
300 | 20125 | prod04 | 3 | 0
300 | 20125 | prod05 | 15 | 0
I want to group by custID, prodID and title and update isMasterTitle field to 1 for every max() titleCount per group.
So, I want the following:
custID | prodID | title | titleCount | isMasterTitle
--------+--------+----------+------------+---------------
266 | 191750 | prod01 | 1 | 0
266 | 191750 | prod02 | 4 | 0
266 | 191750 | prod03 | 25 | 1
300 | 20125 | prod04 | 3 | 0
300 | 20125 | prod05 | 15 | 1
I'm trying the following:
UPDATE [dbo].[_Variations]
SET isMasterTitle = 1
FROM [dbo].[_Variations] v1
INNER JOIN (SELECT custID, prodID, MAX(titleCount) AS mtitleCount
FROM [_Variations]
GROUP BY custID,prodID) as v2 ON v1.custID = v2.custID and v1.prodID = v2.prodID and v1.titleCount = v2.mtitleCount

try the following:
;with cte
as
(
select isMasterTitle, ROW_NUMBER() over (partition by custID, prodID order by titleCount desc) rn
from #t
)
update cte
set isMasterTitle = 1
where rn = 1
select * from #t
Your given code also works fine.
Please find the db<>fiddle here.

I would recommend leveraging a powerful feature of SQL Server called the updateable common-table-expression.
You can build a cte that uses window functions to identify which row should be updated, and then directly update it; there is no need to join again the original table in the outer query. This makes the query both shorter and more efficient:
with cte as (
select
isMaster,
row_number() over(partition by custID, prodID order by titleCount desc) rn
from [dbo].[_Variations]
)
update cte set isMaster = 1 where rn = 1

Related

How to check if SQL records are in a specific order

I'm having trouble figuring out how I can check if records on a table are in a specific order. The simplified table design is essentially this:
+------------+----------------+--------+
| ID (GUID) | StartDate | NumCol |
+------------+----------------+--------+
| CEE8C17... | 8/17/2019 3:11 | 22 |
| BB22001... | 8/17/2019 3:33 | 21 |
| 4D40B12... | 8/17/2019 3:47 | 21 |
| 3655125... | 8/17/2019 4:06 | 20 |
| 3456CD1... | 8/17/2019 4:22 | 20 |
| 38BAF92... | 8/17/2019 4:40 | 19 |
| E60CBE8... | 8/17/2019 5:09 | 19 |
| 5F2756B... | 8/17/2019 5:24 | 18 |
+------------+----------------+--------+
The ID column is a non-sequential GUID. The table is sorted by default on the StartDate when data is entered. However I am trying to flag instances where the NumCol values are out of descending order. The NumCol values can be identical on adjacent records, but ultimately they must be descending.
+--------+
| NumCol |
+--------+
| 22 |
| *20 | <- I want the ID where this occurs
| 21 |
| 20 |
| 20 |
| 19 |
| 19 |
| 18 |
+--------+
I've tried LEFT JOIN this table to itself, but can't seem to come up with an ON clause that gives the right results:
ON a.ID <> b.ID AND a.NumCol > b.NumCol
I also thought I could use OFFSET n ROWS to compare the default sorted table against one with an ORDER BY NumCol performed on it. I can't come up with anything that works.
I need a solution that will work for both SQL Server and SQL Compact.
With EXISTS:
select t.* from tablename t
where exists (
select 1 from tablename
where numcol > t.numcol and startdate > t.startdate
)
Or with row_number() window function:
select t.id, t.startdate, t.numcol
from (
select *,
row_number() over (order by startdate desc) rn1,
row_number() over (order by numcol) rn2
from tablename
) t
where rn1 > rn2
See the demo.
This might be easiest:
select * from T t1
where NumCol < (select max(NumCol) from T t2 where t2.StartDate > t1.StartDate);
The exists version is probably better to optimize though.
Using analytic functions you could try this approach which finds breaks in the monotonicity of consecutive rows. It might not return all the rows you're interested in seeing:
with data as (
select *, lag(NumCol) over (order by StartDate desc) as prevNumCol
from T
)
select * from data where prevNumCol > NumCol;
Here's a better solution that's probably not available in both of your environments:
with data as (
select *,
max(NumCol) over (
order by StartDate desc
rows between unbounded preceding and current row
) as prevMax
from T
)
select * from data where prevMax > NumCol;

Using groupBy to improve my Select (select count) query

Let's say we have this and want to see all Tasks, that havent been done yet and an additional column showing how many open Tasks there are left for this customer.
I have a table like this in my database:
+------------+--------------------------+-------+
| CustomerID | Task | Done |
+------------+--------------------------+-------+
| 1 | CleanRoom | False |
| 1 | Cleandishes | True |
| 1 | WashClothes | False |
| 2 | TakeDogsOut | True |
| 2 | PlayWithKids | True |
| 3 | HaveFunWithMrSamplesWife | True |
| 3 | CleanMrSamplesCar | False |
+------------+--------------------------+-------+
I need this as returned table:
+------------+-------------------+-------------+
| CustomerID | Task | DoneOverAll |
+------------+-------------------+-------------+
| 1 | CleanRoom | 2 |
| 1 | WashClothes | 2 |
| 3 | CleanMrSamplesCar | 1 |
+------------+-------------------+-------------+
Perfect return table would be like this, but I can do that myself when I have the one above:
About this a question; Doing this will probably be a String combination task. Should I do this on the Select statement, or would it be more advisable to do that in the final application on the client computer?
+------------+-------------------+-------------+
| CustomerID | Task | DoneOverAll |
+------------+-------------------+-------------+
| 1 | CleanRoom | 1/3 |
| 1 | WashClothes | 1/3 |
| 3 | CleanMrSamplesCar | 1/2 |
+------------+-------------------+-------------+
I know I could go like
SELECT
a.CustomerID,
a.Task,
(
Select count(*) from myTable where
customerID = a.CustomerID and
done = False
) as DoneOverAll
FROM myTable as a
WHERE Done = False
But I think that this is very ineffective, since it would execute a Select Count for each row in my table. Is there a way to achieve this with a JOIN using groupBy or something? I'm not into GroupBy commands yet.
Okay I should have tried first. Came up with the following;
Select count(*), CustomerID from myTable group by CustomerID
All I need to do now is to get this into a join.
Okay, got it. Sorry again for not trying first!
SELECT
a.CustomerID,
a.Task,
b.cnt
FROM myTable as a
LEFT JOIN (select count(*) AS cnt, CustomerID FROM myTable GROUP BY CustomerID) as b on a.CustomerID = B.CustomerID
WHERE Done = False
Question left;
Perfect return table would be like this, but I can do that myself when I have the one above:
About this a question; Doing this will probably be a String combination task. Should I do this on the Select statement, or would it be more advisable to do that in the final application on the client computer?
+------------+-------------------+-------------+
| CustomerID | Task | DoneOverAll |
+------------+-------------------+-------------+
| 1 | CleanRoom | 1/3 |
| 1 | WashClothes | 1/3 |
| 3 | CleanMrSamplesCar | 1/2 |
+------------+-------------------+-------------+
I'm not sure why Done = False, but this is your logic. :-)
Here's what I would do, without the LEFT JOIN.
SELECT
a.CustomerID,
a.Task,
SUM(CASE WHEN a.Done = 'False' THEN 1 ELSE 0 END) DoneOverAll,
SUM(Case WHEN a.Done = 'True' THEN 1 ELSE 0 END) NotDone
FROM myTable as a
Group By a.CustomerID, a.Task
Do calculate separately .
;with tempfalse as(
SELECT
a.CustomerID,
a.Task,
count(*) as DoneOverAll
FROM myTable as a
WHERE Done = False
group by a.CustomerID, a.Task
)
, temptrue (
SELECT
a.CustomerID,
a.Task,
count(*) as total
FROM myTable as a
group by a.CustomerID, a.Task
)
SELECT
a.CustomerID,
a.Task,
cast(NULLIF(DoneOverAll,0) as varchar (10) ) + '/' + cast(NULLIF(b.total,0) as varchar (10) )
from temptrue as a left join tempfalse b
on a.CustomerID =a.CustomerID and
a.Task = b.Task

Update All other Records Based on a single record

I have a table with a million records. I need to update some columns which are null based on the existing 'not null' records of a particular id based columns. I've tried with one query, it seems to be working fine but I don't have confidence in it that it will be able to update all those 1 million records exactly the way I need. I'm providing you some sample data how my table looks like.Any help will be appreciated
SELECT * INTO #TEST FROM (
SELECT 1 AS EMP_ID,10 AS DEPT_ID,15 AS ITEM_NBR ,NULL AS AMOUNT,NULL AS ITEM_NME
UNION ALL
SELECT 1,20,16,500,'ABCD'
UNION ALL
SELECT 1,30,17,NULL,NULL
UNION ALL
SELECT 2,10,15,1000,'XYZ'
UNION ALL
SELECT 2,30,16,NULL,NULL
UNION ALL
SELECT 2,40,17,NULL,NULL
) AS A
Sample data:
+--------+---------+----------+--------+----------+
| EMP_ID | DEPT_ID | ITEM_NBR | AMOUNT | ITEM_NME |
+--------+---------+----------+--------+----------+
| 1 | 10 | 15 | NULL | NULL |
| 1 | 20 | 16 | 500 | ABCD |
| 1 | 30 | 17 | NULL | NULL |
| 2 | 10 | 15 | 1000 | XYZ |
| 2 | 30 | 16 | NULL | NULL |
| 2 | 40 | 17 | NULL | NULL |
+--------+---------+----------+--------+----------+
Expected result:
+--------+---------+----------+--------+----------+
| EMP_ID | DEPT_ID | ITEM_NBR | AMOUNT | ITEM_NME |
+--------+---------+----------+--------+----------+
| 1 | 10 | 15 | 500 | ABCD |
| 1 | 20 | 16 | 500 | ABCD |
| 1 | 30 | 17 | 500 | ABCD |
| 2 | 10 | 15 | 1000 | XYZ |
| 2 | 30 | 16 | 1000 | XYZ |
| 2 | 40 | 17 | 1000 | XYZ |
+--------+---------+----------+--------+----------+
I tried this but I'm unable to conclude whether it is updating all the 1 million records properly.
SELECT * FROM #TEST T
inner JOIN #TEST T1 ON T1.EMP_ID=T.EMP_ID
WHERE T1.AMOUNT IS NOT NULL
UPDATE T SET AMOUNT=T1.AMOUNT
FROM #TEST T
inner JOIN #TEST T1 ON T1.EMP_ID=T.EMP_ID
WHERE T1.AMOUNT IS not NULL
I have used UPDATE using inner join
UPDATE T
SET T.AMOUNT = X.AMT,T.ITEM_NME=X.I_N
FROM #TEST T
JOIN
(SELECT EMP_ID,MAX(AMOUNT) AS AMT,MAX(ITEM_NME) AS I_N
FROM #TEST
GROUP BY EMP_ID) X ON X.EMP_ID = T.EMP_ID
SELECT * into #Test1
FROM #TEST
WHERE AMOUNT IS NOT NULL
For records validation run this query first
SELECT T.AMOUNT, T1.AMOUNT, T1.EMP_ID,T1.EMP_ID
FROM #TEST T
inner JOIN #TEST1 T1 ON T1.EMP_ID=T.EMP_ID
WHERE T.AMOUNT IS NULL
Begin Trans
UPDATE T
SET T.AMOUNT=T1.AMOUNT, T.ITEM_NME= = T1.ITEM_NME
FROM #TEST T
inner JOIN #TEST1 T1 ON T1.EMP_ID=T.EMP_ID
WHERE T.AMOUNT IS NULL
rollback
SELECT EMP_ID,MAX(AMOUNT) as AMOUNT MAX(ITEM_NAME) as ITEM_NAME
INTO #t
FROM #TEST
GROUP BY EMP_ID
UPDATE t SET t.AMOUNT = t1.AMOUNT, t.ITEM_NAME = t1.ITEM_NAME
FROM #TEST t INNER JOIN #t t1
ON t.emp_id = t1.emp_id
WHERE t.AMOUNT IS NULL and t.ITEM_NAME IS NULL
Use MAX aggregate function to get amount and item name for each employee and then replace null values of amount and item name with those values. For validation use COUNT function to calculate the number of rows with values of amount and item name as null. If the number of rows is zero then table is updated correctly

Where clause if there are multiple of the same ID

I have following table:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SAP | Mike | 21 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I want to have one record for each ID. The idea behind this is that if the ID comes only once (no matter the Source), that record will be taken. But, If there are 2 records for one ID, the one containing SQL as source will be the used record here.
So, In this case, the result will be:
ID | source | Name | Age | ... | ...
1 | SQL | John | 18 | ... | ...
2 | SQL | Mike | 20 | ... | ...
3 | SAP | Jill | 25 | ... | ...
I did this with a partition over (ordered by Source desc), but that wouldn't work well if a third source will be added one day.
Any other options/ideas?
The easiest approach(in my opinion) is using a CTE with a ranking function:
with cte as
(
select ID, source, Name, Age, ... ,
rn = row_number() over (partition by ID order by case when source = 'sql'
then 0 else 1 end asc)
from dbo.tablename
)
select ID, source, Name, Age, ...
from cte
where rn = 1
You can use ROW_NUMBER:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER( PARTITION BY ID
ORDER BY CASE WHEN [Source] = 'SQL' THEN 1 ELSE 2 END)
FROM dbo.YourTable
)
SELECT *
FROM CTE
WHERE RN = 1;
You can use the WITH TIES clause and the window function Row_Number()
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By ID Order By Case When Source = 'SQL' Then 0 Else 1 End)
How about
SELECT *
FROM table
WHERE ID in (
SELECT ID FROM test
group by ID
having count(ID) = 1)
OR source = 'SQL'

TSQL pivot issue

Hello I have a temp table (#tempResult) that contains results like the following...
-----------------------------------------
| DrugAliasID | Dosage1 | Unit1 | rowID |
-----------------------------------------
| 322 | 10 | MG | 1 |
| 322 | 50 | ML | 2 |
| 441 | 20 | ML | 3 |
| 443 | 15 | ML | 4 |
-----------------------------------------
I'm looking to get the results to be like the following, pivoting the rows that have the same DrugAliasID.
--------------------------------------------------
| DrugAliasID | Dosage1 | Unit1 | Dosage2 | Unit2 |
--------------------------------------------------
| 322 | 10 | MG | 50 | ML |
| 441 | 20 | ML | NULL | NULL |
| 443 | 15 | ML | NULL | NULL |
--------------------------------------------------
So far I have a solution that isn't using pivot. I'm not too good with pivot and was wondering if anyone knew how to use it in this scenario. Or solve it some other way. Thanks
SELECT
tr.drugAliasID,
MIN(trmin.dosage1) AS dosage1,
MIN(trmin.unit1) AS unit1,
MIN(trmax.dosage1) AS dosage2,
MIN(trmax.unit1) AS unit2
FROM
#tempResult tr
JOIN
#tempResult trmin ON trmin.RowID = tr.rowid AND trmin.drugAliasID = tr.drugAliasID
JOIN
#tempResult trmax ON trmax.RowID = tr.rowid AND trmax.drugAliasID = tr.drugAliasID
JOIN
(SELECT
MIN(RowID) AS rowid,
drugAliasID
FROM
#tempResult
GROUP BY
drugAliasID) tr1 ON tr1.rowid = trmin.RowID
JOIN
(SELECT
MAX(RowID) AS rowid,
drugAliasID
FROM
#tempResult
GROUP BY
drugAliasID) tr2 ON tr2.rowid = tr.RowID
GROUP BY
tr.drugAliasID
HAVING
count(tr.drugAliasID) > 1
Assuming your version of SQL Server supports the use of CTEs, you can simplify your query thus:
;with cte as
(select *, row_number() over (partition by drugaliasid order by rowid) rn
from #tempResult
)
select c.drugaliasid, c.dosage1, c.unit1, c2.dosage1 as dosage2, c2.unit1 as unit2
from cte c
left join cte c2 on c.drugaliasid = c2.drugaliasid and c.rn = 1 and c2.rn = 2
where c.rn = 1
Demo
This will give you the desired result, without having to use the pivot keyword.

Resources