I have a recursive query that is working as intended for calculating weighted average cost for inventory calculation. My problem is that I need multiple weighted average from the same query grouped by different columns. I know I can solve the issue by calculating it multiple times, one for each key-column. But because of query performance considerations, I want it to be traversed once. Sometimes I have 1M+ rows.
I have simplified the data and replaced weighted average to a simple sum to make my problem more easy to follow.
How can I get the result below using recursive cte? Remember that I have to use a recursive query to calculate weighted average cost. I am on sql server 2016.
Example data (Id is also the sort order. The Id and Key is unique together.)
Id Key1 Key2 Key3 Value
1 1 1 1 10
2 1 1 1 10
3 1 2 1 10
4 2 2 1 10
5 1 2 1 10
6 1 1 2 10
7 1 1 1 10
8 3 3 1 10
Expected result
Id Key1 Key2 Key3 Value Key1Sum Key2Sum Key3Sum
1 1 1 1 10 10 10 10
2 1 1 1 10 20 20 20
3 1 2 1 10 30 10 30
4 2 2 1 10 10 20 40
5 1 2 1 10 40 30 50
6 1 1 2 10 50 30 10
7 1 1 1 10 60 40 60
8 3 3 1 10 10 10 70
EDIT
After some well deserved criticism I have to be much better in how I make a question.
Here is an example and why I need a recursive query. In the example I get the result for Key1, but I need it for Key2 and Key3 as well in the same query. I know that I can repeat the same query three times, but that is not preferable.
DECLARE #InventoryItem AS TABLE (
IntentoryItemId INT NULL,
InventoryOrder INT,
Key1 INT NULL,
Key2 INT NULL,
Key3 INT NULL,
Quantity NUMERIC(22,9) NOT NULL,
Price NUMERIC(16,9) NOT NULL
);
INSERT INTO #InventoryItem (
IntentoryItemId,
InventoryOrder,
Key1,
Key2,
Key3,
Quantity,
Price
)
VALUES
(1, NULL, 1, 1, 1, 10, 1),
(2, NULL, 1, 1, 1, 10, 2),
(3, NULL, 1, 2, 1, 10, 2),
(4, NULL, 2, 2, 1, 10, 1),
(5, NULL, 1, 2, 1, 10, 5),
(6, NULL, 1, 1, 2, 10, 3),
(7, NULL, 1, 1, 1, 10, 3),
(8, NULL, 3, 3, 1, 10, 1);
--The steps below will give me the cost "grouped" by Key1
WITH Key1RowNumber AS (
SELECT
IntentoryItemId,
ROW_NUMBER() OVER (PARTITION BY Key1 ORDER BY IntentoryItemId) AS RowNumber
FROM #InventoryItem
)
UPDATE #InventoryItem
SET InventoryOrder = Key1RowNumber.RowNumber
FROM #InventoryItem InventoryItem
INNER JOIN Key1RowNumber
ON Key1RowNumber.IntentoryItemId = InventoryItem.IntentoryItemId;
WITH cte AS (
SELECT
IntentoryItemId,
InventoryOrder,
Key1,
Quantity,
Price,
CONVERT(NUMERIC(22,9), InventoryItem.Quantity) AS CurrentQuantity,
CONVERT(NUMERIC(22,9), (InventoryItem.Quantity * InventoryItem.Price) / NULLIF(InventoryItem.Quantity, 0)) AS AvgPrice
FROM #InventoryItem InventoryItem
WHERE InventoryItem.InventoryOrder = 1
UNION ALL
SELECT
Sub.IntentoryItemId,
Sub.InventoryOrder,
Sub.Key1,
Sub.Quantity,
Sub.Price,
CONVERT(NUMERIC(22,9), Main.CurrentQuantity + Sub.Quantity) AS CurrentQuantity,
CONVERT(NUMERIC(22,9),
((Main.CurrentQuantity) * Main.AvgPrice + Sub.Quantity * Sub.price)
/
NULLIF((Main.CurrentQuantity) + Sub.Quantity, 0)
) AS AvgPrice
FROM CTE Main
INNER JOIN #InventoryItem Sub
ON Main.Key1 = Sub.Key1
AND Sub.InventoryOrder = main.InventoryOrder + 1
)
SELECT cte.IntentoryItemId, cte.AvgPrice
FROM cte
ORDER BY IntentoryItemId
Why you will want to calculate on 1M+ rows ?
Secondly I think your db design is wrong ? key1 ,key2,key3 should have been unpivoted and one column called Keys and 1 more column to identify each key group.
It will be clear to you in below example.
If I am able to optimize my query then I can think of calculating many rows else I try to limit number of rows.
Also if possible you can think of keeping calculated column of Avg Price.i.e. when table is populated then you can calculate and store it.
First let us know, if output is correct or not.
DECLARE #InventoryItem AS TABLE (
IntentoryItemId INT NULL,
InventoryOrder INT,
Key1 INT NULL,
Key2 INT NULL,
Key3 INT NULL,
Quantity NUMERIC(22,9) NOT NULL,
Price NUMERIC(16,9) NOT NULL
);
INSERT INTO #InventoryItem (
IntentoryItemId,
InventoryOrder,
Key1,
Key2,
Key3,
Quantity,
Price
)
VALUES
(1, NULL, 1, 1, 1, 10, 1),
(2, NULL, 1, 1, 1, 10, 2),
(3, NULL, 1, 2, 1, 10, 2),
(4, NULL, 2, 2, 1, 10, 1),
(5, NULL, 1, 2, 1, 10, 5),
(6, NULL, 1, 1, 2, 10, 3),
(7, NULL, 1, 1, 1, 10, 3),
(8, NULL, 3, 3, 1, 10, 1);
--select * from #InventoryItem
--return
;with cte as
(
select *
, ROW_NUMBER() OVER (PARTITION BY Key1 ORDER BY IntentoryItemId) AS rn1
, ROW_NUMBER() OVER (PARTITION BY Key2 ORDER BY IntentoryItemId) AS rn2
, ROW_NUMBER() OVER (PARTITION BY Key3 ORDER BY IntentoryItemId) AS rn3
from #InventoryItem
)
,cte1 AS (
SELECT
IntentoryItemId,
Key1 keys,
Quantity,
Price
,rn1
,rn1 rn
,1 pk
FROM cte c
union ALL
SELECT
IntentoryItemId,
Key2 keys,
Quantity,
Price
,rn1
,rn2 rn
,2 pk
FROM cte c
union ALL
SELECT
IntentoryItemId,
Key3 keys,
Quantity,
Price
,rn1
,rn3 rn
,3 pk
FROM cte c
)
, cte2 AS (
SELECT
IntentoryItemId,
rn,
Keys,
Quantity,
Price,
CONVERT(NUMERIC(22,9), InventoryItem.Quantity) AS CurrentQuantity,
CONVERT(NUMERIC(22,9), (InventoryItem.Quantity * InventoryItem.Price)) a,
CONVERT(NUMERIC(22,9), InventoryItem.Price) b,
CONVERT(NUMERIC(22,9), (InventoryItem.Quantity * InventoryItem.Price) / NULLIF(InventoryItem.Quantity, 0)) AS AvgPrice
,pk
FROM cte1 InventoryItem
WHERE InventoryItem.rn = 1
UNION ALL
SELECT
Sub.IntentoryItemId,
sub.rn,
Sub.Keys,
Sub.Quantity,
Sub.Price,
CONVERT(NUMERIC(22,9), Main.CurrentQuantity + Sub.Quantity) AS CurrentQuantity,
CONVERT(NUMERIC(22,9),Main.CurrentQuantity * Main.AvgPrice),
CONVERT(NUMERIC(22,9),Sub.Quantity * Sub.price),
CONVERT(NUMERIC(22,9),
((Main.CurrentQuantity * Main.AvgPrice) + (Sub.Quantity * Sub.price))
/
NULLIF(((Main.CurrentQuantity) + Sub.Quantity), 0)
) AS AvgPrice
,sub.pk
FROM CTE2 Main
INNER JOIN cte1 Sub
ON Main.Keys = Sub.Keys and main.pk=sub.pk
AND Sub.rn = main.rn + 1
--and Sub.InventoryOrder<=2
)
select *
,(select AvgPrice from cte2 c1 where pk=2 and c1.IntentoryItemId=c.IntentoryItemId ) AvgPrice2
,(select AvgPrice from cte2 c1 where pk=2 and c1.IntentoryItemId=c.IntentoryItemId ) AvgPrice3
from cte2 c
where pk=1
ORDER BY pk,rn
Alternate Solution (for Sql 2012+) and many thanks to Jason,
SELECT *
,CONVERT(NUMERIC(22,9),avg((Quantity * Price) / NULLIF(Quantity, 0))
OVER(PARTITION BY Key1 ORDER by IntentoryItemId ROWS UNBOUNDED PRECEDING))AvgKey1Price
,CONVERT(NUMERIC(22,9),avg((Quantity * Price) / NULLIF(Quantity, 0))
OVER(PARTITION BY Key2 ORDER by IntentoryItemId ROWS UNBOUNDED PRECEDING))AvgKey2Price
,CONVERT(NUMERIC(22,9),avg((Quantity * Price) / NULLIF(Quantity, 0))
OVER(PARTITION BY Key3 ORDER by IntentoryItemId ROWS UNBOUNDED PRECEDING))AvgKey3Price
from #InventoryItem
order by IntentoryItemId
Here's how to do it in SQL Server 2012 & later...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
DROP TABLE #TestData;
CREATE TABLE #TestData (
Id INT,
Key1 INT,
Key2 INT,
Key3 INT,
[Value] INT
);
INSERT #TestData(Id, Key1, Key2, Key3, Value) VALUES
(1, 1, 1, 1, 10),
(2, 1, 1, 1, 10),
(3, 1, 2, 1, 10),
(4, 2, 2, 1, 10),
(5, 1, 2, 1, 10),
(6, 1, 1, 2, 10),
(7, 1, 1, 1, 10),
(8, 3, 3, 1, 10);
--=============================================================
SELECT
td.Id, td.Key1, td.Key2, td.Key3, td.Value,
Key1Sum = SUM(td.[Value]) OVER (PARTITION BY td.Key1 ORDER BY td.Id ROWS UNBOUNDED PRECEDING),
Key2Sum = SUM(td.[Value]) OVER (PARTITION BY td.Key2 ORDER BY td.Id ROWS UNBOUNDED PRECEDING),
Key3Sum = SUM(td.[Value]) OVER (PARTITION BY td.Key3 ORDER BY td.Id ROWS UNBOUNDED PRECEDING)
FROM
#TestData td
ORDER BY
td.Id;
results...
Id Key1 Key2 Key3 Value Key1Sum Key2Sum Key3Sum
----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
1 1 1 1 10 10 10 10
2 1 1 1 10 20 20 20
3 1 2 1 10 30 10 30
4 2 2 1 10 10 20 40
5 1 2 1 10 40 30 50
6 1 1 2 10 50 30 10
7 1 1 1 10 60 40 60
8 3 3 1 10 10 10 70
Related
First, I apologize if the title won't make sense but below is the detailed scenario.
Say I have a document_revision table
id document_id phase_id user_id
1 1 3 1
2 1 2 1
3 1 1 1
4 2 3 2
5 2 2 2
where phase_id is: transcribe = 3; proof = 2; and submit = 1.
I would like to write a query where I can filter the revision records where I will disregard a proof phase if the same user did the transcribe and proof. So the output would be:
id document_id phase_id user_id
1 1 3 1
3 1 1 1
4 2 3 2
I've been struggling for hours figuring out a query for this but no luck so far.
Assuming you only want the phase 3 for any case where a user_id was involved in phase 2 and 3, then one way you could do this is with ROW_NUMBER(), e.g.:
DECLARE #T TABLE (ID INT IDENTITY(1, 1), Document_ID INT, Phase_ID INT, [User_ID] INT);
INSERT #T (Document_ID, Phase_ID, [User_ID]) VALUES
(1, 1, 1), (1, 2, 1), (1, 3, 1), (2, 3, 2), (2, 2, 2), (3, 1, 1), (3, 2, 1), (3, 3, 2);
SELECT ID, Document_ID, Phase_ID, [User_ID]
FROM
(
SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY Document_ID, [User_ID], CASE WHEN Phase_ID IN (2, 3) THEN 2 ELSE Phase_ID END ORDER BY Phase_ID DESC)
FROM #T
) AS T
WHERE RN = 1;
DECLARE #document_revision TABLE (
id INT IDENTITY(1,1),
document_id INT,
phase_id INT,
user_id INT
);
INSERT INTO #document_revision
(document_id, phase_id, user_id)
VALUES
(1, 3, 1),
(1, 2, 1),
(1, 1, 1),
(2, 3, 2),
(2, 2, 2),
-- To test a scenario where there is a proof and a submit with no transcribe phases and same document
(3, 2, 3),
(3, 1, 3),
-- To test a scenario where there is a transcribe and a submit with no proof phases and same document
(4, 3, 4),
(4, 1, 4),
-- To test a scenario where there is a proof and a submit with no transcribe phase (for document_id 5) but different document and same user as above
(5, 2, 4);
SELECT dr.id
, dr.document_id
, dr.phase_id
, dr.user_id
FROM #document_revision AS dr
WHERE NOT EXISTS ( SELECT 1
FROM #document_revision AS temp
-- Same user
WHERE temp.user_id = dr.user_id
-- Same document
AND temp.document_id = dr.document_id
-- To check if there is already a transcribe phase_id with the same user_id and document_id
AND temp.phase_id = 3
-- -- To check if there is already a proof phase_id with the same user_id and document_id
AND dr.phase_id = 2 )
results:
id document_id phase_id user_id
1 1 3 1
3 1 1 1
4 2 3 2
6 3 2 3
7 3 1 3
8 4 3 4
9 4 1 4
10 5 2 4
I have data that looks like ID and Col1, where the value 01 in Col1 denotes the start of a related group of rows lasting until the next 01.
Sample Data:
ID Col1
1 01
2 02
3 02
---------
4 01
5 02
6 03
7 03
----------
8 01
9 03
----------
10 01
I need to calculate GroupTotal, which provides a running total of '01' from Col1, and also GroupID, which is an increment ID that resets at every instance of '01' in Col 1. Row order must be preserved with ID.
Desired Results:
ID Col1 GroupTotal GroupID
1 01 1 1
2 02 1 2
3 02 1 3
----------------------------
4 01 2 1
5 02 2 2
6 03 2 3
7 03 2 4
----------------------------
8 01 3 1
9 03 3 2
----------------------------
10 01 4 1
I've been messing with OVER, PARTITION BY etc. and cannot crack either.
Thanks
I believe what the OP is saying is that the only data available is a table with the id and col1 data, and that the desired results is what is currently posted in the question.
If that is the case, you just need the following.
Sample Data Setup:
declare #grp_tbl table (id int, col1 int)
insert into #grp_tbl (id, col1)
values (1, 1),(2, 2),(3, 2),(4, 1),(5, 2),(6, 3),(7, 3),(8, 1),(9, 3),(10, 1)
Answer:
declare #max_id int = (select max(id) from #grp_tbl)
; with grp_cnt as
(
--getting the range of ids that are in each group
--and ranking them
select gt.id
, lead(gt.id - 1, 1, #max_id) over (order by gt.id asc) as id_max --max id in the group
, row_number() over (order by gt.id asc) as grp_ttl
from #grp_tbl as gt
where 1=1
and gt.col1 = 1
)
--ranking the range of ids inside each group
select gt.id
, gt.col1
, gc.grp_ttl as group_total
, row_number() over (partition by gc.grp_ttl order by gt.id asc) as group_id
from #grp_tbl as gt
left join grp_cnt as gc on gt.id between gc.id and gc.id_max
Final Results:
id col1 group_total group_id
1 1 1 1
2 2 1 2
3 2 1 3
4 1 2 1
5 2 2 2
6 3 2 3
7 3 2 4
8 1 3 1
9 3 3 2
10 1 4 1
If I understood correctly, this is what you want:
CREATE TABLE #tmp
([ID] int, [Col1] int, [GroupTotal] int, [GroupID] int)
;
INSERT INTO #tmp
([ID], [Col1], [GroupTotal], [GroupID])
VALUES
(1, 01, 1, 1),
(2, 02, 1, 2),
(3, 02, 1, 3),
(4, 01, 2, 1),
(5, 02, 2, 2),
(6, 03, 2, 3),
(7, 03, 2, 4),
(8, 01, 3, 1),
(9, 03, 3, 2),
(10, 01, 4, 1)
;
select *, row_number() over (partition by Grp order by ID) as GrpID From (
select ID, Col1, [GroupTotal],
sum(case when Col1 = '01' then 1 else 0 end) over (Order by ID) as Grp,
[GroupID]
from #tmp
The sum handles the groups with case, 1 is added always when Col1=01, and that's then used in the row_number to partition the groups.
Example
I'm not really sure what you are after but you are on the right tracks with partitioning functions. The following calculates a running total of groupid by grouptotal. I'm sure that's not what you want but it shows you how you can achieve it.
select *, SUM(GroupId) over (partition by grouptotal order by id)
from #tmp
order by grouptotal, id
I have a data set within Microsoft SQL that looks like so
ID Value1 Value2
1 8 4
1 4 2
1 9 3
1 3 1
2 4 9
2 5 7
2 6 4
2 7 5
2 1 1
I am trying to pull only the data from the corresponding row containing a max value in column 1 grouped by the ID number. The result should be as follows
ID Value1 Value2
1 9 3
2 7 5
The following is what I have tried, but have been unsuccessful. It works if Value2 is removed.
USE [Database]
SELECT [ID],
MAX([Value1]) as Value1,
[Value2]
FROM [dbo].[Datatable]
GROUP BY [ID]
The ROW_NUMBER() window function can be used to partition the table by ID and rank the data within (by Value1 in descending order, in this case).
DECLARE #DataTable TABLE (ID INT, Value1 INT, Value2 INT);
INSERT #DataTable (ID, Value1, Value2)
VALUES (1, 8, 4)
, (1, 4, 2)
, (1, 9, 3)
, (1, 3, 1)
, (2, 4, 9)
, (2, 5, 7)
, (2, 6, 4)
, (2, 7, 5)
, (2, 1, 1);
SELECT ID, Value1, Value2
FROM (
SELECT ID, Value1, Value2, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Value1 DESC) RN
FROM #DataTable) T
WHERE RN = 1;
Alternatively, if there are possible matches on the maximum value of Value1, use RANK() (or DENSE_RANK())
Did you try searching for the Max value of value2 ? Like this
USE [Database]
SELECT [ID],
MAX([Value1]) as Value1,
MAX([Value2]) as Value2
FROM [dbo].[Datatable]
GROUP BY [ID]
I am trying to work out how I can tag unique (what i am calling) blocks (or segments if you will) which have a start and end based consecutive 'Trip' rows ordered by 'epoch' sharing the same 'code'. In this case group by 'trip', 'code' will not work as I need to measure the duration of the 'code' remains constant for the trip. I've tried to use a CTE but I have been unable to partition the data in such a way that it gives desired result shown below. The block number I've shown could be any value, just so long as it is unique so that it tags the consecutive occurrences of the same 'code' on the trip in order of 'epoch'.
Any ideas?
declare #data table (id int, trip int, code int NULL, epoch int, value1 int, value2 int);
insert into #data (id, trip, code, epoch, value1, value2)
values
(1, 1, null, 31631613, 0, 0),
(2, 2, 1, 31631614, 10, 40),
(3, 1, 1, 31631616, 10, 60),
(4, 1, 1, 31631617, 40, 60),
(5, 2, 1, 31631617, 23, 40),
(6, 2, 2, 31631620, 27, 40),
(7, 2, 2, 31631629, 23, 40),
(9, 1, 1, 31631618, 39, 60),
(10, 1, null, 31631621, 38, 60),
(12, 1, null, 31631625, 37, 60),
(15, 1, null, 31631627, 35, 60),
(19, 1, 1, 31631630, 39, 60),
(20, 1, 1, 31631632, 40, 60),
(21, 2, 1, 31631629, 23, 40);
block id trip code epoch value1 value2
1 1 1 NULL 31631613 0 0
2 2 2 1 31631614 10 40
2 5 2 1 31631617 23 40
3 3 1 1 31631616 10 60
3 4 1 1 31631617 40 60
3 9 1 1 31631618 39 60
4 6 2 2 31631620 27 40
4 7 2 2 31631629 23 40
5 10 1 NULL 31631621 38 60
5 12 1 NULL 31631625 37 60
5 15 1 NULL 31631627 35 60
6 19 1 1 31631630 39 60
6 20 1 1 31631632 40 60
7 21 2 1 31631629 23 40
You didn't update your expected output so I'm still not 100% sure this is what you want, but give it a try...
SELECT
DENSE_RANK() OVER (ORDER BY trip, code),
*
FROM
#data
ORDER BY
trip, code, epoch
Ok, it's far from perfect by any means but it is a starter that at least identifies the start and end of a contiguous block where the 'code' has remained the same for the trip. For the sake of at least contributing something I'll post what I jerried up. If I ever get time to do a proper job I'll post it.
declare #minint int; set #minint = -2147483648;
declare #maxint int; set #maxint = 2147483647;
declare #id_data table (pk int IDENTITY(1,1), id int, trip int, code int NULL, epoch int, value1 int, value2 int);
insert into #id_data VALUES(#minint, #minint, #minint, #minint, #minint, #minint);
insert into #id_data
SELECT id, trip, coalesce(code,0), epoch, value1, value2
FROM #data
order by trip, epoch, code;
insert into #id_data VALUES(#maxint, #maxint, #maxint, #maxint, #maxint, #maxint);
WITH CTE as
(
SELECT pk, id, trip, code, epoch, value1, value2, ROW_NUMBER() OVER (PARTITION BY trip ORDER BY epoch) as row_num
FROM #id_data
)
SELECT B.*, A.code, C.min_next_code
FROM CTE A
INNER JOIN CTE B ON (B.pk = A.pk + 1) AND (A.code != B.code) -- SELECTS THE RECORDS THAT START A NEW GROUP
OUTER APPLY (
SELECT min_next_code = MIN(pk) - 1 -- LOCATION OF NEXT GROUP
FROM CTE
WHERE pk > B.pk AND (trip = B.trip) AND (code != B.code)
) C
WHERE B.id < #maxint
For every unique combination of BoxId and Revision with a single UnitTypeId of 1 and a single UnitTypeId of 2 both having a NULL SetNumber, assign a SetNumber of 1.
Table and data setup:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[UnitTypes]') AND type in (N'U'))
Drop Table dbo.UnitTypes
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Tracking]') AND type in (N'U'))
DROP TABLE [dbo].[Tracking]
GO
CREATE TABLE dbo.UnitTypes
(
Id int NOT NULL,
Notes varchar(80)
)
GO
CREATE TABLE dbo.Tracking
(
Id int NOT NULL IDENTITY (1, 1),
BoxId int NOT NULL,
Revision int NOT NULL,
UnitValue int NULL,
UnitTypeId int NULL,
SetNumber int NULL
)
GO
ALTER TABLE dbo.Tracking ADD CONSTRAINT
PK_Tracking PRIMARY KEY CLUSTERED
(
Id
)
GO
Insert Into dbo.UnitTypes (Id, Notes) Values (1, 'X Coord'),
(2, 'Y Coord'),
(3, 'Weight'),
(4, 'Length')
Go
Insert Into dbo.Tracking (BoxId, Revision, UnitValue, UnitTypeId, SetNumber)
Values (1165, 1, 150, 1, NULL),
(1165, 1, 1477, 2, NULL),
(1165, 1, 31, 4, NULL),
(1166, 1, 425, 1, 1),
(1166, 1, 1146, 2, 1),
(1166, 1, 438, 1, NULL),
(1166, 1, 1163, 2, NULL),
(1167, 1, 560, 1, NULL),
(1167, 1, 909, 2, NULL),
(1167, 1, 12763, 3, NULL),
(1168, 1, 21, 1, NULL),
(1168, 1, 13109, 3, NULL)
The ideal results would be:
Id BoxId Revision UnitValue UnitTypeId SetNumber
1 1165 1 150 1 1
2 1165 1 1477 2 1
3 1165 1 31 4 1
4 1166 1 425 1 1
5 1166 1 1146 2 1
6 1166 1 438 1 NULL <--NULL Because there is already an existing Set
7 1166 1 1163 2 NULL <--NULL Because there is already an existing Set
8 1167 1 560 1 1
9 1167 1 909 2 1
10 1167 1 12763 3 1
11 1168 1 21 1 NULL <--NULL Because there is not exactly one UnitTypeId of 1 and exactly one UnitTypeId of 2 for this BoxId\Revision combination.
12 1168 1 13109 3 NULL <--NULL Because there is not exactly one UnitTypeId of 1 and exactly one UnitTypeId of 2 for this BoxId\Revision combination.
EDIT:
The question is how can I update the SetNumber, given the constraints above, using pure TSQL?
If I understand your question correctly, you could do this with a subquery that demands all conditions are met:
update t1
set SetNumber = 1
from dbo.Tracking t1
where SetNumber is null
and 1 =
(
select case
when count(case when t2.UnitTypeId = 1 then 1 end) <> 1 then 0
when count(case when t2.UnitTypeId = 2 then 1 end) <> 1 then 0
when count(t2.SetNumber) <> 0 then 0
else 1
end
from dbo.Tracking t2
where t1.BoxId = t2.BoxId
and t1.Revision = t2.Revision
)
The count(t2.SetNumber) is a bit tricky: this will only count rows where SetNumber is not null. So this meets the criterion that no other set with the same (BoxId, Revision) exists.
Try this out, it returns the same results that you gave. The WITH statement sets up a CTE to query from. The ROW_NUMBER() function is partitioning function that does what you want:
;WITH BoxSets AS (
SELECT
ID
,BoxId
,Revision
,UnitValue
,UnitTypeId
,CASE WHEN UnitTypeId IN (1,2) THEN 1 ELSE 0 END ValidUnit
,ROW_NUMBER() OVER (PARTITION BY BoxID,UnitTypeID ORDER BY BoxID,UnitTypeID,UnitValue ) SetNumber
FROM Tracking
)
SELECT
b.ID
,b.BoxId
,b.Revision
,b.UnitValue
,b.UnitTypeId
,CASE ISNULL(b1.ValidUnits,0) WHEN 0 THEN NULL ELSE CASE b.SetNumber WHEN 1 THEN b.SetNumber ELSE NULL END END
FROM BoxSets AS b
LEFT JOIN (SELECT
BoxID
,SUM(ValidUnit) AS ValidUnits
FROM BoxSets
GROUP BY BoxId
HAVING SUM(ValidUnit) > 1) AS b1 ON b.BoxId = b1.BoxId