Fill gaps with set based on priority - sql-server

Goal
For each Foo we should try the fill the time frames as much as possible with records from FooBar. It's ok to have empty time frames when no records exists in FooBar.
The records in FooBar behave as a set.
Meaning if FooId and the time frame are (exactly) the same the BarId's are all valid in this time frame.
Gaps should be filled based on the sets priority.
Table structure
CREATE TABLE Foo(
FooId INT NOT NULL,
ValidFrom DATETIME NOT NULL,
ValidUntil DATETIME NOT NULL,
)
CREATE TABLE FooBar (
FooId INT NOT NULL,
BarId INT NOT NULL,
ValidFrom DATETIME NOT NULL,
ValidUntil DATETIME NOT NULL,
Priority TINYINT NOT NULL,
)
Sample data
INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES
(1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')
INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority)
VALUES
-- First set for FooId = 1 with prio 1
(1, 1, '2021-01-01', '2021-03-01', 1)
, (1, 2, '2021-01-01', '2021-03-01', 1)
, (1, 3, '2021-01-01', '2021-03-01', 1)
-- Second set for FooId = 1 with prio 2
, (1, 1, '2021-02-01', '2021-06-01', 2)
, (1, 2, '2021-02-01', '2021-06-01', 2)
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3)
, (1, 2, '2021-01-01', '2021-12-31', 3)
, (1, 3, '2021-01-01', '2021-12-31', 3)
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1)
, (1, 5, '2021-04-01', '2021-04-02', 1)
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3)
Expected result
Origin column to clarify, shouldn't be part of the generated result set
FooId
BarId
ValidFrom
ValidUntil
Origin
1
1
2021-01-01
2021-03-01
First set
1
2
2021-01-01
2021-03-01
First set
1
3
2021-01-01
2021-03-01
First set
1
1
2021-03-02
2021-03-31
Second set
1
2
2021-03-02
2021-03-31
Second set
1
4
2021-04-01
2021-04-02
Fourth set
1
5
2021-04-01
2021-04-02
Fourth set
1
1
2021-04-03
2021-06-01
Second set
1
2
2021-04-03
2021-06-01
Second set
1
1
2021-06-02
2021-12-31
Third set
1
2
2021-06-02
2021-12-31
Third set
1
3
2021-06-02
2021-12-31
Third set
2
6
2021-01-01
2021-12-31
First set (FooId = 2)
I know this is possible with a cursor or while loop, but I'm looking for a more performant/elegant solution.
Compatibility level is: 130

It's ok to have empty time frames when no records exists in FooBar.
Does this mean that a solution without empty frames is also acceptable?
If so, then the third set for FooId = 1 also defines a BarId = 3 for period 2021-03-02 -> 2021-03-31 for example.
Sample data
Tweaked the data model a bit to no have those timestamps (00:00:00.000) in the result.
Also added a set identifier (FooBar.SetId) for easy origin tracing.
CREATE TABLE Foo(
FooId INT NOT NULL,
ValidFrom DATE/*TIME*/ NOT NULL,
ValidUntil DATE/*TIME*/ NOT NULL
)
CREATE TABLE FooBar (
FooId INT NOT NULL,
BarId INT NOT NULL,
ValidFrom DATE/*TIME*/ NOT NULL,
ValidUntil DATE/*TIME*/ NOT NULL,
Priority TINYINT NOT NULL,
SetId nvarchar(5)
)
INSERT INTO Foo(FooId, ValidFrom, ValidUntil)
VALUES
(1, '2020-01-01', '2021-12-31')
, (2, '2020-01-01', '2021-06-30')
INSERT INTO FooBar(FooId, BarId, ValidFrom, ValidUntil, Priority, SetId)
VALUES
-- First set for FooId = 1 with prio 1
(1, 1, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 2, '2021-01-01', '2021-03-01', 1, 'Set 1')
, (1, 3, '2021-01-01', '2021-03-01', 1, 'Set 1')
-- Second set for FooId = 1 with prio 2
, (1, 1, '2021-02-01', '2021-06-01', 2, 'Set 2')
, (1, 2, '2021-02-01', '2021-06-01', 2, 'Set 2')
-- Third set for FooId = 1 with prio 3
, (1, 1, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 2, '2021-01-01', '2021-12-31', 3, 'Set 3')
, (1, 3, '2021-01-01', '2021-12-31', 3, 'Set 3')
-- Fourth set for FooId = 1 with Prio 1
, (1, 4, '2021-04-01', '2021-04-02', 1, 'Set 4')
, (1, 5, '2021-04-01', '2021-04-02', 1, 'Set 4')
-- First set for FooId = 2 with prio 3
, (2, 6, '2021-01-01', '2021-04-02', 3, 'Set 1')
Solution
The common table expressions (CTE) ValidFrom and ValidPeriod cut all period information from Foo and FooBar in the smallest individual periods.
The previous step also produces an extra trailing, incomplete period for each FooId that is remove with the exists clause.
Then for each individual period fetch the FooBar record with the first priority value (by saying that no similar record with an even smaller priority is allowed: not exists ... fb2.Priority < fb.Priority).
This gives:
with ValidFrom as
(
select f.FooId,
f.ValidFrom
from Foo f
union
select f.FooId,
dateadd(day, 1, f.ValidUntil)
from Foo f
union
select fb.FooId,
fb.ValidFrom
from FooBar fb
union
select fb.FooId,
dateadd(day, 1, fb.ValidUntil)
from Foobar fb
),
ValidPeriod as
(
select vf.FooId,
vf.ValidFrom,
dateadd(day, -1, lead(vf.ValidFrom) over(partition by vf.FooId order by vf.ValidFrom)) as ValidUntil
from ValidFrom vf
)
select vp.FooId,
fb.BarId,
vp.ValidFrom,
vp.ValidUntil,
--fb.ValidFrom,
--fb.ValidUntil,
--fb.Priority,
fb.SetId
from ValidPeriod vp
left join FooBar fb
on fb.FooId = vp.FooId
and fb.ValidFrom <= vp.ValidUntil
and fb.ValidUntil >= vp.ValidFrom
and not exists ( select 'x'
from FooBar fb2
where fb2.FooId = fb.FooId
and fb2.BarId = fb.BarId
and fb2.ValidFrom <= vp.ValidUntil
and fb2.ValidUntil >= vp.ValidFrom
and fb2.Priority < fb.Priority )
where exists ( select 'x'
from ValidPeriod vp2
where vp2.FooId = vp.FooId
and vp2.ValidFrom > vp.ValidFrom )
order by vp.FooId,
vp.ValidFrom,
fb.BarId;
Result
This result contains more period information than those you requested in your expected result. Removing the union's with Foo from the first CTE will remove the null values and limit the period to the period information available in FooBar alone (in fact this would eliminate Foo from the solution entirely).
With vp.ValidFrom and vp.ValidUntil as result periods:
FooId BarId ValidFrom ValidUntil SetId
----- ----- ---------- ---------- -----
1 null 2020-01-01 2020-12-31 null -- extra row
1 1 2021-01-01 2021-01-31 Set 1
1 2 2021-01-01 2021-01-31 Set 1
1 3 2021-01-01 2021-01-31 Set 1
1 1 2021-02-01 2021-03-01 Set 1 -- extra row
1 2 2021-02-01 2021-03-01 Set 1 -- extra row
1 3 2021-02-01 2021-03-01 Set 1 -- extra row
1 1 2021-03-02 2021-03-31 Set 2
1 2 2021-03-02 2021-03-31 Set 2
1 3 2021-03-02 2021-03-31 Set 3 -- extra row
1 1 2021-04-01 2021-04-02 Set 2 -- extra row
1 2 2021-04-01 2021-04-02 Set 2 -- extra row
1 3 2021-04-01 2021-04-02 Set 3 -- extra row
1 4 2021-04-01 2021-04-02 Set 4
1 5 2021-04-01 2021-04-02 Set 4
1 1 2021-04-03 2021-06-01 Set 2
1 2 2021-04-03 2021-06-01 Set 2
1 3 2021-04-03 2021-06-01 Set 3 -- extra row
1 1 2021-06-02 2021-12-31 Set 3
1 2 2021-06-02 2021-12-31 Set 3
1 3 2021-06-02 2021-12-31 Set 3
2 null 2020-01-01 2020-12-31 null -- extra row
2 6 2021-01-01 2021-04-02 Set 1
2 null 2021-04-03 2021-06-30 null -- extra row
With fb.ValidFrom and fb.ValidUntil as result periods:
FooId BarId ValidFrom ValidUntil SetId
----- ----- ---------- ---------- -----
1 null null null null -- extra row
1 1 2021-01-01 2021-03-01 Set 1
1 2 2021-01-01 2021-03-01 Set 1
1 3 2021-01-01 2021-03-01 Set 1
1 1 2021-01-01 2021-03-01 Set 1 -- extra row
1 2 2021-01-01 2021-03-01 Set 1 -- extra row
1 3 2021-01-01 2021-03-01 Set 1 -- extra row
1 1 2021-02-01 2021-06-01 Set 2
1 2 2021-02-01 2021-06-01 Set 2
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 1 2021-02-01 2021-06-01 Set 2 -- extra row
1 2 2021-02-01 2021-06-01 Set 2 -- extra row
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 4 2021-04-01 2021-04-02 Set 4
1 5 2021-04-01 2021-04-02 Set 4
1 1 2021-02-01 2021-06-01 Set 2
1 2 2021-02-01 2021-06-01 Set 2
1 3 2021-01-01 2021-12-31 Set 3 -- extra row
1 1 2021-01-01 2021-12-31 Set 3
1 2 2021-01-01 2021-12-31 Set 3
1 3 2021-01-01 2021-12-31 Set 3
2 null null null null -- extra row
2 6 2021-01-01 2021-04-02 Set 1
2 null null null null -- extra row
Fiddle to see things in action.

Related

Create comma separated value strings using data from different tables in SQL Server

I have the following database model:
criteria table:
criteria_id criteria_name is_range
1 product_category 0
2 product_subcategory 0
3 items 1
4 criteria_4 1
evaluation_grid table:
evaluation_grid_id criteria_id start_value end_value provider property_1 property_2 property_3
1 1 3 NULL internal 1 1 1
2 1 1 NULL internal 1 1 1
3 2 1 NULL internal 1 2 1
4 3 1 100 internal 2 1 1
5 4 1 50 internal 2 2 1
6 1 2 NULL external 2 8 1
7 2 2 NULL external 2 5 1
8 3 1 150 external 2 2 2
9 3 1 100 external 2 3 1
product_category table:
id name
1 test1
2 test2
3 test3
product_subcategory table:
id name
1 producttest1
2 producttest2
3 producttest3
What I am trying to achieve is returning the values like this:
criteria start_value end_value provider property_1 property_2 property_3
product_category test3, test1 NULL internal 1 1 1
product_subcategory producttest1 NULL internal 1 2 1
items 1 100 internal 2 1 1
criteria_4 1 50 internal 2 2 1
product_category test2 NULL external 2 8 1
product_subcategory producttest2 NULL external 2 5 1
items 1 150 external 2 2 2
criteria_4 1 100 external 2 3 1
Basically keeping the order from table evaluation_grid but grouping only the criterias which are not ranges
in comma separated value strings based on start_value, end_value, provier, property_1, property_2 and property_3
I tried like this:
SELECT c.criteria_name AS criteria
,CASE WHEN c.criteria_id = 1
THEN
(IsNull(STUFF((SELECT ', ' + RTRIM(LTRIM(pc.name))
FROM product_category pc
INNER JOIN [evaluation_grid] eg ON eg.start_value=pc.id
WHERE srsg.criteria_id=c.criteria_id
FOR XML PATH('')), 1, 2, ''), ''))
WHEN c.criteria_id = 2
THEN (IsNull(STUFF((SELECT ' , ' + RTRIM(LTRIM(psc.name))
FROM product_subcategory psc
INNER JOIN [evaluation_grid] eg ON eg.start_value=psc.id
WHERE srsg.criteria_id=c.criteria_id
FOR XML PATH('')
), 1, 3, ''), ''))
ELSE
CAST(eg.start_value AS VARCHAR)
END AS start_value
,eg.end_value AS end_value
,eg.provider AS provider
,eg.property_1 AS property_1
,eg.property_2 AS property_2
,eg.property_3 AS property_3
FROM [evaluation_grid] eg
INNER JOIN criteria c ON eg.criteria_id = crs.criteria_id
GROUP BY c.criteria_name,c.criteria_id,c.is_range,eg.start_value,eg.end_value,eg.provider,eg.property_1,eg.property_2,eg.property_3
But it is returning wrong data, like this:
criteria start_value end_value provider property_1 property_2 property_3
product_category test3, test1, test2 NULL internal 1 1 1
product_category test3, test1, test2 NULL external 2 8 1
product_category test3, test1, test2 NULL internal 1 1 1
product_subcategory producttest1,producttest2 NULL internal 1 2 1
product_subcategory producttest1,producttest2 NULL external 2 5 1
items 1 100 internal 1 1 1
items 1 150 external 2 2 2
criteria_4 1 50 internal 2 2 1
criteria_4 1 100 external 2 3 1
I tried some versions with "with cte;" as well but didn't manage to find the solution yet and yes, I checked the similar questions already. :)
PS: I cannot use STRING_AGG because we have below 2017 Sql Server version.
Any suggestion will be highly appreciated, thanks !
As far as I can tell this query returns the exact output you're looking for.
with cte as (
select c.criteria_name,
eg.evaluation_grid_id,
case when c.criteria_id = 1 then pc.[name]
when c.criteria_id = 2 then psc.[name]
else null end pc_cat,
c.criteria_id,c.is_range, eg.start_value, eg.end_value,
eg.[provider], eg.property_1, eg.property_2,eg.property_3
from #evaluation_grid eg
join #criteria c ON eg.criteria_id = c.criteria_id
left join #product_category pc on eg.start_value=pc.id
left join #product_subcategory psc on eg.start_value=psc.id)
select c.criteria_name as criteria,
case when c.is_range=0 then
STUFF((SELECT ', ' + RTRIM(LTRIM(c2.pc_cat))
FROM cte c2
WHERE c2.criteria_id=c.criteria_id
and c2.is_range=c.is_range
and c2.[provider]=c.[provider]
and c2.property_1=c.property_1
and c2.property_2=c.property_2
and c2.property_3=c.property_3
FOR XML PATH('')), 1, 2, '')
else max(cast(c.start_value as varchar(50))) end as start_value,
c.end_value, c.[provider], c.property_1, c.property_2, c.property_3
from cte c
group by c.criteria_name, c.criteria_id, c.is_range, c.end_value,
c.[provider], c.property_1, c.property_2, c.property_3
order by max(c.evaluation_grid_id);
Output
criteria start_value end_value provider property_1 property_2 property_3
product_category test3, test1 NULL internal 1 1 1
product_subcategory producttest1 NULL internal 1 2 1
items 1 100 internal 2 1 1
criteria_4 1 50 internal 2 2 1
product_category test2 NULL external 2 8 1
product_subcategory producttest2 NULL external 2 5 1
items 1 150 external 2 2 2
criteria_4 1 100 external 2 3 1
It's a bit difficult to follow the requirements. Can you review the setup & results below and let us know what the desired resultset should be?
declare #criteria table (criteria_id int, criteria_name varchar(50), is_range bit)
insert into #criteria
values(1, 'product_category', 0), (2, 'product_subcategory', 0), (3, 'items', 1), (4, 'criteria_4', 1);
declare #evaluation_grid table (evaluation_grid_id int, criteria_id int, start_value int, end_value int, [provider] varchar(50), property_1 int, property_2 int, property_3 int);
insert into #evaluation_grid
values
(1, 1, 3, NULL, 'internal', 1, 1, 1),
(2, 1, 1, NULL, 'internal', 1, 1, 1),
(3, 2, 1, NULL, 'internal', 1, 2, 1),
(4, 3, 1, 100, 'internal', 2, 1, 1),
(5, 4, 1, 50, 'internal', 2, 2, 1),
(6, 1, 2, NULL, 'external', 2, 8, 1),
(7, 2, 2, NULL, 'external', 2, 5, 1),
(8, 3, 1, 150, 'external', 2, 2, 2),
(9, 4, 1, 100, 'external', 2, 3, 1)
declare #product_category table (id int, [name] varchar(50))
insert into #product_category
values (1, 'test1'), (2, 'test2'), (3, 'test3'), (4, 'test4');
declare #product_subcategory table (id int, [name] varchar(50))
insert into #product_subcategory
values (1, 'producttest1'), (2, 'producttest2'), (3, 'producttest3');
select c.criteria_name,
stuff(( select ',' + ipc.[name]
from #evaluation_grid ieg
join #product_category ipc on ieg.start_value = ipc.id
where [provider] = eg.[provider] and property_1 = eg.property_1 and property_2 = eg.property_2 and property_3 = eg.property_3
order by ieg.evaluation_grid_id
for xml path('')), 1 ,1, '') as start_value,
end_value,
[provider],
property_1,
property_2,
property_3
from #evaluation_grid eg
join #criteria c on eg.criteria_id = c.criteria_id
where c.is_range = 0
group
by c.criteria_name, end_value, [provider], property_1, property_2, property_3
union all
select c.criteria_name, cast(start_value as varchar(10)), end_value, [provider], property_1, property_2, property_3
from #evaluation_grid eg
join #criteria c on eg.criteria_id = c.criteria_id
where c.is_range = 1;

Rows in pivot table not merging

I have created the following table to illustrate what is happening
create table weather (
WDate varchar(10),
ItemCode varchar(8),
ItemValue int,
ItemUnits varchar(8))
insert into Weather values
('2020-02-10', 'MAXTEMP', 6, 'degC'),
('2020-02-10', 'MINTEMP', 2, 'degC'),
('2020-02-10', 'RAIN', 0, 'mm'),
('2020-02-11', 'MAXTEMP', 5, 'degC'),
('2020-02-11', 'RAIN', 20, 'mm'),
('2020-02-11', 'MINTEMP', 1, 'degC'),
('2020-02-12', 'RAIN', 5, 'mm'),
('2020-02-12', 'MAXTEMP', 8, 'degC'),
('2020-02-12', 'MINTEMP', 2, 'degC')
The data is not always in the same order because it can come from equipment that may not be time sync'ed. When I run the following query
SELECT
[wdate] as 'Date',
[MINTEMP] as 'Min Temp',
[MAXTEMP] as 'Max Temp',
[RAIN] as 'Rain'
FROM
(
SELECT
*
FROM
weather
) rawdata
PIVOT
(
min(ItemValue)
FOR ItemCode IN ([MINTEMP], [MAXTEMP], [RAIN])
) pitem
ORDER BY WDate
I get
WDate Min Temp Max Temp Rain
2020-02-10 2 6 NULL
2020-02-10 NULL NULL 0
2020-02-11 1 5 NULL
2020-02-11 NULL NULL 20
2020-02-12 2 8 NULL
2020-02-12 NULL NULL 5
I can't figure out why the Rain data doesn't end up on the same row as the Min and Max Temp. I was expecting
WDate Min Temp Max Temp Rain
2020-02-10 2 6 0
2020-02-11 1 5 20
2020-02-12 2 8 5
You must "FEED" the pivot with the minimum number of columns. Notice the ItemUnits is missing from the sub-select rawdata
Example
SELECT
[wdate] as 'Date',
[MINTEMP] as 'Min Temp',
[MAXTEMP] as 'Max Temp',
[RAIN] as 'Rain'
FROM
(
Select WDate
,ItemCode
,ItemValue
from Weather
) rawdata
PIVOT
(
min(ItemValue)
FOR ItemCode IN ([MINTEMP], [MAXTEMP], [RAIN])
) pitem
ORDER BY WDate
Returns
Date Min Temp Max Temp Rain
2020-02-10 2 6 0
2020-02-11 1 5 20
2020-02-12 2 8 5

Update JSON value - MSSQL

In MSSQL I need to update the actualWatchedTime with totalDuration.
Current Table
Id VideoId UserId ProgressJson
1 1 1 {"actualWatchedTime":228,"currentWatchTime":3,"totalDuration":657}
2 2 1 {"actualWatchedTime":328,"currentWatchTime":23,"totalDuration":349}
3 3 1 {"actualWatchedTime":28,"currentWatchTime":2,"totalDuration":576}
4 1 2 {"actualWatchedTime":82,"currentWatchTime":103,"totalDuration":576}
5 2 2 {"actualWatchedTime":280,"currentWatchTime":253,"totalDuration":456}
Expected Table
Id VideoId UserId ProgressJson
1 1 1 {"actualWatchedTime":657,"currentWatchTime":3,"totalDuration":657}
2 2 1 {"actualWatchedTime":349,"currentWatchTime":23,"totalDuration":349}
3 3 1 {"actualWatchedTime":576,"currentWatchTime":2,"totalDuration":576}
4 1 2 {"actualWatchedTime":576,"currentWatchTime":103,"totalDuration":576}
5 2 2 {"actualWatchedTime":456,"currentWatchTime":253,"totalDuration":456}
How do I do that?
declare #t table
(
id int,
ProgressJson nvarchar(500)
);
insert into #t(id, ProgressJson)
values
(1, N'{"actualWatchedTime":228,"currentWatchTime":3,"totalDuration":657}'),
(2, N'{"actualWatchedTime":328,"currentWatchTime":23,"totalDuration":349}'),
(3, N'{"actualWatchedTime":28,"currentWatchTime":2,"totalDuration":576}'),
(4, N'{"actualWatchedTime":82,"currentWatchTime":103,"totalDuration":576}'),
(5, N'{"actualWatchedTime":280,"currentWatchTime":253,"totalDuration":456}');
select *
from #t;
update #t
set ProgressJson = JSON_MODIFY(ProgressJson,'$.actualWatchedTime', cast(json_value(ProgressJson, '$.totalDuration') as int));
select *
from #t;

Separate a group of values in to A or B (1 or 2) SQL

Note: I have already asked a similar question but had omitted a key part in the fact that a tool has many components.
I have a list of multiple tools and their components that all have a model number. I want to group every second tool based on the model it belongs to.
The derivedColumn is the query I want to return
declare #t table (Model int, toolID INT ,Component INT,DerivedColumn int);
insert into #t values (1,1,1,1),(1,1,2,1),(1,1,3,1),(1,2,1,2),(1,2,2,2),(1,2,3,2),(1,3,1,1),(1,3,2,1),(1,3,3,1),(1,4,1,2),(1,4,2,2),(1,4,3,2),(1,5,1,1),(1,5,2,1),(1,5,3,1),(2,1,1,1),(2,1,2,1),(2,2,1,2),(2,2,2,2),(2,3,1,1),(2,3,2,1)
SELECT * FROM #t
Model toolID Component DerivedColumn
1 1 1 1
1 1 2 1
1 1 3 1
1 2 1 2
1 2 2 2
1 2 3 2
1 3 1 1
1 3 2 1
1 3 3 1
1 4 1 2
1 4 2 2
1 4 3 2
1 5 1 1
1 5 2 1
1 5 3 1
2 1 1 1
2 1 2 1
2 2 1 2
2 2 2 2
2 3 1 1
2 3 2 1
Every second tool belonging to a model should have an alternative group number.
I believe I have to use a windows function but haven't been able to solve.
You could use dense_rank() and mod function %2 to calculate
DECLARE #SampleData AS TABLE
(
Model int,
ToolId int,
Component int
)
INSERT INTO #SampleData
(
Model, ToolId, Component
)
VALUES
(1, 1, 1),(1, 1, 2),(1, 1, 3),(1, 2, 1),
(1, 2, 2),(1, 2, 3),(1, 3, 1),(1, 3, 2),
(1, 3, 3),(1, 4, 1),(1, 4, 2),(1, 4, 3),
(1, 5, 1),(1, 5, 2),(1, 5, 3),(2, 1, 1),
(2, 1, 2),(2, 2, 1),(2, 2, 2),(2, 3, 1),
(2, 3, 2)
SELECT *,
CASE (dense_rank() OVER(PARTITION BY sd.Model ORDER BY sd.ToolId) + 1) % 2
WHEN 1 THEN 2
WHEN 0 THEN 1
END as DerivedColumn
FROM #SampleData sd
ORDER BY sd.Model, sd.ToolId
Demo link: http://rextester.com/LIQL79881
Hope it may helps you
DECLARE #T TABLE (Model INT, toolID INT ,Component INT,DerivedColumn INT);
INSERT INTO #T VALUES (1,1,1,1),(1,1,2,1),(1,1,3,1),(1,2,1,2),(1,2,2,2),(1,2,3,2),(1,3,1,1),(1,3,2,1),(1,3,3,1),(1,4,1,2),(1,4,2,2),(1,4,3,2),(1,5,1,1),(1,5,2,1),(1,5,3,1),(2,1,1,1),(2,1,2,1),(2,2,1,2),(2,2,2,2),(2,3,1,1),(2,3,2,1)
SELECT Model
,toolID
,ROW_NUMBER()Over(Partition by toolID order by Model) AS AlternativetoolID
,Component
,DerivedColumn
from #t;

Create "Sets" within the same table based on multi-column criteria

For every unique combination of BoxId and Revision with a single UnitTypeId of 1 and a single UnitTypeId of 2 both having a NULL SetNumber, assign a SetNumber of 1.
Table and data setup:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[UnitTypes]') AND type in (N'U'))
Drop Table dbo.UnitTypes
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Tracking]') AND type in (N'U'))
DROP TABLE [dbo].[Tracking]
GO
CREATE TABLE dbo.UnitTypes
(
Id int NOT NULL,
Notes varchar(80)
)
GO
CREATE TABLE dbo.Tracking
(
Id int NOT NULL IDENTITY (1, 1),
BoxId int NOT NULL,
Revision int NOT NULL,
UnitValue int NULL,
UnitTypeId int NULL,
SetNumber int NULL
)
GO
ALTER TABLE dbo.Tracking ADD CONSTRAINT
PK_Tracking PRIMARY KEY CLUSTERED
(
Id
)
GO
Insert Into dbo.UnitTypes (Id, Notes) Values (1, 'X Coord'),
(2, 'Y Coord'),
(3, 'Weight'),
(4, 'Length')
Go
Insert Into dbo.Tracking (BoxId, Revision, UnitValue, UnitTypeId, SetNumber)
Values (1165, 1, 150, 1, NULL),
(1165, 1, 1477, 2, NULL),
(1165, 1, 31, 4, NULL),
(1166, 1, 425, 1, 1),
(1166, 1, 1146, 2, 1),
(1166, 1, 438, 1, NULL),
(1166, 1, 1163, 2, NULL),
(1167, 1, 560, 1, NULL),
(1167, 1, 909, 2, NULL),
(1167, 1, 12763, 3, NULL),
(1168, 1, 21, 1, NULL),
(1168, 1, 13109, 3, NULL)
The ideal results would be:
Id BoxId Revision UnitValue UnitTypeId SetNumber
1 1165 1 150 1 1
2 1165 1 1477 2 1
3 1165 1 31 4 1
4 1166 1 425 1 1
5 1166 1 1146 2 1
6 1166 1 438 1 NULL <--NULL Because there is already an existing Set
7 1166 1 1163 2 NULL <--NULL Because there is already an existing Set
8 1167 1 560 1 1
9 1167 1 909 2 1
10 1167 1 12763 3 1
11 1168 1 21 1 NULL <--NULL Because there is not exactly one UnitTypeId of 1 and exactly one UnitTypeId of 2 for this BoxId\Revision combination.
12 1168 1 13109 3 NULL <--NULL Because there is not exactly one UnitTypeId of 1 and exactly one UnitTypeId of 2 for this BoxId\Revision combination.
EDIT:
The question is how can I update the SetNumber, given the constraints above, using pure TSQL?
If I understand your question correctly, you could do this with a subquery that demands all conditions are met:
update t1
set SetNumber = 1
from dbo.Tracking t1
where SetNumber is null
and 1 =
(
select case
when count(case when t2.UnitTypeId = 1 then 1 end) <> 1 then 0
when count(case when t2.UnitTypeId = 2 then 1 end) <> 1 then 0
when count(t2.SetNumber) <> 0 then 0
else 1
end
from dbo.Tracking t2
where t1.BoxId = t2.BoxId
and t1.Revision = t2.Revision
)
The count(t2.SetNumber) is a bit tricky: this will only count rows where SetNumber is not null. So this meets the criterion that no other set with the same (BoxId, Revision) exists.
Try this out, it returns the same results that you gave. The WITH statement sets up a CTE to query from. The ROW_NUMBER() function is partitioning function that does what you want:
;WITH BoxSets AS (
SELECT
ID
,BoxId
,Revision
,UnitValue
,UnitTypeId
,CASE WHEN UnitTypeId IN (1,2) THEN 1 ELSE 0 END ValidUnit
,ROW_NUMBER() OVER (PARTITION BY BoxID,UnitTypeID ORDER BY BoxID,UnitTypeID,UnitValue ) SetNumber
FROM Tracking
)
SELECT
b.ID
,b.BoxId
,b.Revision
,b.UnitValue
,b.UnitTypeId
,CASE ISNULL(b1.ValidUnits,0) WHEN 0 THEN NULL ELSE CASE b.SetNumber WHEN 1 THEN b.SetNumber ELSE NULL END END
FROM BoxSets AS b
LEFT JOIN (SELECT
BoxID
,SUM(ValidUnit) AS ValidUnits
FROM BoxSets
GROUP BY BoxId
HAVING SUM(ValidUnit) > 1) AS b1 ON b.BoxId = b1.BoxId

Resources