Create Tree Query From Numeric Mapping Table in SQL (Specific Format) - sql-server

I have an exported table from accounting software like below.
AccountID AccountName
--------- -----------
11 Acc11
12 Acc12
13 Acc13
11/11 Acc11/11
11/12 Acc11/12
11/111 Acc11/111
11/11/001 Acc11/11/001
11/11/002 Acc11/11/002
12/111 Acc12/111
12/112 Acc12/112
I want to convert it to tree query in MS-SQL Server 2008 to use it as a Treelist datasource in my win aaplication.
I raised this question before and it's answered with a way that it was very very slow for my big table with more than 5000 records (Create Tree Query From Numeric Mapping Table in SQL). But I think counting "/" and separating AccountID field with "/" can solve my problem easier and very faster.
Anyway, My expected result must be like below:
AccountID AccountName ID ParentID Level HasChild
--------- ----------- --- --------- ------ --------
11 Acc11 1 Null 1 1
12 Acc12 2 Null 1 1
13 Acc13 3 Null 1 0
11/11 Acc11/11 4 1 2 1
11/12 Acc11/12 5 1 2 0
11/111 Acc11/111 6 1 2 0
11/11/001 Acc11/11/001 7 4 3 0
11/11/002 Acc11/11/002 8 4 3 0
12/111 Acc12/111 9 2 2 0
12/112 Acc12/112 10 2 2 0
Please Help Me.

I modified my answer given in the first question...
It would be best, if your table would keep the relation data directly in indexed columns. Before you change your table's structure you might try this:
A table with test data
DECLARE #tbl TABLE ( AccountID VARCHAR(100), AccountName VARCHAR(100));
INSERT INTO #tbl VALUES
('11','Acc11')
,('12','Acc12')
,('13','Acc13')
,('11/11','Acc11/11')
,('11/12','Acc11/12')
,('11/111','Acc11/111')
,('11/11/001','Acc11/11/001')
,('11/11/002','Acc11/11/002')
,('12/111','Acc12/111')
,('12/112','Acc12/112');
This will get the needed data into a newly created temp table called #tempHierarchy
SELECT AccountID
,AccountName
,ROW_NUMBER() OVER(ORDER BY LEN(AccountID)-LEN(REPLACE(AccountID,'/','')),AccountID) AS ID
,Extended.HierarchyLevel
,STUFF(
(
SELECT '/' + A.B.value('.','varchar(10)')
FROM Extended.IDsXML.nodes('/x[position() <= sql:column("HierarchyLevel")]') AS A(B)
FOR XML PATH('')
),1,2,'') AS ParentPath
,Extended.IDsXML.value('/x[sql:column("HierarchyLevel")+1][1]','varchar(10)') AS ownID
,Extended.IDsXML.value('/x[sql:column("HierarchyLevel")][1]','varchar(10)') AS ancestorID
INTO #tempHierarchy
FROM #tbl
CROSS APPLY(SELECT LEN(AccountID)-LEN(REPLACE(AccountID,'/','')) + 1 AS HierarchyLevel
,CAST('<x></x><x>' + REPLACE(AccountID,'/','</x><x>') + '</x>' AS XML) AS IDsXML) AS Extended
;
The intermediate result
+-----------+--------------+----+----------------+------------+-------+------------+
| AccountID | AccountName | ID | HierarchyLevel | ParentPath | ownID | ancestorID |
+-----------+--------------+----+----------------+------------+-------+------------+
| 11 | Acc11 | 1 | 1 | | 11 | |
+-----------+--------------+----+----------------+------------+-------+------------+
| 12 | Acc12 | 2 | 1 | | 12 | |
+-----------+--------------+----+----------------+------------+-------+------------+
| 13 | Acc13 | 3 | 1 | | 13 | |
+-----------+--------------+----+----------------+------------+-------+------------+
| 11/11 | Acc11/11 | 4 | 2 | 11 | 11 | 11 |
+-----------+--------------+----+----------------+------------+-------+------------+
| 11/111 | Acc11/111 | 5 | 2 | 11 | 111 | 11 |
+-----------+--------------+----+----------------+------------+-------+------------+
| 11/12 | Acc11/12 | 6 | 2 | 11 | 12 | 11 |
+-----------+--------------+----+----------------+------------+-------+------------+
| 12/111 | Acc12/111 | 7 | 2 | 12 | 111 | 12 |
+-----------+--------------+----+----------------+------------+-------+------------+
| 12/112 | Acc12/112 | 8 | 2 | 12 | 112 | 12 |
+-----------+--------------+----+----------------+------------+-------+------------+
| 11/11/001 | Acc11/11/001 | 9 | 3 | 11/11 | 001 | 11 |
+-----------+--------------+----+----------------+------------+-------+------------+
| 11/11/002 | Acc11/11/002 | 10 | 3 | 11/11 | 002 | 11 |
+-----------+--------------+----+----------------+------------+-------+------------+
And now a similar recursive approach takes place as in my first answer. But - as it is using a real table now and all the string splitting has taken place already - it should be faster...
WITH RecursiveCTE AS
(
SELECT th.*
,CAST(NULL AS BIGINT) AS ParentID
,CASE WHEN EXISTS(SELECT 1 FROM #tempHierarchy AS x WHERE x.ParentPath=th.AccountID) THEN 1 ELSE 0 END AS HasChild
FROM #tempHierarchy AS th WHERE th.HierarchyLevel=1
UNION ALL
SELECT sa.AccountID
,sa.AccountName
,sa.ID
,sa.HierarchyLevel
,sa.ParentPath
,sa.ownID
,sa.ancestorID
,(SELECT x.ID FROM #tempHierarchy AS x WHERE x.AccountID=sa.ParentPath)
,CASE WHEN EXISTS(SELECT 1 FROM #tempHierarchy AS x WHERE x.ParentPath=sa.AccountID) THEN 1 ELSE 0 END AS HasChild
FROM RecursiveCTE AS r
INNER JOIN #tempHierarchy AS sa ON sa.HierarchyLevel=r.HierarchyLevel+1
AND r.AccountID=sa.ParentPath
)
SELECT r.AccountID
,r.AccountName
,r.ID
,r.ParentID
,r.HierarchyLevel
,r.HasChild
FROM RecursiveCTE AS r
ORDER BY HierarchyLevel,ParentID;
And finally I clean up
DROP TABLE #tempHierarchy;
And here's the final result
+-----------+--------------+----+----------+----------------+----------+
| AccountID | AccountName | ID | ParentID | HierarchyLevel | HasChild |
+-----------+--------------+----+----------+----------------+----------+
| 11 | Acc11 | 1 | NULL | 1 | 1 |
+-----------+--------------+----+----------+----------------+----------+
| 12 | Acc12 | 2 | NULL | 1 | 1 |
+-----------+--------------+----+----------+----------------+----------+
| 13 | Acc13 | 3 | NULL | 1 | 0 |
+-----------+--------------+----+----------+----------------+----------+
| 11/11 | Acc11/11 | 4 | 1 | 2 | 1 |
+-----------+--------------+----+----------+----------------+----------+
| 11/111 | Acc11/111 | 5 | 1 | 2 | 0 |
+-----------+--------------+----+----------+----------------+----------+
| 11/12 | Acc11/12 | 6 | 1 | 2 | 0 |
+-----------+--------------+----+----------+----------------+----------+
| 12/111 | Acc12/111 | 7 | 2 | 2 | 0 |
+-----------+--------------+----+----------+----------------+----------+
| 12/112 | Acc12/112 | 8 | 2 | 2 | 0 |
+-----------+--------------+----+----------+----------------+----------+
| 11/11/001 | Acc11/11/001 | 9 | 4 | 3 | 0 |
+-----------+--------------+----+----------+----------------+----------+
| 11/11/002 | Acc11/11/002 | 10 | 4 | 3 | 0 |
+-----------+--------------+----+----------+----------------+----------+

Related

SQL-Server Closure table query

I need a hierarchy for my database and decided to use the closure table model. The hierarchy tables have the usual structure, like this:
locations table
+----+---------+
| id | name |
+----+---------+
| 1 | Europe |
| 2 | France |
| 3 | Germany |
| 4 | Spain |
| 5 | Paris |
| 6 | Nizza |
| 7 | Berlin |
| 8 | Munich |
| 9 | Madrid |
+----+---------+
CREATE TABLE locations (
id int IDENTITY(1,1) PRIMARY KEY,
name varchar(30)
)
lacations_relation table
+----+--------+--------+-------+
| id | src_id | dst_id | depth |
+----+--------+--------+-------+
| 1 | 1 | 1 | 0 |
| 2 | 2 | 2 | 0 |
| 3 | 1 | 2 | 1 |
| 4 | 3 | 3 | 0 |
| 5 | 1 | 3 | 1 |
| 6 | 4 | 4 | 0 |
| 7 | 1 | 4 | 1 |
| 8 | 5 | 5 | 0 |
| 9 | 2 | 5 | 1 |
| 10 | 1 | 5 | 2 |
| 11 | 6 | 6 | 0 |
| 12 | 2 | 6 | 1 |
| 13 | 1 | 6 | 2 |
| 14 | 7 | 7 | 0 |
| 15 | 3 | 7 | 1 |
| 16 | 1 | 7 | 2 |
| 17 | 8 | 8 | 0 |
| 18 | 3 | 8 | 1 |
| 19 | 1 | 8 | 2 |
| 20 | 9 | 9 | 0 |
| 21 | 4 | 9 | 1 |
| 22 | 1 | 9 | 2 |
+----+--------+--------+-------+
CREATE TABLE locations_relation (
id int IDENTITY(1,1) PRIMARY KEY,
src_id int,
dst_id int,
depth int,
CONSTRAINT FK_src FOREIGN KEY (src_id)
REFERENCES locations (id),
CONSTRAINT FK_dst FOREIGN KEY (dst_id)
REFERENCES locations (id)
)
Now there is a third table, which holds information about documents and is referencing the locations table, which looks like this:
closure_junction
+----+------------+-------------+
| id | country_id | document_id |
+----+------------+-------------+
| 1 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 6 | 2 |
| 4 | 6 | 3 |
| 5 | 5 | 2 |
| 6 | 5 | 4 |
+----+------------+-------------+
CREATE TABLE closure_junction (
id int IDENTITY(1,1) PRIMARY KEY,
country_id int NOT NULL,
document_id int,
CONSTRAINT FK_countries FOREIGN KEY (id)
REFERENCES countries(id)
)
What I'd like to have is single SQL-Query which counts the document per location and if there are documents in a child it should be counted up in the parent. For example if paris holds 2 documents than france should automatically also hold 2 documents. The query should also output the path of each node to the root aswell as the depth of the node. I know there is way to do this recursively, but I'd like to avoid that.
I have a query which gives me the correct result, but I'm not satisfied with how it works. Is there a way to circumentvent storing the children in a column?
This is my query with the correct output:
;WITH cte (name, path, depth, children) AS
(
SELECT
node.name,
STRING_AGG(locations.name, ' / ' ) WITHIN GROUP (ORDER BY relation.depth DESC) as path,
MAX(relation.depth) as depth,
STRING_AGG(locations.id, ' ') as children
FROM locations node
INNER JOIN locations_relation relation
ON node.id = relation.dst_id
INNER JOIN locations
ON relation.src_id = locations.id
GROUP BY node.name
)
SELECT
name,
path,
depth,
COUNT(DISTINCT document_id) as count_docs
FROM cte
CROSS APPLY string_split(children, ' ')
LEFT JOIN closure_junction ON
closure_junction.country_id = value
GROUP BY name, path, depth
ORDER BY depth ASC
+---------+---------------------------+-------+------------+
| name | path | depth | count_docs |
+---------+---------------------------+-------+------------+
| Europe | Europe | 0 | 0 |
| France | Europe / France | 1 | 2 |
| Germany | Europe / Germany | 1 | 0 |
| Spain | Europe / Spain | 1 | 0 |
| Berlin | Europe / Germany / Berlin | 2 | 0 |
| Madrid | Europe / Spain / Madrid | 2 | 0 |
| Munich | Europe / Germany / Munich | 2 | 0 |
| Nizza | Europe / France / Nizza | 2 | 3 |
| Paris | Europe / France / Paris | 2 | 3 |
+---------+---------------------------+-------+------------+
Would be great if someone could give me a clue on how to accomplish this.
The count you can easily replace with a simple LEFT JOIN, but for this path you will still need to concatenate it somehow.
Something like this:
WITH CTE_path
AS
( SELECT node.id,
STRING_AGG(locations.name, ' / ' ) WITHIN GROUP (ORDER BY relation.depth DESC) as path
FROM locations node
INNER JOIN locations_relation relation
ON node.id = relation.dst_id
INNER JOIN locations
ON relation.src_id = locations.id
GROUP BY node.id)
SELECT l.name,count(DISTINCT cj.document_id),pa.path
FROM locations l
JOIN CTE_path pa
ON pa.id = l.id
LEFT JOIN locations_relation lr
ON l.id = lr.dst_id
LEFT JOIN closure_junction cj
ON cj.country_id = lr.src_id
GROUP BY l.name,pa.path

SQL Server Lag by partitioned group

I have a table of data as follows:
+----+-------+----------+
| id | value | group_id |
+----+-------+----------+
| 1 | -200 | 0 |
| 2 | -620 | 0 |
| 3 | -310 | 0 |
| 4 | 400 | 1 |
| 5 | 300 | 1 |
| 6 | 100 | 1 |
| 7 | -200 | 2 |
| 8 | -400 | 2 |
| 9 | -500 | 2 |
+----+-------+----------+
What I would like to do is produce a 4th column that, for each record, shows the last value of the preceding group_id.
So the result I want is as follows:
+----+-------+----------+----------------+
| id | value | group_id | LastValByGroup |
+----+-------+----------+----------------+
| 1 | -200 | 0 | 0 |
| 2 | -620 | 0 | 0 |
| 3 | -310 | 0 | 0 |
| 4 | 400 | 1 | -310 |
| 5 | 300 | 1 | -310 |
| 6 | 100 | 1 | -310 |
| 7 | -200 | 2 | 100 |
| 8 | -400 | 2 | 100 |
| 9 | -500 | 2 | 100 |
+----+-------+----------+----------------+
What I have done so far is in 2 parts. First I use the LAST_VALUE function to get the last Value in each group. Then I have tried to use the LAG function to get the last value from the previous group. Unfortunately the second part of my code isn't working as desired.
Here is my code:
CREATE TABLE #temp
(
id int identity(1,1),
value int,
group_id int
)
INSERT #temp VALUES(-200,0)
INSERT #temp VALUES(-620,0)
INSERT #temp VALUES(-310,0)
INSERT #temp VALUES(400,1)
INSERT #temp VALUES(300,1)
INSERT #temp VALUES(100,1)
INSERT #temp VALUES(-200,3)
INSERT #temp VALUES(-400,3)
INSERT #temp VALUES(-500,3)
;WITH cte AS
(
SELECT
*,
LastValByGroup = LAST_VALUE(Value) OVER(Partition By group_id ORDER BY id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM
#temp
), lagged AS
(
SELECT
*,
LaggedLastValByGroup = LAG(LastValByGroup,1,0) OVER(Partition By group_id ORDER BY id)
FROM
cte
)
SELECT * FROM lagged ORDER BY id
DROP TABLE #temp
And this is the result I get:
+----+-------+----------+----------------+----------------------+
| id | value | group_id | LastValByGroup | LaggedLastValByGroup |
+----+-------+----------+----------------+----------------------+
| 1 | -200 | 0 | -310 | 0 |
| 2 | -620 | 0 | -310 | -310 |
| 3 | -310 | 0 | -310 | -310 |
| 4 | 400 | 1 | 100 | 0 |
| 5 | 300 | 1 | 100 | 100 |
| 6 | 100 | 1 | 100 | 100 |
| 7 | -200 | 3 | -500 | 0 |
| 8 | -400 | 3 | -500 | -500 |
| 9 | -500 | 3 | -500 | -500 |
+----+-------+----------+----------------+----------------------+
Any help is much appreciated.
Thanks
You can use first_value like following to get the desired result.
select distinct t2.*, ISNULL(FIRST_VALUE(t1.[value]) over(partition by t1.group_id order by t1.id desc), 0) LastValByGroup
from #data t1
right join #data t2 on t1.group_id + 1 = t2.group_id
Please find the db<>fiddle here.

Divide selected value by count(*)

I have a Microsoft SQL Server with the following tables:
Projects
BookedHours (with fk_Project = Projects.ID)
Products
ProjectsToProducts (n:m with fk_Projects = Projects.ID and fk_Products = Products.ID)
I now want to select how many hours are booked to which product per month. The problem is, that one project can have multiple products (that's why I need the n:m table).
If I do the following, it will count the hours twice if a project has two products.
SELECT
P.ID AS fk_Product, MONTH(B.Datum) AS Monat, SUM(B.Hours) AS Stunden
FROM
tbl_BookedHours AS B
INNER JOIN
tbl_Projects AS M on B.fk_Project = M.ID
INNER JOIN
tbl_ProjectProduct AS PP ON PP.fk_Project = M.ID
INNER JOIN
tbl_Products AS P ON PP.fk_Product = P.ID
WHERE
YEAR(B.Datum) = 2020
GROUP BY
P.ID, MONTH(B.Datum)
ORDER BY
P.ID, MONTH(B.Datum)
I can get the number of products for each project with this SQL:
SELECT fk_Project, COUNT(*) AS Cnt
FROM tbl_ProjectProduct
GROUP By fk_MainProject
But how can I now divide the hours for each project by its individual factor and add it all up per product and month?
I could do it in my C# program or I could use a cursor and iterate through all projects, but I think there should be an more elegant way.
Edit with sample data:
|----------------| |----------------| |------------------------------|
| tbl_Projects | | tbl_Products | | tbl_ProjectProduct |
|----------------| |----------------| |------------------------------|
| ID | Name | | ID | Name | | ID | fk_Project | fk_Product |
|----+-----------| |----+-----------| |------------------------------|
| 1 | Project 1 | | 1 | Product 1 | | 1 | 1 | 1 |
| 2 | Project 2 | | 2 | Product 2 | | 2 | 1 | 2 |
| 3 | Project 3 | | 3 | Product 3 | | 3 | 2 | 1 |
| 4 | Project 4 | | 4 | Product 4 | | 4 | 3 | 3 |
|----------------| |----------------| | 5 | 4 | 1 |
| 6 | 4 | 2 |
| 7 | 4 | 4 |
|------------------------------|
|--------------------------------------|
| tbl_BookedHours |
|--------------------------------------|
| ID | fk_Project | Hours | Date |
|--------------------------------------|
| 1 | 1 | 10 | 2020-01-15 |
| 2 | 1 | 20 | 2020-01-20 |
| 3 | 2 | 10 | 2020-01-15 |
| 4 | 3 | 30 | 2020-01-18 |
| 5 | 2 | 20 | 2020-01-20 |
| 6 | 4 | 30 | 2020-01-25 |
| 7 | 1 | 10 | 2020-02-15 |
| 8 | 1 | 20 | 2020-02-20 |
| 9 | 2 | 10 | 2020-02-15 |
| 10 | 3 | 30 | 2020-03-18 |
| 11 | 2 | 20 | 2020-03-20 |
| 12 | 4 | 30 | 2020-03-25 |
|--------------------------------------|
The Result should be:
|----------------------------|
| fk_Product | Month | Hours |
|----------------------------|
| 1 | 1 | 55 |
| 2 | 1 | 25 |
| 3 | 1 | 30 |
| 4 | 1 | 10 |
| 1 | 2 | 25 |
| 2 | 2 | 15 |
| 1 | 3 | 30 |
| 2 | 3 | 10 |
| 3 | 3 | 30 |
| 4 | 3 | 10 |
|----------------------------|
For example booking Nr. 1 has to be divided by 2 (because Project 1 has two products) and one half of amount added to Product 1 and the other to Product 2 (Both in January). Booking Nr. 4 should not be divided, because Project 3 only has one product. Booking Numer 12 for example has to be divided by 3.
So that in total the Hours in the end add up to the same total.
I hope it's clearer now.
*** EDIT 2***
DECLARE #tbl_Projects TABLE (ID INT, [Name] VARCHAR(MAX))
INSERT INTO #tbl_Projects VALUES
(1,'Project 1'),
(2,'Project 2'),
(3,'Project 3'),
(4,'Project 4')
DECLARE #tbl_Products TABLE (ID INT, [Name] VARCHAR(MAX))
INSERT INTO #tbl_Products VALUES
(1,'Product 1'),
(2,'Product 2'),
(3,'Product 3'),
(4,'Product 4')
DECLARE #tbl_ProjectProduct TABLE (ID INT, fk_Project int, fk_Product int)
INSERT INTO #tbl_ProjectProduct VALUES
(1,1,1),
(2,1,2),
(3,2,1),
(4,3,3),
(5,4,1),
(6,4,2),
(7,4,4)
DECLARE #tbl_BookedHours TABLE (ID INT, fk_Project int, Hours int, [Date] Date)
INSERT INTO #tbl_BookedHours VALUES
(1,1,10,'2020-01-15'),
(2,1,20,'2020-01-20'),
(3,2,10,'2020-01-15'),
(4,3,30,'2020-01-18'),
(5,2,20,'2020-01-20'),
(6,4,30,'2020-01-25'),
(7,1,10,'2020-02-15'),
(8,1,20,'2020-02-20'),
(9,2,10,'2020-02-15'),
(10,3,30,'2020-03-18'),
(11,2,20,'2020-03-20'),
(12,4,30,'2020-03-25')
SELECT P.ID AS fk_Product, MONTH(B.Date) AS Month, SUM(B.Hours) AS SumHours
FROM #tbl_BookedHours AS B INNER JOIN #tbl_Projects AS M on B.fk_Project = M.ID
INNER JOIN #tbl_ProjectProduct AS PP ON PP.fk_Project = M.ID
INNER JOIN #tbl_Products AS P ON PP.fk_Product = P.ID
GROUP BY P.ID,MONTH(B.Date)
ORDER BY P.ID, MONTH(B.Date)
This gives me the wrong result, because it Counts the hours for both products:
| fk_Product | Month | SumHours |
|-------------------------------|
| 1 | 1 | 90 |
| 1 | 2 | 40 |
| 1 | 3 | 50 |
| 2 | 1 | 60 |
| 2 | 2 | 30 |
| 2 | 3 | 30 |
| 3 | 1 | 30 |
| 3 | 3 | 30 |
| 4 | 1 | 30 |
| 4 | 3 | 30 |
|-------------------------------|
Consider the following query. I modified your table variables to temp tables so it was easier to debug.
;WITH CTE AS
(
SELECT fk_Project, count(fk_Product) CNT
FROM #tbl_ProjectProduct
GROUP BY fk_Project
)
,CTE2 AS
(
SELECT t1.Date, t2.fk_Project, Hours/CNT NewHours
FROM #tbl_BookedHours t1
INNER JOIN CTE t2 on t1.fk_Project = t2.fk_Project
)
SELECT t4.ID fk_Product, MONTH(date) MN, SUM(NewHours) HRS
FROM CTE2 t1
INNER JOIN #tbl_Projects t2 on t1.fk_Project = t2.id
INNER JOIN #tbl_ProjectProduct t3 on t3.fk_Project = t2.ID
INNER JOIN #tbl_Products t4 on t4.ID = t3.fk_Product
GROUP BY t4.ID,MONTH(date)

Creating a conditonal ROW_NUMBER() Partition clause based on previous row value

I have a table that looks like this:
+----------------+--------+
| EvidenceNumber | ID |
+----------------+--------+
| 001 | 8 |
| 001.A | 8 |
| 001.A.01 | 8 |
| 001.A.02 | 8 |
| 001.B | 8 |
| 001.C | 8 |
| 001.D | 8 |
| 001.E | 8 |
| 001.F | 8 |
| 001.G | 8 |
| 001.G.01 | 8 |
+----------------+--------+
If 001 were a bag, inside of it was 001.A, 001.B, and so on through to 001.G
In the output above, 001.A was another bag, and that bag contained 001.A.01 and 001.A.02. The same thing can be seen with 001.G.01.
Every entry in this table is either a bag or an item. I am only interested in counting the amount of items per ID.
Since 001.A.01 and 001.A.02 is the last we see of the "001.A's" we know A.01 and A.02 were items.
Since we see 001.B only once, that was an item as well.
001.G was a bag, but 001.G.01 was an item.
The above output is showing 8 items and 3 bags.
I feel like Row_number and the Partition clause is the perfect tool for the job, but I can't find a way to partition based on a clause that uses a previous row's value.
Maybe something like that isn't even necessary here, but I pictured it like:
{001} -- variable
{001}.A -- variable seen again, obviously 001 was a bag. Create new variable {001.A} and move on.
{001.A}.01 -- same thing.
{001.A.01} -- Unique variable. This is a final step. This is a bag and should be Row number 1.
Obviously, the below code is just making "ItemNum" 1 for each item since there are not duplicates.
SELECT
ROW_NUMBER() OVER(Partition BY EvidenceNumber ORDER BY EvidenceNumber) AS ItemNum,
EvidenceNumber,
ID
FROM EVIDENCE
WHERE ID = '18'
ORDER BY EvidenceNumber
+---------+----------------+--------+
| ItemNum | EvidenceNumber | ID |
+---------+----------------+--------+
| 1 | 001 | 8 |
| 1 | 001.A | 8 |
| 1 | 001.A.01 | 8 |
| 1 | 001.A.02 | 8 |
| 1 | 001.B | 8 |
| 1 | 001.C | 8 |
| 1 | 001.D | 8 |
| 1 | 001.E | 8 |
| 1 | 001.F | 8 |
| 1 | 001.G | 8 |
| 1 | 001.G.01 | 8 |
+---------+----------------+--------+
Ideally, it would partition on the items only, so in this case:
+---------+----------------+----+
| ItemNum | EvidenceNumber | ID |
+---------+----------------+----+
| 0 | 001 | 8 |
| 0 | 001.A | 8 |
| 1 | 001.A.01 | 8 |
| 2 | 001.A.02 | 8 |
| 3 | 001.B | 8 |
| 4 | 001.C | 8 |
| 5 | 001.D | 8 |
| 6 | 001.E | 8 |
| 7 | 001.F | 8 |
| 0 | 001.G | 8 |
| 8 | 001.G.01 | 8 |
+---------+----------------+----+
I don't think window functions alone are the best approach. Instead:
select t.*,
(case when exists (select 1
from evidence t2
where t2.caseid = t.caseid and
t2.EvidenceNumber like t.EvidenceNumber + '.%'
)
then 0 else 1
end) as is_item
from evidence t ;
Then sum these up using another subquery:
select t.*,
sum(is_item) over (partition by caseid order by EvidenceNumber) as item_counter
from (select t.*,
(case when exists (select 1
from evidence t2
where t2.caseid = t.caseid and
t2.EvidenceNumber like t.EvidenceNumber + '.%'
)
then 0 else 1
end) as is_item
from evidence t
) t;
trick with Lead and Row_Number:
DECLARE #Table TABLE (
EvidenceNumber varchar(64),
Id int
)
INSERT INTO #Table VALUES
('001',8),
('001.A',8),
('001.A.01',8),
('001.A.02',8),
('001.B',8),
('001.C',8),
('001.D',8),
('001.E',8),
('001.F',8),
('001.G',8),
('001.G.01',8);
WITH CTE AS (
SELECT
[IsBag] = PATINDEX(EvidenceNumber+'%',
IsNull(LEAD(EvidenceNumber) OVER (ORDER BY EvidenceNumber),0)
),
[EvidenceNumber],
[Id]
FROM
#Table
)
SELECT
[NumItem] = IIF(IsBag = 0,ROW_NUMBER() OVER (PARTITION BY [ISBag] order by [IsBag]),0),
[EvidenceNumber],
[Id]
FROM
CTE
ORDER BY EvidenceNumber

Adding a count column in SQL Server for groups of records

I am trying to update an existing table with an individual count of the record on each row in a count column.
The table has the following columns that need to be incremented:
MBR_NO, CLAIM_N0, Effective_Dt, incr_count
So a sample might look like this before the run:
MBR_NO | CLAIM_N0 | Effective_Dt | incr_count |
-------+----------+----------------+------------+
1 | 2 | 1/1/2015 | NULL |
1 | 4 | 5/5/2015 | NULL |
1 | 5 | 6/7/2016 | NULL |
1 | 7 | 8/7/2016 | NULL |
2 | 2 | 4/3/2015 | NULL |
2 | 5 | 5/21/2015 | NULL |
3 | 8 | 3/27/2015 | NULL |
I want to count by MBR_NO and update the Incr_count to look like this:
MBR_NO | CLAIM_N0 | Effective_Dt | incr_count |
-------+----------+----------------+------------+
1 | 2 | 1/1/2015 | 1 |
1 | 4 | 5/5/2015 | 2 |
1 | 5 | 6/7/2016 | 3 |
1 | 7 | 8/7/2016 | 4 |
2 | 2 | 4/3/2015 | 1 |
2 | 5 | 5/21/2015 | 2 |
3 | 8 | 3/27/2015 | 1 |
I need to change that filed for processing later on.
I know this is not that complex but It seemed that the other topics offered solutions that don't incrementally update. Any help would be appreciated.
You could just do this in a query with
ROW_NUMBER() OVER (PARTITION BY MBR_NO ORDER BY Effective_DT).
but does it matter if the number changes? i.e. in your example if you had
MBR_NO EffectiveDate RowNumber
------------------------------------
2 1/1/2017 1
2 5/1/2017 2
but if you inserted a row with an effective date of say 3/1/2017 it would change the row number for the 5/1/2017 row i.e.
MBR_NO EffectiveDate RowNumber
------------------------------------
2 1/1/2017 1
2 3/1/2017 2
2 5/1/2017 3
You can query as below:
Select MBR_NO, CLAIM_N0, Effective_Dt,
incr_count = count(MBR_NO) over(Partition by MBR_NO order by Effective_Dt)
from yourtable
Output as below:
+--------+----------+--------------+------------+
| MBR_NO | CLAIM_N0 | Effective_Dt | incr_count |
+--------+----------+--------------+------------+
| 1 | 2 | 2015-01-01 | 1 |
| 1 | 4 | 2015-05-05 | 2 |
| 1 | 5 | 2016-06-07 | 3 |
| 1 | 7 | 2016-08-07 | 4 |
| 2 | 2 | 2015-04-03 | 1 |
| 2 | 5 | 2015-05-21 | 2 |
| 3 | 8 | 2015-03-27 | 1 |
+--------+----------+--------------+------------+

Resources