Recursing down a tree - from parent root down to children - sql-server

I need to write a query for a database structure where a parent element has 0-many child elements, but a child element can "have" many parents.
Note the problem I'm not trying to solve is that where the child element definition contains its parent element, and you can simply write a recursive CTE that "starts" at a child and joins back up itself until it hits a root element (ie, an element with NULL ParentID).
For this I need specifically to start at a parent and work my way down finding all children, grandchildren etc. So my database structure currently is as follows:
create table Element (
Id int identity(1, 1) not null,
Name varchar(100) not null,
-- other stuff ...
)
create table ElementRelation (
ParentElementId int not null, -- foreign key to Element.Id
ChildElementId int not null, -- foreign key to Element.Id
-- other stuff...
)
select * from Element
/* returns
Id | Name
------|---------
1 | ElementA
2 | ElementB
3 | ElementC
4 | ElementD
*/
select * from ElementRelation
/* returns
ParentElementId | ChildElementId
----------------|---------------
1 | 2
1 | 3
1 | 4
2 | 3
2 | 4
3 | 4
*/
Which results in this tree structure (pardon the cruse Paint doodle):
So you can see the typical solution of having a leaf-first table with a ParentId foreign key column doesn't work - element 4 has three immediate parents, element 3 has two, etc. It would be inappropriate for the child to declare its parent elements.
What I effectively need is a query that, given a starting parent element, finds all its immediate children. And then for each of those children, finds all of their immediate children, and so on until all paths have reached a leaf node. In this example data, a query against element 1 in this example would return the {1,2}, {1,3}, {1,4}, {2,3}, {2,4}, {3,4}, {3,4} (though it doesn't matter whether or not the query only returns a distinct list), and a query against element 2 would return {2,4}, {2,3}, {3,4}
I could solve this with a cursor, but if there's a faster set-based means that would be preferred. If there's a better approach that would redefine the fundamental structure, that's also acceptable.
In terms of "what have you tried?" - several variants on a CTE query based on child-to-parent recursion, none of which came close to solving the problem, so I won't share them here.

An option would be a recursive CTE
DROP TABLE IF EXISTS [#ElementRelation];
CREATE TABLE [#ElementRelation]
(
[ParentElementId] INT
, [ChildElementId] INT
);
INSERT INTO [#ElementRelation] (
[ParentElementId]
, [ChildElementId]
)
VALUES ( 1, 2 )
, ( 1, 3 )
, ( 1, 4 )
, ( 2, 3 )
, ( 2, 4 )
, ( 3, 4 );
DECLARE #Start INT = 1;
;WITH [cte]
AS (
--this query here is the Anchor, where do I start in my recursive CTE
SELECT [p].[ParentElementId]
, [p].[ChildElementId]
FROM [#ElementRelation] [p]
WHERE [p].[ParentElementId] = #Start
UNION ALL
--This query is the recusive member, select from the table, joining my CTE and you see the ON clause [d].[ChildElementId] = [c].[ParentElementId]
SELECT [c].[ParentElementId]
, [c].[ChildElementId]
FROM [#ElementRelation] [c]
INNER JOIN [cte] [d]
ON [d].[ChildElementId] = [c].[ParentElementId] )
SELECT *
FROM [cte];
Giving you the results:
ParentElementId ChildElementId
--------------- --------------
1 2
1 3
1 4
3 4
2 3
2 4
3 4

Doing this in a single query is going to be difficult. CTE is probably the way to go but removing duplicates could be an issue. You've always got the possibility of cycles also. You could loop and add all children that aren't already in your list:
declare #ids table (id int primary key);
insert #ids values (1); -- id you are finding descendants for
while ##ROWCOUNT > 0
begin
insert #ids (id)
select distinct e.ChildElementId
from #ids i
join ElementRelation e on e.ParentElementId = i.id
where not exists (select 1 from #ids i2 where i2.id = e.ChildElementId);
end

Related

Add a where clause in SQL Server CTE

I have used a SQL Server CTE for selecting data from my self-referencing table whose schema is
CREATE TABLE BaseDomainTable
(
[BaseModelId] INT,
[Comments] NVARCHAR(100) NULL,
[ParentModelId] INT NULL
)
Each ModelId has comments and may or may not have parent ie any model can be a first model and any model can start a new branch of its own by starting from any parent
INSERT INTO BaseDomainTable ([BaseModelId],[Comments],[ParentModelId])
VALUES (1, 'Comments 1', NULL), (2, 'Comments 2', 1),
(3, 'Comments for 3', 2), (4, 'Comments 4', 2)
After this insert, 1 is my base parent, 2 is derived from 1, 3 and 4 are derived from 2.
To get the data in a hierarchical format I have added a cte.
WITH ParentCTECheck (BaseModelId, Comments, ParentModelId) AS
(
SELECT
Parent.BaseModelId, Parent.Comments, Parent.ParentModelId
FROM
BaseDomainTable Parent
WHERE
Parent.ParentModelId IS NULL
AND Parent.Comments IS NOT NULL
UNION ALL
SELECT
Derived.BaseModelId, Derived.Comments, Derived.ParentModelId
FROM
BaseDomainTable Derived
JOIN
ParentCTECheck ON ParentCTECheck.BaseModelId = Derived.ParentModelId
WHERE
Derived.ParentModelId IS NOT NULL
AND Derived.Comments IS NOT NULL
)
SELECT *
FROM ParentCTECheck
AND I am getting this output:
BaseModelId Comments ParentModelId
------------------------------------------
1 Comments 1 NULL
2 Comments 2 1
3 Comments for 3 2
4 Comments 4 2
I want to change it such that if I pass in the BaseModelId as 4 the cte will traverse the loop for 4 and skip all the data related to 3 in this case and return data for 4,2,1. And when I pass 2 it should skip both 3 and 4 and get the CTE will traverse loop of 2 ie 2,1
Is there a way that this can be done?
You have to traverse the tree bottom up:
;WITH ParentCTECheck AS (
-- Anchor query: get leaf node
SELECT BaseModelId, Comments, ParentModelId
FROM BaseDomainTable
WHERE BaseModelId = 4 -- <-- Id of leaf node
UNION ALL
-- Recursive query: go up the tree and get next level nodes.
-- Recursion terminates as soon as the parent node is met.
SELECT bt.BaseModelId, bt.Comments, bt.ParentModelId
FROM BaseDomainTable AS bt
JOIN ParentCTECheck AS ct ON bt.BaseModelId = ct.ParentModelId
)
SELECT *
FROM ParentCTECheck
Demo here

How can I automatically update all other rows in the table after insert one new one in sql?

I have a table like this where I keep a list of appointments which increment based on the combination of A and B:
ID A B Appointment Count
-----------------------------
1 abc 0 2010-10-20 1
2 abc 0 2010-10-25 2
3 abc 0 2010-10-30 3
3 abc 1 2010-10-30 1
4 xyz 1 2010-08-18 1
5 xyz 1 2010-08-19 2
6 xyz 1 2010-08-20 3
And a function like this:
CREATE FUNCTION dbo.GenerateCount
(
#id int,
#A int,
#B int,
#appt_date date
)
RETURNS Int
AS
BEGIN
RETURN
(
SELECT COUNT(*)
FROM dbo.test_seq
WHERE patient_id = #A
AND B = #B
AND id <= #id
AND appt_date <= #appt_date
)
END
With data inserted like this:
CREATE TABLE [dbo].[test_seq](
[id] [int] IDENTITY(1,1) NOT NULL,
[A] [int] NOT NULL,
[B] [int] NOT NULL,
[appt_date] [date] NOT NULL,
[count] AS dbo.GenerateCount(id, A, B, appt_date)
)
When I insert a new entry in the table, it increments the count as expected. However if I insert a new entry with a date in the middle, say if I want to add:
ID A B Appointment Count
-----------------------------
1 abc 0 2010-10-21
it has the correct count assigned, but the other rows don't get updated. How can I trigger a table update for all the other records after that date so they are corrected with the relevant count values?
I tried creating a trigger on insert/update/delete, but that only applies to the row being inserted and not the whole table.
If I get this correctly, the simple answer is: Don't!
A SQL-Server-table is not Excel...
You must decide
Do you want to set a value persistantly (in other words: a kind of key)?
Do you just want to number the rows for the moment?
Create a VIEW upon your table (according to the approach you find in your last question).
This will compute the correct numbers whenever you call that.
Why not just create a view
SELECT seq.*
, row_number() over (partition by A, B order by Appointment, ID) as [count]
FROM dbo.test_seq seq
Why the AND id <= #id? That is just going to break stuff if you do insert a date in the middle.

How do I look for matching data in SQL?

I have a database table for a todo list application, and i need a way to track tasks which are dependant on other tasks, I already have a table with ID,title, description, IsComplete and a DependsOnTask column, containing the unique identifier for the task another given task is dependant on.
the problem is, when I try the below in SQL it doesn't give any results!
SELECT TOP 1000 [id]
,[title]
,[description]
,[complete]
,[DependsOnTask]
FROM [master].[dbo].[ToDoItems] where ToDoItems.id =ToDoItems.DependsOnTask;
So my question is, is there a way to find all records with a unique identifier matching DependsOnTask?
Thanks in advance :)
You are missing a JOIN:
SELECT tdi.*, dot.*
FROM dbo.ToDoItems tdi JOIN
dbo.ToDoItems dot
ON dot.id = tdi.DependsOnTask;
This returns all tasks where DependsOnTask is not null, along with information from that record.
Notes:
You don't need to use square braces when they are not necessary. They just clutter up queries.
Use table aliases and qualify column names, so you know where columns are coming from.
You need to use an explicit JOIN for references back to the same table.
If you have a hierarchical structure and task could have a parent and that parent is a child to another task, yo can use recursive CTE to find all hierarchy of determined task.
Let me show an example.
You got structure like this:
SELECT *
FROM (VALUES
(1,'Title1','Do some stuff 1', 0, NULL),
(2,'Title2','Do some stuff 2', 0, NULL),
(3,'Title3','Do some stuff 3', 1, 1),
(4,'Title4','Do some stuff 4', 1, 1),
(5,'Title5','Do some stuff 5', 0, 2),
(6,'Title6','Do some stuff 6', 1, 2),
(7,'Title7','Do some stuff 7', 0, 4),
(8,'Title8','Do some stuff 8', 0, NULL)
) as t([id],[title],[description],[complete],[DependsOnTask])
So task 1 has 2 child tasks - 3 and 4. The 4th task got 1 child - 7. You want to get all child tasks of task with id = 1:
DECLARE #taskid int = 1
;WITH cte AS (
SELECT [id]
,[title]
,[description]
,[complete]
,[DependsOnTask]
FROM [ToDoItems]
WHERE [id] = #taskid
UNION ALL
SELECT t.*
FROM [ToDoItems] t
INNER JOIN cte c
ON c.id = t.DependsOnTask
)
SELECT *
FROM cte
Output:
id title description complete DependsOnTask
1 Title1 Do some stuff 1 0 NULL
3 Title3 Do some stuff 3 1 1
4 Title4 Do some stuff 4 1 1
7 Title7 Do some stuff 7 0 4
So if you change last select to:
SELECT #taskid as main,
id,
DependsOnTask
FROM cte
You will get:
main id DependsOnTask
1 1 NULL
1 3 1
1 4 1
1 7 4
So you get all child tasks of Task1.
If you change CTE like this:
;WITH cte AS (
SELECT [id]
,[title]
,[description]
,[complete]
,[DependsOnTask]
,[id] as Parent
FROM [ToDoItems]
WHERE [DependsOnTask] IS NULL
UNION ALL
SELECT t.*,
c.Parent
FROM [ToDoItems] t
INNER JOIN cte c
ON c.id = t.DependsOnTask
)
SELECT Parent,
id,
DependsOnTask
FROM cte
You will got all you need: Parent task, Child tasks and what are they dependent on:
Parent id DependsOnTask
1 1 NULL
2 2 NULL
8 8 NULL
2 5 2
2 6 2
1 3 1
1 4 1
1 7 4

How to copy a node's children in an adjacent list

I have an adjacent list hierarchy model that makes up a topic structure
ID Parent_Id Topic_Name
1 Null Topic 1
2 Null Topic 2
3 2 Topic 3
4 3 Topic 4
5 2 Topic 5
6 Null Topic 6
I want to specify a topic id and then copy it to a new topic id at a certain position and retain the levels / structure underneath
So in my example I could specify topic topic_id 2 with pos_id 1 and it would create
ID Parent_Id Topic_Name
1 Null Topic 1
7 Null Topic 2
8 7 Topic 3
9 8 Topic 4
10 7 Topic 5
2 Null Topic 2
3 2 Topic 3
4 3 Topic 4
5 2 Topic 5
6 Null Topic 6
topic_id being the node to copy and pos_id is the node to insert the copy after
Auto numbering is on for the ID, but I can't guarantee that subnodes will always be the next id number up from the parent.
topic_id being the node to copy and pos_id is the node to insert the copy after
I think you can do this in a single statement. Here is the idea.
First, expand the data for all parents (at whatever level) for each id. This uses a recursive CTE.
Then, go back to the original list and choose only those who are descendants of 2.
Then assign a new id to each of the ids found in this group. The following query gets that maximum id and adds a row_number() constant to it.
Then, for each record in the subtree, lookup the new id's in the record, and then insert the results.
The following query takes this approach. I haven't tested it:
with Parents as (
select id, parent_id, 1 as level
from AdjList al
union all
select cte.id, cte.Parent_id, level+1
from AdjList al join
cte
on cte.Parent_id = al.id
),
LookingFor as (
select *
from AdjList
where id in (select id from Parents where id = 2)
),
NewIds as (
select id, const.maxid + ROW_NUMBER() over (order by (select NULL)) as newid
from (select distinct id
from LookingFor
) t cross join
(select MAX(id) as maxid, from AdjList) const
)
insert into AdjList(Id, Parent_id, Topic_Name)
select ni1.newid, coalesce(ni2.NEWID, 1), lf.Topic_Name
from LookingFor lf left outer join
NewIds ni1
on lf.id = ni1.id left outer join
NewIds ni2
on lf.Parent_Id = ni2.id
where ni1.newid is not null
You might want to have a look at Nested Treesets wich would be way better for your purpose I think.
Great explanation here:
http://en.wikipedia.org/wiki/Nested_set_model

How do I get the "Next available number" from an SQL Server? (Not an Identity column)

Technologies: SQL Server 2008
So I've tried a few options that I've found on SO, but nothing really provided me with a definitive answer.
I have a table with two columns, (Transaction ID, GroupID) where neither has unique values. For example:
TransID | GroupID
-----------------
23 | 4001
99 | 4001
63 | 4001
123 | 4001
77 | 2113
2645 | 2113
123 | 2113
99 | 2113
Originally, the groupID was just chosen at random by the user, but now we're automating it. Thing is, we're keeping the existing DB without any changes to the existing data(too much work, for too little gain)
Is there a way to query "GroupID" on table "GroupTransactions" for the next available value of GroupID > 2000?
I think from the question you're after the next available, although that may not be the same as max+1 right? - In that case:
Start with a list of integers, and look for those that aren't there in the groupid column, for example:
;WITH CTE_Numbers AS (
SELECT n = 2001
UNION ALL
SELECT n + 1 FROM CTE_Numbers WHERE n < 4000
)
SELECT top 1 n
FROM CTE_Numbers num
WHERE NOT EXISTS (SELECT 1 FROM MyTable tab WHERE num.n = tab.groupid)
ORDER BY n
Note: you need to tweak the 2001/4000 values int the CTE to allow for the range you want. I assumed the name of your table to by MyTable
select max(groupid) + 1 from GroupTransactions
The following will find the next gap above 2000:
SELECT MIN(t.GroupID)+1 AS NextID
FROM GroupTransactions t (updlock)
WHERE NOT EXISTS
(SELECT NULL FROM GroupTransactions n WHERE n.GroupID=t.GroupID+1 AND n.GroupID>2000)
AND t.GroupID>2000
There are always many ways to do everything. I resolved this problem by doing like this:
declare #i int = null
declare #t table (i int)
insert into #t values (1)
insert into #t values (2)
--insert into #t values (3)
--insert into #t values (4)
insert into #t values (5)
--insert into #t values (6)
--get the first missing number
select #i = min(RowNumber)
from (
select ROW_NUMBER() OVER(ORDER BY i) AS RowNumber, i
from (
--select distinct in case a number is in there multiple times
select distinct i
from #t
--start after 0 in case there are negative or 0 number
where i > 0
) as a
) as b
where RowNumber <> i
--if there are no missing numbers or no records, get the max record
if #i is null
begin
select #i = isnull(max(i),0) + 1 from #t
end
select #i
In my situation I have a system to generate message numbers or a file/case/reservation number sequentially from 1 every year. But in some situations a number does not get use (user was testing/practicing or whatever reason) and the number was deleted.
You can use a where clause to filter by year if all entries are in the same table, and make it dynamic (my example is hardcoded). if you archive your yearly data then not needed. The sub-query part for mID and mID2 must be identical.
The "union 0 as seq " for mID is there in case your table is empty; this is the base seed number. It can be anything ex: 3000000 or {prefix}0000. The field is an integer. If you omit " Union 0 as seq " it will not work on an empty table or when you have a table missing ID 1 it will given you the next ID ( if the first number is 4 the value returned will be 5).
This query is very quick - hint: the field must be indexed; it was tested on a table of 100,000+ rows. I found that using a domain aggregate get slower as the table increases in size.
If you remove the "top 1" you will get a list of 'next numbers' but not all the missing numbers in a sequence; ie if you have 1 2 4 7 the result will be 3 5 8.
set #newID = select top 1 mID.seq + 1 as seq from
(select a.[msg_number] as seq from [tblMSG] a --where a.[msg_date] between '2023-01-01' and '2023-12-31'
union select 0 as seq ) as mID
left outer join
(Select b.[msg_number] as seq from [tblMSG] b --where b.[msg_date] between '2023-01-01' and '2023-12-31'
) as mID2 on mID.seq + 1 = mID2.seq where mID2.seq is null order by mID.seq
-- Next: a statement to insert a row with #newID immediately in tblMSG (in a transaction block).
-- Then the row can be updated by your app.

Resources