Related
I need to write a query for a database structure where a parent element has 0-many child elements, but a child element can "have" many parents.
Note the problem I'm not trying to solve is that where the child element definition contains its parent element, and you can simply write a recursive CTE that "starts" at a child and joins back up itself until it hits a root element (ie, an element with NULL ParentID).
For this I need specifically to start at a parent and work my way down finding all children, grandchildren etc. So my database structure currently is as follows:
create table Element (
Id int identity(1, 1) not null,
Name varchar(100) not null,
-- other stuff ...
)
create table ElementRelation (
ParentElementId int not null, -- foreign key to Element.Id
ChildElementId int not null, -- foreign key to Element.Id
-- other stuff...
)
select * from Element
/* returns
Id | Name
------|---------
1 | ElementA
2 | ElementB
3 | ElementC
4 | ElementD
*/
select * from ElementRelation
/* returns
ParentElementId | ChildElementId
----------------|---------------
1 | 2
1 | 3
1 | 4
2 | 3
2 | 4
3 | 4
*/
Which results in this tree structure (pardon the cruse Paint doodle):
So you can see the typical solution of having a leaf-first table with a ParentId foreign key column doesn't work - element 4 has three immediate parents, element 3 has two, etc. It would be inappropriate for the child to declare its parent elements.
What I effectively need is a query that, given a starting parent element, finds all its immediate children. And then for each of those children, finds all of their immediate children, and so on until all paths have reached a leaf node. In this example data, a query against element 1 in this example would return the {1,2}, {1,3}, {1,4}, {2,3}, {2,4}, {3,4}, {3,4} (though it doesn't matter whether or not the query only returns a distinct list), and a query against element 2 would return {2,4}, {2,3}, {3,4}
I could solve this with a cursor, but if there's a faster set-based means that would be preferred. If there's a better approach that would redefine the fundamental structure, that's also acceptable.
In terms of "what have you tried?" - several variants on a CTE query based on child-to-parent recursion, none of which came close to solving the problem, so I won't share them here.
An option would be a recursive CTE
DROP TABLE IF EXISTS [#ElementRelation];
CREATE TABLE [#ElementRelation]
(
[ParentElementId] INT
, [ChildElementId] INT
);
INSERT INTO [#ElementRelation] (
[ParentElementId]
, [ChildElementId]
)
VALUES ( 1, 2 )
, ( 1, 3 )
, ( 1, 4 )
, ( 2, 3 )
, ( 2, 4 )
, ( 3, 4 );
DECLARE #Start INT = 1;
;WITH [cte]
AS (
--this query here is the Anchor, where do I start in my recursive CTE
SELECT [p].[ParentElementId]
, [p].[ChildElementId]
FROM [#ElementRelation] [p]
WHERE [p].[ParentElementId] = #Start
UNION ALL
--This query is the recusive member, select from the table, joining my CTE and you see the ON clause [d].[ChildElementId] = [c].[ParentElementId]
SELECT [c].[ParentElementId]
, [c].[ChildElementId]
FROM [#ElementRelation] [c]
INNER JOIN [cte] [d]
ON [d].[ChildElementId] = [c].[ParentElementId] )
SELECT *
FROM [cte];
Giving you the results:
ParentElementId ChildElementId
--------------- --------------
1 2
1 3
1 4
3 4
2 3
2 4
3 4
Doing this in a single query is going to be difficult. CTE is probably the way to go but removing duplicates could be an issue. You've always got the possibility of cycles also. You could loop and add all children that aren't already in your list:
declare #ids table (id int primary key);
insert #ids values (1); -- id you are finding descendants for
while ##ROWCOUNT > 0
begin
insert #ids (id)
select distinct e.ChildElementId
from #ids i
join ElementRelation e on e.ParentElementId = i.id
where not exists (select 1 from #ids i2 where i2.id = e.ChildElementId);
end
I have a database table for a todo list application, and i need a way to track tasks which are dependant on other tasks, I already have a table with ID,title, description, IsComplete and a DependsOnTask column, containing the unique identifier for the task another given task is dependant on.
the problem is, when I try the below in SQL it doesn't give any results!
SELECT TOP 1000 [id]
,[title]
,[description]
,[complete]
,[DependsOnTask]
FROM [master].[dbo].[ToDoItems] where ToDoItems.id =ToDoItems.DependsOnTask;
So my question is, is there a way to find all records with a unique identifier matching DependsOnTask?
Thanks in advance :)
You are missing a JOIN:
SELECT tdi.*, dot.*
FROM dbo.ToDoItems tdi JOIN
dbo.ToDoItems dot
ON dot.id = tdi.DependsOnTask;
This returns all tasks where DependsOnTask is not null, along with information from that record.
Notes:
You don't need to use square braces when they are not necessary. They just clutter up queries.
Use table aliases and qualify column names, so you know where columns are coming from.
You need to use an explicit JOIN for references back to the same table.
If you have a hierarchical structure and task could have a parent and that parent is a child to another task, yo can use recursive CTE to find all hierarchy of determined task.
Let me show an example.
You got structure like this:
SELECT *
FROM (VALUES
(1,'Title1','Do some stuff 1', 0, NULL),
(2,'Title2','Do some stuff 2', 0, NULL),
(3,'Title3','Do some stuff 3', 1, 1),
(4,'Title4','Do some stuff 4', 1, 1),
(5,'Title5','Do some stuff 5', 0, 2),
(6,'Title6','Do some stuff 6', 1, 2),
(7,'Title7','Do some stuff 7', 0, 4),
(8,'Title8','Do some stuff 8', 0, NULL)
) as t([id],[title],[description],[complete],[DependsOnTask])
So task 1 has 2 child tasks - 3 and 4. The 4th task got 1 child - 7. You want to get all child tasks of task with id = 1:
DECLARE #taskid int = 1
;WITH cte AS (
SELECT [id]
,[title]
,[description]
,[complete]
,[DependsOnTask]
FROM [ToDoItems]
WHERE [id] = #taskid
UNION ALL
SELECT t.*
FROM [ToDoItems] t
INNER JOIN cte c
ON c.id = t.DependsOnTask
)
SELECT *
FROM cte
Output:
id title description complete DependsOnTask
1 Title1 Do some stuff 1 0 NULL
3 Title3 Do some stuff 3 1 1
4 Title4 Do some stuff 4 1 1
7 Title7 Do some stuff 7 0 4
So if you change last select to:
SELECT #taskid as main,
id,
DependsOnTask
FROM cte
You will get:
main id DependsOnTask
1 1 NULL
1 3 1
1 4 1
1 7 4
So you get all child tasks of Task1.
If you change CTE like this:
;WITH cte AS (
SELECT [id]
,[title]
,[description]
,[complete]
,[DependsOnTask]
,[id] as Parent
FROM [ToDoItems]
WHERE [DependsOnTask] IS NULL
UNION ALL
SELECT t.*,
c.Parent
FROM [ToDoItems] t
INNER JOIN cte c
ON c.id = t.DependsOnTask
)
SELECT Parent,
id,
DependsOnTask
FROM cte
You will got all you need: Parent task, Child tasks and what are they dependent on:
Parent id DependsOnTask
1 1 NULL
2 2 NULL
8 8 NULL
2 5 2
2 6 2
1 3 1
1 4 1
1 7 4
I'm trying to build a CTE which will pull back all records which are related to a given, arbitrary record in the database.
Create table Requests (
Id bigint,
OriginalId bigint NULL,
FollowupId bigint NULL
)
insert into Requests VALUES (1, null, 3)
insert into Requests VALUES (2, 1, 8)
insert into Requests VALUES (3, 1, 4)
insert into Requests VALUES (4, 3, null)
insert into Requests VALUES (5, null, null)
insert into Requests VALUES (6, null, 7)
insert into Requests VALUES (7, 6, null)
insert into Requests VALUES (8, 2, null)
OriginalId is always the Id of a previous record (or null). FollowupId points to the most recent followup record (which, in turn, points back via OriginalId) and can probably be ignored, but it's there if it's helpful.
I can easily pull back either all ancestors or all descendants of a given record using the following CTE
;With TransactionList (Id, originalId, followupId, Steps)
AS
(
Select Id, originalId, followupId, 0 as Steps from requests where Id = #startId
union all
select reqs.Id, reqs.originalId, reqs.followupId, Steps + 1 from requests reqs
inner join TransactionList tl on tl.Id = reqs.originalId --or tl.originalId = reqs.Id
)
SELECT Id from TransactionList
However, if I use both where clauses, I run into recursion, hit the recursion limit, and it bombs out. Even combining both sets, I don't get the entire tree - just one branch from it.
I don't care about anything other than the list of Ids. They don't need to be sorted, or to display their relationship or anything. Doesn't hurt, but not necessary. But I need every Id in a given tree to pull back the same list when it's passed as #startId.
As an example of what I'd like to see, this is what the output should be when #startId is set to any value 1-4 or 8:
1
2
3
4
8
And for either 6 or 7, I get back both 6 and 7.
You can just create 2 CTE's.
The first CTE will get the Root of the hierarchy, and the second will use the Root ID to get the descendants of the Root.
;WITH cteRoot AS (
SELECT *, 0 [Level]
FROM Requests
WHERE Id = #startId
UNION ALL
SELECT r.*, [Level] + 1
FROM Requests r
JOIN cteRoot cte ON r.Id = cte.OriginalID
),
cteDesc AS (
SELECT *
FROM cteRoot
WHERE OriginalId IS NULL
UNION ALL
SELECT r.*, [Level] + 1
FROM Requests r
JOIN cteDesc cte ON r.OriginalId = cte.Id
)
SELECT * FROM cteDesc
SQL Fiddle
I need to get an ordered hierarchy of a tree, in a specific way. The table in question looks a bit like this (all ID fields are uniqueidentifiers, I've simplified the data for sake of example):
EstimateItemID EstimateID ParentEstimateItemID ItemType
-------------- ---------- -------------------- --------
1 A NULL product
2 A 1 product
3 A 2 service
4 A NULL product
5 A 4 product
6 A 5 service
7 A 1 service
8 A 4 product
Graphical view of the tree structure (* denotes 'service'):
A
___/ \___
/ \
1 4
/ \ / \
2 7* 5 8
/ /
3* 6*
Using this query, I can get the hierarchy (just pretend 'A' is a uniqueidentifier, I know it isn't in real life):
DECLARE #EstimateID uniqueidentifier
SELECT #EstimateID = 'A'
;WITH temp as(
SELECT * FROM EstimateItem
WHERE EstimateID = #EstimateID
UNION ALL
SELECT ei.* FROM EstimateItem ei
INNER JOIN temp x ON ei.ParentEstimateItemID = x.EstimateItemID
)
SELECT * FROM temp
This gives me the children of EstimateID 'A', but in the order that it appears in the table. ie:
EstimateItemID
--------------
1
2
3
4
5
6
7
8
Unfortunately, what I need is an ordered hierarchy with a result set that follows the following constraints:
1. each branch must be grouped
2. records with ItemType 'product' and parent are the top node
3. records with ItemType 'product' and non-NULL parent grouped after top node
4. records with ItemType 'service' are bottom node of a branch
So, the order that I need the results, in this example, is:
EstimateItemID
--------------
1
2
3
7
4
5
8
6
What do I need to add to my query to accomplish this?
Try this:
;WITH items AS (
SELECT EstimateItemID, ItemType
, 0 AS Level
, CAST(EstimateItemID AS VARCHAR(255)) AS Path
FROM EstimateItem
WHERE ParentEstimateItemID IS NULL AND EstimateID = #EstimateID
UNION ALL
SELECT i.EstimateItemID, i.ItemType
, Level + 1
, CAST(Path + '.' + CAST(i.EstimateItemID AS VARCHAR(255)) AS VARCHAR(255))
FROM EstimateItem i
INNER JOIN items itms ON itms.EstimateItemID = i.ParentEstimateItemID
)
SELECT * FROM items ORDER BY Path
With Path - rows a sorted by parents nodes
If you want sort childnodes by ItemType for each level, than you can play with Level and SUBSTRING of Pathcolumn....
Here SQLFiddle with sample of data
This is an add-on to Fabio's great idea from above. Like I said in my reply to his original post. I have re-posted his idea using more common data, table name, and fields to make it easier for others to follow.
Thank you Fabio! Great name by the way.
First some data to work with:
CREATE TABLE tblLocations (ID INT IDENTITY(1,1), Code VARCHAR(1), ParentID INT, Name VARCHAR(20));
INSERT INTO tblLocations (Code, ParentID, Name) VALUES
('A', NULL, 'West'),
('A', 1, 'WA'),
('A', 2, 'Seattle'),
('A', NULL, 'East'),
('A', 4, 'NY'),
('A', 5, 'New York'),
('A', 1, 'NV'),
('A', 7, 'Las Vegas'),
('A', 2, 'Vancouver'),
('A', 4, 'FL'),
('A', 5, 'Buffalo'),
('A', 1, 'CA'),
('A', 10, 'Miami'),
('A', 12, 'Los Angeles'),
('A', 7, 'Reno'),
('A', 12, 'San Francisco'),
('A', 10, 'Orlando'),
('A', 12, 'Sacramento');
Now the recursive query:
-- Note: The 'Code' field isn't used, but you could add it to display more info.
;WITH MyCTE AS (
SELECT ID, Name, 0 AS TreeLevel, CAST(ID AS VARCHAR(255)) AS TreePath
FROM tblLocations T1
WHERE ParentID IS NULL
UNION ALL
SELECT T2.ID, T2.Name, TreeLevel + 1, CAST(TreePath + '.' + CAST(T2.ID AS VARCHAR(255)) AS VARCHAR(255)) AS TreePath
FROM tblLocations T2
INNER JOIN MyCTE itms ON itms.ID = T2.ParentID
)
-- Note: The 'replicate' function is not needed. Added it to give a visual of the results.
SELECT ID, Replicate('.', TreeLevel * 4)+Name 'Name', TreeLevel, TreePath
FROM MyCTE
ORDER BY TreePath;
I believe that you need to add the following to the results of your CTE...
BranchID = some kind of identifier that uniquely identifies the branch. Forgive me for not being more specific, but I'm not sure what identifies a branch for your needs. Your example shows a binary tree in which all branches flow back to the root.
ItemTypeID where (for example) 0 = Product and 1 = service.
Parent = identifies the parent.
If those exist in the output, I think you should be able to use the output from your query as either another CTE or as the FROM clause in a query. Order by BranchID, ItemTypeID, Parent.
I'm using PostgreSQL's Ltree module for storing hierarchical data. I'm looking to retrieve the full hierarchy sorted by a particular column.
Consider the following table:
votes | path | ...
-------+-------+-----
1 | 1 | ...
2 | 1.1 | ...
4 | 1.2 | ...
1 | 1.2.1 | ...
3 | 2 | ...
1 | 2.1 | ...
2 | 2.1.1 | ...
4 | 2.1.2 | ...
... | ... | ...
In my current implementation, I'd query the database with SELECT * FROM comments ORDER BY path, which would return the whole tree:
Node 1
-- Node 1.1
-- Node 1.2
---- Node 1.2.1
Node 2
-- Node 2.1
---- Node 2.1.1
---- Node 2.1.2
However, I want to sort by votes (not by id, which is what sorting by path amounts to). Each depth level needs to be independently sorted, with the correct tree structure kept intact. Something that would return the following:
Node 2
-- Node 2.1
---- Node 2.1.2
---- Node 2.1.1
Node 1
-- Node 1.2
---- Node 1.2.1
-- Node 1.1
Postgres' WITH RECURSIVE might be appropriate, but I'm not sure. Any ideas?
You were on the right track with WITH RECURSIVE.
Solution with recursive CTE
WITH RECURSIVE t AS (
SELECT t.votes
, t.path
, 1::int AS lvl
, to_char(t2.votes, 'FM0000000') AS sort
FROM tbl t
JOIN tbl t2 ON t2.path = subltree(t.path, 0, 1)
UNION ALL
SELECT t.votes
, t.path
, t.lvl + 1
, t.sort || to_char(t2.votes, 'FM0000000')
FROM t
JOIN tbl t2 ON t2.path = subltree(t.path, 0, t.lvl + 1)
WHERE nlevel(t.path) > t.lvl
)
SELECT votes, path, max(sort) AS sort
FROM t
GROUP BY 1, 2
ORDER BY max(sort), path;
Major points
The crucial part is to replace every level of the path with the value of votes. Thereby we assemble one column we can ORDER BY at the end. This is necessary, because the path has an unknown depth and we cannot order by an unknown number of expressions in static SQL.
In order to get a stable sort, I convert votes to a string with leading zeroes using to_char(). I use seven digits in the demo, which works for vote values below 10.000.000. Adjust according to your maximum vote count.
In the final SELECT I exclude all intermediary states to eliminate duplicates. Only the last step with max(sort) remains.
This works in standard SQL with a recursive CTE, but is not very efficient for large trees. A plpgsql function that recursively updates the sort path in a temporary table without creating temporary dupes might perform better.
Only works with the additional module ltree installed, which provides the functions subltree() and nlevel(), as well as the ltree data type.
My test setup, for review convenience:
CREATE TEMP TABLE tbl(votes int, path ltree);
INSERT INTO tbl VALUES
(1, '1')
, (2, '1.1')
, (4, '1.2')
, (1, '1.2.1')
, (3, '2')
, (1, '2.1')
, (2, '2.1.1')
, (4, '2.1.2')
, (1, '2.1.3')
, (2, '3')
, (17, '3.3')
, (99, '3.2')
, (10, '3.1.1')
, (2345, '3.1.2')
, (1, '3.1.3')
;
PL/pgSQL table function doing the same
Should be faster with huge trees.
CREATE OR REPLACE FUNCTION f_sorted_ltree()
RETURNS TABLE(votes int, path ltree)
LANGUAGE plpgsql VOLATILE AS
$func$
DECLARE
lvl integer := 0;
BEGIN
CREATE TEMP TABLE t ON COMMIT DROP AS
SELECT tbl.votes
, tbl.path
, ''::text AS sort
, nlevel(tbl.path) AS depth
FROM tbl;
-- CREATE INDEX t_path_idx ON t (path); -- beneficial for huge trees
-- CREATE INDEX t_path_idx ON t (depth);
LOOP
lvl := lvl + 1;
UPDATE t SET sort = t.sort || to_char(v.votes, 'FM0000000')
FROM (
SELECT t2.votes, t2.path
FROM t t2
WHERE t2.depth = lvl
) v
WHERE v.path = subltree(t.path, 0 ,lvl);
EXIT WHEN NOT FOUND;
END LOOP;
-- Return sorted rows
RETURN QUERY
SELECT t.votes, t.path
FROM t
ORDER BY t.sort;
END
$func$;
Call:
SELECT * FROM f_sorted_ltree();
Read in the manual about setting temp_buffers.
I would be interested which performs faster with your real life data.
create table comments (
id serial,
parent_id int,
msg text,
primary key (id)
);
insert into comments (id, parent_id, msg) values (1, null, 'msg 1');
insert into comments (id, parent_id, msg) values (2, null, 'msg 2');
insert into comments (id, parent_id, msg) values (3, 1, 'msg 1 / ans 1');
insert into comments (id, parent_id, msg) values (4, null, 'msg 3');
insert into comments (id, parent_id, msg) values (5, 2, 'msg 2 / ans 1');
insert into comments (id, parent_id, msg) values (6, 2, 'msg 2 / ans 2');
insert into comments (id, parent_id, msg) values (7, 2, 'msg 2 / ans 3');
desc
WITH RECURSIVE q AS
(
SELECT id, msg, 1 as level, ARRAY[id] as path
FROM comments c
WHERE parent_id is null
UNION ALL
SELECT sub.id, sub.msg, level + 1, path || sub.id
FROM q
JOIN comments sub
ON sub.parent_id = q.id
)
SELECT id, msg, level
FROM q
order by path || array_fill(100500, ARRAY[8 - level]) desc;
results in
4,"msg 3",1
2,"msg 2",1
7,"msg 2 / ans 3",2
6,"msg 2 / ans 2",2
5,"msg 2 / ans 1",2
1,"msg 1",1
3,"msg 1 / ans 1",2
asc
WITH RECURSIVE q AS
(
SELECT id, msg, 1 as level, ARRAY[id] as path
FROM comments c
WHERE parent_id is null
UNION ALL
SELECT sub.id, sub.msg, level + 1, path || sub.id
FROM q
JOIN comments sub
ON sub.parent_id = q.id
)
SELECT id, msg, level
FROM q
--order by path || array_fill(100500, ARRAY[8 - level]) desc;
order by path;
results in
1,"msg 1",1
3,"msg 1 / ans 1",2
2,"msg 2",1
5,"msg 2 / ans 1",2
6,"msg 2 / ans 2",2
7,"msg 2 / ans 3",2
4,"msg 3",1