Common Table Expression basic example - sql-server

I have a recursive tree database table
DataItem
Id (uniqueidentifier)
Parent_Id? (uniqueidentifier)
PositionInParent (int)
I've read some articles about Common Table Expressions, which allows me to recursively read the tree structure directly from SQL database, but all of them are very complicated and i cannot make them work.
I am trying to read recursively all the DataItems, starting from the root ones (which has no parent), and adding the children items (ordered by PositionInParent)
Please help me create this simple example, and from there i will add more logic if necessary.

;WITH HierarchyCTE (ParentId, Id, Level)
AS
(
SELECT e.ParentId, e.Id, 0 AS Level
FROM Employees AS e
WHERE ParentId IS NULL
UNION ALL
SELECT e.ParentId, e.Id, Level + 1
FROM Employees AS e
INNER JOIN HierarchyCTE AS h
ON e.ParentId = h.Id
)
SELECT ParentId, Id, Level AS PositionInParent
FROM HierarchyCTE
You can use condition WHERE ParentId = 0if ParentId of super parent is 0

Related

Getting non-deterministic results from WITH RECURSIVE cte

I'm trying to create a recursive CTE that traverses all the records for a given ID, and does some operations between ordered records. Let's say I have customers at a bank who get charged a uniquely identifiable fee, and a customer can pay that fee in any number of installments:
WITH recursive payments (
id
, index
, fees_paid
, fees_owed
)
AS (
SELECT id
, index
, fees_paid
, fee_charged
FROM table
WHERE index = 1
UNION ALL
SELECT t.id
, t.index
, t.fees_paid
, p.fees_owed - p.fees_paid
FROM table t
JOIN payments p
ON t.id = p.id
AND t.index = p.index + 1
)
SELECT *
FROM payments
ORDER BY 1,2;
The join logic seems sound, but when I join the output of this query to the source table, I'm getting non-deterministic and incorrect results.
This is my first foray into Snowflake's recursive CTEs. What am I missing in the intermediate result logic that is leading to the non-determinism here?
I assume this is edited code, because in the anchor of you CTE you select the fourth column fee_charged which does not exist, and then in the recursion you don't sum the fees paid and other stuff, basically you logic seems rather strange.
So creating some random data, that has two different id streams to recurse over:
create or replace table data (id number, index number, val text);
insert into data
select * from values (1,1,'a'),(2,1,'b')
,(1,2,'c'), (2,2,'d')
,(1,3,'e'), (2,3,'f')
v(id, index, val);
Now altering you CTE just a little bit to concat that strings together..
WITH RECURSIVE payments AS
(
SELECT id
, index
, val
FROM data
WHERE index = 1
UNION ALL
SELECT t.id
, t.index
, p.val || t.val as val
FROM data t
JOIN payments p
ON t.id = p.id
AND t.index = p.index + 1
)
SELECT *
FROM payments
ORDER BY 1,2;
we get:
ID INDEX VAL
1 1 a
1 2 ac
1 3 ace
2 1 b
2 2 bd
2 3 bdf
Which is exactly as I would expect. So how this relates to your "it gets strange when I join to other stuff" is ether, your output of you CTE is not how you expect it to be.. Or your join to other stuff is not working as you expect, Or there is a bug with snowflake.
Which all comes down to, if the CTE results are exactly what you expect, create a table and join that to your other table, so eliminate some form of CTE vs JOIN bug, and to debug why your join is not working.
But if your CTE output is not what you expect, then lets help debug that.

SQL Server 2012 CTE Find Root or Top Parent of Hierarchical Data

I'm having an issue trying to recursively walk a hierarchy to find the top node of all descendent nodes in an organizational structure that may have multiple top-level nodes. I'm trying to use a SQL Server 2012 CTE to do so, but it won't recurse to reach the very top node of each branch. I've tried writing my query EXACTLY as shown in other posts relating to this, but still no dice. (At least I think I am.) I'm hoping someone can tell me what I'm doing wrong here? This post most closely relates to what I'm trying to do and I've followed the accepted answers, but I'm still just not "getting it" : Finding a Top Level Parent in SQL
As shown above, I have OrgGroups that reference direct parent groups, unless it's a top level and then it's NULL. For instance, (4) Finance (top-level) -> (5) HR -> (11) Benefits
I want to create a database view that lists each OrgGroup along with the ID of their TOP-MOST ancestor. (not their direct parent)
So, for example, the DB View would have a record for the (11) Benefits OrgGroup and a corresponding column value for it's top-most parentgroupId of (4) Finance.
;WITH OrgStructureIndex AS
(
SELECT O.OrgGroupId, O.Name, O.OrgStructureId, O.ParentGroupId, 1 AS Lvl
FROM OrgGroups O
UNION ALL
SELECT OG.OrgGroupId, OG.Name, OG.OrgStructureId, OG.ParentGroupId, Lvl+1 AS Lvl
FROM OrgGroups OG INNER JOIN OrgStructureIndex OI
ON OI.OrgGroupId = OG.ParentGroupId
)
SELECT * FROM OrgStructureIndex
This results in the Benefits org group having a top-most ParentGroupId of (5) HR. Desired results would be (4) Finance. It also results in duplicate records.
To get rid of the duplicates at least, I've changed my SQL to:
;WITH OrgStructureIndex AS
(
SELECT O.OrgGroupId, O.Name, O.OrgStructureId, O.ParentGroupId, 1 AS Lvl
FROM OrgGroups O
UNION ALL
SELECT OG.OrgGroupId, OG.Name, OG.OrgStructureId, OG.ParentGroupId, Lvl+1 AS Lvl
FROM OrgGroups OG INNER JOIN OrgStructureIndex OI
ON OI.OrgGroupId = OG.ParentGroupId
)
,CTE_RN AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY oi.OrgGroupId ORDER BY oi.Lvl DESC) RN
FROM OrgStructureIndex oi
)
SELECT * FROM CTE_RN
WHERE RN = 1
Where am I falling short here?? TIA
Two shortcomings:
First, for some reason you decided to select all nodes in the anchor part of the CTE, not just the root ones. That's why you have a lot of duplicates.
Second, you don't pass along the only field you actually need - the Id of the actual root.
Here is how you can fix them:
;WITH OrgStructureIndex AS
(
SELECT O.OrgGroupId, O.Name, O.OrgStructureId, O.ParentGroupId, 1 AS Lvl,
-- #2
o.OrgGroupId as [RootGroupId]
FROM OrgGroups O
-- #1
where o.ParentGroupId is null
UNION ALL
SELECT OG.OrgGroupId, OG.Name, OG.OrgStructureId, OG.ParentGroupId, Lvl+1 AS Lvl,
-- #2
oi.RootGroupId
FROM OrgGroups OG INNER JOIN OrgStructureIndex OI
ON OI.OrgGroupId = OG.ParentGroupId
)
SELECT * FROM OrgStructureIndex;
You can indeed walk up from the leaf node--as you are doing--to find the root of each original row. The thing your missing is tracking the starting leaf as you recurse up. Stripped down example:
fiddle
CREATE TABLE OrgGroup (OrgGroupId INT, Name VARCHAR(10), ParentGroupId INT)
GO
INSERT INTO OrgGroup VALUES
(1,'Main', NULL),
(2,'IT',1),
(3,'DotCom',2),
(4,'Finance', NULL),
(5,'HR',4),
(6,'Accounting',4)
GO
;WITH cte AS
(
SELECT 1 AS Lvl
,OrgGroupId LeafId
,OrgGroupId
,ParentGroupId
,Name
,Name LeafName
FROM OrgGroup
UNION ALL
SELECT Lvl+1 AS Lvl
,OI.LeafId
,OG.OrgGroupId
,OG.ParentGroupId
,OG.Name
,OI.LeafName
FROM OrgGroup OG
INNER JOIN
cte OI ON OI.ParentGroupId = OG.OrgGroupId
)
,cte_rn AS (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY LeafID ORDER BY Lvl DESC) rn
FROM cte
)
SELECT * FROM cte_rn WHERE rn = 1*

TSQL self join to get results

I run the following query
Select * From
(
Select
GUID,
MFG_CODE,
STK_NAME,
parentid,
masteritem,
ROW_NUMBER() over(order by guid) r
From Fstock Where MasterItem=1 OR isNull(parentID, '')=''
) a
Where r between 4716 And 4716
And I get following results
GUID MFG_CODE parentid masteritem r
31955 369553 0 1 4717
As you can see GUID 31955 is actually a parentITEM & I need to bring in all the children of this parent item within the same query.
For example if I do:
Select * From Fstock where parentID = 31955
It returns 3 children of it
GUID
31956
31957
31958
So is there a way to combine these two queries together, I only want to return fixed amount of rows using row_number() function, however those returned rows sometimes contain a Parent ITem, I would like to return the children for those parent items as well within same query.
Performance is very important for me.
--- EDIT ----
I got it to work with following query, does anyone have other ideas?
With CTE
As
(
Select
GUID,
Manufacturer,
SELL_PRICE,
MFG_CODE,
parentid,
masteritem,
ROW_NUMBER() over(order by GUID) r
From Fstock Where MasterItem=1 OR isNull(parentID, '')=''
)
Select A.*,F.parentID From
(
Select * From CTE
Where r between 4717 And 6000
) A
Left join Fstock F on F.parentID = A.GUID
Order by A.r
This is crude and untested, but I believe you're looking for a recursive Common Table Expression (CTE) that will combine the parent-child relationships for you. Now, natively, this does not integrate any row limitations you mentioned in terms of returning a "fixed number of rows," which I was not precisely sure how to interpret, but the basic query below should be a start for you.
With Products(GUID, MFG_CODE,STK_NAME, parentid,masteritem)
as
(
Select GUID,MFG_CODE,STK_NAME,parentid,masteritem
from fstock
where masteritem=1 OR isNull(parentID, '')=''
Union all
Select f.GUID,f.MFG_CODE,f.STK_NAME,f.parentid,f.masteritem
from fstock f
inner join products g
on f.parentid=g.guid
)

Wrong order in Table valued Function(keep "order" of a recursive CTE)

a few minutes ago i asked here how to get parent records with a recursive CTE.
This works now, but I get the wrong order(backwards, ordered by the PK idData) when i create a Table valued Function which returns all parents. I cannot order directly because i need the logical order provided by the CTE.
This gives the correct order(from next parent to that parent and so on):
declare #fiData int;
set #fiData=16177344;
WITH PreviousClaims(idData,fiData)
AS(
SELECT parent.idData,parent.fiData
FROM tabData parent
WHERE parent.idData = #fiData
UNION ALL
SELECT child.idData,child.fiData
FROM tabData child
INNER JOIN PreviousClaims parent ON parent.fiData = child.idData
)
select iddata from PreviousClaims
But the following function returns all records in backwards order(ordered by PK):
CREATE FUNCTION [dbo].[_previousClaimsByFiData] (
#fiData INT
)
RETURNS #retPreviousClaims TABLE
(
idData int PRIMARY KEY NOT NULL
)
AS
BEGIN
DECLARE #idData int;
WITH PreviousClaims(idData,fiData)
AS(
SELECT parent.idData,parent.fiData
FROM tabData parent
WHERE parent.idData = #fiData
UNION ALL
SELECT child.idData,child.fiData
FROM tabData child
INNER JOIN PreviousClaims parent ON parent.fiData = child.idData
)
INSERT INTO #retPreviousClaims
SELECT idData FROM PreviousClaims;
RETURN;
END;
select * from dbo._previousClaimsByFiData(16177344);
UPDATE:
Since everybody beliefs that the CTE is not ordering(Any "ordering" will be totally arbitrary and coincidental), i'm wondering why the opposite seems to be true. I have queried a child claim with many parents and the order in the CTE is exactly the logical order when i go from child to parent and so on. This would mean that the CTE is iterating from record to record like a cursor and the following select returns it in exact this order. But when i call the TVF i got the order of the primary key idData instead.
The solution was simple. I only needed to remove the parent key of the return-Table of the TVF. So change...
RETURNS #retPreviousClaims TABLE
(
idData int PRIMARY KEY NOT NULL
)
to...
RETURNS #retPreviousClaims TABLE
(
idData int
)
.. and it keeps the right "order" (same order they were inserted into the CTE's temporary result set).
UPDATE2:
Because Damien mentioned that the "CTE-Order" could change in certain circumstances, i will add a new column relationLevel to the CTE which describes the level of relationship of the parent records (what is by the way quite useful in general f.e. for a ssas cube).
So the final Inline-TVF(which returns all columns) is now:
CREATE FUNCTION [dbo].[_previousClaimsByFiData] (
#fiData INT
)
RETURNS TABLE AS
RETURN(
WITH PreviousClaims
AS(
SELECT 1 AS relationLevel, child.*
FROM tabData child
WHERE child.idData = #fiData
UNION ALL
SELECT relationLevel+1, child.*
FROM tabData child
INNER JOIN PreviousClaims parent ON parent.fiData = child.idData
)
SELECT TOP 100 PERCENT * FROM PreviousClaims order by relationLevel
)
This is an exemplary relationship:
select idData,fiData,relationLevel from dbo._previousClaimsByFiData(46600314);
Thank you.
The correct way to do your ORDERing is to add an ORDER BY clause to your outermost select. Anything else is relying on implementation details that may change at any time (including if the size of your database/tables goes up, which may allow more parallel processing to occur).
If you need something convenient to allow the ordering to take place, look at Example D in the examples from the MSDN page on WITH:
WITH DirectReports(ManagerID, EmployeeID, Title, EmployeeLevel) AS
(
SELECT ManagerID, EmployeeID, Title, 0 AS EmployeeLevel
FROM dbo.MyEmployees
WHERE ManagerID IS NULL
UNION ALL
SELECT e.ManagerID, e.EmployeeID, e.Title, EmployeeLevel + 1
FROM dbo.MyEmployees AS e
INNER JOIN DirectReports AS d
ON e.ManagerID = d.EmployeeID
)
Add something similay to the EmployeeLevel column to your CTE, and everything should work.
I think the impression that the CTE is creating an ordering is wrong. It's a coincidence that the rows are coming out in order (possibly due to how they were originally inserted into tabData). Regardless, the TVF is returning a table so you have to explicitly add an ORDER BY to the SELECT you're using to call it if you want to guarantee ordering:
select * from dbo._previousClaimsByFiData(16177344) order by idData
There is no ORDER BY anywhere in sight - neither in the table-valued function, nor in the SELECT from that TVF.
Any "ordering" will be totally arbitrary and coincidental.
If you want a specific order, you need to specify an ORDER BY.
So why can't you just add an ORDER BY to your SELECT:
SELECT * FROM dbo._previousClaimsByFiData(16177344)
ORDER BY (whatever you want to order by)....
or put your ORDER BY into the TVF:
INSERT INTO #retPreviousClaims
SELECT idData FROM PreviousClaims
ORDER BY idData DESC (or whatever it is you want to order by...)

getting nested results from linq to sql

my data looks like this in the table:
ID Name Parent ID
--- ---- ---------
1 Mike null
2 Steve 1
3 George null
4 Jim 1
I can't figure out how to write a linq to sql query that will return the results with the parent rows grouped with their child rows. So for example this is the result I want:
1 Mike (no parent)
2 Steve (Parent is 1)
4 Jim (Parent is 1)
3 George (no parent)
The way I'm doing it right now is to first grab a result set of all the parent rows. Then I loop through it and find the children for each parent and insert all this into a List<> as I loop. At the end the List<> has everything in the order I want it.
But is there a way to do this in just one linq query?
Assuming that you have a self-referential relationship for the table, you could do something like:
var q = db.People
.OrderBy( p => p.ParentID == null
? p.Name
: p.Parent.Name + ":" + p.ID + ":" + p.Name );
You need a Common Table Expression (CTE) to do recursive SQL. CTEs are not supported by Linq to Sql. You can execute a query directly though.
This is what the SQL might look like although it does not group the children with their parents. I don't think you can do the grouping using CTEs:
WITH DirectReports (ID, Name, ParentID, Level)
AS
(
SELECT e.ID, e.Name, e.ParentID, 0 AS Level
FROM Employee e
WHERE e.ParentID IS NULL
UNION ALL
SELECT e.ID, e.Name, e.ParentID, Level + 1
FROM Employee E
JOIN DirectReports AS d
ON e.ParentID = d.ID
)
SELECT *
FROM DirectReports

Resources