Avoiding duplicate recursion with Common Table Expressions - sql-server

Let's say I have a table with 2 columns ID and ParentID. My data looks like this:
ID ParentID
1 Null
2 1
3 1
4 2
4 2
So to find all relationships based on a given ID my query simplified looks like this:
WITH links ([ID], [ParentID], Depth)
AS
(
--Get the starting link
SELECT
[ID],
[ParentID],
[Depth] = 1
FROM
[MyTable]
WHERE
[ID] = #StartID
UNION ALL
--Recursively get links that are parented to links already in the CTE
SELECT
mt.[ID],
mt.[ParentID],
[Depth] = l.[Depth] + 1
FROM
[MyTable] mt
JOIN
links l ON mt.ParentID = l.ID
WHERE
Depth < 99
)
SELECT
[Depth],
[ID],
[ParentID]
FROM
[links]
Now let's say the data in my table creates a cyclical relationship (4 is parented to 2 and 2 is parented to 4. Forgetting for a moment that there should likely be constraints on the database to prevent this, the above recursive CTE query produce duplicate records (99 of them) because it will recursively evaluate that cyclical relationship between 2 and 4.
ID ParentID
1 Null
2 1
3 1
4 2
2 4
2 4
How can I alter my query to prevent that, assuming that I have no control over preventing the actual data from representing that cyclical relationship. Normally I would put a distinct on the final select but I want the Depth value, which makes every record distinct. I'm also hoping to account for it within the CTE, as a distinct operates on the final select, and is probably not as efficient.

You could create a tree path variable in the CTE which shows your entire path from the top of the recursive query, then check to see if the number in question is in the tree path, if it is then abort at that point.
USE Master;
GO
CREATE DATABASE [QueryTraining];
GO
USE [QueryTraining];
GO
CREATE TABLE [MyTable] (
ID int, --would normally be an INT IDENTITY
ParentID int
);
INSERT INTO [MyTable] (ID, ParentID)
VALUES (1, NULL),
(2, 1),
(3, 1),
(4, 2),
(2, 4),
(2, 4);
DECLARE #StartID AS INTEGER;
SET #StartID = 1;
;WITH links (ID, ParentID, Depth, treePath)
AS
(
--Get the starting link
SELECT [ID],
[ParentID],
[Depth] = 1,
CAST(':' + CAST([ID] AS VARCHAR(MAX)) AS VARCHAR(MAX)) AS treePath
FROM [MyTable]
WHERE [ID] = #StartID
UNION ALL
--Recursively get links that are parented to links already in the CTE
SELECT mt.[ID],
mt.[ParentID],
[Depth] = l.[Depth] + 1,
CAST(l.treePath + CAST(mt.[ID] AS VARCHAR(MAX)) + ':' AS VARCHAR(MAX)) AS treePath
FROM [MyTable] mt
INNER JOIN links l ON mt.ParentID = l.ID
AND CHARINDEX(':' + CAST(mt.[ID] AS VARCHAR(MAX)) + ':', l.[treePath]) = 0
WHERE Depth < 10
)
SELECT
[Depth],
[ID],
[ParentID],
[treePath]
FROM
[links];
The line on the INNER JOIN that says
AND CHARINDEX(':' + CAST(mt.[ID] AS VARCHAR(MAX)) + ':', l.[treePath]) = 0
Is where the previous numbers in the path get filtered out.
Just copy and paste the example and give it a try.
One note, the way that I am using CHARINDEX on the CTE may not scale well, but it does accomplish what I think you are looking for.

Related

Select Duplicate Records

I want to retrieve only Duplicated records not unique records.
Suppose I have data which consists of as below
Ids Names
1 A
2 B
1 A
I want like output like the following:
Sno Id Name
1 1 A
2 1 A
Try this:
DECLARE #DataSource TABLE
(
[ID] INT
,[name] CHAR(1)
,[value] CHAR(2)
);
INSERT INTO #DataSource ([ID], [name], [value])
VALUES (1, 'A', 'x1')
,(2, 'B', 'x2')
,(1, 'A', 'x3');
WITH DataSource AS
(
SELECT *
,COUNT(*) OVER (PARTITION BY [ID], [name]) AS [Count]
FROM #DataSource
)
SELECT *
FROM Datasource
WHERE [Count] > 1;
The grouping part is done in the PARTITION BY part of the window function. So, basically, we are counting records for each unique ID - name pairs. Of couse, you are able to add more columns columns here.
SELECT Id, Names
FROM T
GROUP BY Id,Name
HAVING COUNT(*) >1
like your request, you need to create a new column [SNo] that is partitioned on the orignal columns (Names, Id). Those with [SNo] >1 are duplicates. To Filter, just get RCount>1.
See a mockup below:
DECLARE #Records TABLE (Id int, Names VARCHAR(10))
INSERT INTO #Records
SELECT 1, 'A' UNION ALL
SELECT 2, 'B' UNION ALL
SELECT 1, 'A'
----To Get Duplicates -----
SELECT *
FROM
(
SELECT
SNo=ROW_NUMBER()over(PARTITION BY Names,Id order by Id),
RCount=COUNT(*) OVER (PARTITION BY [ID], Names),
*
FROM
#Records
)M
WHERE
RCount>1

Find Pairs in recursive table [duplicate]

This question already has an answer here:
SQL Server recursive self join
(1 answer)
Closed 5 years ago.
I have the followign table
CREATE TABLE [dbo].[MyTable2](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ParentID] [int] NOT NULL,
)
I try to create a query which will return a list of pairs ID, ParentID. For example I have the followign data
ID ParentID
1 0
2 0
3 1
4 3
5 3
15 8
I want when I search by ID = 5 to have the following list:
ID ParentID
5 3
3 1
1 0
If I search by ID = 15 it should see that the sequence is boken and I will get the followign list.
ID ParentID
15 8
I used a temporary table in order to make it work and my code is the following:
if object_id('tempdb..#Pairs') is not null
DROP TABLE #Pairs
create table #Pairs
(
ID INT,
ParentID INT
)
Declare #ID integer = 5;
Declare #ParentID integer;
while (#ID > 0)
BEGIN
SET #ParentID = null; -- I set it to null so that I will be able to check in case the sequence is broken
select #ID=ID, #ParentID=ParentID
from MyTable
where ID = #ID;
if #ParentID IS NOT null
begin
Insert into #Pairs (ID, ParentID) Values (#ID, #ParentID)
SET #ID = #ParentID;
end
else
SET #ID = 0;
END
SELECT * from #Pairs
It works but I am sure that there is a better way to do it. I found some strange queries which was suspposed to do something similar but I was not able to convert it in order to cover my needs.
For example I found the following Question but I was not able to convert it to work with my table. All the queries that I found had similar answers.
You are looking for recursive queries. See following example:
SELECT * INTO tab FROM (VALUES
(1, 0),
(2, 0),
(3, 1),
(4, 3),
(5, 3),
(15, 8)) T(ID, ParentID);
DECLARE #whatAreYouLookingFor int = 5;
WITH Rec AS
(
SELECT * FROM tab WHERE ID=#whatAreYouLookingFor
UNION ALL
SELECT T.* FROM tab T JOIN Rec R ON R.ParentID=T.ID
)
SELECT * FROM Rec;
DROP TABLE tab
Output:
ID ParentID
-- --------
5 3
3 1
1 0

How to get id's of parent ids for inserting children

I have Parent and Child table.
The goal is to duplicate the records, except with new primary keys.
Original Tables
Parent(id)
1
Child(id,parentId, data)
1,1
2,1
After insert:
Parent
1
2
Child
1,1
2,1
3,2
4,2
How do I do that? The part I am having trouble with is getting the new parent key for use with the child records.
This is what I have come up with so far.
--DECLARE VARS
declare #currentMetadataDocumentSetId int = 1, --Ohio
#newMetadataDocumentSetid int = 3; --PA
--CLEANUP
IF OBJECT_ID('tempdb..#tempFileRowMap') IS NOT NULL
/*Then it exists*/
DROP TABLE #tempFileRowMap
--Remove existing file row maps.
delete from file_row_map where metadata_document_set_id = #newMetadataDocumentSetid;
--Create a temptable to hold data to be copied.
Select [edi_document_code],
[functional_group],
[description],
3 as [metadata_document_set_id],
[document_name],
[incoming_file_row_subtype],
[metadata_document_id],
[document_subcode],
[outgoing_file_row_subtype],
[asi_type_code],
[asi_action_code],
[metadata_document_set],
file_row_map_id as orig_file_row_map_id
into #tempFileRowMap
from file_row_map fileRowMap
where metadata_document_set_id = #currentMetadataDocumentSetId;
--Select * from #tempFileRowMap;
Insert into file_row_map select
[edi_document_code],
[functional_group],
[description],
[metadata_document_set_id],
[document_name],
[incoming_file_row_subtype],
[metadata_document_id],
[document_subcode],
[outgoing_file_row_subtype],
[asi_type_code],
[asi_action_code],
[metadata_document_set]
from #tempFileRowMap
--Show Results
Select * from file_row_map fileRowMap where fileRowMap.metadata_document_set_id = #newMetadataDocumentSetid
--Update Detail
Select
[file_row_map_id],
[file_row_column],
[element_code],
[element_metadata_id],
[col_description],
[example],
[translate],
[is_used],
[is_mapped],
[page_num],
[subcode],
[qualifier],
[loop_code],
[loop_subcode],
[default_value],
[delete_flag]
into #tempFileRowMapDetail
from [dbo].[file_row_map_detail] d
left join #tempFileRowMap m
on m.orig_file_row_map_id = d.file_row_map_id
select * from #tempFileRowMapDetail
Simply use OUTPUT clause for getting exact Parent Table Primary Key values.
Lets build Example Schema for your case
--For Capturing inserted ID
CREATE TABLE #ID_CAPTURE (PARENT_ID INT,ORDER_NME VARCHAR(20));
--Your Intermidiate Data To insert into Actual Tables
CREATE TABLE #DUMMY_TABLE (ORDER_NME VARCHAR(20), ITEM_NME VARCHAR(20));
--Actual Tables
CREATE TABLE #ORDER_PARENT (ORDER_ID INT IDENTITY,ORDER_NME VARCHAR(20))
CREATE TABLE #ORDER_CHILD (CHILD_ID INT IDENTITY ,ORDER_ID INT, ORDER_NME VARCHAR(20))
INSERT INTO #DUMMY_TABLE
SELECT 'BILL1','Oil'
UNION ALL
SELECT 'BILL1', 'Gas'
UNION ALL
SELECT 'BILL2', 'Diesel'
Now do Inserts in Parent & Child Tables
INSERT INTO #ORDER_PARENT
OUTPUT inserted.ORDER_ID, inserted.ORDER_NME into #ID_CAPTURE
SELECT DISTINCT ORDER_NME FROM #DUMMY_TABLE
INSERT INTO #ORDER_CHILD
SELECT C.PARENT_ID, ITEM_NME FROM #DUMMY_TABLE D
INNER JOIN #ID_CAPTURE C ON D.ORDER_NME = C.ORDER_NME
SELECT * FROM #ID_CAPTURE
SELECT * FROM #ORDER_CHILD
There are other ways to get Inserted Identity values.
See documentation ##IDENTITY (Transact-SQL) , SCOPE_IDENTITY
Try following approach:
DECLARE #Table1 TABLE (
ID INT NOT NULL PRIMARY KEY,
ParentID INT NULL, -- FK
[Desc] VARCHAR(50) NOT NULL
);
INSERT #Table1 (ID, ParentID, [Desc])
VALUES
(1, NULL, 'A'),
(2, 1, 'AA.1'),
(3, 1, 'AA.2'),
(4, NULL, 'B'),
(5, 4, 'BB.1'),
(6, 4, 'BB.2'),
(7, 4, 'BB.3'),
(8, 7, 'BBB.1');
DECLARE #ParentID INT = 4;
DECLARE #LastID INT = (SELECT TOP(1) ID FROM #Table1 x ORDER BY x.ID DESC)
IF #LastID IS NULL
BEGIN
RAISERROR('Invalid call', 16, 1)
--RETURN ?
END
SELECT #LastID AS LastID;
/*
LastID
-----------
8
*/
DECLARE #RemapIDs TABLE (
OldID INT NOT NULL PRIMARY KEY,
[NewID] INT NOT NULL UNIQUE
);
WITH CteRecursion
AS (
SELECT 1 AS Lvl, crt.ID, crt.ParentID --, crt.[Desc]
FROM #Table1 crt
WHERE crt.ID = #ParentID
UNION ALL
SELECT cld.Lvl + 1 AS Lvl, crt.ID, crt.ParentID --, crt.[Desc]
FROM #Table1 crt
JOIN CteRecursion cld ON crt.ParentID = cld.ID
)
INSERT #RemapIDs (OldID, [NewID])
SELECT r.ID, #LastID + ROW_NUMBER() OVER(ORDER BY r.Lvl) AS [NewID]
FROM CteRecursion r;
--INSERT #Table1 (ID, ParentID, [Desc])
SELECT nc.[NewID] AS ID, np.[NewID] AS ParentID, o.[Desc]
FROM #Table1 o -- old
JOIN #RemapIDs nc /*new child ID*/ ON o.ID = nc.OldID
LEFT JOIN #RemapIDs np /*new parent ID*/ ON o.ParentID = np.OldID
/*
ID ParentID Desc
----------- ----------- --------------------------------------------------
9 NULL B
10 9 BB.1
11 9 BB.2
12 9 BB.3
13 12 BBB.1
*/
Note: with some minor changes should work w. many ParentIDs values.

Recursive SQL query with multiple columns

I have a table with the following columns
idRelationshipType int,
idPerson1 int,
idPerson2 int
This table allows me to indicate records in a database that should be linked together.
I need to do a query returning all the unique ids where a person's id exists in idPerson1 or idPerson2 columns. Additionally, I need the query to be recursive so that the if I a match is found in idPerson1, the value for idPerson2 is included in the result set and used to repeat the query recursively until no more matches are found.
Example data:
CREATE TABLE [dbo].[tbRelationships]
(
[idRelationshipType] [int],
[idPerson1] [int] ,
[idPerson2] [int]
)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 1, 2)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 2, 3)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 3, 4)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 5, 1)
Four 'Relationships' are defined here. For this query, I will only know one of the ids to begin with. I need a query that in concept works like
SELECT idPerson
FROM [some query]
WHERE [the id i have to start with] = #idPerson
AND idRelationshipType = #idRelationshipType
The returned result should be a 5 rows with one column 'idPerson', with 1, 2, 3, 4, and 5 as the row values.
I have tried various combinations of UNPIVOT and recursive CTEs but I am not making much progress.
Any help would be greatly appreciated.
Thanks,
Daniel
I think this is what you want:
DECLARE #RelationshipType int
DECLARE #PersonId int
SELECT #RelationshipType = 1, #PersonId = 1
;WITH Hierachy (idPerson1, IdPerson2)
AS
(
--root
SELECT R.idPerson1, R.idPerson2
FROM tbRelationships R
WHERE R.idRelationshipType = #RelationshipType
AND (R.idPerson1 = #PersonId OR R.idPerson2 = #PersonId)
--recurse
UNION ALL
SELECT R.idPerson1, R.idPerson2
FROM Hierachy H
JOIN tbRelationships R
ON (R.idPerson1 = H.idPerson2
OR R.idPerson2 = H.idPerson1)
AND R.idRelationshipType = #RelationshipType
)
SELECT DISTINCT idPerson
FROM
(
SELECT idPerson1 AS idPerson FROM Hierachy
UNION
SELECT idPerson2 AS idPerson FROM Hierachy
) H
Essentially, get the first rows where the required id is in either column, and then recurse getting all of the child ids based on id column 2

Is it possible to generate xml from SQL where I don't know the number of levels? [duplicate]

I wonder is there anyway to select hierarchy in SQL server 2005 and return xml format?
I have a database with a lot of data (about 2000 to 3000 records), and i am now using a function in SQL server 2005 to retrieve the data in hierarchy and return an XML but it seems not perfect because it's too slow when there is a lot of data
Here is my function
Database
ID Name Parent Order
Function
CREATE FUNCTION [dbo].[GetXMLTree]
(
#PARENT bigint
)
RETURNS XML
AS
BEGIN
RETURN /* value */
(SELECT [ID] AS "#ID",
[Name] AS "#Name",
[Parent] AS "#Parent",
[Order] AS "#Order",
dbo.GetXMLTree(Parent).query('/xml/item')
FROM MyDatabaseTable
WHERE [Parent]=#PARENT
ORDER BY [Order]
FOR XML PATH('item'),ROOT('xml'),TYPE)
END
I would like to use XML in hierarchy because with me there's alot of thing to do with it :)
Any best solutions plzzzzz
You can use a recursive CTE to build the hierarchy and loop over levels to build the XML.
-- Sample data
create table MyDatabaseTable(ID int, Name varchar(10), Parent int, [Order] int)
insert into MyDatabaseTable values
(1, 'N1', null, 1),
(2, 'N1_1', 1 , 1),
(3, 'N1_1_1', 2 , 1),
(4, 'N1_1_2', 2 , 2),
(5, 'N1_2', 1 , 2),
(6, 'N2', null, 1),
(7, 'N2_1', 6 , 1)
-- set #Root to whatever node should be root
declare #Root int = 1
-- Worktable that holds temp xml data and level
declare #Tree table(ID int, Parent int, [Order] int, [Level] int, XMLCol xml)
-- Recursive cte that builds #tree
;with Tree as
(
select
M.ID,
M.Parent,
M.[Order],
1 as [Level]
from MyDatabaseTable as M
where M.ID = #Root
union all
select
M.ID,
M.Parent,
M.[Order],
Tree.[Level]+1 as [Level]
from MyDatabaseTable as M
inner join Tree
on Tree.ID = M.Parent
)
insert into #Tree(ID, Parent, [Order], [Level])
select *
from Tree
declare #Level int
select #Level = max([Level]) from #Tree
-- Loop for each level
while #Level > 0
begin
update Tree set
XMLCol = (select
M.ID as '#ID',
M.Name as '#Name',
M.Parent as '#Parent',
M.[Order] as '#Order',
(select XMLCol as '*'
from #Tree as Tree2
where Tree2.Parent = M.ID
order by Tree2.[Order]
for xml path(''), type)
from MyDatabaseTable as M
where M.ID = Tree.ID
order by M.[Order]
for xml path('item'))
from #Tree as Tree
where Tree.[Level] = #Level
set #Level = #Level - 1
end
select XMLCol
from #Tree
where ID = #Root
Result
<item ID="1" Name="N1" Order="1">
<item ID="2" Name="N1_1" Parent="1" Order="1">
<item ID="3" Name="N1_1_1" Parent="2" Order="1" />
<item ID="4" Name="N1_1_2" Parent="2" Order="2" />
</item>
<item ID="5" Name="N1_2" Parent="1" Order="2" />
</item>
What benefit do you expect from using XML? I don't have a perfect solution for the case when you need XML by all means - but maybe you could also investigate alternatives??
With a recursive CTE (Common Table Expression), you could easily get your entire hierarchy in a single result set, and performance should be noticeably better than doing a recursive XML building function.
Check this CTE out:
;WITH Hierarchy AS
(
SELECT
ID, [Name], Parent, [Order], 1 AS 'Level'
FROM
dbo.YourDatabaseTable
WHERE
Parent IS NULL
UNION ALL
SELECT
t.ID, t.[Name], t.Parent, t.[Order], Level + 1 AS 'Level'
FROM
dbo.YourDatabaseTable t
INNER JOIN
Hierarchy h ON t.Parent = h.ID
)
SELECT *
FROM Hierarchy
ORDER BY [Level], [Order]
This gives you a single result set, where all rows are returned, ordered by level (1 for the root level, increasing 1 for each down level) and their [Order] column.
Could that be an alternative for you? Does it perform better??
I realise this answer is a bit late, but it might help some other unlucky person who is searching for answers to this problem. I have had similar performance problems using hierarchyid with XML:
It turned out for me that the simplest solution was actually just to call ToString() on the hierarchyid values before selecting as an XML column. In some cases this sped up my queries by a factor of ten!
Here's a snippet that exhibits the problem.
create table #X (id hierarchyid primary key, n int)
-- Generate 1000 random items
declare #n int = 1000
while #n > 0 begin
declare #parentID hierarchyID = null, #leftID hierarchyID = null, #rightID hierarchyID = null
select #parentID = id from #X order by newid()
if #parentID is not null select #leftID = id from #X where id.GetAncestor(1) = #parentID order by newid()
if #leftID is not null select #rightID = min(id) from #X where id.GetAncestor(1) = #parentID and id > #leftID
if #parentID is null set #parentID = '/'
declare #id hierarchyid = #parentID.GetDescendant(#leftID, #rightID)
insert #X (id, n) values (#id, #n)
set #n -= 1
end
-- select as XML using ToString() manually
select id.ToString() id, n from #X for xml path ('item'), root ('items')
-- select as XML without ToString() - about 10 times slower with SQL Server 2012
select id, n from #X for xml path ('item'), root ('items')
drop table #X

Resources