SQL Recursion with original table - sql-server

I have a leveled recursion question. I have a table called Item with some values, as well as an n..n table for Item that Extends an item. The structure is as follows:
ITEM [id, Name]
ItemLinksToItem [ID, ItemChildID, ItemParentID]
Now I want to make a level-list of items with a CTE recursion, but I need the items in that have no children (therefore aren't in the ItemLinksToItem) in there as well.
If using basic recursion, it just links the ItemLinksToItem itemchildid to itemparentid.
with cte as(
select
a.ItemChildID, a.ItemParentId, 1 as level
from ItemLinksToItem a
union all
select
c.ItemChildID, p.ItemParentId, p.level +1
from ItemLinksToItem c
inner join cte p on p.ItemChildID = c.ItemParentId
)
select * from cte
resulting in the following information:
ItemChildID ItemParentID Level
D8B11945-DC40-4E95-925E-7B38EB795510 E40FB8AB-0A06-496C-B004-B5B7044564E5 1
E40FB8AB-0A06-496C-B004-B5B7044564E5 2399F47A-056C-ED11-9FC3-B2E1592D478C 1
D8B11945-DC40-4E95-925E-7B38EB795510 2399F47A-056C-ED11-9FC3-B2E1592D478C 2
But I also want all items that are not in the ItemLinkstoItem in there with ItemParentID, no ItemChildID and Level 0, as such
ItemChildID ItemParentID Level
D8B11945-DC40-4E95-925E-7B38EB795510 E40FB8AB-0A06-496C-B004-B5B7044564E5 1
E40FB8AB-0A06-496C-B004-B5B7044564E5 2399F47A-056C-ED11-9FC3-B2E1592D478C 1
D8B11945-DC40-4E95-925E-7B38EB795510 2399F47A-056C-ED11-9FC3-B2E1592D478C 2
NULL 9859AFC6-D505-4199-9298-3CAD55B6EB67 0
Can anyone help me out?
Example data below
declare #Item table (id uniqueidentifier,
name nvarchar(255))
insert into #Item VALUES
('D8B11945-DC40-4E95-925E-7B38EB795510', 'Item Zero'),
('E40FB8AB-0A06-496C-B004-B5B7044564E5', 'Item One'),
('2399F47A-056C-ED11-9FC3-B2E1592D478C', 'Item Two'),
('C43CF1D6-50E9-4DFF-A28C-DF0CEAEDF405', 'Unrelated Item Three')
declare #ItemLinksToItem table (id int,
itemchildid uniqueidentifier ,
itemparentid uniqueidentifier)
INSERT INTO #ItemLinksToItem
VALUES
(5, 'D8B11945-DC40-4E95-925E-7B38EB795510', 'E40FB8AB-0A06-496C-B004-B5B7044564E5'),
(4, 'E40FB8AB-0A06-496C-B004-B5B7044564E5', '2399F47A-056C-ED11-9FC3-B2E1592D478C')

Related

Recursive SQL query with multiple columns

I have a table with the following columns
idRelationshipType int,
idPerson1 int,
idPerson2 int
This table allows me to indicate records in a database that should be linked together.
I need to do a query returning all the unique ids where a person's id exists in idPerson1 or idPerson2 columns. Additionally, I need the query to be recursive so that the if I a match is found in idPerson1, the value for idPerson2 is included in the result set and used to repeat the query recursively until no more matches are found.
Example data:
CREATE TABLE [dbo].[tbRelationships]
(
[idRelationshipType] [int],
[idPerson1] [int] ,
[idPerson2] [int]
)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 1, 2)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 2, 3)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 3, 4)
INSERT INTO tbRelationships (idRelationshipType, idPerson1, idPerson2)
VALUES (1, 5, 1)
Four 'Relationships' are defined here. For this query, I will only know one of the ids to begin with. I need a query that in concept works like
SELECT idPerson
FROM [some query]
WHERE [the id i have to start with] = #idPerson
AND idRelationshipType = #idRelationshipType
The returned result should be a 5 rows with one column 'idPerson', with 1, 2, 3, 4, and 5 as the row values.
I have tried various combinations of UNPIVOT and recursive CTEs but I am not making much progress.
Any help would be greatly appreciated.
Thanks,
Daniel
I think this is what you want:
DECLARE #RelationshipType int
DECLARE #PersonId int
SELECT #RelationshipType = 1, #PersonId = 1
;WITH Hierachy (idPerson1, IdPerson2)
AS
(
--root
SELECT R.idPerson1, R.idPerson2
FROM tbRelationships R
WHERE R.idRelationshipType = #RelationshipType
AND (R.idPerson1 = #PersonId OR R.idPerson2 = #PersonId)
--recurse
UNION ALL
SELECT R.idPerson1, R.idPerson2
FROM Hierachy H
JOIN tbRelationships R
ON (R.idPerson1 = H.idPerson2
OR R.idPerson2 = H.idPerson1)
AND R.idRelationshipType = #RelationshipType
)
SELECT DISTINCT idPerson
FROM
(
SELECT idPerson1 AS idPerson FROM Hierachy
UNION
SELECT idPerson2 AS idPerson FROM Hierachy
) H
Essentially, get the first rows where the required id is in either column, and then recurse getting all of the child ids based on id column 2

SQLServer : Grouping and Replacing a COLUMN value with the DATA from other table, without UDF

I would like to replace the numbers in #CommentsTable column "Comments" with the equivalent text from #ModTable table, without using UDF in a single SELECT. May with a CTE. Tried STUFF with REPLACE, but no luck.
Any suggestions would be a great help!
Sample:
DECLARE #ModTable TABLE
(
ID INT,
ModName VARCHAR(10),
ModPos VARCHAR(10)
)
DECLARE #CommentsTable TABLE
(
ID INT,
Comments VARCHAR(100)
)
INSERT INTO #CommentsTable
VALUES (1, 'MyFirst 5 Comments with 6'),
(2, 'MySecond comments'),
(3, 'MyThird comments 5')
INSERT INTO #ModTABLE
VALUES (1, '[FIVE]', '5'),
(1, '[SIX]', '6'),
(1, '[ONE]', '1'),
(1, '[TWO]', '2')
SELECT T1.ID, <<REPLACED COMMENTS>>
FROM #CommentsTable T1
GROUP BY T1.ID, T1.Comments
**Expected Result:**
ID Comments
1 MyFirst [FIVE] Comments with [SIX]
2 MySecond comments
3 MyThird comments [FIVE]
Create a cursor, span over the #ModTable and do each replacement a time
DECLARE replcursor FOR SELECT ModPos, ModName FROM #ModTable;
OPEN replcursor;
DECLARE modpos varchar(100) DEFAULT "";
DECLARE modname varchar(100) DEFAULT "";
get_loop: LOOP
FETCH replcursor INTO #modpos, #modname
SELECT T1.ID, REPLACE(T1.Comments, #modpos, #modname)
FROM #CommentsTable T1
GROUP BY T1.ID, T1.Comments
END LOOP get_loop;
Of course, you can store the results in a temp table and get the results altogether in the end of loop.
You can use a while loop to iterate over the records and the mods. I slightly modified your #ModTable to have unique values for ID. If this is not your data structure, then you can use a window function like ROW_NUMBER() to get a unique value over which you can iterate.
Revised script example:
DECLARE #ModTable TABLE
(
ID INT,
ModName VARCHAR(10),
ModPos VARCHAR(10)
)
DECLARE #CommentsTable TABLE
(
ID INT,
Comments VARCHAR(100)
)
INSERT INTO #CommentsTable
VALUES (1, 'MyFirst 5 Comments with 6'),
(2, 'MySecond comments'),
(3, 'MyThird comments 5')
INSERT INTO #ModTABLE
VALUES (1, '[FIVE]', '5'),
(2, '[SIX]', '6'),
(3, '[ONE]', '1'),
(4, '[TWO]', '2')
declare #revisedTable table (id int, comments varchar(100))
declare #modcount int = (select count(*) from #ModTable)
declare #commentcount int = (select count(*) from #CommentsTable)
declare #currentcomment varchar(100) = ''
while #commentcount > 0
begin
set #modcount = (select count(*) from #ModTable)
set #currentcomment = (select Comments from #CommentsTable where ID = #commentcount)
while #modcount > 0
begin
set #currentcomment = REPLACE( #currentcomment,
(SELECT TOP 1 ModPos FROM #ModTable WHERE ID = #modcount),
(SELECT TOP 1 ModName FROM #ModTable WHERE ID = #modcount))
set #modcount = #modcount - 1
end
INSERT INTO #revisedTable (id, comments)
SELECT #commentcount, #currentcomment
set #commentcount = #commentcount - 1
end
SELECT *
FROM #revisedTable
order by id
I think the will work even though I generally avoid recursive queries. It assumes that you have consecutive ids though:
with Comments as
(
select ID, Comments, 0 as ConnectID
from #CommentsTable
union all
select ID, replace(c.Comments, m.ModPos, m.ModName), m.ConnectID
from Comments c inner join #ModTable m on m.ConnectID = c.ConnectID + 1
)
select * from Comments
where ConnectID = (select max(ID) from #ModTable)
=> CLR Function()
As I have lot of records in "CommentsTable" and the "ModTable" would have multiple ModName for each comments, finally decided to go with CLR Function. Thanks all of you for the suggestions and pointers.

Avoiding duplicate recursion with Common Table Expressions

Let's say I have a table with 2 columns ID and ParentID. My data looks like this:
ID ParentID
1 Null
2 1
3 1
4 2
4 2
So to find all relationships based on a given ID my query simplified looks like this:
WITH links ([ID], [ParentID], Depth)
AS
(
--Get the starting link
SELECT
[ID],
[ParentID],
[Depth] = 1
FROM
[MyTable]
WHERE
[ID] = #StartID
UNION ALL
--Recursively get links that are parented to links already in the CTE
SELECT
mt.[ID],
mt.[ParentID],
[Depth] = l.[Depth] + 1
FROM
[MyTable] mt
JOIN
links l ON mt.ParentID = l.ID
WHERE
Depth < 99
)
SELECT
[Depth],
[ID],
[ParentID]
FROM
[links]
Now let's say the data in my table creates a cyclical relationship (4 is parented to 2 and 2 is parented to 4. Forgetting for a moment that there should likely be constraints on the database to prevent this, the above recursive CTE query produce duplicate records (99 of them) because it will recursively evaluate that cyclical relationship between 2 and 4.
ID ParentID
1 Null
2 1
3 1
4 2
2 4
2 4
How can I alter my query to prevent that, assuming that I have no control over preventing the actual data from representing that cyclical relationship. Normally I would put a distinct on the final select but I want the Depth value, which makes every record distinct. I'm also hoping to account for it within the CTE, as a distinct operates on the final select, and is probably not as efficient.
You could create a tree path variable in the CTE which shows your entire path from the top of the recursive query, then check to see if the number in question is in the tree path, if it is then abort at that point.
USE Master;
GO
CREATE DATABASE [QueryTraining];
GO
USE [QueryTraining];
GO
CREATE TABLE [MyTable] (
ID int, --would normally be an INT IDENTITY
ParentID int
);
INSERT INTO [MyTable] (ID, ParentID)
VALUES (1, NULL),
(2, 1),
(3, 1),
(4, 2),
(2, 4),
(2, 4);
DECLARE #StartID AS INTEGER;
SET #StartID = 1;
;WITH links (ID, ParentID, Depth, treePath)
AS
(
--Get the starting link
SELECT [ID],
[ParentID],
[Depth] = 1,
CAST(':' + CAST([ID] AS VARCHAR(MAX)) AS VARCHAR(MAX)) AS treePath
FROM [MyTable]
WHERE [ID] = #StartID
UNION ALL
--Recursively get links that are parented to links already in the CTE
SELECT mt.[ID],
mt.[ParentID],
[Depth] = l.[Depth] + 1,
CAST(l.treePath + CAST(mt.[ID] AS VARCHAR(MAX)) + ':' AS VARCHAR(MAX)) AS treePath
FROM [MyTable] mt
INNER JOIN links l ON mt.ParentID = l.ID
AND CHARINDEX(':' + CAST(mt.[ID] AS VARCHAR(MAX)) + ':', l.[treePath]) = 0
WHERE Depth < 10
)
SELECT
[Depth],
[ID],
[ParentID],
[treePath]
FROM
[links];
The line on the INNER JOIN that says
AND CHARINDEX(':' + CAST(mt.[ID] AS VARCHAR(MAX)) + ':', l.[treePath]) = 0
Is where the previous numbers in the path get filtered out.
Just copy and paste the example and give it a try.
One note, the way that I am using CHARINDEX on the CTE may not scale well, but it does accomplish what I think you are looking for.

Inserting random number of rows in SQL Server via join to integer list is inconsistent

I am creating a database with sample data. Each time I run the stored procedure to generate some new data for my sample database, I would like to clear out and repopulate table B ("Item") based on all the rows in table A ("Product").
If table A contained the rows with primary key values 1, 2, 3, 4, and 5, I would want table B to have a foreign key for table A and insert a random number of rows into table B for each table A row. (We are essentially stocking the shelves with a random number of "item" for any given "product.")
I am using code from this answer to generate a list of numbers. I join to the results of this function to create the rows to insert:
WITH cte AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY (select 0)) AS i
FROM
sys.columns c1 CROSS JOIN sys.columns c2 CROSS JOIN sys.columns c3
)
SELECT i
FROM cte
WHERE
i BETWEEN #p_Min AND #p_Max AND
i % #p_Increment = 0
Random numbers are generated in a view (to get around the limitations of functions) as follows:
-- Mock.NewGuid view
SELECT id = ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)))
And a function that returns the random numbers:
-- Mock.GetRandomInt(min, max) function definition
DECLARE #random int;
SELECT #random = Id % (#MaxValue - #MinValue + 1) FROM Mock.NewGuid;
RETURN #random + #MinValue;
However, when you look at this code and execute it...
WITH Products AS
(
SELECT ProductId, ItemCount = Mock.GetRandomInt(1,5)
FROM Product.Product
)
SELECT A = Products.ProductId, B = i
FROM Products
JOIN (SELECT i FROM Mock.GetIntList(1,5,1)) Temp ON
i < Products.ItemCount
ORDER BY ProductId, i
... this returns some inconsistent results!
A,B
1,1
1,2
1,3
2,1
2,2
3,2 <-- where is 1?
3,3
4,1
5,3 <-- where is 1, 2?
6,1
I would expect that, for every product id, the JOIN results in 1-5 rows. However, it seems like values get skipped! This is even more apparent with larger data sets. I was originally trying to generate 20-50 rows in Item for each Product row, but this resulted in only 30-40 rows for each product.
The question: Any idea why this is happening? Each product should have a random number of rows (between 1 and 5) inserted for it and the B value should be sequential! Instead, some numbers are missing!
This issue also happens if I store numbers in a table I created and then join to that, or if I use a recursive CTE.
I am using SQL Server 2008R2, but I believe I see the same issue on my 2012 database as well. Compatibility levels are 2008 and 2012 respectively.
This is a fun problem. I've dealt with this in a round about way a number of times. I am sure there is a way to not use a cursor. But why not. This is a cheap problem memory wise so long as the #RandomMaxRecords doesn't get huge or you have a significant amount of product records. If the data in the Items table is meaningless then I would suggest truncating any in memory table where I define the hash table for #Item. And obviously you will pull from your Product table not the hash I have created for testing.
This is a fantastic article and describes in detail how I arrive at my solution. Less Than Dot Blog
CODE
--This is your product table with 5 random products
IF OBJECT_ID('tempdb..#Product') IS NOT NULL DROP TABLE #Product
CREATE TABLE #Product
(
ProductID INT PRIMARY KEY IDENTITY(1,1),
ProductName VARCHAR(25),
ProductDescription VARCHAR(max)
)
INSERT INTO #Product (ProductName,ProductDescription) VALUES ('Product Name 1','Product Description 1'),
('Product Name 2','Product Description 2'),
('Product Name 3','Product Description 3'),
('Product Name 4','Product Description 4'),
('Product Name 5','Product Description 5')
--This is your item table. This would probably just be a truncate statement so that your table is reset for the new values to go in
IF OBJECT_ID ('tempdb..#Item') IS NOT NULL DROP TABLE #Item
CREATE TABLE #Item
(
ItemID INT PRIMARY KEY IDENTITY(1,1),
FK_ProductID INT NOT NULL,
ItemName VARCHAR(25),
ItemDescription VARCHAR(max)
)
--Declare a bunch of variables for the cursor and insert into the item table process
DECLARE #ProductID INT
DECLARE #ProductName VARCHAR(25)
DECLARE #ProductDescription VARCHAR(max)
DECLARE #RandomItemCount INT
DECLARE #RowEnumerator INT
DECLARE #RandomMaxRecords INT = 10
--We declare a cursor to iterate over the records in product and generate random amounts of items
DECLARE ItemCursor CURSOR
FOR SELECT * FROM #Product
OPEN ItemCursor
FETCH NEXT FROM ItemCursor INTO #ProductID, #ProductName, #ProductDescription
WHILE (##FETCH_STATUS <> -1)
BEGIN
--Get the Random Number into the variable. And we only want 1 or more records. Mod division will produce a 0.
SELECT #RandomItemCount = ABS(CHECKSUM(NewID())) % #RandomMaxRecords
SELECT #RandomItemCount = CASE #RandomItemCount WHEN 0 THEN 1 ELSE #RandomItemCount END
--Iterate on the RowEnumerator to the RandomItemCount and insert item rows
SET #RowEnumerator = 1
WHILE (#RowEnumerator <= #RandomItemCount)
BEGIN
INSERT INTO #Item (FK_ProductID,ItemName,ItemDescription)
SELECT #ProductID, REPLACE(#ProductName,'Product','Item'),REPLACE(#ProductDescription,'Product','Item')
SELECT #RowEnumerator = #RowEnumerator + 1
END
FETCH NEXT FROM ItemCursor INTO #ProductID, #ProductName, #ProductDescription
END
CLOSE ItemCursor
DEALLOCATE ItemCursor
GO
--Look at the result
SELECT
*
FROM
#Product AS P
RIGHT JOIN #Item AS I ON (P.ProductID = I.FK_ProductID)
--Cleanup
DROP TABLE #Product
DROP TABLE #Item
It looks like a LEFT OUTER JOIN to GetIntList (as opposed to INNER JOIN) fixes the problem I am having.

How to use CTE to map parent-child relationship?

Say I have a table of items representing a tree-like structured data, and I would like to continuously tracing upward until I get to the top node, marked by a parent_id of NULL. What would my MS SQL CTE (common table expression) look like?
For example, if I were to get the path to get to the top from Bender, it would look like
Comedy
Futurama
Bender
Thanks, and here's the sample data:
DECLARE #t Table(id int, description varchar(50), parent_id int)
INSERT INTO #T
SELECT 1, 'Comedy', null UNION
SELECT 2, 'Futurama', 1 UNION
SELECT 3, 'Dr. Zoidberg', 2 UNION
SELECT 4, 'Bender', 2 UNION
SELECT 5, 'Stand-up', 1 UNION
SELECT 6, 'Unfunny', 5 UNION
SELECT 7, 'Dane Cook', 6
it should look like this:
declare #desc varchar(50)
set #desc = 'Bender'
;with Parentage as
(
select * from #t where description = #desc
union all
select t.*
from #t t
inner join Parentage p
on t.id = p.parent_id
)
select * from Parentage
order by id asc --sorts it root-first

Resources