Common Table Expressions to retrieve path - sql-server

I'm trying to use a recursive query to find a path through a table structured like this:
RelatedEntities
FromKey TINYINT
ToKey TINYINT
...more....
I thought I could do something like this:
DECLARE #startKey UNIQUEIDENTIFIER, #endKey UNIQUEIDENTIFIER;
SET #startKey = 0;
SET #endKey = 3;
;With findPath
AS
(
SELECT FromKey, ToKey
FROM RelatedEntities
WHERE FromKey = #startKey
UNION ALL
SELECT FromKey, ToKey
FROM RelatedEntities r
JOIN findPath b
ON r.FromKey = b.ToKey
AND r.FromKey NOT IN (SELECT FromKey FROM b)
)
SELECT * FROM findPath;
This code fails because I cannot use a subquery within a CTE. It also seems to be a rule that the recursive query can only contain one reference to the CTE. (true?) Maybe this is a job for a cursor, or procedural code, but I thought I would put it out here in case I'm missing a way to find a path through a table with a CTE?
The parameters are:
Start with a beginning and ending key
Base query uses the beginning key
Recursive query should stop when it contains the ending key, (have not been able to figure that one out) and should not repeat start keys.
A MAXRECURSION option could be used to stop after a certain number of iterations.
Thanks to all of you CTE gurus out there. Set me straight.
Changing this from UNIQUEIDENTIFIERS to TINYINT for readability. The SQL constructs are the same. Here's some code to test it.
CREATE TABLE RelatedEntities(FromKey TINYINT, ToKey TINYINT);
INSERT RelatedEntities(FromKey, ToKey)
VALUES
(1, 0),
(0, 1),
(1, 7),
(7, 1),
(3, 4),
(4, 3)
;With FindPath
AS
(
SELECT FromKey, ToKey, 0 AS recursionLevel
FROM RelatedEntities
WHERE FromKey = 1
UNION ALL
SELECT r.FromKey, r.ToKey, recursionLevel = recursionLevel + 1
FROM RelatedEntities r
INNER JOIN FindPath b ON r.FromKey = b.ToKey
WHERE b.ToKey <> 3 AND RecursionLevel < 10
)
SELECT * FROM FindPath
ORDER BY recursionLevel
Note that this returns from 1, to 0, then from 0, to 1, and repeats until I run out of recursion levels.

You need to modify your query like this:
DECLARE #startKey UNIQUEIDENTIFIER, #endKey UNIQUEIDENTIFIER;
DECLARE #maxRecursion INT = 100
SET #startKey = '00000000-0000-0000-0000-000000000000';
SET #endKey = 'F7801327-C037-AA93-67D1-B7892F6093A7';
;With FindPath
AS
(
SELECT FromKey, ToKey, 0 AS recursionLevel
FROM RelatedEntities
WHERE FromKey = #startKey
UNION ALL
SELECT r.FromKey, r.ToKey, recursionLevel = recursionLevel +1
FROM RelatedEntities r
INNER JOIN FindPath b ON r.FromKey = b.ToKey
WHERE b.ToKey <> #endKey AND recursionLevel < #maxRecursion
)
SELECT * FROM FindPath;
The anchor member of the above CTE:
SELECT FromKey, ToKey, 0 AS recursionLevel
FROM RelatedEntities
WHERE FromKey = #startKey
will select the starting record, T0, of the (From, To) chain of records.
The recursive member of the CTE:
SELECT r.FromKey, r.ToKey, recursionLevel = recursionLevel +1
FROM RelatedEntities r
INNER JOIN FindPath b ON r.FromKey = b.ToKey
WHERE b.ToKey <> #endKey AND recursionLevel < #maxRecursion
will be executed with T0, T1, ... as an input and T1, T2, ... respectively as an output.
This process will continue adding records to the final result set until an empty set is returned from the recursive member, i.e. until a record with ToKey=#endKey has been added to the result set, or #maxRecursion level has been reached.
EDIT:
You can use the following query in order the effectively handle any circular paths:
;With FindPath
AS
(
SELECT FromKey, ToKey,
0 AS recursionLevel,
CAST(FromKey AS VARCHAR(MAX)) AS FromKeys
FROM RelatedEntities
WHERE FromKey = 1
UNION ALL
SELECT r.FromKey, r.ToKey,
recursionLevel = recursionLevel + 1,
FromKeys = FromKeys + ',' + CAST(r.FromKey AS VARCHAR(MAX))
FROM RelatedEntities r
INNER JOIN FindPath b ON r.FromKey = b.ToKey
WHERE (b.ToKey <> 3)
AND (RecursionLevel < 10)
AND PATINDEX('%,' + CAST(r.ToKey AS VARCHAR(MAX)) + ',%', ',' + FromKeys + ',') = 0
)
SELECT * FROM FindPath
ORDER BY recursionLevel
Calculated field FromKeys is used to carry on FromKey on to the next recursion level. This way any keys from previous recursion levels are accumulated from level to level using string concatenation. PATINDEX is then used to check whether a circular path has been met.
SQL Fiddle Demo here

Related

Is it always possible to transform multiple spatial selects with while loop and variables into a single query without using temp tables in sql?

This problem can be solved with temp table, however, I don't want to use Temp table or var table, this question is mostly for my personal educational purposes.
I inherited the following SQL:
DECLARE #i int = 993
while #i <=1000
begin
declare #lat nvarchar(20)
select top 1 #lat = SUBSTRING(Address,0,CHARINDEX(',',Address,0)) from dbo.rent
where id = #i;
declare #lon nvarchar(20)
select top 1 #lon = SUBSTRING(Address,CHARINDEX(',',Address)+1,LEN(Address)) from dbo.rent
where id = #i
declare #p GEOGRAPHY = GEOGRAPHY::STGeomFromText('POINT('+ #lat +' '+#lon+')', 4326)
select price/LivingArea sq_m, (price/LivingArea)/avg_sq_m, * from
(select (sum(price)/sum(LivingArea)) avg_sq_m, count(1) cnt, #i id from
(select *, GEOGRAPHY::STGeomFromText('POINT('+
convert(nvarchar(20), SUBSTRING(Address,0,CHARINDEX(',',Address,0)))+' '+
convert( nvarchar(20), SUBSTRING(Address,CHARINDEX(',',Address)+1,LEN(Address)))+')', 4326)
.STBuffer(500).STIntersects(#p) as [Intersects]
from dbo.rent
where Address is not null
) s
where [Intersects] = 1) prox
inner join dbo.rent r on prox.id = r.id
set #i = #i+1
end
it is used to analyze property prices per square meter that are in proximity and compare them to see which ones are cheaper...
Problem: a mechanism for calling has to be moved from C# to SQL and all queries have to be combined into a single result (now you get one row per one while run), i.e #i and #p has to go and become while id < x and id > y or somehow magically joined,
the procedure is a cut down version of actual thing but having a solution to the above I will have no problem making the whole thing work...
I am of the opinion that any SQL mechanism with variables and loops can be transformed to a single SQL statement, hence the question.
SqlFiddle
If I understand your question properly (Remove the need for loops and return one data set) then you can use CTE (Common Table Expressions) for the Lats, Lons and Geog variables.
You;re SQLFIddle was referencing a database called "webanalyser" so I removed that from the query below
However, the query will not return anything as the sample data has wrong data for address column.
;WITH cteLatsLongs
AS(
SELECT
lat = SUBSTRING(Address, 0, CHARINDEX(',', Address, 0))
,lon = SUBSTRING(Address, CHARINDEX(',', Address) + 1, LEN(Address))
FROM dbo.rent
)
,cteGeogs
AS(
SELECT
Geog = GEOGRAPHY ::STGeomFromText('POINT(' + LL.lat + ' ' + LL.lon + ')', 4326)
FROM cteLatsLongs LL
),cteIntersects
AS(
SELECT *,
GEOGRAPHY::STGeomFromText('POINT(' + CONVERT(NVARCHAR(20), SUBSTRING(Address, 0, CHARINDEX(',', Address, 0))) + ' ' + CONVERT(NVARCHAR(20), SUBSTRING(Address, CHARINDEX(',', Address) + 1, LEN(Address))) + ')', 4326).STBuffer(500).STIntersects(G.Geog) AS [Intersects]
FROM dbo.rent
CROSS APPLY cteGeogs G
)
SELECT avg_sq_m = (SUM(price) / SUM(LivingArea)), COUNT(1) cnt
FROM
cteIntersects I
WHERE I.[Intersects] = 1
It can be done, in this specific case 'discovery' that was necessary was the ability to perform JOINs on Point e.g ability to join tables on proximity (another a small cheat was to aggregate point-strings to actual points, but it's just an optimization). Once this is done, a query could be rewritten as follows:
SELECT adds.Url,
adds.Price/adds.LivingArea Sqm,
(adds.Price/adds.LivingArea)/k1.sale1Avg ratio,
*
FROM
(SELECT baseid,
count(k1Rent.rentid) rent1kCount,
sum(k1Rent.RperSqM)/(count(k1Rent.rentid)) AS rent1kAvgSqM,
count(around1k.SaleId) sale1kCount,
(sum(k1sale.price)/sum(k1Sale.LivingArea)) sale1Avg,
(sum(k1sale.price)/sum(k1Sale.LivingArea))/((sum(k1Rent.RperSqM)/(count(k1Rent.rentid)))*12) years --*
FROM
(SELECT sa.id baseid,
s.id saleid,
s.RoomCount,
POINT
FROM SpatialAnalysis sa
INNER JOIN Sale s ON s.Id = SaleId
WHERE sa.SalesIn1kRadiusCount IS NULL) AS base
JOIN SpatialAnalysis around1k ON base.Point.STBuffer(1000).STIntersects(around1k.Point) = 1
LEFT OUTER JOIN
(SELECT id rentid,
rc,
Price/avgRoomSize RperSqM
FROM
(SELECT *
FROM
(SELECT rc,
sum(avgArea*c)/sum(c) avgRoomSize
FROM
(SELECT roomcount rc,
avg(livingarea) avgArea,
count(1) c
FROM Rent
WHERE url LIKE '%systemname%'
AND LivingArea IS NOT NULL
GROUP BY RoomCount
UNION
(SELECT roomcount rc,
avg(livingarea) avgArea,
count(1) c
FROM sale
WHERE url LIKE '%systemname%'
AND LivingArea IS NOT NULL
GROUP BY RoomCount))uni
GROUP BY rc) avgRoom) avgrents
JOIN rent r ON r.RoomCount = avgrents.rc) k1Rent ON k1Rent.rentid =around1k.RentId
AND base.RoomCount = k1Rent.rc
LEFT OUTER JOIN Sale k1Sale ON k1Sale.Id = around1k.SaleId
AND base.RoomCount = k1Sale.RoomCount
GROUP BY baseid) k1
left outer join SpatialAnalysis sp on sp.Id = baseid
left outer join Sale adds on adds.Id = sp.SaleId
where adds.Price < 250000
order by years, ratio

Character-by-character comparison strings in sql

How to compare the strings on characters, check that the strings consist of the same symbols using T-SQL?
For example:
'aaabbcd' vs 'ddbca' (TRUE): both strings consist of the same symbols
'abcddd' vs 'cda' (FALSE): both strings do not consist of the same symbols
If performance is important then I would suggest a purely set-based solution using Ngrams8k.
This will give you the correct answer:
SELECT AllSame = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(#string1, 1) ng1
FULL JOIN dbo.ngrams8k(#string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL;
To use this logic against a table you could use CROSS APPLY like so:
-- Sample data
DECLARE #table TABLE (string1 varchar(100), string2 varchar(100));
INSERT #table VALUES ('aaabbcd','ddbca'),('abcddd','cda');
-- Solution using CROSS APPLY
SELECT *
FROM #table t
CROSS APPLY
(
SELECT AllSame = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(t.string1, 1) ng1
FULL JOIN dbo.ngrams8k(t.string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL
) x;
Results:
string1 string2 AllSame
--------- --------- --------
aaabbcd ddbca 1
abcddd cda 0
Not only will this be the fastest solution presented thus far, notice that we're getting the job done with as little code possible.
UPDATE TO INCLUDE COMPARE PERFORMANCE TO MARTIN SMITH'S SOLUTION
-- sample data
IF OBJECT_ID('tempdb..#sample') IS NOT NULL DROP TABLE #sample;
SELECT TOP (10000)
string1 = replicate('a',abs(checksum(newid())%5))+replicate('b',abs(checksum(newid())%4))+
replicate('c',abs(checksum(newid())%5))+replicate('d',abs(checksum(newid())%4))+
replicate('e',abs(checksum(newid())%5))+replicate('f',abs(checksum(newid())%4)),
string2 = replicate('a',abs(checksum(newid())%5))+replicate('b',abs(checksum(newid())%4))+
replicate('c',abs(checksum(newid())%5))+replicate('d',abs(checksum(newid())%4))+
replicate('e',abs(checksum(newid())%5))+replicate('f',abs(checksum(newid())%4))
INTO #sample
FROM sys.all_columns a, sys.all_columns b;
SET NOCOUNT ON;
SET STATISTICS TIME ON;
PRINT 'ajb serial'+char(10)+replicate('-',50);
SELECT flag
FROM #sample t
CROSS APPLY
(
SELECT Flag = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(t.string1, 1) ng1
FULL JOIN dbo.ngrams8k(t.string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL
) x
OPTION (MAXDOP 1);
PRINT 'ajb parallel'+char(10)+replicate('-',50);
SELECT flag
FROM #sample t
CROSS APPLY
(
SELECT Flag = COALESCE(MAX(0),1)
FROM dbo.ngrams8k(t.string1, 1) ng1
FULL JOIN dbo.ngrams8k(t.string2, 1) ng2 ON ng1.token = ng2.token
WHERE ng1.token IS NULL OR ng2.token IS NULL
) x
OPTION (querytraceon 8649);
PRINT 'M Smith - serial'+char(10)+replicate('-',50);
WITH Nums AS
(
SELECT TOP (100) ROW_NUMBER() OVER ( ORDER BY (SELECT NULL)) number
FROM sys.all_columns
)
SELECT flag
FROM #sample T
CROSS APPLY (SELECT CASE WHEN Min(Cnt) = 2 THEN 1 ELSE 0 END AS Flag
FROM (SELECT Count(*) AS Cnt
FROM (SELECT 1 AS s,
Substring(t.string1, N1.number, 1) AS c
FROM Nums N1
WHERE N1.number <= Len(t.string1)
UNION
SELECT 2 AS s,
Substring(t.string2, N2.number, 1) AS c
FROM Nums N2
WHERE N2.number <= Len(t.string2)) D1
GROUP BY c) D2
) Ca
OPTION (MAXDOP 1);
SET STATISTICS TIME OFF;
Results:
ajb serial
--------------------------------------------------
SQL Server Execution Times:
CPU time = 656 ms, **elapsed time = 660 ms**.
ajb parallel
--------------------------------------------------
SQL Server Execution Times:
CPU time = 1281 ms, **elapsed time = 204 ms**.
M Smith serial
--------------------------------------------------
SQL Server Execution Times:
CPU time = 1390 ms, **elapsed time = 1393 ms**.
Note that I did not test Martin's solution with a parallel plan because, as is, that query cannot run in parallel.
An inline method.
This uses a numbers table
CREATE TABLE dbo.Numbers (number INT PRIMARY KEY);
INSERT INTO dbo.Numbers
SELECT TOP 8000 ROW_NUMBER() OVER (ORDER BY ##SPID)
FROM sys.all_columns c1,
sys.all_columns c2
A version without but with lesser performance is in the edit history if you'd prefer trading off performance against not having to use one.
WITH T(S1, S2)
AS (SELECT 'aaabbcd',
'ddbca'
UNION ALL
SELECT 'abcddd',
'cda')
SELECT *
FROM T
CROSS APPLY (SELECT CASE WHEN Min(Cnt) = 2 THEN 1 ELSE 0 END AS Flag
FROM (SELECT Count(*) AS Cnt
FROM (SELECT 1 AS s,
Substring(S1, N1.number, 1) AS c
FROM dbo.Numbers N1
WHERE N1.number <= Len(S1)
UNION
SELECT 2 AS s,
Substring(S2, N2.number, 1) AS c
FROM dbo.Numbers N2
WHERE N2.number <= Len(S2)) D1
GROUP BY c) D2
) Ca
You can use this'%your-search-string%' to find your string contains any substring.
SELECT * FROM TableName
WHERE Name LIKE '%searchText%'
You can use the stored procedure for checking that characters of the string.
CREATE PROCEDURE IsStringMatching
(
#originalString NVARCHAR(32) ,
#stringToBeChecked NVARCHAR(32),
#IsMatching BIT OUTPUT
)
AS
BEGIN
DECLARE #inputStringCount INT = LEN(#originalString);
DECLARE #loopCount INT = 0, #temp INT;
DECLARE #char VARCHAR;
SET #IsMatching = 1
WHILE #loopCount < #inputStringCount
BEGIN
SET #char = SUBSTRING(#originalString,#loopCount+1,1);
SET #temp = CHARINDEX(#char, #stringToBeChecked,1);
IF(#temp = 0)
BEGIN
SET #IsMatching = 0;
BREAK;
END
SET #loopCount = #loopCount + 1;
END;
END
You can validate like this:
DECLARE #IsMatching BIT;
SELECT EXECUTE IsStringMatchingQ 'aaabbcd', 'ABC';
SELECT #IsMatching

TSQL: How to ouput records even when where condition doesnt match

I have a temp table with 3 columns "ID","Cost", "MaxCost"..below is my select statement which selects rows given particular ID..
SELECT
t.Cost
t.MaxCost
FROM #temp t
WHERE t.ID = #ID
How do i modify the above query so that even if given ID doesn't exists it still output rows with Cost = 0 & MaxCost = 0
Select both the actual and the default record, and select the first one ordering by their weight.
select top (1)
Cost,
MaxCost
from (
SELECT
t.Cost
t.MaxCost,
1 as takeme
FROM #temp t
WHERE t.ID = #ID
union all
select 0, 0, 0
) foo
order by
foo.takeme desc
declare #table table (cost int);
insert into #table values (2), (2), (3);
declare #findCost int = 1;
select * from #table where cost = #findCost
union all
select 0 as cost from #table where cost = #findCost having count(*) = 0;
set #findCost = 2;
select * from #table where cost = #findCost
union all
select 0 as cost from #table where cost = #findCost having count(*) = 0;

Get total number of parents

I am upgrading a legacy application and I need to find the number of parents certain rows in a table have.
I was thinking of using a declared procedure for this, but i'm having trouble figuring it out how to make it work.
Basically you have a table with id and parent
= id = parent =
= 0 = 0 =
= 1 = 0 =
= 2 = 1 =
= 3 = 1 =
= 4 = 2 =
ID is unique, and parent is variable depending on what was used to create that entry, but it will always match a row with matching ID
What i'd like to achieve is calling one procedure that returns all matching parent numbers as a simple iterable result set.
So if i were to do getAllParents(4) it should return me 2, 1, 0
My failed attempts at looping have brought me so far
CREATE PROCEDURE getNumberOfParents #start int #current int
as
begin
SELECT parent FROM test where id=#start
if(parent > 0)
begin
set #current = #current + 1;
set #current = #current + getNumberOfParents(parent,#current);
end
end
Due to restrictions I cannot use an extra table to achieve this, otherwise i'd be easy heh. i can however make temptables that can be cleaned up after the method exits.
You can do it without a while loop by the use of a recursive CTE:
DECLARE #lookupId INT = 4
;WITH ParentsCTE AS (
SELECT id, parent
FROM #mytable
WHERE id = #lookupId
UNION ALL
SELECT m.id, m.parent
FROM #mytable AS m
INNER JOIN ParentsCTE AS p ON m.id = p.parent
WHERE m.id <> m.parent
)
SELECT parent
FROM ParentsCTE
The anchor member of the above CTE:
SELECT id, parent
FROM #mytable
WHERE id = #lookupId
returns the immediate parent of the 'lookup id'.
The recursive member:
SELECT m.id, m.parent
FROM #mytable AS m
INNER JOIN ParentsCTE AS p ON m.id = p.parent
WHERE m.id <> m.parent
keeps adding parents up the hierarchy until a root node (m.id <> m.parent predicate detects this) has been reached.
SQL Fiddle Demo
Test Data
DECLARE #Table TABLE (id INT, parent INT)
INSERT INTO #Table VALUES
(0 , 0),
(1 , 0),
(2 , 1),
(3 , 1),
(4 , 2)
Query
-- Id you want all the parents for
DECLARE #ParentsFor INT = 4
;with parents as
(
select ID, parent
from #Table
where parent IS NOT NULL
union all
select p.ID, t.parent
from parents p
inner join #Table t on p.parent = t.ID
and t.ID <> t.parent
)
select Distinct
STUFF((SELECT ',' + Cast(parent AS VarChar(10))
FROM parents
WHERE ID = #ParentsFor
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'')
FROM parents
Result
2,1,0
SQL FIDDLE
Fiddle Demo Here
try:
declare #id varchar(max)=''
declare #getid int=4
while #getid>0
begin
select #id=#id+cast(parent as varchar(10))+',' from tab_1
where id=#getid
select #getid=parent from tab_1 where id=#getid
end
select #id
Try to use while loop as presented in this example:
http://blog.sqlauthority.com/2007/10/24/sql-server-simple-example-of-while-loop-with-continue-and-break-keywords/

Infinite loop CTE with OPTION (maxrecursion 0)

I have CTE query with large record on it. Previously it worked fine. But lately, it throws an error for some members
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
So I put OPTION (maxrecursion 0) or OPTION (maxrecursion 32767) on my query, because I don't want to limit the records. But, the result is the query takes forever to load. How do I solve this?
Here's my code:
with cte as(
-- Anchor member definition
SELECT e.SponsorMemberID , e.MemberID, 1 AS Level
FROM tblMember AS e
where e.memberid = #MemberID
union all
-- Recursive member definition
select child.SponsorMemberID , child.MemberID, Level + 1
from tblMember child
join cte parent
on parent.MemberID = child.SponsorMemberID
)
-- Select the CTE result
Select distinct a.*
from cte a
option (maxrecursion 0)
EDIT: remove unnecessary code to easy understand
SOLVED: So the issue is not came from maxrecursion. It's from the CTE. I don't know why but possibly it contain any sponsor cycles: A -> B -> C -> A -> ... (Thanks to #HABO)
I tried this method and it works. Infinite loop in CTE when parsing self-referencing table
If you are hitting the recursion limit, you either have considerable depth in sponsoring relationships or a loop in the data. A query like the following will detect loops and terminate the recursion:
declare #tblMember as Table ( MemberId Int, SponsorMemberId Int );
insert into #tblMember ( MemberId, SponsorMemberId ) values
( 1, 2 ), ( 2, 3 ), ( 3, 5 ), ( 4, 5 ), ( 5, 1 ), ( 3, 3 );
declare #MemberId as Int = 3;
declare #False as Bit = 0, #True as Bit = 1;
with Children as (
select MemberId, SponsorMemberId,
Convert( VarChar(4096), '>' + Convert( VarChar(10), MemberId ) + '>' ) as Path, #False as Loop
from #tblMember
where MemberId = #MemberId
union all
select Child.MemberId, Child.SponsorMemberId,
Convert( VarChar(4096), Path + Convert( VarChar(10), Child.MemberId ) + '>' ),
case when CharIndex( '>' + Convert( VarChar(10), Child.MemberId ) + '>', Path ) = 0 then #False else #True end
from #tblMember as Child inner join
Children as Parent on Parent.MemberId = Child.SponsorMemberId
where Parent.Loop = 0 )
select *
from Children
option ( MaxRecursion 0 );

Resources