CTE recursive query loops - sql-server

I have many to many table Dependency, that has two columns SourceId and DependsOnId. I would really like to get recursively all dependents for given Id.
For now I have next query:
with Rec(SourceId, DependsOnId)
as (
select SourceId, DependsOnId from [dbo].[Dependency]
union all
select Rec.SourceId, Rec.DependsOnId
from [dbo].[Dependency] d
join [dbo].[Dependency] dd
on d.SourceId = dd.DependsOnId and d.DependsOnId = dd.SourceId
join Rec
on Rec.DependsOnId = d.SourceId
)
SELECT * FROM Rec
OPTION (MAXRECURSION 30000);
But it prefers to loop infinitely. I understand why, it is because of mirror dependencies like 1->2 and 2->1. So I need such cases to be handled only once.
Appreciate your help.

DECLARE #ParentID INT = 2;
WITH CTE AS(
SELECT SourceId, DependsOnId
FROM [dbo].[Dependency]
WHERE SourceId = #ParentID --<-- Given Parent id
UNION ALL
SELECT t.SourceId, t.DependsOnId
FROM [dbo].[Dependency] t
INNER JOIN
CTE c ON t.DependsOnId = c.SourceId
)
SELECT *
FROM CTE

Hmmm. You have too many joins, I think. Try this:
with Rec(SourceId, DependsOnId) as (
select SourceId, DependsOnId
from [dbo].[Dependency]
union all
select Rec.SourceId, d.DependsOnId
from Rec join
[dbo].[Dependency] d
on Rec.DependsOnId = d.SourceId
)
SELECT *
FROM Rec
OPTION (MAXRECURSION 30000);
The subquery is now adding one dependencies at one additional depth when it goes through the loop.
Note: this can still have infinite recursion, if your data has cycles in it. I would recommend that you set up a SQL Fiddle, if this doesn't work for you.

Related

Clean up SQL Server script

SELECT pp.pat_key, MAX(pp.PROV_NPI) [Provider_ID], CONCAT(pp.LAST_NM,' ',pp.FIRST_NM) [Provider_Name]
INTO pat_primary_provider
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP=1
AND pat_key IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY pp.PAT_KEY, pp.last_nm, pp.FIRST_NM;
SELECT ppp.*
INTO ppp1
FROM (SELECT PAT_KEY, MAX(provider_ID) AS maxprov FROM pat_primary_provider GROUP BY PAT_KEY) AS x
INNER JOIN pat_primary_provider AS ppp ON ppp.PAT_KEY = x.PAT_KEY AND ppp.Provider_ID = x.maxprov;
I need to get the results of ppp1 only using one query (no INTO statements) in SQL Server. Please help.
Simply put the first query into a CTE (without the INTO clause). Then select from that.
;WITH pat_primary_provider AS
(
-- The first query goes here
)
-- The second query goes here
But something like below might also return the PAT_KEY's with the maximum PROV_NPI:
SELECT TOP 1 WITH TIES
PAT_KEY,
MAX(PROV_NPI) AS [Max_Provider_ID],
CONCAT(LAST_NM,' ',FIRST_NM) AS [Patient_Provider_Full_Name]
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP = 1
AND PAT_KEY IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY PAT_KEY, LAST_NM, FIRST_NM
ORDER BY row_number() over (order by MAX(PROV_NPI) desc);
Whats wrong with just inserting the first query as subqueries into the second?
SELECT ppp.*
FROM (SELECT PAT_KEY, MAX(provider_ID) AS maxprov FROM (SELECT pp.pat_key, MAX(pp.PROV_NPI) [Provider_ID], CONCAT(pp.LAST_NM,' ',pp.FIRST_NM) [Provider_Name]
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP=1
AND pat_key IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY pp.PAT_KEY, pp.last_nm, pp.FIRST_NM) GROUP BY PAT_KEY) AS x
INNER JOIN (SELECT pp.pat_key, MAX(pp.PROV_NPI) [Provider_ID], CONCAT(pp.LAST_NM,' ',pp.FIRST_NM) [Provider_Name]
FROM TRDW.dbo.PATIENT_PROVIDER pp
WHERE IS_PCP=1
AND pat_key IN (SELECT Consumer_ID FROM CareWire0521)
GROUP BY pp.PAT_KEY, pp.last_nm, pp.FIRST_NM) AS ppp ON ppp.PAT_KEY = x.PAT_KEY AND ppp.Provider_ID = x.maxprov;

Updating DISTINCTROW in SQL Server [duplicate]

What would the syntax be to convert this MS Access query to run in SQL Server as it doesn't have a DistinctRow keyword
UPDATE DISTINCTROW [MyTable]
INNER JOIN [AnotherTable] ON ([MyTable].J5BINB = [AnotherTable].GKBINB)
AND ([MyTable].J5BHNB = [AnotherTable].GKBHNB)
AND ([MyTable].J5BDCD = [AnotherTable].GKBDCD)
SET [AnotherTable].TessereCorso = [MyTable].[J5F7NR];
DISTINCTROW [MyTable] removes duplicate MyTable entries from the results. Example:
select distinctrow items
items.item_number, items.name
from items
join orders on orders.item_id = items.id;
In spite of the join getting you the same item_number and name multiple times when there is more than one order for it, DISTINCTROW reduces this to one row per item. So the whole join is merely for assuring that you only select items for which exist at least one order. You don't find DISTINCTROW in any other DBMS as far as I know. Probably because it is not needed. When checking for existence, we use EXISTS of course (or IN for that matter).
You are joining MyTable and AnotherTable and expect for some reason to get the same MyTable record multifold for one AnotherTable record, so you use DISTINCTROW to only get it once. Your query would (hopefully) fail if you got two different MyTable records for one AnotherTable record.
What the update does is:
update anothertable
set tesserecorso = (select top 1 j5f7nr from mytable where mytable.j5binb = anothertable.gkbinb and ...)
where exists (select * from mytable where mytable.j5binb = anothertable.gkbinb and ...)
But this uses about the same subquery twice. So we'd want to update from a query instead.
The easiest way to get one result record per <some columns> in a standard SQL query is to aggregate data:
select *
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
How to write an updateble query is different from one DBMS to another. Here is the final update statement for SQL-Server:
update a
set a.tesserecorso = m.j5f7nr
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
The DISTINCTROW predicate in MS Access SQL removes duplicates across all fields of a table in join statements and not just the selected fields of query (which DISTINCT in practically all SQL dialects do). So consider selecting all fields in a derived table with DISTINCT predicate:
UPDATE [AnotherTable]
SET [AnotherTable].TessereCorso = main.[J5F7NR]
FROM
(SELECT DISTINCT m.* FROM [MyTable] m) As main
INNER JOIN [AnotherTable]
ON (main.J5BINB = [AnotherTable].GKBINB)
AND (main.J5BHNB = [AnotherTable].GKBHNB)
AND (main.J5BDCD = [AnotherTable].GKBDCD)
Another variant of the query.. (Too lazy to get the original tables).
But like the query above updates 35 rows =, so does this one
UPDATE [Albi-Anagrafe-Associati]
SET
[Albi-Anagrafe-Associati].CRegDitte = [055- Registri ditte].[CRegDitte],
[Albi-Anagrafe-Associati].NIscrTribunale = [055- Registri ditte].[NIscrTribunale],
[Albi-Anagrafe-Associati].NRegImprese = [055- Registri ditte].[NRegImprese]
FROM [055- Registri ditte]
WHERE EXISTS(
SELECT *
FROM [055- Registri ditte]-- [Albi-Anagrafe-Associati]
WHERE ([055- Registri ditte].GIBINB = [Albi-Anagrafe-Associati].GKBINB)
AND ([055- Registri ditte].GIBHNB = [Albi-Anagrafe-Associati].GKBHNB)
AND ([055- Registri ditte].GIBDCD = [Albi-Anagrafe-Associati].GKBDCD))
Update [AnotherTable]
Set [AnotherTable].TessereCorso = MyTable.[J5F7NR]
From [AnotherTable]
Inner Join
(
Select Distinct [J5BINB],[5BHNB],[J5BDCD]
,(Select Top 1 [J5F7NR] From MyTable) as [J5F7NR]
,[J5BHNB]
From MyTable
)as MyTable
On (MyTable.J5BINB = [AnotherTable].GKBINB)
AND (MyTable.J5BHNB = [AnotherTable].GKBHNB)
AND (MyTable.J5BDCD = [AnotherTable].GKBDCD)

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?
I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.
This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/
You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234
I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T
I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

TSQL optimizing code for NOT IN

I inherit an old SQL script that I want to optimize but after several tests, I must admit that all my tests only creates huge SQL with repetitive blocks. I would like to know if someone can propose a better code for the following pattern (see code below). I don't want to use temporary table (WITH). For simplicity, I only put 3 levels (table TMP_C, TMP_D and TMP_E) but the original SQL have 8 levels.
WITH
TMP_A AS (
SELECT
ID,
Field_X
FROM A
TMP_B AS(
SELECT DISTINCT
ID,
Field_Y,
CASE
WHEN Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM B
INNER JOIN TMP_A
ON TMP_A.ID=TMP_B.ID),
TMP_C AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_1'),
TMP_D AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_2' AND ID NOT IN (SELECT ID FROM TMP_C)),
TMP_E AS (
SELECT DISTINCT
ID,
CATEG
FROM TMP_B
WHERE CATEG='CATEG_3'
AND ID NOT IN (SELECT ID FROM TMP_C)
AND ID NOT IN (SELECT ID FROM TMP_D))
SELECT * FROM TMP_C
UNION
SELECT * FROM TMP_D
UNION
SELECT * FROM TMP_E
Many thanks in advance for your help.
First off, select DISTINCT will prevent duplicates from the result set, so you are overworking the condition. By adding the "WITH" definitions and trying to nest their use makes it more confusing to follow. The data is ultimately all coming from the "B" table where also has key match in "A". Lets start with just that... And since you are not using anything from the (B)Field_Y or (A)Field_X in your result set, don't add them to the mix of confusion.
SELECT DISTINCT
B.ID,
CASE WHEN B.Field_Z IN ('TEST_1','TEST_2') THEN 'CATEG_1'
WHEN B.Field_Z IN ('TEST_3','TEST_4') THEN 'CATEG_2'
WHEN B.Field_Z IN ('TEST_5','TEST_6') THEN 'CATEG_3'
ELSE 'CATEG_4'
END AS CATEG
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2', 'TEST_3', 'TEST_4', 'TEST_5', 'TEST_6' )
The where clause will only include those category qualifying values you want and still have the results per each category.
Now, if you actually needed other values from your "Field_Y" or "Field_X", then that would generate a different query. However, your Tmp_C, Tmp_D and Tmp_E are only asking for the ID and CATEG columns anyhow.
This may perform better
SELECT DISTINCT B.ID, 'CATEG_1'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_1', 'TEST_2')
UNION
SELECT DISTINCT B.ID, 'CATEG_2'
FROM
B JOIN A ON B.ID = A.ID
WHERE
B.Field_Z IN ( 'TEST_3', 'TEST_4')
...

TSQL - While loop within select?

In SQL server
Ok, so I'm working with a database table in which rows can have parent rows, which can then have parent rows of their own. I need to select the root 'row'. I don't know the best way to do this.
There is a field called ParentId, which links the row to the row with that ID. When the ParentId = 0, it is the root row.
This is my query now:
SELECT Releases.Name,WorkLog.WorkLogId
FROM WorkLog,Releases
WHERE
Releases.ReleaseId = WorkLog.ReleaseId
and WorkLogDateTime >= #StartDate
and WorkLogDateTime <= #end
I don't really need the Release Name of the child releases, I want only the root Release Name, so I want to select the result of a While loop like this:
WHILE (ParentReleaseId != 0)
BEGIN
#ReleaseId = ParentReleaseId
END
Select Release.Name
where Release.RealeaseId = #ReleaseId
I know that syntax is horrible, but hopefully I'm giving you an idea of what I'm trying to acheive.
Here is an example, which could be usefull:
This query is getting a lower element of a tree, and searching up to the parent of parents.
Like I have 4 level in my table -> category 7->5, 5->3, 3-> 1. If i give it to the 5 it will find the 1, because this is the top level of the three.
(Changing the last select you can have all of the parents up on the way.)
DECLARE #ID int
SET #ID = 5;
WITH CTE_Table_1
(
ID,
Name,
ParentID
)
AS(
SELECT
ID,
Name,
ParentID
FROM Table_1
WHERE ID = #ID
UNION ALL
SELECT
T.ID,
T.Name,
T.ParentID
FROM Table_1 T
INNER JOIN CTE_Table_1 ON CTE_Table_1.ParentID = T.ID
)
SELECT * FROM CTE_Table_1 WHERE ParentID = 0
something like this
with cte as
(
select id,parent_id from t where t.id=#myStartingValue
union all
select t.id,t.parent_id
from cte
join t on cte.parent_id = t.id where cte.parent_id<>0
)
select *
from cte
join t on cte.id=t.id where cte.parent_id = 0
and with fiddle : http://sqlfiddle.com/#!3/a5fa1/1/0
Using Andras approach, I edited the final select to directly give me the ID of the root release
WITH cte_Releases
(
ReleaseId,
ParentReleaseID
)
AS(
SELECT
ReleaseId,
ParentReleaseID
FROM Releases
Where ReleaseId = 905
UNION ALL
SELECT
R.ReleaseId,
R.ParentReleaseID
FROM Releases R
INNER JOIN cte_Releases ON cte_Releases.ParentReleaseID = R.ReleaseId
)
SELECT max(ReleaseId) as ReleaseId, min(ReleaseId) as RootReleaseId FROM cte_Releases
My problem now is I want to run through all #IDs (905 in that code) and join each record to a result

Resources