Let's say I have table below:
ID | Name | Active | ParentID
1 | Foo1 | 1 | 0
2 | Foo2 | 1 | 1
3 | Foo3 | 1 | 2
4 | Foo4 | 1 | 3
5 | Foo5 | 1 | 3
6 | Foo6 | 0 | 5
7 | Foo7 | 1 | 2
7 | Foo7 | 1 | 6
8 | Foo8 | 1 | 7
9 | Foo9 | 1 | 5
(I have indeed duplicate ID's, on which I expressed my thoughts but to no result)
As you can see, once child can have multiple parents. ID's with ParentID 0 have no parent. I need to select all ID's that are active and do not have an inactive parent above them, however high in the tree that might be.
So with the data set above, my result would be:
ID | Name |
1 | Foo1 |
2 | Foo2 |
3 | Foo3 |
4 | Foo4 |
5 | Foo5 |
9 | Foo9 |
ID 6 got removed because it was Inactive
ID 7 got removed because one of its parents (6) is inactive
ID 8 got removed because a parent (6) of its parent (7) is inactive
ID 9 is fine because its parent (5) is active and so are 5 his parents etc
I attempted this with a subquery in the where
SELECT *
FROM table
WHERE ID not in (SELECT ID FROM table where Active = 0)
But that only solves it for the current record.
I've also tried a typical self-join as used for employee/manager, but that only goes one layer deep, while here I also need to check for the parent of the parent etc
Any suggestions/ideas?
One method would be to use an rCTE to work through the hierachy, with a column that retains the initial ID. Then you can use an EXISTS to ensure there are no rows with a value of 0 for Active:
WITH rCTE AS(
SELECT ID,
Name,
Active,
ParentID,
ID AS InitialID
FROM dbo.YourTable YT
UNION ALL
SELECT YT.ID,
YT.Name,
YT.Active,
YT.ParentID,
r.InitialID
FROM rCTE r
JOIN dbo.YourTable YT ON r.ParentID = YT.ID)
SELECT *
FROM dbo.YourTable YT
WHERE NOT EXISTS (SELECT 1
FROM rCTE r
WHERE r.InitialID = YT.ID
AND r.Active = 0);
I would use a recursive CTE to identify IDs where the chain is continuous, using both conditional and unconditional increment by 1 as follows:
With A As
(Select ID, [Name], Active, ParentID, 0 As NUM_1, 0 As NUM_2
From Tbl Where ParentID=0
Union All
Select Tbl.ID, Tbl.[Name], Tbl.Active, Tbl.ParentID,
NUM_1 + 1 As NUM_1,
NUM_2 + IIF(Tbl.Active=1,1,0) As NUM_2
From Tbl Inner Join A On (Tbl.ParentID=A.ID)
)
Select ID, [Name]
From A
Where ID Not In (Select ID From A Where NUM_1<>NUM_2)
Order by ID
Result:
ID
Name
1
Foo1
2
Foo2
3
Foo3
4
Foo4
5
Foo5
9
Foo9
db<>fiddle
I have two tables in SQL Server, Say in table1 I have two columns Key1Display and Key2Display, they are of datatype bit and used to control whether to display the values in table2, and table 2 will have 2 columns Key1 and Key2.
What I am trying to achieve is a sort of cross join, say if table 1 has 3 rows:
| Key1Display | Key2Display |
+---------------------+------------------+
| 0 | 1 |
| 1 | 0 |
| 1 | 1 |
Say in table 2 there are 2 rows
| Key1 | Key2 |
+---------------------+------------------+
| Row1Key1value | Row1Key2value |
| Row2Key1value | Row2Key2value |
Then based on these two tables, I want to have a query to display 6 (2*3) rows and 1 column of results like this:
null:Row1Key2value
Row1Key1Value:null
Row1Key1Value:Row1Key2value
null:Row2Key2value
Row1Key2Value:null
Row1Key2Value:Row2Key2value
So something like:
select
case when t1.Key1Display = 1 then coalesce(t2.Key1,'??') else 'null' end
+ ':' + case when t1.Key2Display = 1 then coalesce(t2.Key2,'??') else 'null' end
-- And so on for as many keys as you have
from table1 t1
cross join table2 t2
I'm trying to get a "lineage" or similar, and also information about the first and last links (at least; all would be good), out of a table that has self-referential links between rows that have been "replaced" and rows that have replaced them. The table has a structure along these lines:
CREATE TABLE Thing (
Id INT PRIMARY KEY,
TStamp DATETIME,
Replaces INT NULL,
ReplacedBy INT NULL
);
I'm stuck with this structure. :-) It's sort of doubly-linked (yes, it's a bit silly): Each row has a unique Id, and then a row that has been "replaced" by another will have a non-NULL ReplacedBy giving the Id of the replacement row, and the replacement row will also have a link back to what it replaces in Replaces. So we can use either Replaces or ReplacedBy (or both) if we like.
Here's some sample data:
INSERT INTO Thing
(Id, TStamp, Replaces, ReplacedBy)
VALUES
(1, '2017-01-01', NULL, 11),
(2, '2017-01-02', NULL, 12),
(3, '2017-01-03', NULL, NULL),
(4, '2017-01-04', NULL, NULL),
(11, '2017-01-11', 1, NULL),
(12, '2017-01-12', 2, 22),
(22, '2017-01-22', 12, NULL);
So 1 was replaced by 11, 2 was replaced by 12, and 12 was replaced by 22.
I'd like to get the following information for each chain of links from this table in a reasonable way:
Details of the row that started the chain
Details of the final row in the chain
Details of the links in-between or at least how many links (total) there are in the chain
...filtered by a date range applied to the last row in the chain.
In an ideal universe, I'd get back something like this:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−−−−−−+
| FirstId | LastId | Id | Links | TStamp |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−−−−−−+
| 1 | 11 | 1 | 2 | 2017−01−01 |
| 1 | 11 | 11 | 2 | 2017−01−11 |
| 2 | 22 | 2 | 3 | 2017−01−02 |
| 2 | 22 | 12 | 3 | 2017−01−12 |
| 2 | 22 | 22 | 3 | 2017−01−22 |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−−−−−−+
So far I have this query, which I could post-process to get the above:
WITH Data AS (
SELECT Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
UNION ALL
SELECT Thing.Id, Thing.TStamp, Thing.Replaces, Thing.ReplacedBy, Depth + 1
FROM Data
JOIN Thing
ON Thing.Replaces = Data.Id
)
SELECT *
FROM Data
WHERE ReplacedBy IS NOT NULL OR Depth > 0
ORDER BY
Id, Depth;
That gives me:
+−−−−+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−+
| Id | TStamp | Replaces | ReplacedBy | Depth |
+−−−−+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−+
| 1 | 2017−01−01 | NULL | 11 | 0 |
| 2 | 2017−01−02 | NULL | 12 | 0 |
| 11 | 2017−01−11 | 1 | NULL | 1 |
| 12 | 2017−01−12 | 2 | 12 | 0 |
| 12 | 2017−01−12 | 2 | 12 | 1 |
| 22 | 2017−01−13 | 12 | NULL | 1 |
| 22 | 2017−01−13 | 12 | NULL | 2 |
+−−−−+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−+
And I could use something like this to figure out (for instance) the final row of each chain:
WITH Data AS (
SELECT Id, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
UNION ALL
SELECT Thing.Id, Thing.Replaces, Thing.ReplacedBy, Depth + 1
FROM Data
JOIN Thing
ON Thing.Replaces = Data.Id
),
MaxData AS (
SELECT Data.Id, Data.Depth
FROM Data
JOIN (
SELECT Id, MAX(Depth) AS MaxDepth
FROM Data
GROUP BY Id
) j ON data.Id = j.Id AND Data.Depth = j.MaxDepth
WHERE Depth > 0
)
SELECT *
FROM MaxData
ORDER BY
Id;
...which gives me:
+−−−−+−−−−−−−+
| Id | Depth |
+−−−−+−−−−−−−+
| 11 | 1 |
| 12 | 1 |
| 22 | 2 |
+−−−−+−−−−−−−+
...but I've lost the starting point and the points along the way.
I have the strong feeling I'm missing something really straight-forward — but clever — that would let me get this largely with the query rather than post-processing, some kind of join with a "min" and "max" query (but not like my one above). What would it be?
The table doesn't have any indexes on Replaces or ReplacedBy, but we could add any needed. The table is only lightly used (roughly 300k rows and probably only a couple of hundred updates/inserts a day).
I'm limited to SQL Server 2008 features.
Inspired by Gordon Linoff's answer and HABO's comment which highlighted something Gordon was doing that was critical, I:
Removed the SQL Server 2012+ FIRST_VALUE function, replacing it with a CROSS JOIN on an "overview" query of the data
Included the Links count in the overview query
Removed the reliance on t in Gordon's WHERE NOT EXISTS (SELECT 1 FROM Thing t2 WHERE t2.ReplacedBy = t.id), which (at last on SQL Server 2008) wasn't bound to anything
Filtered out rows that weren't replaced
Below, I also add the date filtering mentioned in the question
...filtered by a date range applied to the last row in the chain.
...which Gordon didn't cover at all, and changes our approach, but only in terms of the arrow of time.
So, first, without the date criteria, sticking fairly close to Gordon's answer:
WITH Data AS (
SELECT Id AS FirstId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
WHERE Replaces IS NULL AND ReplacedBy IS NOT NULL
UNION ALL
SELECT d.FirstId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth + 1
FROM Data d
JOIN Thing t ON t.Replaces = d.Id
),
Overview AS (
SELECT FirstId, MAX(Id) AS LastId, COUNT(*) AS Links
FROM Data
GROUP BY
FirstId
)
SELECT d.FirstId, o.LastId, d.Id, o.Links, d.Depth, d.TStamp
FROM Data d
CROSS APPLY (
SELECT LastId, Links
FROM Overview
WHERE FirstId = d.FirstId
) o
ORDER BY
d.FirstId, d.Depth
;
The critical parts of that are grabbing the seed Id as FirstId here:
SELECT Id AS FirstId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
WHERE Replaces IS NULL AND ReplacedBy IS NOT NULL
and then propagating it through the results of the recursive join:
SELECT d.FirstId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth + 1
FROM Data d
JOIN Thing t ON t.Replaces = d.Id
Just adding that to my original query gives us most of what I wanted. Then we add a second query to get the LastId for each FirstId (Gordon did it as a FIRST_VALUE over a partition, but I can't do that in SQL Server 2008) and using an overview query also lets me grab the number of links. We cross-apply that on the basis of the FirstId value to get the overall results I wanted.
The query above returns the following for the sample data:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
| FirstId | LastId | Id | Links | Depth | TStamp |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
| 1 | 11 | 1 | 2 | 0 | 2017-01-01 |
| 1 | 11 | 11 | 2 | 1 | 2017-01-11 |
| 2 | 22 | 2 | 3 | 0 | 2017-01-02 |
| 2 | 22 | 12 | 3 | 1 | 2017-01-12 |
| 2 | 22 | 22 | 3 | 2 | 2017-01-13 |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
...e.g., exactly what I wanted, plus Depth if I want (so I know what order the intermediary links were in).
If we wanted to include rows that were never replaced, we'd just change
WHERE Replaces IS NULL AND ReplacedBy IS NOT NULL
to
WHERE Replaces IS NULL
Giving us:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
| FirstId | LastId | Id | Links | Depth | TStamp |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
| 1 | 11 | 1 | 2 | 0 | 2017-01-01 |
| 1 | 11 | 11 | 2 | 1 | 2017-01-11 |
| 2 | 22 | 2 | 3 | 0 | 2017-01-02 |
| 2 | 22 | 12 | 3 | 1 | 2017-01-12 |
| 2 | 22 | 22 | 3 | 2 | 2017-01-13 |
| 3 | 3 | 3 | 1 | 0 | 2017-01-03 |
| 4 | 4 | 4 | 1 | 0 | 2017-01-04 |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
But we've ignored the date criteria required by the question:
...filtered by a date range applied to the last row in the chain.
To do that without building a massive temporary result set, we have to work backward: Instead of selecting the starting point (the first entry in a chain, Replaces IS NULL), we need to select the ending point (the last entry in a chain, ReplacedBy IS NULL), and then invert our logic working back through the chain. It's largely a matter of:
Swapping FirstId with LastId
Swapping Replaces with ReplacedBy (convenient the table had both!)
Using MIN to get the first ID in the chain rather than MAX to get the last
Using d.Depth - 1 rather than d.Depth + 1
Then fixing-up Depth based on Links once we know it in our final select, to get those nice values where 0 = first link rather than some varying negative number: o.Links + d.Depth - 1 AS Depth
All of which gives us:
WITH Data AS (
SELECT Id AS LastId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
WHERE ReplacedBy IS NULL AND Replaces IS NOT NULL
-- Filtering by date of last entry would go here
UNION ALL
SELECT d.LastId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth - 1
FROM Data d
JOIN Thing t ON t.ReplacedBy = d.Id
),
Overview AS (
SELECT LastId, MIN(Id) AS FirstId, COUNT(*) AS Links
FROM Data
GROUP BY
LastId
)
SELECT o.FirstId, d.LastId, d.Id, o.Links, o.Links + d.Depth - 1 AS Depth, d.TStamp
FROM Data d
CROSS APPLY (
SELECT FirstId, Links
FROM Overview
WHERE LastId = d.LastId
) o
ORDER BY
o.FirstId, d.Depth
;
So for instance, if we used
AND TStamp BETWEEN '2017-01-12' AND '2017-02-01'
where I have
-- Filtering by date of last entry would go here
above, with our sample data we'd get this result:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
| FirstId | LastId | Id | Links | Depth | TStamp |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
| 2 | 22 | 2 | 3 | 0 | 2017−01−02 |
| 2 | 22 | 12 | 3 | 1 | 2017−01−12 |
| 2 | 22 | 22 | 3 | 2 | 2017−01−13 |
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
...because the last link the Id = 1 chain is outside the date range, so we don't include it.
This is a little tricky. Arrange the CTE to start at the beginning of each list. That makes the subsequent processing easier:
WITH Data AS (
SELECT Id as FirstId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing t
WHERE NOT EXISTS (SELECT 1 FROM Thing t2 WHERE t2.ReplacedBy = t.id)
UNION ALL
SELECT d.FirstId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth + 1
FROM Data d JOIN
Thing t
ON t.Replaces = d.Id
)
SELECT d.*,
FIRST_VALUE(id) OVER (PARTITION BY FirstId ORDER BY Depth DESC) as LastId
FROM Data d;
Then, you can use FIRST_VALUE() with a reverse sort to get the last value in the chain.
This returns chains that have no links. You can add a filter to remove these.
Hi i've got this little issue while sorting items within our database.
table data is like this:
id | description | parent | rowlevel
100 | item 12222 | -none- | 0
SET | item 12345 | -none- | 0
201 | item 22345 | -SET - | 1
I'd like have to output sorted on "id" though also have it sorted on a way that the "childen" come after the "parents".
I know this would be easier with a different table layout though changing that is not an option.
planned result would be:
id | description | parent | rowlevel
100 | item 12222 | -none- | 0
SET | item 12345 | -none- | 0
201 | item 22345 | -SET - | 1
301 | item 22345 | -SET - | 1
401 | item 22345 | -SET - | 1
ST2 | item 12345 | -none- | 0
211 | item 22345 | -ST2 - | 1
321 | item 22345 | -ST2 - | 1
101 | item 22345 | -ST2 - | 1
i've tried using order by but though its result has the rowlevel 1 items together the item's parent results at the bottom. together with the other parents.
I am not experiences using joins and have no clue if its posible to have a join on the table itself which will resolve this.
the only think I can think of is some sort of nested SQL query.
but besides that I am not sure if that will work. I am also concerned this will eat resources and have a great impact on the performance.
As long es you have only 1 level, just add the following order criterion to your SQL:
ORDER BY COALESCE(parent, id), rowlevel
(Assuming that -none- is actually null and SET/-SET - is actually some kind of numeric id.)
If you have more than 1 level of nesting, you will require some kind of recursive CTE.
You can use a recursive CTE:
WITH CTE AS
(
SELECT t1.[id], t1.[description], t1.[parent],
[rowlevel] = 0
FROM dbo.Table1 t1
WHERE t1.parent = '-none-'
UNION ALL
SELECT t2.[id], t2.[description], t2.[parent],
[rowlevel] = CTE.[rowlevel] + 1
FROM dbo.Table1 t2 INNER JOIN CTE
ON t2.parent = CTE.id
)
SELECT * FROM CTE
ORDER BY rowlevel, id
DEMO
Question is similar to this one How to write a MySQL query that returns a temporary column containing flags for whether or not an item related to that row exists in another table
Except that I need to be more specific about which rows exists
I have two tables: 'competitions' and 'competition_entries'
Competitions:
ID | NAME | TYPE
--------------------------------
1 | Example | example type
2 | Another | example type
Competition Entries
ID | USERID | COMPETITIONID
---------------------------------
1 | 100 | 1
2 | 110 | 1
3 | 110 | 2
4 | 120 | 1
I want to select the competitions but add an additional column which specifies whether the user has entered the competition or not. This is my current SELECT statement
SELECT
c.[ID],
c.[NAME],
c.[TYPE],
(CASE
WHEN e.ID IS NOT NULL AND e.USERID = #userid THEN 1
ELSE 0
END
) AS 'ENTERED'
FROM competitions AS c
LEFT OUTER JOIN competition_entries AS e
ON e.COMPETITIONID = c.ID
My desired result set from setting the #userid parameter to 110 is this
ID | NAME | TYPE | ENTERED
-------------------------------------
1 | Example | example type | 1
2 | Another | example type | 1
But instead I get this
ID | NAME | TYPE | ENTERED
-------------------------------------
1 | Example | example type | 0
1 | Example | example type | 1
1 | Example | example type | 0
2 | Another | example type | 1
Because it's counting the entries for all user ids
Fixing your query
SELECT
c.[ID],
c.[NAME],
c.[TYPE],
MAX(CASE
WHEN e.ID IS NOT NULL AND e.USERID = #userid THEN 1
ELSE 0
END
) AS 'ENTERED'
FROM competitions AS c
LEFT OUTER JOIN competition_entries AS e ON e.COMPETITIONID = c.ID
GROUP BY
c.[ID],
c.[NAME],
c.[TYPE]
An alternative is to rewrite it using EXISTS which is pretty much the same but may be easier to understand.
BTW, using single quotes on the column name is deprecated. Use square brackets.
SELECT
c.[ID],
c.[NAME],
c.[TYPE],
CASE WHEN EXISTS (
SELECT *
FROM competition_entries AS e
WHERE e.COMPETITIONID = c.ID
AND e.USERID = #userid) THEN 1 ELSE 0 END [ENTERED]
FROM competitions AS c