Need guidance in joining SQL queries - snowflake-cloud-data-platform

The column - PCT_COL_IND is a BOOLEAN Column.
SELECT ID,
MAX(CASE WHEN PCT_COL LIKE '%no-response%%' OR PCT_COL LIKE '%ind-not-found%'
OR PCT_COL LIKE '%search_res_not_found%' OR PCT_COL LIKE '%empty%' THEN 'TRUE' ELSE 'FALSE' END) AS PCT_COL_IND
FROM table1 A
LEFT JOIN table2 B
ON B.p_id = A.p_id
WHERE date = '2022-02-02'
GROUP BY ID
)
ID PCT_COL_IND
519181 FALSE
694562 FALSE
694892 FALSE
When I run the above query separately, I'm getting values for PCT_COL_IND as FALSE.
When I join the above query with two other queries, I'm getting values for PCT_COL_IND as NULL.
I'm trying to insert the source records to target table using Stored Procedure in Snowflake.
But when I join the above query along with two other queries, I'm getting NULL values for most of the records
for PCT_COL_IND column. I should not display the NULL results.
Can someone suggest me with any logic?
I have mentioned the entire query below:
SELECT distinct
a1.id,
a1.date,
a1.begin_time,
a1.Pno,
b1.PCT_COL_TXT,
b1.PCT_COL_IND,
CASE WHEN final.id is NULL then 'I' ELSE 'U' END as DML_Type
FROM(
SELECT
distinct id,
min(date) over (partition by id order by date) AS date ,
first_value(timestamp_col) over (partition by id order by cast(pno as int) nulls last) as begin_time,
first_value(pg_col) over (partition by id order by cast(pno as int) nulls last) as Pno,
row_number() over(partition by id order by date) as RNK
FROM table1
WHERE date = '2021-11-02'
QUALIFY RNK=1
)a1
left join
(
SELECT id,
CASE
WHEN PCT_COL LIKE '%no-response% then true
WHEN PCT_COL LIKE '%no-response%' then true
WHEN PCT_COL LIKE '%no-response% then true
WHEN PCT_COL LIKE '%no-response%' THEN 'TRUE'
ELSE 'FALSE' END AS PCT_COL_IND,
CASE WHEN listagg(DISTINCT PCT_COL ,',')WITHIN GROUP(ORDER BY PCT_COL )='' THEN NULL
ELSE REPLACE(lower(LISTAGG(DISTINCT PCT_COL , ',') WITHIN GROUP(ORDER BY PCT_COL )),'+','') END AS PCT_COL_TXT
FROM table1 B
LEFT JOIN table2
ON B.s_id = A.s_id
WHERE date = '2021-11-02'
GROUP BY 1,2
)b1
ON b1.id=a1.id
LEFT JOIN
(select distinct
id ,
cr_date,
begin_time,
Pno,
fin_pno,
PCT_COL_TXT,
PCT_COL_IND,
row_number() over(partition by id order by cr_date DESC) as RNK
FROM table4
QUALIFY RNK=1
)final on a1.id = final.id
WHERE (final.id is NULL)
or
(
nvl(a1.id,'0') <> nvl(final.id,'0' )
OR nvl(a1.date,'0')<> nvl(final.cr_date,'0')
OR nvl( a1.Pno,'NA') <> nvl(final.Pno,'NA')
OR nvl( a1.fin_pno,'NA') <> nvl(final.fin_pno,'NA')
OR nvl( b1.PCT_COL_TXT,'NA') <> nvl(final.PCT_COL_TXT,'NA')
OR nvl( b1.PCT_COL_IND,'0') <> nvl(final.PCT_COL_IND,'0')
)
I have tried the logic but still getting NULL values.

First thing, that CASE statement can be written in different ways. And as you current have it, it is rather hard to read in my opinion.
SELECT
column1 as PCT_COL
,CASE WHEN PCT_COL LIKE '%no-response%%' OR PCT_COL LIKE '%ind-not-found%'
OR PCT_COL LIKE '%search_res_not_found%' OR PCT_COL LIKE '%empty%' THEN 'TRUE' ELSE 'FALSE' END as logic_1
,CASE
WHEN PCT_COL LIKE '%no-response%%' then true
WHEN PCT_COL LIKE '%ind-not-found%' then true
WHEN PCT_COL LIKE '%search_res_not_found%' then true
WHEN PCT_COL LIKE '%empty%' THEN 'TRUE'
ELSE 'FALSE'
END as logic_2
,PCT_COL LIKE ANY ('%no-response%%', '%ind-not-found%', '%search_res_not_found%', '%empty%') as logic_3
,NVL(PCT_COL LIKE ANY ('%no-response%%', '%ind-not-found%', '%search_res_not_found%', '%empty%'), FALSE) as logic_4
FROM VALUES
('not like any of these'),
('BLAHBLAHBLAH ind-not-found BLAH BLAH'),
(null);
the second CASE logic_2 while taking more space if, you have to use a case is flows better to my eye. But we can do better and just use LIKE ANY as shown in logic_3 but that has a down side of if the input is NULL the result is NULL, so a NVL cleans that up, as seen in logic_4
PCT_COL
LOGIC_1
LOGIC_2
LOGIC_3
LOGIC_4
'not like any of these'
FALSE
FALSE
FALSE
FALSE
'BLAHBLAHBLAH ind-not-found' BLAH BLAH
TRUE
TRUE
TRUE
TRUE
NULL
FALSE
FALSE
NULL
FALSE
so that helps point at one of the possible sources of you NULL, related to you left join, if you have only NULL's for PCT_COL your current code will not give you a NULL response, so this is not you problem.
The next thing you code does is it uses MAX to aggregates these TEXT strings. As an aside using TEXT strings can be very inefficient, but for now let ignore that. And look at how MAX behaves over different input.
SELECT column1 as ID, MAX(column2)
FROM VALUES
(1, 'TRUE'), (1, 'TRUE'),
(2, 'TRUE'), (2, 'FALSE'),
(3, 'TRUE'), (3, null),
(4, 'FALSE'), (4, 'FALSE'),
(5, 'FALSE'), (5, null),
(6, null), (6, null)
GROUP BY 1
ORDER BY 1;
ID
MAX(COLUMN2)
1
TRUE
2
TRUE
3
TRUE
4
FALSE
5
FALSE
6
NULL
So the only case where MAX give the result you don't want (NULL) is when all the input given to is are NULLs. Otherwise 'TRUE' or 'FALSE' strings are better.
So this points to you LEFT JOIN is not matching, for some ID's and thus producing only NULL values.
So you then need to decide "do you want those NULL values for reason X" at which point you need to mask this, I would suggest wrapping you code in NVL(<expression>,FALSE) to find this edge case.
But also at the same time, given you are getting the results of LEFT JOIN that you seem to not want. perhaps you don't want the LEFT JOIN. But given this is also example code, you might be needing those LEFT JOINs for "some other reason".
So I would use:
SELECT
ID,
NVL(MAX(PCT_COL LIKE ANY ('%no-response%', '%ind-not-found%', '%search_res_not_found%', '%empty%')),false) AS PCT_COL_IND
FROM table1 A
LEFT JOIN table2 B
ON B.p_id = A.p_id
WHERE date = '2022-02-02'
GROUP BY ID
BUT that does not have the OUTPUT of PCT_COL_IND as an uppercase string, of which if you really need, I would flip over to IFF and be explicit about what you are doing:
SELECT
ID,
MAX(IFF(PCT_COL LIKE ANY ('%no-response%', '%ind-not-found%', '%search_res_not_found%', '%empty%'),'TRUE','FALSE')) AS PCT_COL_IND
FROM table1 A
LEFT JOIN table2 B
ON B.p_id = A.p_id
WHERE date = '2022-02-02'
GROUP BY ID

So to review your SQL it can be cut down to this:
SELECT
a1.id
,b1.pct_col_ind
,CASE WHEN final.id is NULL then 'I' ELSE 'U' END as DML_Type
FROM (
SELECT
column1 as id
FROM VALUES
(1),
(2),
(3)
) AS a1
LEFT JOIN (
SELECT *
FROM VALUES
(1, true),
(2, true),
(3, false)
v(id, PCT_COL_IND)
) AS b1 ON a1.id = b1.id
LEFT JOIN (
SELECT *
FROM VALUES
--(1, true),
(2, true),
(3, false)
v(id, PCT_COL_IND)
) AS final ON a1.id = final.id
WHERE final.id is NULL
OR nvl(a1.id,'0') <> nvl(final.id,'0' )
OR nvl( b1.PCT_COL_IND,'0') <> nvl(final.PCT_COL_IND,'0')
;
We only get output under 3 conditions, where the final does not match the a1 row (this can be triggered by missing id some id's in final. So lets have 3 ID's in a1 that are not in final. and the in B1 have one of those be true, and one be false and one not exist there.
Now the next reason we get output is a1.id != final.id but given we join on them being equal, we will only be here if final.id is null, thus we already triggered the first OR or if there are nulls in the ID's of A1 at which point both a1.id and final.id are null, and thus converted to '0' thus are equal, thus it's a pointless clause and not the problem we are looking for.
So the third reason to get rows into the output. is if nvl( b1.PCT_COL_IND,'0') <> nvl(final.PCT_COL_IND,'0'). again if the right hand side is null we are already in and if both sides are null we are not going in. Thus this code is saying if I had a value value, and the left join on B1 failed then NVL will make b1.PCT_COL_IND -> '0'
all the talking transforms the code into:
SELECT
a1.id
,b1.pct_col_ind
,CASE WHEN final.id is NULL then 'I' ELSE 'U' END as DML_Type
FROM (
SELECT
column1 as id
FROM VALUES
(1),
(2),
(3),
(4),
(5)
) AS a1
LEFT JOIN (
SELECT *
FROM VALUES
(1, true)
,(2, false)
--,(3, false)
--,(4, true) /* we want to fail to match */
--,(5, true) /* we want to fail to match again */
v(id, PCT_COL_IND)
) AS b1 ON a1.id = b1.id
LEFT JOIN (
SELECT *
FROM VALUES
--(1, true),
--(2, true),
--(3, false)
(4, true),
(5, false)
v(id, PCT_COL_IND)
) AS final ON a1.id = final.id
WHERE final.id is NULL
--OR nvl(a1.id,'0') <> nvl(final.id,'0' )
OR nvl( b1.PCT_COL_IND,'0') <> nvl(final.PCT_COL_IND,'0')
;
ID
PCT_COL_IND
DML_TYPE
1
TRUE
I
2
FALSE
I
3
NULL
I
4
NULL
U
so row 5 is not present because final.PCT_COL_IND is false, and the null of b1.PCT_COL_IND is replace with the text string "0" which implicitly is converted to a false boolean value, thus the row is dropped. Here we are starting to see the impact of using strings verse boolean's perhaps you want strings after all.
but row 4 is present because the final.pct_col_ind is true, which is different to the empty string
rows 1,2,3 are present because they are not in final, but "did not have results in b1" thus 3 is null.
BUT you b1 is from the same data as a1, so that case should not happen. But that is how I triggered case 4.. so now I am perplexed. I might have to think this over more..

Related

Check if a column value is updated to NULL

I have to determine if a column has been updated from a NOT NULL value to NULL in sql server.
Example -
UpdateDate Value Individual
2020-09-02 10:39:03.530 NULL 105292933
2020-08-31 11:05:06.053 Y 105292933
2020-08-31 11:04:32.720 N 105292931
In above example, for Individual 105292933, Value has been updated to NULL from Y. So the result should be the first row. I am new to sql server. Here is what I tried to get the result -
SELECT a.*
FROM tableX AS a
WHERE a.Value <>
( SELECT TOP 1 b.Value
FROM tableX AS b
WHERE a.Individual = b.Individual
AND a.UpdateDate > b.UpdateDate
ORDER BY b.UpdateDate DESC
)
But it is not picking the changes from Y to NULL or N to NULL. Any help will be appreciated.
You can't compare a value with NULL, You can only compare something that is (strings, numbers etc).
In SQL Server NULL != NULL. To check if value is null use WHERE a.value IS NULL
You can change Your code to compare column with some special string (or empty string if You like) if its value is NULL using ISNULL() function.
SELECT a.*
FROM tableX AS a
WHERE ISNULL(a.Value, '*NULL*') <>
( SELECT TOP 1 ISNULL(b.Value, '*NULL*')
FROM tableX AS b
WHERE a.Individual = b.Individual
AND a.UpdateDate > b.UpdateDate
ORDER BY b.UpdateDate DESC
)
A litter more about NULL values here https://www.w3schools.com/sql/sql_null_values.asp
Window functions are usually more efficient than subqueries, when applicable. I would recommend lag():
select *
from (
select t.*, lag(value) over(partition by individual order by updatedate) lag_value
from mytable t
) t
where value is null and lag_value is not null
This query identifies rows were the prior value (based on UpdateDate) was 'Y' or 'N'. Something like this
with lag_cte as (
select *, lag([value]) over(partition by individual order by updatedate) lag_value
from tableX)
select *
from lag_cte
where value is null
and lag_value in('N', 'Y');

Count Case with 2 columns with the same values (Clarified)

Basically I want COUNT a CASE when values are present in 2 columns.
For example:
SELECT
COUNT
(CASE WHEN 1.sample AND 2.sample IN ('a','b','c')
THEN 1
ELSE NULL
END
) AS CASE
FROM table1 AS 1
INNER JOIN table2 AS 2
...
Message:
Conversion failed when converting the varchar value '08:12.06' to data
type int. Warning: Null value is eliminated by an aggregate or other
SET operation.
I get what's triggering the error, I just don't know a solution to count the case when values are present in both columns.
Can you try this and see if it works? I think this is what you are looking for.
SELECT
SUM
(CASE WHEN 1.sample IN ('a','b','c') AND 2.sample IN ('a','b','c')
THEN 1
ELSE 0
END
) AS CASE
FROM table1 AS 1
INNER JOIN table2 AS 2
You need to list the columns separately for comparison. Usually I specify a column to count, and you do not need to put NULL for the else condition.
SELECT
COUNT
(CASE WHEN 1.sample IS NULL OR 2.sample IS NULL THEN 0
WHEN ( 1.sample IN ('a','b','c')
AND 2.sample IN ('a','b','c')
)
THEN 1.sample
END
) AS CASE
FROM table1 AS 1
INNER JOIN table2 AS 2 ON....

Ignore condition in WHERE clause when column is NULL

I do have table were one row (with Type =E) is related to another row.
I have written query to return COUNT of those related rows. The problem is that there is no explicit relationship (like ID column that would clearly say which row is related to other row). Therefore I am trying to find relationship based on multiple conditions in WHERE clause.
The problem is that in few cases, the columns A and B could be NULL (for records where TYPE = 'M'). In such a cases I would like to ignore that condition, so It would use only first 3 conditions to determine relationship.
I have tried CASE Statement but is not working as expected:
SELECT [T1].[ID],[T1].[AlphaId],[T1].[Type],[T1].[A],[T1].[B],[T1].[Date],[T1].[ServiceID]
,( SELECT COUNT(*)
FROM MyTable T2
WHERE [T1].[AlphaId]=[T2].[AlphaId] AND
[T1].[Date]=[T2].[Date] AND
[T1].[ServiceID]=[T2].[ServiceID] AND
[T2].[A]=CASE WHEN [T2].[A] IS NULL THEN NULL ELSE [T1].[A] END AND
[T2].[B]=CASE WHEN [T2].[B] IS NULL THEN NULL ELSE [T1].[B] END AND
[T2].[Type]='M'
) as TotalCount
FROM MyTable T1
WHERE [T1].[Type] = 'E'
I can't ignore that condition, as for some cases the Date, ServiceID could be same, however it's the A, B which differs them. Luckily where A, B IS NULL, it is the Date, ServiceID which differs those two records.
http://sqlfiddle.com/#!3/c98db/1
Many thanks in advance.
You could join the tables and use COUNT and GROUP BY to get the counts. Then you can JOIN [A] and [B] if they are equal or NULL.
SELECT [T1].[ID],[T1].[AlphaId],[T1].[Type],[T1].[A],[T1].[B],[T1].[Date],[T1].[ServiceID], count([T2].[ID])
FROM MyTable T1
INNER JOIN MyTable T2 ON [T1].[AlphaId]=[T2].[AlphaId] AND
[T1].[Date]=[T2].[Date] AND
[T1].[ServiceID]=[T2].[ServiceID] AND
([T2].[A]= [T1].[A] OR [T2].[A] IS NULL )AND
([T2].[B]= [T1].[B] OR [T2].[B] IS NULL )AND
[T2].[Type] <> [T1].[Type]
WHERE [T1].[Type] = 'E'
GROUP BY [T1].[ID],[T1].[AlphaId],[T1].[Type],[T1].[A],[T1].[B],[T1].[Date],[T1].[ServiceID]

Arrange rows in T-SQL

How to arrange rows manually in T-SQL?
I have a table result in order like this:
Unknown
Charlie
Dave
Lisa
Mary
but the expected result is supposed to be:
Charlie
Dave
Lisa
Mary
Unknown
edited:
My whole query is:
select (case when s.StudentID is null then 'Unknown' else s.StudentName end) as StudentName from Period pd full join Course c on pd.TimeID = c.TimeID full join Student s on c.StudentID = s.StudentID
group by s.StudentName, s.StudentID
order by case s.StudentName
when 'Charlie' then 1
when 'Dave' then 2
when 'Lisa' then 3
when 'Mary' then 4
when 'Unknown' then 5
end
but it didn't work. I think the problem root is because Unknown is from NULL value, as I wrote in that query that when StudentID is null then change "NULL" to "Unknown". Is this affecting the "stubborn" order of the result? By the way I also have tried order by s.StudentName asc but also didn't work.
Thank you.
Try the following...
SELECT os.StudentName
FROM ( SELECT CASE WHEN s.StudentID IS NULL THEN 'Unknown'
ELSE s.StudentName
END AS StudentName
FROM Period pd
FULL JOIN Course c ON pd.TimeID = c.TimeID
FULL JOIN Student s ON c.StudentID = s.StudentID
GROUP BY s.StudentName ,
s.StudentID
) AS os
ORDER BY os.StudentName
Edit: based on comment...
When I use this, it works fine...notice the Order By has no identifier
declare #tblStudent TABLE (StudentID int, StudentName varchar(30));
insert into #tblStudent values (null, '');
insert into #tblStudent values (1, 'Charlie');
insert into #tblStudent values (2, 'Dave');
insert into #tblStudent values (3, 'Lisa');
insert into #tblStudent values (4, 'Mary');
SELECT CASE WHEN s.StudentID IS NULL THEN 'Unknown'
ELSE s.StudentName
END AS StudentName
FROM #tblStudent s
GROUP BY s.StudentName ,
s.StudentID
ORDER BY StudentName
As I see your rows must be ordered alphabetically, so just add in the end of the query: ORDER BY p.StudentName.
If this not help, please add whole query, so we can find out the problem.
So when I see query I can explain. You try to sort by column p.StudentName. This column contains NULL. Try to sort by StudentName without p in front. This is alias of the expression which contains Unknown.
just put the following clause in you SQL statement:
order by p.StudentName
Sql server will order the column alphabetically.

SQL Server 2005: Update rows in a specified order (like ORDER BY)?

I want to update rows of a table in a specific order, like one would expect if including an ORDER BY clause, but SQL Server does not support the ORDER BY clause in UPDATE queries.
I have checked out this question which supplied a nice solution, but my query is a bit more complicated than the one specified there.
UPDATE TableA AS Parent
SET Parent.ColA = Parent.ColA + (SELECT TOP 1 Child.ColA
FROM TableA AS Child
WHERE Child.ParentColB = Parent.ColB
ORDER BY Child.Priority)
ORDER BY Parent.Depth DESC;
So, what I'm hoping that you'll notice is that a single table (TableA) contains a hierarchy of rows, wherein one row can be the parent or child of any other row. The rows need to be updated in order from the deepest child up to the root parent. This is because TableA.ColA must contain an up-to-date concatenation of its own current value with the values of its children (I realize this query only concats with one child, but that is for the sake of simplicity - the purpose of the example in this question does not necessitate any more verbosity), therefore the query must update from the bottom up.
The solution suggested in the question I noted above is as follows:
UPDATE messages
SET status=10
WHERE ID in (SELECT TOP (10) Id
FROM Table
WHERE status=0
ORDER BY priority DESC
);
The reason that I don't think I can use this solution is because I am referencing column values from the parent table inside my subquery (see WHERE Child.ParentColB = Parent.ColB), and I don't think two sibling subqueries would have access to each others' data.
So far I have only determined one way to merge that suggested solution with my current problem, and I don't think it works.
UPDATE TableA AS Parent
SET Parent.ColA = Parent.ColA + (SELECT TOP 1 Child.ColA
FROM TableA AS Child
WHERE Child.ParentColB = Parent.ColB
ORDER BY Child.Priority)
WHERE Parent.Id IN (SELECT Id
FROM TableA
ORDER BY Parent.Depth DESC);
The WHERE..IN subquery will not actually return a subset of the rows, it will just return the full list of IDs in the order that I want. However (I don't know for sure - please tell me if I'm wrong) I think that the WHERE..IN clause will not care about the order of IDs within the parentheses - it will just check the ID of the row it currently wants to update to see if it's in that list (which, they all are) in whatever order it is already trying to update... Which would just be a total waste of cycles, because it wouldn't change anything.
So, in conclusion, I have looked around and can't seem to figure out a way to update in a specified order (and included the reason I need to update in that order, because I am sure I would otherwise get the ever-so-useful "why?" answers) and I am now hitting up Stack Overflow to see if any of you gurus out there who know more about SQL than I do (which isn't saying much) know of an efficient way to do this. It's particularly important that I only use a single query to complete this action.
A long question, but I wanted to cover my bases and give you guys as much info to feed off of as possible. :)
Any thoughts?
You cannot succeed this in one query, because your updates are correlated (ie. level N depends on the updated value of level N+1). Relational engines frown on this very explicitly because of the Halloween Problem. The query plan will go out of its way to ensure that the updates occur as if they had two stages: one in which the current state was read, and then one in which the updated state was applied. If necessary, they'll spool intermediate tables just to preserve this apparent execution order (read all->write all). Since your query, if I understand correctly, tries to break this very premise I don't see any way you'll succeed.
UPDATE statements will be executed as a single query, not as a step by step result.
You need to either use a while loop/cursor (uhhgg) or maybe make use of a CTE expression view to achieve what you are trying, which gives you the recursice possibility.
Have a look at
Using Common Table Expressions
Recursive Queries Using Common Table
Expressions
Here is a one line SQL solution. If you ever relax the requirement that it need be one update statement you can factor out some of the complexity
CREATE TABLE [TableA](
[ID] [int] NOT NULL,
[ParentID] [int] NULL,
[ColA] [varchar](max) NOT NULL,
[Priority] [varchar](50) NOT NULL,
[Depth] [int] NOT NULL)
go
INSERT TableA
SELECT 1, NULL, 'p', 'Favorite', 0 UNION ALL
SELECT 2, 1, 'm', 'Favorite', 1 UNION ALL
SELECT 3, 1, 'o', 'Likeable', 1 UNION ALL
SELECT 4, 2, 'v', 'Favorite', 2 UNION ALL
SELECT 5, 2, 'v', 'Likeable', 2 UNION ALL
SELECT 6, 2, 'd', 'Likeable', 2 UNION ALL
SELECT 7, 6, 'c', 'Red-headed Stepchild', 3 UNION ALL
SELECT 8, 6, 's', 'Likeable', 3 UNION ALL
SELECT 9, 8, 'n', 'Favorite', 4 UNION ALL
SELECT 10, 6, 'c', 'Favorite', 3 UNION ALL
SELECT 11, 5, 'c', 'Favorite', 3 UNION ALL
SELECT 12, NULL, 'z', 'Favorite', 0 UNION ALL
SELECT 13, 3, 'e', 'Favorite', 2 UNION ALL
SELECT 14, 8, 'k', 'Likeable', 4 UNION ALL
SELECT 15,4, 'd', 'Favorite', 3
;WITH cte AS (
SELECT a.i, a.Depth, a.maxd, a.mind, a.maxc, a.di, a.ci, a.cdi, a.ID, a.y, CAST('' AS varchar(max))z
FROM(
SELECT DISTINCT i = 1
,p.Depth
,maxd = (SELECT MAX(Depth) FROM TableA)
,mind = (SELECT MIN(Depth) FROM TableA)
,maxc = (SELECT MAX(c) FROM (SELECT COUNT(*) OVER(PARTITION BY ParentID) FROM TableA)f(c))
,di = (SELECT MIN(Depth) FROM TableA)
,ci = 1
,cdi = (SELECT MIN(Depth) FROM TableA)
,p.ID
,CAST(p.ID AS varchar(max)) + p.ColA + SPACE(1) + CASE WHEN g IS NULL THEN '' ELSE '(' END
+ ISNULL(g,'') + CASE WHEN g IS NULL THEN '' ELSE ')' END y
FROM TableA p
LEFT JOIN TableA c ON (c.ParentID = p.ID)
CROSS APPLY (SELECT SPACE(1) + CAST(c2.ID AS varchar(max)) + ColA + SPACE(1)
FROM TableA c2 WHERE ParentID = p.ID
ORDER BY Priority
FOR XML PATH(''))f(g)
)a
UNION ALL
SELECT r.i, r.Depth, r.maxd, r.mind, r.maxc, r.di, r.ci, r.cdi, r.ID
,CASE WHEN di = cdi
THEN REPLACE(r.y,LEFT(r.z,CHARINDEX(SPACE(1),r.z,2)), r.z)
ELSE r.y END [y]
,r.z
FROM(
SELECT i = i + 1
,Depth
,[maxd]
,[mind]
,[maxc]
,CASE WHEN ci = maxc AND cdi = maxd
THEN di + 1
ELSE di
END [di]
,CASE WHEN cdi = [maxd]
THEN CASE WHEN ci + 1 > maxc
THEN 1
ELSE ci + 1
END
ELSE ci
END [ci]
,CASE WHEN cdi + 1 > maxd
THEN mind
ELSE cdi + 1
END [cdi]
,id,y
,CAST(ISNULL((SELECT y FROM(
SELECT p.Depth,p.ID
,SPACE(1) + CAST(p.ID AS varchar(max)) + p.ColA + SPACE(1) +
CASE WHEN g IS NULL THEN '' ELSE '(' END + ISNULL(g,'')
+ CASE WHEN g IS NULL THEN '' ELSE ')' END y
,r1 = DENSE_RANK() OVER(ORDER BY p.ID) --child number
,r2 = ROW_NUMBER() OVER(PARTITION BY p.ID ORDER BY p.ID) --DISTINCT not allowed in recursive section
FROM TableA p
JOIN TableA c ON (c.ParentID = p.ID)
CROSS APPLY (SELECT SPACE(1)+CAST(c2.ID AS varchar(max))+ColA+SPACE(1)
FROM TableA c2
WHERE ParentID = p.ID
ORDER BY Priority
FOR XML PATH(''))f(g)
WHERE p.Depth = cdi AND cdi < di AND p.ID <> cte.ID
)v
WHERE r1 = ci
AND r2 = 1
AND cte.y LIKE '%' + LEFT(v.y,CHARINDEX(SPACE(1),v.y,2) ) + '%'),'') AS varchar(max)) z
FROM cte
WHERE [di]<[maxd] or [ci]<[maxc] or [cdi]<[maxd]
)r
)--cte
UPDATE t
SET ColA = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(y,SPACE(1),''),'1',''),'2',''),'3',''),'4',''),'5',''),'6',''),'7',''),'8',''),'9',''),'0','')
FROM cte
JOIN TableA t ON (t.ID = cte.ID)
WHERE di = (SELECT MAX(Depth) FROM TableA)
AND cdi = (SELECT MAX(Depth) FROM TableA)
AND ci = (SELECT MAX(c) FROM (SELECT COUNT(*) OVER(PARTITION BY ParentID) FROM TableA)f(c))
OPTION(maxrecursion 0)
SELECT * FROM TableA
DROP TABLE TableA
JMTyler-
1 What kind of data is in ColA? What does it look like?
2 How is/should that column be originally populated? I ask this because you would only be able to run the update once since the value in that column would be modified from a previous run. Any additional runs would just concatenate more data. Which makes me believe there is another ColC with the original value for ColA (a person's name?)
3 Will a row ever be deleted orphaning it's children? If yes what should their ParentColB then point to? NULL? Does their depth then get set to 0 so they are now at the top of the hierarchy?
If you can answer this I can give you a solution
Thanks

Resources