Get data and using subselect on the same table

Get data and using subselect on the same table - sql-server

I have a table like this:
+--------------+--------------+--------------+
| userid | username | proxy |
+--------------+--------------+--------------+
| 1 | j.doe | |
| 2 | lechnerio | 1,4 |
| 3 | janedoe | 1 |
| 4 | mustermann | 2 |
+--------------+--------------+--------------+
The proxy can either be NULL, one or more IDs from other users.
I'd like to build a view that helps to visualize the user. I thought of a similar result like this:
+--------------+--------------+--------------+-----------------------------+
| userid | username | proxy | proxy_info |
+--------------+--------------+--------------+-----------------------------+
| 1 | j.doe | | |
| 2 | lechnerio | 1,4 | j.doe (1), mustermann (4) |
| 3 | janedoe | 1 | j.doe (1) |
| 4 | mustermann | 2 | lechnerio (2) |
+--------------+--------------+--------------+-----------------------------+
I can't wrap my head around the sub-select I need for the proxy_info. The table itself has more than these three columns, but that shouldn't matter in this example. combining the proxy_info with a username and the id in brackets isn't the issue either. Yet I'm unable to subselect the values where the userid matches.
I'd be happy about any tips and hints to achieve the result listed above.
I thought about either joining the table with itself or using a union. But both options seem over-complicated for the desired result. I'm working with SQL Server here.
as of an idea:
SELECT a.agentcode
,a.username
,a.proxy
,(
SELECT b.agentcode
FROM app_users b
WHERE a.agentcode = b.proxy
) AS proxy_info
FROM app_users a

Try this:
DROP TABLE IF EXISTS [dbo].[test_Users];
GO
CREATE TABLE [dbo].[test_Users]
(
[userid] INT
,[username] VARCHAR(18)
,[proxy] VARCHAR(12)
);
GO
INSERT INTO [dbo].[test_Users] ([userid], [username], [proxy])
VALUES (1, 'j.doe', NULL)
,(2, 'janedoe', '1,4')
,(3, 'janedoe', '1')
,(4, 'mustermann', '2');
GO
SELECT U.[userid]
,U.[username]
,U.[proxy]
,NULLIF(STRING_AGG(CONCAT(P.[username], '(', p.[userid] ,')'), ', '), '()')
FROM [dbo].[test_Users] U
OUTER APPLY STRING_SPLIT (U.[proxy], ',') S
LEFT JOIN [dbo].[test_Users] P
ON S.[value] = P.[userid]
GROUP BY U.[userid]
,U.[username]
,U.[proxy];

Edit: that is solution for MySQL. In SQL Server there is not equivalent function FIND_IN_SET, but there are workarounds, i.e.
FIND_IN_SET() equivalent in SQL Server .
You can play with functions FIND_IN_SET to join the table and GROUP_CONCAT to concatenate values in the form you want to get them.
SELECT a.id, a.username, a.proxy, GROUP_CONCAT(CONCAT(b.username,'(',b.id,')')) as `proxy_name`
FROM users a
INNER JOIN users b ON(FIND_IN_SET(b.id,a.proxy))
GROUP BY a.id
ORDER BY a.id;
Output:
| id | username | proxy | proxy_name |
|----|------------|-------|------------------------|
| 2 | lechnerio | 1,4 | j.ode(1),mustermann(4) |
| 3 | janedoe | 1 | j.ode(1) |
| 4 | mustermann | 2 | lechnerio(2) |

Below query will work in SQL Server 2017
Test Setup. I have taken one record lesser for easy setup
CREATE TABLE Tablename(userid int, username varchar(20), proxy varchar(10))
INSERT INTO Tablename
VALUES (1, 'j.doe',null), (2,'lechnerio','1,4'),(4, 'mustermann','2');
Query to Execute
;WITH CTE_ExpandedTable AS
(SELECT userid, username, proxy, t.val as proxyid
FROM Tablename
OUTER APPLY (SELECT value from string_split(proxy, ',')) as t(val)
)
SELECT c.userid, c.username,c.proxy, CASE WHEN c.proxy IS NULL THEN NULL
ELSE
STRING_AGG(CONCAT(t1.username,' (',t1.userid,')'),',') END AS proxies
FROM CTE_ExpandedTable AS c
LEFT OUTER JOIN TableName as t1
ON t1.userid = c.proxyid
GROUP BY c.userid, c.username,c.proxy
Resultset
+--------+------------+--------+--------------------------+
| userid | username | proxy | proxies |
+--------+------------+--------+--------------------------+
| 1 | j.doe | (null) | (null) |
| 2 | lechnerio | 1,4 | j.doe (1),mustermann (4) |
| 4 | mustermann | 2 | lechnerio (2) |
+--------+------------+--------+--------------------------+

You can use a mixtrue of XMLPath and recursive ctes.
First, split the proxy. Then join the names per proxy id. Last but not least concat the string again.
Here an example - I used several ctes in order to follow the process step by step:
DECLARE #T1 TABLE(
userid int,
username nvarchar(100),
proxy nvarchar(100)
)
INSERT INTO #T1 VALUES(1, 'j.doe', NULL)
INSERT INTO #T1 VALUES(2, 'lechnerio', '1,4')
INSERT INTO #T1 VALUES(3, 'janedoe', '1')
INSERT INTO #T1 VALUES(4, 'mustermann', '2')
;WITH cte1 AS(
SELECT *, username + ' (' + CAST(userid AS nvarchar(2)) + ')' AS DisplayUserName
FROM #T1
),
cte2 AS(
SELECT
userid,
username,
DisplayUserName,
proxy,
CAST(LEFT(proxy, CHARINDEX(',', proxy + ',') - 1) AS NVARCHAR(100)) d1,
STUFF(proxy, 1, CHARINDEX(',', proxy + ','), '') pString
FROM cte1
UNION all
SELECT
userid,
username,
DisplayUserName,
proxy,
CAST(LEFT(pString, CHARINDEX(',', pString + ',') - 1) AS NVARCHAR(100)) d1,
STUFF(pString, 1, CHARINDEX(',', pString + ','), '')
FROM cte2
WHERE
pString > ''
),
cte3 AS(
SELECT c.*, t.DisplayUserName ProxyUserName
FROM cte2 c
LEFT JOIN cte1 t ON t.userid = c.d1
)
SELECT DISTINCT x2.userid, x2.username, x2.proxy
,SUBSTRING(
(
SELECT ',' + x1.ProxyUserName AS [text()]
FROM cte3 x1
WHERE x1.userid = x2.userid
ORDER BY x1.userid, x1.ProxyUserName, x1.proxy
FOR XML PATH ('')
),2,4000) DisplayUserName
FROM cte3 x2
LEFT JOIN cte1 t ON t.userid = x2.d1

Here is another solution for you using STRING_SPLIT() and STUFF() function of SQL Server.
create table MyTable(userid int
, username varchar(50)
, proxy varchar(20))
insert into MyTable values
(1, 'j.doe', null),
(2, 'lechnerio', '1,4'),
(3, 'janedoe', '1'),
(4, 'mustermann', '2')
; with cte as (SELECT
t1.userid
, t1.username
, t1.proxy
, t2.username + '(' + value + ')' as proxyinfo
FROM MyTable as t1
outer apply STRING_SPLIT(t1.[proxy], ',') p
left join dbo.MyTable as t2 on t2.userid = p.value
)
SELECT a.userid
, username
, proxy
, STUFF((
select ', '+ cast(proxyinfo as nvarchar(150))
from cte b
WHERE a.userid = b.userid
FOR XML PATH(''), TYPE
).value('.', 'varchar(max)')
,1,1,'') AS proxyinfo
FROM cte a
GROUP BY a.userid
, username
, proxy
Here is the working db<>fiddle demo.

Related

Parse XML into a (XPath, Value) pair

Working with XML in SQL Server, given this XML:
<A>
<B>123</B>
<C>
<Cs>234</Cs>
<Cs>345</Cs>
<Cs>12</Cs>
<Cs>2346</Cs>
</Cs>
</A>
I'd like to produce a result set that looks like this:
xpath
value
(/A/B)[1]
123
(/A/C/Cs)[1]
234
(/A/C/Cs)[2]
345
(/A/C/Cs)[3]
12
(/A/C/Cs)[4]
2346
Is there a trick that can do this without walking through the XML? Added bonus would include the ability to start somewhere other than the document root. You could pass /A/C to this routine and it would only give the paths under that element.

This is one of the rare cases when the archaic OPENXML() is handy.
XQuery 3.0 introduced a real solution for such task: fn:path() function long time ago in 2014. Unfortunately, MS SQL Server supports just a subset of XQuery 1.0
Back to mundane earth.
SQL
DECLARE #xml XML =
N'<A>
<B>123</B>
<C>
<Cs>234</Cs>
<Cs>345</Cs>
<Cs>12</Cs>
<Cs>2346</Cs>
</C>
</A>';
DECLARE #DocHandle INT;
EXEC sp_xml_preparedocument #DocHandle OUTPUT, #xml;
;WITH rs AS
(
SELECT * FROM OPENXML(#DocHandle,'/*')
), cte AS
(
-- anchor
SELECT id
,ParentID
--, nodetype
, [text]
,CAST(id AS VARCHAR(100)) AS [Path]
,CAST('/' + rs.localname AS VARCHAR(1000))
+ N'['
+ CAST(ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS NVARCHAR)
+ N']' AS [XPath]
FROM rs
WHERE ParentID IS NULL
UNION ALL
--recursive member
SELECT t.id
,t.ParentID
--, nodetype = (SELECT nodetype FROM rs WHERE id = t.ParentID)
, t.[text]
, CAST(a.[Path] + ',' + CAST( t.ID AS VARCHAR(100)) AS VARCHAR(100)) AS [Path]
, CAST(a.[XPath] + '/' + IIF(t.nodetype = 2, '#', '')
+ t.localname AS VARCHAR(1000))
+ N'['
+ TRY_CAST(ROW_NUMBER() OVER(PARTITION BY t.localname ORDER BY (SELECT 1)) AS NVARCHAR)
+ N']' AS [XPath]
FROM rs AS t
INNER JOIN cte AS a ON t.ParentId = a.id
)
SELECT ID, ParentID, /*nodetype,*/ [Path]
, REPLACE([XPath],'#text','text()') AS XPath
, [text] AS [Value]
FROM cte
WHERE [text] IS NOT NULL
--AND CAST([text] AS VARCHAR(30)) = '12345'
ORDER BY [Path];
EXEC sp_xml_removedocument #DocHandle;
Output
+----+----------+----------+----------------------------+-------+
| ID | ParentID | Path | XPath | Value |
+----+----------+----------+----------------------------+-------+
| 8 | 2 | 0,2,8 | /A[1]/B[1]/text()[1] | 123 |
| 9 | 4 | 0,3,4,9 | /A[1]/C[1]/Cs[1]/text()[1] | 234 |
| 10 | 5 | 0,3,5,10 | /A[1]/C[1]/Cs[2]/text()[1] | 345 |
| 11 | 6 | 0,3,6,11 | /A[1]/C[1]/Cs[3]/text()[1] | 12 |
| 12 | 7 | 0,3,7,12 | /A[1]/C[1]/Cs[4]/text()[1] | 2346 |
+----+----------+----------+----------------------------+-------+

You can use a recursive CTE. You pass in the XML document in #xml. If you need to use a table to get the XML, you can use CROSS APPLY YourXml.nodes instead of FROM #XML.nodes.
WITH cte AS (
SELECT
xpath = CONCAT(v.name, '[', ROW_NUMBER() OVER (PARTITION BY v.name ORDER BY (SELECT 1)), ']'),
value = x.nd.value('text()[1]','nvarchar(100)'),
child = x.nd.query('*')
FROM #xml.nodes('*') x(nd)
CROSS APPLY (VALUES (x.nd.value('local-name(.)[1]','nvarchar(max)'))) v(name)
UNION ALL
SELECT
xpath = CONCAT(cte.xpath, '/', v.name, '[', ROW_NUMBER() OVER (PARTITION BY xpath, v.name ORDER BY (SELECT 1)), ']'),
value = x.nd.value('text()[1]','nvarchar(100)'),
child = x.nd.query('*')
FROM cte
CROSS APPLY cte.child.nodes('*') x(nd)
CROSS APPLY (VALUES (x.nd.value('local-name(.)[1]','nvarchar(max)'))) v(name)
)
SELECT
xpath = CONCAT('/', xpath, '/text()[1]'),
value
FROM cte
WHERE value IS NOT NULL;
db<>fiddle
Unfortunately, you cannot use the ancestor:: axis, which would have made this much easier.
If SQL Server supported ancestor:: you could do something like this
SELECT
xpath = '(' + x.nd.query('for $n in ancestor::* return concat("/", local-name($n))') + '/text())[1]',
value = x.nd.value('text()[1]','nvarchar(100)')
FROM #xml.nodes('//*[text()]') x(nd)

Copying a branch of tree-like structured table

I have the following table, where ID is the pk of the table and is IDENTITY
+----+----------+-----------+-------------+
| ID | ParentID | SomeValue | FullPath |
+----+----------+-----------+-------------+
| 1 | NULL | A | (1) |
| 2 | 1 | A.1 | (1)/(2) |
| 3 | 2 | A.1.1 | (1)/(2)/(3) |
| 4 | NULL | B | (4) |
| 5 | 4 | B.1 | (4)/(5) |
| 6 | 4 | B.2 | (4)/(6) |
| 7 | 6 | B.2.1 | (4)/(6)/(7) |
+----+----------+-----------+-------------+
This table represents data stored in a hierarchical way. I am creating a procedure that will take as input an ID and new_ParentID as parameters; ID (and its children and children's children, etc) will be the branch to copy into new_ParentID.
I started the procedure, but I cannot figure out how will I get the new ID of the parent I created in order to add it's children. For example, if I want to copy A.1 (and A.1.1) into B.2, once A.1-Copied will be created, I do not know its ID to put it as ParentID of A.1.1-Copied. I'm aware of the function SCOPE_IDENTITY, but I don't know how to use it in a CTE. Here is what I have at the moment:
;WITH Branch
AS
(
SELECT ID,
ParentGroupID,
SomeValue
FROM
#Table1 A
WHERE
ID = #ID
UNION ALL
SELECT E.ID,
E.ParentGroupID,
E.SomeValue
FROM
#Table1 E
INNER JOIN Branch T
ON T.ID = E.ParentGroupID
)
INSERT INTO #Table1
SELECT
CASE WHEN ParentGroupID IS NULL
THEN #new_ParentID
ELSE ???,
SomeValue + '-Copied'
FROM
Branch
How can I manage to use SCOPE_IDENTITY to correctly set the new parent of children of my copied branch ?
EDITS:
Suppose I want to copy branch with ID 4 (so the whole B branch) into ID 2 (so A.1 branch), we should have data as follows:
+----+----------+------------+-----------------------+
| ID | ParentID | SomeValue | FullPath |
+----+----------+------------+-----------------------+
| 1 | NULL | A | (1) |
| 2 | 1 | A.1 | (1)/(2) |
| 3 | 2 | A.1.1 | (1)/(2)/(3) |
| 4 | NULL | B | (4) |
| 5 | 4 | B.1 | (4)/(5) |
| 6 | 4 | B.2 | (4)/(6) |
| 7 | 6 | B.2.1 | (4)/(6)/(7) |
| 8 | 2 | B-Copy | (1)/(2)/(8) |
| 9 | 8 | B.1-Copy | (1)/(2)/(8)/(9) |
| 10 | 8 | B.2-Copy | (1)/(2)/(8)/(10) |
| 11 | 10 | B.2.1-Copy | (1)/(2)/(8)/(10)/(11) |
+----+----------+------------+-----------------------+
I have procedures that update the SomeValue and FullPath values after, so don't worry about those! I'm interested in how to reproduce the hierarchy
Here is the code to insert sample data:
CREATE TABLE #Data
(
ID INT IDENTITY(1,1),
ParentID INT,
SomeValue VARCHAR(30),
FullPath VARCHAR(255)
)
INSERT INTO #Data VALUES(NULL,'A','(1)')
INSERT INTO #Data VALUES('1','A.1','(1)/(2)')
INSERT INTO #Data VALUES('2','A.1.1','(1)/(2)/(3)')
INSERT INTO #Data VALUES(NULL,'B','(4)')
INSERT INTO #Data VALUES('4','B.1','(4)/(5)')
INSERT INTO #Data VALUES('4','B.2','(4)/(6)')
INSERT INTO #Data VALUES('6','B.2.1','(4)/(6)/(7)')

OK, let's not beat around the bush, this is pretty messy, and takes a couple of sweeps.
We need to first use a MERGE here (with no UPDATE clause) so that we can OUTPUT the new and old ID values into a table variable. Then, afterwards we need to use an UPDATE to update all the paths for the new path.
You could likely UPDATE the prior level in the MERGE and at the same time INSERT the current level within the MERGE, however, I didn't go down that path, as it was potentially messier. Therefore, after inserting the rows, I use a further rCTe to create the new paths and UPDATE them.
This gives you the below (annotated) SQL:
USE Sandbox;
GO
CREATE TABLE dbo.Data
(
ID INT IDENTITY(1,1),
ParentID INT,
SomeValue VARCHAR(30),
FullPath VARCHAR(255)
)
INSERT INTO dbo.Data
--VALUES has supported multiple rows in 2008, you should be making use of it.
VALUES(NULL,'A','(1)')
,('1','A.1','(1)/(2)')
,('2','A.1.1','(1)/(2)/(3)')
,(NULL,'B','(4)')
,('4','B.1','(4)/(5)')
,('4','B.2','(4)/(6)')
,('6','B.2.1','(4)/(6)/(7)')
GO
--There are your parameters
DECLARE #BranchToCopy int,
#CopysParent int;
SET #BranchToCopy = 4;
SET #CopysParent = 2;
--Table which will have the data to INSERT in
DECLARE #NewData table (ID int,
ParentID int,
SomeValue varchar(30),
FullPath varchar(255),
Level int);
--Will be used in the MERGE's OUTPUT clause to link the new and old IDs
DECLARE #Keys table (OldID int,
NewID int,
Level int);
--Get the hierachical data and INSERT into the #NewData variable
WITH rCTE AS(
SELECT D.ID,
D.ParentID,
D.SomeValue,
D.FullPath,
1 AS Level
FROM dbo.Data D
WHERE ID = #BranchToCopy
UNION ALL
SELECT D.ID,
D.ParentID,
D.SomeValue,
D.FullPath,
r.[Level] + 1
FROM dbo.Data D
JOIN rCTE r ON D.ParentID = r.ID)
INSERT INTO #NewData (ID,ParentID,SomeValue,FullPath,Level)
SELECT r.ID,
r.ParentID,
CONCAT(r.SomeValue,'-Copy'),
r.FullPath,
r.[Level]
FROM rCTE r;
--Uncomment to see results
--SELECT *
--FROM #NewData;
--Yes, we're using a WHILE!
--This, however, is what is known as a "set based loop"
DECLARE #i int = 1;
WHILE #i <= (SELECT MAX(Level) FROM #NewData) BEGIN
--We use MERGE here as it allows us to OUTPUT columns that weren't inserted into the table
MERGE INTO dbo.Data USING (SELECT ND.ID,
CASE ND.ID WHEN #BranchToCopy THEN #CopysParent ELSE K.NewID END AS Parent,
ND.SomeValue,
ND.Level
FROM #NewData ND
LEFT JOIN #Keys K ON ND.ParentID = K.OldID
WHERE ND.Level = #i) U ON 0=1
WHEN NOT MATCHED THEN
INSERT (ParentID, SomeValue)
VALUES (U.Parent, U.SomeValue)
OUTPUT U.ID, inserted.ID, U.Level
INTO #Keys (OldID, NewID, Level);
--Increment
SET #i = #i + 1;
END;
--Uncomment to see results
--SELECT *
--FROM dbo.[Data];
--Now we need to do the FullPath, as that would be a pain to do on the fly
DECLARE #Paths table (ID int, NewPath varchar(255));
--Work out the new paths
WITH rCTE AS(
SELECT D.ID,
D.ParentID,
D.SomeValue,
D.FullPath,
CONVERT(varchar(255),NULL) AS NewPath
FROM dbo.Data D
WHERE D.ID = #CopysParent
UNION ALL
SELECT D.ID,
D.ParentID,
D.SomeValue,
D.FullPath,
CONVERT(varchar(255),CONCAT(ISNULL(r.FullPath,r.NewPath),'/(',D.ID,')'))
FROM dbo.Data D
JOIN rCTE r ON D.ParentID = r.ID
JOIN #Keys K ON D.ID = K.NewID) --As we want only the new rows
INSERT INTO #Paths (ID, NewPath)
SELECT ID, NewPath
FROM rCTe
WHERE FullPath IS NULL;
--Update the table
UPDATE D
SET FullPath = P.NewPath
FROM dbo.Data D
JOIN #Paths P ON D.ID = P.ID;
SELECT *
FROM dbo.Data;
GO
--Clean up
DROP TABLE dbo.Data;
DB<>Fiddle

Here's a solution, using only CTE's:
The path config as ( ... defines the from and to ids to be used for the computation. This could all be done in a TVF.
WITH T AS (
select 1 id, null parentid, 'A' somevalue, '(1)' fullpath union all
select 2 id, 1 parentid, 'A.1' somevalue, '(1)/(2)' fullpath union all
select 3 id, 2 parentid, 'A.1.1' somevalue, '(1)/(2)/(3)' fullpath union all
select 4 id, NULL parentid, 'B' somevalue, '(4)' fullpath union all
select 5 id, 4 parentid, 'B.1' somevalue, '(4)/(5)' fullpath union all
select 6 id, 4 parentid, 'B.2' somevalue, '(4)/(6)' fullpath union all
select 7 id, 6 parentid, 'B.2.1' somevalue, '(4)/(6)/(7)' fullpath
)
, config as (
select 4 from_id, 2 to_id
)
, maxid as (
select max(id) maxid from t
)
, initpath as (
select fullpath from t cross join config where id = to_id
)
, subset_from as (
select t.*, maxid + ROW_NUMBER() over (order by id) new_id, ROW_NUMBER() over (order by id) rn from t cross join config cross join maxid where fullpath like '(' + cast(from_id as varchar) + ')%'
)
, subset_count as (
select count(*) subset_count from subset_from
)
, fullpath_replacements (id, parentid, somevalue, new_id, fullpath, new_fullpath, lvl) as (
select id, parentid, somevalue, new_id, fullpath, replace(fullpath, '(' + cast((select sf.id from subset_from sf where rn = 1) as varchar) + ')', '(' + cast((select sf.new_id from subset_from sf where rn = 1) as varchar) + ')'), 1
from subset_from
union all
select id, parentid, somevalue, new_id, fullpath, replace(new_fullpath, '(' + cast((select sf.id from subset_from sf where sf.rn = fr.lvl + 1) as varchar) + ')', '(' + cast((select sf.new_id from subset_from sf where sf.rn = fr.lvl + 1) as varchar) + ')'), fr.lvl + 1
from fullpath_replacements fr where fr.lvl < (select subset_count from subset_count)
)
, final_replacement as (
select id, parentid, somevalue, new_id, fullpath, (select fullpath from t where t.id = (select to_id from config)) + '/' + new_fullpath new_fullpath, isnull((select sf.new_id from subset_from sf where sf.id = fr.parentid), (select to_id from config)) new_parentid
from fullpath_replacements fr where fr.lvl = (select subset_count from subset_count)
)
select id, parentid, somevalue, fullpath
from (
select * from t
union all
select new_id, new_parentid, somevalue, new_fullpath from final_replacement
) t order by id
The idea is to create new ids with the row_number window function (see subset_from part).
Then make the replacements in the fullpath id by id. That is done using a recursive CTE fullpath_replacements to simulate a loop.
This works because in the fullpath I can always use the brackets to identify which part of the fullpath needs to be exchanged.
This is the output:

Concatenate Multiple rows of a table in SQL Server 2014 / SQL Server 2016 [duplicate]

This question already has answers here:
How to use GROUP BY to concatenate strings in SQL Server?
(22 answers)
How to concatenate text from multiple rows into a single text string in SQL Server
(47 answers)
Closed 3 years ago.
I have a table like this :
id | movie | actorid | actor | roleid | rolename
----+---------+---------+---------+--------+------------------
1 | mi3 | 121 | tom | 6 | actor
2 | avenger | 104 | scarlett| 4 | actress
2 | avenger | 3 | russo | 2 | action director
I'm expecting the output like :
id | movie | actorid | actor | roleid | rolename
----+---------+---------+----------------+--------+--------------------------
1 | mi3 | 121 | tom | 6 | actor
2 | avenger | 104,3 | scarlett,russo | 4,2 | actress, action director
For latest SQL Server version, I saw the STRING_AGG function to concatenate columns or row data. But how can I achieve the expected output with SQL Server 2014 using STUFF ?

Try this:
DECLARE #DataSource TABLE
(
[id] INT
,[movie] VARCHAR(12)
,[actiorid] INT
,[actor] VARCHAR(12)
,[roleid] INT
,[rolename] VARCHAR(36)
);
INSERT INTO #DataSource ([id], [movie], [actiorid], [actor], [roleid], [rolename])
VALUES (1, 'mi3 ', 121, 'tom ', 6, 'actor')
,(2, 'avenger', 104, 'scarlett', 4, 'actress')
,(2, 'avenger', 3, 'russo', 2, 'action director');
-- SQL Server 2017
SELECT [id]
,[movie]
,STRING_AGG([actiorid], ',') AS [actorid]
,STRING_AGG([actor], ',') AS [actor]
,STRING_AGG([roleid], ',') AS [roleid]
,STRING_AGG([rolename], ',') AS [rolename]
FROM #DataSource
GROUP BY [id]
,[movie];
-- SQL Server
WITH DataSoruce AS
(
SELECT DISTINCT [id]
,[movie]
FROM #DataSource
)
SELECT *
FROM DataSoruce A
CROSS APPLY
(
SELECT STUFF
(
(
SELECT DISTINCT ',' + CAST([actiorid] AS VARCHAR(12))
FROM #DataSource S
WHERE A.[id] = S.[id]
AND A.[movie] = S.[movie]
FOR XML PATH, TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
)
) R1 ([actiorid])
CROSS APPLY
(
SELECT STUFF
(
(
SELECT DISTINCT ',' + CAST([actor] AS VARCHAR(12))
FROM #DataSource S
WHERE A.[id] = S.[id]
AND A.[movie] = S.[movie]
FOR XML PATH, TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
)
) R2 ([actor])
CROSS APPLY
(
SELECT STUFF
(
(
SELECT DISTINCT ',' + CAST([roleid] AS VARCHAR(12))
FROM #DataSource S
WHERE A.[id] = S.[id]
AND A.[movie] = S.[movie]
FOR XML PATH, TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
)
) R3 ([roleid])
CROSS APPLY
(
SELECT STUFF
(
(
SELECT DISTINCT ',' + CAST([rolename] AS VARCHAR(12))
FROM #DataSource S
WHERE A.[id] = S.[id]
AND A.[movie] = S.[movie]
FOR XML PATH, TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
)
) R4 ([rolename]);

How to retrieve a different attribute from a row with multiple different and repeated values into a new column?

I have a table that looks like this:
att1 att2
| a | 1 |
| a | 2 |
| b | 2 |
| b | 3 |
| c | 1 |
| c | 2 |
| c | 2 |
And I need the different record of att2 for the duplicate value on att1 to be grouped into a new column like this
att1 att2 att3
| a | 1 | 2 |
| b | 2 | 3 |
| c | 1 | 2 |
I tried to pivot, I tried to self join, but I can't seem to find the query to separate the values like this. Can someone please help me? Thanks

you can use a dynamic pivot query like below
see demo link
create table tt (att1 varchar(10), att2 int)
insert into tt values
('a',1)
,('a',2)
,('b',2)
,('b',3)
,('c',1)
,('c',2)
,('c',2)
go
declare #q varchar(max), #cols varchar(max)
set #cols
= STUFF((
SELECT distinct ',' +
QUOTENAME('att '+
cast(1+ row_number() over (partition by att1 order by att2 ) as varchar(max))
)
FROM (select distinct att1,att2 from tt)tt --note this distinct
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #q=
'select
att1,'+ #cols +
' from
(
select
att1,att2,
''att ''+
cast(1+row_number() over (partition by att1 order by att2 ) as varchar(max)) as r
from
(select distinct att1,att2 from tt)tt
)I
pivot
(
max(att2)
for r in ('+#cols+')
)piv'
exec(#q)

Any query like this always smells like report formatting, rather than genuine data requirement, which should probably be done in a reporting tool rather than a database. But as with all things it is possible with enough code.
This should work for you.
create table #t (att1 nvarchar(max) ,att2 int);
insert #t select 'a', 1 union all select 'a', 2;
insert #t select 'b', 2 union all select 'b', 3;
insert #t select 'c', 1 union all select 'c', 2 union all select 'c', 2;
select att1, 1 as att2, 2 as att3 from
(
select att1, att2, row_number() over (partition by att1 order by att1, att2) as r
from (select distinct att1, att2 from #t) as x
) src
pivot ( avg(att2) for r in ([1],[2])) p;
drop table #t;
The first step is to get the distinct values in your table, and then sort and group them by att1. I'm doing this with a row_number() command, which looks like this:
select att1, att2, row_number() over (partition by att1 order by att1, att2) as r
from (select distinct att1, att2 from #t) as x ;
att1 attr2 r
a 1 1
a 2 2
b 2 1
b 3 2
c 1 1
c 2 2
From there the pivot command transforms rows into columns. The catch with the pivot command is that the names of those new columns need to be data driven; during your row_number command you could provide better names, or you can alias them as I have done here.
Finally, this only works when there are only two values to pivot. To add more, modify the for r in ([1], [2]) line to include e.g. 3, 4, etc.

Recursive sum in tree structure

I have a tree struture in a single table. The table is a tree of categories that can be nested endlessly. Each category has a ProductCount column that tells how many products are directly in the category (not summing child categories).
Id | ParentId | Name | ProductCount
------------------------------------
1 | -1 | Cars | 0
2 | -1 | Bikes | 1
3 | 1 | Ford | 10
4 | 3 | Mustang | 7
5 | 3 | Focus | 4
I would like to make a sql query that for each row/category gives me the number of products including the ones in the child categories.
The output for the table above should be
Id | ParentId | Name | ProductCount | ProductCountIncludingChildren
--------------------------------------------------------------------------
1 | -1 | Cars | 0 | 21
2 | -1 | Bikes | 1 | 1
3 | 1 | Ford | 10 | 21
4 | 3 | Mustang | 7 | 7
5 | 3 | Focus | 4 | 4
I know I probably should use CTE, but cant quite get it working the way it should.
Any help is appreciated!

You can use a recursive CTE where you in the anchor part get all rows and in the recursive part join to get the child rows. Remember the original Id aliased RootID from the anchor part and do sum aggregate in the main query grouped by RootID.
SQL Fiddle
MS SQL Server 2012 Schema Setup:
create table T
(
Id int primary key,
ParentId int,
Name varchar(10),
ProductCount int
);
insert into T values
(1, -1, 'Cars', 0),
(2, -1, 'Bikes', 1),
(3, 1, 'Ford', 10),
(4, 3, 'Mustang', 7),
(5, 3, 'Focus', 4);
create index IX_T_ParentID on T(ParentID) include(ProductCount, Id);
Query 1:
with C as
(
select T.Id,
T.ProductCount,
T.Id as RootID
from T
union all
select T.Id,
T.ProductCount,
C.RootID
from T
inner join C
on T.ParentId = C.Id
)
select T.Id,
T.ParentId,
T.Name,
T.ProductCount,
S.ProductCountIncludingChildren
from T
inner join (
select RootID,
sum(ProductCount) as ProductCountIncludingChildren
from C
group by RootID
) as S
on T.Id = S.RootID
order by T.Id
option (maxrecursion 0)
Results:
| ID | PARENTID | NAME | PRODUCTCOUNT | PRODUCTCOUNTINCLUDINGCHILDREN |
|----|----------|---------|--------------|-------------------------------|
| 1 | -1 | Cars | 0 | 21 |
| 2 | -1 | Bikes | 1 | 1 |
| 3 | 1 | Ford | 10 | 21 |
| 4 | 3 | Mustang | 7 | 7 |
| 5 | 3 | Focus | 4 | 4 |

This is the same concept as Tom's answer, but less code (and way faster).
with cte as
(
select v.Id, v.ParentId, v.Name, v.ProductCount,
cast('/' + cast(v.Id as varchar) + '/' as varchar) Node
from Vehicle v
where ParentId = -1
union all
select v.Id, v.ParentId, v.Name, v.ProductCount,
cast(c.Node + CAST(v.Id as varchar) + '/' as varchar)
from Vehicle v
join cte c on v.ParentId = c.Id
)
select c1.Id, c1.ParentId, c1.Name, c1.ProductCount,
c1.ProductCount + SUM(isnull(c2.ProductCount, 0)) ProductCountIncludingChildren
from cte c1
left outer join cte c2 on c1.Node <> c2.Node and left(c2.Node, LEN(c1.Node)) = c1.Node
group by c1.Id, c1.ParentId, c1.Name, c1.ProductCount
order by c1.Id
SQL Fiddle (I added some extra data rows for testing)

Actually this could be a good use of HIERARCHYID in SQL Server..
CREATE TABLE [dbo].[CategoryTree]
(
[Id] INT,
[ParentId] INT,
[Name] VARCHAR(100),
[ProductCount] INT
)
GO
INSERT [dbo].[CategoryTree]
VALUES
(1, -1, 'Cars', 0),
(2, -1, 'Bikes', 1),
(3, 1, 'Ford', 10),
(4, 3, 'Mustang', 7),
(5, 3, 'Focus', 4)
--,(6, 1, 'BMW', 100)
GO
Query
WITH [cteRN] AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY [ParentId] ORDER BY [ParentId]) AS [ROW_NUMBER]
FROM [dbo].[CategoryTree]
),
[cteHierarchy] AS (
SELECT CAST(
CAST(hierarchyid::GetRoot() AS VARCHAR(100))
+ CAST([ROW_NUMBER] AS VARCHAR(100))
+ '/' AS HIERARCHYID
) AS [Node],
*
FROM [cteRN]
WHERE [ParentId] = -1
UNION ALL
SELECT CAST(
hierarchy.Node.ToString()
+ CAST(RN.[ROW_NUMBER] AS VARCHAR(100)
) + '/' AS HIERARCHYID),
rn.*
FROM [cteRN] rn
INNER JOIN [cteHierarchy] hierarchy
ON rn.[ParentId] = hierarchy.[Id]
)
SELECT x.[Node].ToString() AS [Node],
x.[Id], x.[ParentId], x.[Name], x.[ProductCount],
x.[ProductCount] + SUM(ISNULL(child.[ProductCount],0))
AS [ProductCountIncludingChildren]
FROM [cteHierarchy] x
LEFT JOIN [cteHierarchy] child
ON child.[Node].IsDescendantOf(x.[Node]) = 1
AND child.[Node] <> x.[Node]
GROUP BY x.[Node], x.[Id], x.[ParentId], x.[Name], x.[ProductCount]
ORDER BY x.[Id]
Result

This wont be optimal but it works, however it involves 2 CTEs. 1 main CTE and a CTE in a table valued function to sum up the values for each sub tree.
The first CTE
;WITH cte
AS
(
SELECT
anchor.Id,
anchor.ParentId,
anchor.Name,
anchor.ProductCount,
s.Total AS ProductCountIncludingChildren
FROM
testTable anchor
CROSS APPLY SumChild(anchor.id) s
WHERE anchor.parentid = -1
UNION ALL
SELECT
child.Id,
child.ParentId,
child.Name,
child.ProductCount,
s.Total AS ProductCountIncludingChildren
FROM
cte
INNER JOIN testTable child on child.parentid = cte.id
CROSS APPLY SumChild(child.id) s
)
SELECT * from cte
AND the function
CREATE FUNCTION SumChild
(
#id int
)
RETURNS TABLE
AS
RETURN
(
WITH cte
AS
(
SELECT
anchor.Id,
anchor.ParentId,
anchor.ProductCount
FROM
testTable anchor
WHERE anchor.id = #id
UNION ALL
SELECT
child.Id,
child.ParentId,
child.ProductCount
FROM
cte
INNER JOIN testTable child on child.parentid = cte.id
)
SELECT SUM(ProductCount) AS Total from CTE
)
GO
Which results in:
from the source table
Apologies about formatting.

I couldn't come up with a good T-SQL, set based answer, but I did come up with an answer:
The temp table mimics your table structure. The table variable is a work table.
--Initial table
CREATE TABLE #products (Id INT, ParentId INT, NAME VARCHAR(255), ProductCount INT)
INSERT INTO #products
( ID,ParentId, NAME, ProductCount )
VALUES ( 1,-1,'Cars',0),(2,-1,'Bikes',1),(3,1,'Ford',10),(4,3,'Mustang',7),(5,3,'Focus',4)
--Work table
DECLARE #products TABLE (ID INT, ParentId INT, NAME VARCHAR(255), ProductCount INT, ProductCountIncludingChildren INT)
INSERT INTO #products
( ID ,
ParentId ,
NAME ,
ProductCount ,
ProductCountIncludingChildren
)
SELECT Id ,
ParentId ,
NAME ,
ProductCount,
0
FROM #products
DECLARE #i INT
SELECT #i = MAX(id) FROM #products
--Stupid loop - loops suck
WHILE #i > 0
BEGIN
WITH cte AS (SELECT ParentId, SUM(ProductCountIncludingChildren) AS ProductCountIncludingChildren FROM #products GROUP BY ParentId)
UPDATE p1
SET p1.ProductCountIncludingChildren = p1.ProductCount + isnull(p2.ProductCountIncludingChildren,0)
FROM #products p1
LEFT OUTER JOIN cte p2 ON p1.ID = p2.ParentId
WHERE p1.ID = #i
SELECT #i = #i - 1
END
SELECT *
FROM #products
DROP TABLE #products
I'd be very interested to see a better, set based approach. The problem I ran into is that when you use recursive cte's, you start with the parent and work toward the children - this doesn't really work for getting a sum at the parent levels. You'd have to do some kind of backward recursive cte.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Get data and using subselect on the same table - sql-server

Related

Parse XML into a (XPath, Value) pair

Copying a branch of tree-like structured table

Concatenate Multiple rows of a table in SQL Server 2014 / SQL Server 2016 [duplicate]

How to retrieve a different attribute from a row with multiple different and repeated values into a new column?

Recursive sum in tree structure

Categories

Resources