Grouping data in comma separated format - sql-server

I have a table called SampleData which looks like this:
col1 col2
1 a
1 b
1 c
2 d
2 e
3 f
I need the data in the below format:
col1 col2
1 a,b,c
2 d,e
3 f
Is there a way of doing this using CTE as well?

you can use STUFF if you are using SQL Server 2005 and above.
SELECT
[col1],
STUFF(
(SELECT ',' + [col2]
FROM Table1
WHERE [col1] = a.[col1]
FOR XML PATH ('')) , 1, 1, '') AS col2
FROM Table1 AS a
GROUP BY [col1]
SQLFiddle Demo

I think this is also useful to you.
Comma Seprate Value

Related

SQL Server how to turn number of rows to column value?

I have three rows return from a table as below:
select ID
from service
Results:
ID
--
1
2
3
How can I return output like below:
count | IDs
-------+----------
3 | 1,2,3
hope this helps
select (select Count(*) from service)+' | '+ SELECT STUFF
(
(
SELECT ',' + s.FirstName
FROM Employee s
ORDER BY s.FirstName FOR XML PATH('')
),
1, 1, ''
) AS Employees)
If you are going to be work with stuff() function then you will no need to subquery for count of ids
select count(1) count,
stuff(
(select ','+cast(id as varchar) from table for xml path('')),
1,1,'') Ids
from table

How to create a compound field in group by in sqlserver

I have a table that contains 2 fields (for simplicity). the first one is the one that I want to group by on, and the second one is the one that I want to show as a comma separated text field. How to do it?
So my data is like this:
col 1 col2
------ ------
Ashkan s1
Ashkan s2
Ashkan s3
Hasan k1
Hasan k2
Hasan k3
Hasan kachal
I want this
col1 count combination
------ ------ -------
Ashkan 3 s1, s2,s3
Hasan 4 k1, k2,k3,kachal
I can do the group by like below, but how to do the combination?
select [col1],count(*)
FROM mytable
group by [col1]
order by count(*)
You can use FOR XML PATH for this:
select col1, count(*) ,
STUFF((SELECT ',' + col2
FROM mytable AS t2
WHERE t2.col1 = t1.col1
FOR XML PATH('')), 1, 1, '')
FROM mytable AS t1
group by col1
order by count(*)
You can use FOR XML PATH('') to concatenate the strings:
WITH Tbl(col1, col2) AS(
SELECT * FROM(VALUES
('Ashkan', 's1'),
('Ashkan', 's2'),
('Ashkan', 's3'),
('Hasan', 'k1'),
('Hasan', 'k2'),
('Hasan', 'k3'),
('Hasan', 'kachal')
) t(a,b)
)
SELECT
col1,
[count] = COUNT(*),
x.combination
FROM Tbl t
CROSS APPLY(
SELECT STUFF((
SELECT ', ' + col2
FROM Tbl
WHERE col1 = t.col1
FOR XML PATH('')
), 1, 2, '') AS combination
) x
GROUP BY t.col1, x.combination;

Concatenate column values against another column group

May be this is odd but i need to concatenate values of ActionId to corresponding group of roleId and Order by ActionID is must., some thing like
ActionID RoleId
"1357" 1
"2468" 2
Here is what i have currently, I am looking for GROUP_CONCAT equivalent in MS SQL.
select av.ActionId, ra.RoleId from RoleAction ra join ActionValue av
on ra.ActionId = av.ActionId order by av.ActionId
ActionID RoleId
1 1
3 1
5 1
7 1
4 2
2 2
6 2
8 2
Is there way to do that? Thanks in advance.
You can make it work using FOR XML PATH('') and an inner query:
SELECT DISTINCT T1.RoleID,
(SELECT '' + ActionID
FROM RoleAction T2
WHERE T1.RoleID = T2.RoleID
ORDER BY ActionID
FOR XML PATH(''))
FROM RoleAction T1
This should work:
WITH CTE_A AS
(
select av.ActionId, ra.RoleId from RoleAction ra join ActionValue av
on ra.ActionId = av.ActionId
)
SELECT DISTINCT A.RoleId,
(SELECT '' +
CAST(B.ActionId AS varchar(10))
FROM CTE_A B
WHERE B.RoleID = A.RoleID
FOR XML PATH('')) AS ActionID
FROM CTE_A A
GROUP BY A.RoleID

SQL Server value update from another table only if not null

I am trying to update a column of a table A with the values in table B column based on if Table A.col1 = TableB.Col1.
Problem: I overwrite TableA column value with Null if Col1 is not found in TableB.Col1.
My current query is
UPDATE [tableA]
SET col2 = (SELECT col2 FROM [tableB] WHERE [TableB].col1 = [TableA].col1)
How can I avoid this?
Ex: TableA
Col1 Col2
1 100
2 200
3 300
TableB
Col1 Col2
1 1000
3 3000
Resulting table should be:
Table A
Col1 Col2
1 1000
2 200
3 3000
But I get:
Col1 Col 2
1 1000
2 null
3 3000
Any ideas?
You could do:
UPDATE [tableA]
SET col2 = COALESCE(
(SELECT col2 FROM [tableB] WHERE [TableB].col1 = [TableA].col1),
col2)
COALESCE returns the first non-NULL expression among its arguments.
Or, you could do:
UPDATE a
SET col2 = b.col2
FROM TableA a
INNER JOIN
TableB b
ON
a.col1 = b.col1
but you should be aware that this second form is SQL Server dialect, not standard SQL.
You don't want to update the whole table so your query needs a where clause. In this case :
WHERE exists (select 1
from [tableB]
where [TableB].col1=[TableA].col1
and [TableB].col2 is not NULL -- that condition may or may not be needed
)
This should do it, no?
UPDATE [tableA]
SET col2= (select col2 from [tableB] where [TableB].col1=[TableA].col1 and [TableB].col1 IS NOT NULL )

Combine multiple rows into list for multiple columns

I'm aware that the "combine multiple rows into list" question has been answered a million times, and here's a reference to an awesome article: Concatenating row values in transact sql
I have a need to combine multiple rows into lists for multiple columns at the same time
ID | Col1 | Col2 ID | Col1 | Col2
------------------ => ------------------
1 A X 1 A X
2 B Y 2 B,C Y,Z
2 C Z
I tried to use the xml method, but this has proven to be very slow over large tables
SELECT DISTINCT
[ID],
[Col1] = STUFF((SELECT ',' + t2.[Col1]
FROM #Table t2
WHERE t2.ID = t.ID
FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''),
[Col2] = STUFF((SELECT ',' + t2.[Col2]
FROM #Table t2
WHERE t2.ID = t.ID
FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''),
FROM #Table t
My current solution is to use a stored procedure that builds each ID row separately. I'm wondering if there's another approach I could use (other than using a loop)
For each column, rank the rows to combine (partition by the key column)
End up with a table like
ID | Col1 | Col2 | Col1Rank | Col2Rank
1 A X 1 1
2 B Y 1 1
2 C Z 2 2
Create a new table containing top rank columns for each ID
ID | Col1Comb | Col2Comb
1 A X
2 B Y
Loop through each remaining rank in increasing order (in this case 1 iteration)
for irank = 0; irank <= 1; irank++
update n set
n.col1Comb = n.Col1Comb + ',' + o.Col1, -- so append the rank 2 items
n.col2comb = n.Col2Comb + ',' + o.Col2 -- if they are not null
from #newtable n
join #oldtable o
on o.ID = n.ID
where o.col1rank = irank or o.col2rank = irank
A CTE trick can be used where you update the CTE.
Method 1: a new parallel table to which the data is copied and then concatenated:
CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), RowID INT IDENTITY(1,1));
CREATE TABLE #Table1Concat(ID INT, Col3 VARCHAR(MAX), Col4 VARCHAR(MAX), RowID INT);
GO
INSERT #Table1 VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z');
GO
INSERT #Table1Concat
SELECT * FROM #Table1;
GO
DECLARE #Cat1 VARCHAR(MAX) = '';
DECLARE #Cat2 VARCHAR(MAX) = '';
; WITH CTE AS (
SELECT TOP 2147483647 t1.*, t2.Col3, t2.Col4, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2)
FROM #Table1 t1
JOIN #Table1Concat t2 ON t1.RowID = t2.RowID
ORDER BY t1.ID, t1.Col1, t1.Col2
)
UPDATE CTE
SET #Cat1 = Col3 = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE #Cat1 + ',' + Col1 END
, #Cat2 = Col4 = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE #Cat2 + ',' + Col2 END;
GO
SELECT ID, Col3 = MAX(Col3)
, Col4 = MAX(Col4)
FROM #Table1Concat
GROUP BY ID
Method 2: Add the concatenation columns directly to the original table and concatenate the new columns:
CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), Col1Cat VARCHAR(MAX), Col2Cat VARCHAR(MAX));
GO
INSERT #Table1(ID,Col1,Col2) VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z');
GO
DECLARE #Cat1 VARCHAR(MAX) = '';
DECLARE #Cat2 VARCHAR(MAX) = '';
; WITH CTE AS (
SELECT TOP 2147483647 t1.*, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2)
FROM #Table1 t1
ORDER BY t1.ID, t1.Col1, t1.Col2
)
UPDATE CTE
SET #Cat1 = Col1Cat = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE #Cat1 + ',' + Col1 END
, #Cat2 = Col2Cat = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE #Cat2 + ',' + Col2 END;
GO
SELECT ID, Col1Cat = MAX(Col1Cat)
, Col2Cat = MAX(Col2Cat)
FROM #Table1
GROUP BY ID;
GO
Try this one -
Query1:
DECLARE #temp TABLE
(
ID INT
, Col1 VARCHAR(30)
, Col2 VARCHAR(30)
)
INSERT INTO #temp (ID, Col1, Col2)
VALUES
(1, 'A', 'X'),
(2, 'B', 'Y'),
(2, 'C', 'Z')
SELECT
r.ID
, Col1 = STUFF(REPLACE(REPLACE(CAST(d.x.query('/t1/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '')
, Col2 = STUFF(REPLACE(REPLACE(CAST(d.x.query('/t2/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '')
FROM (
SELECT DISTINCT ID
FROM #temp
) r
OUTER APPLY (
SELECT x = CAST((
SELECT
[t1/a] = t2.Col1
, [t2/a] = t2.Col2
FROM #temp t2
WHERE r.ID = t2.ID
FOR XML PATH('')
) AS XML)
) d
Query 2:
SELECT
r.ID
, Col1 = STUFF(REPLACE(CAST(d.x.query('for $a in /a return xs:string($a)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '')
, Col2 = STUFF(REPLACE(CAST(d.x.query('for $b in /b return xs:string($b)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '')
FROM (
SELECT DISTINCT ID
FROM #temp
) r
OUTER APPLY (
SELECT x = CAST((
SELECT
[a] = ',' + t2.Col1
, [b] = ',' + t2.Col2
FROM #temp t2
WHERE r.ID = t2.ID
FOR XML PATH('')
) AS XML)
) d
Output:
ID Col1 Col2
----------- ---------- ----------
1 A X
2 B,C Y,Z
One solution, one that is at least syntactically straight-forward, is to use a User-Defined Aggregate to "Join" the values together. This does require SQLCLR and while some folks are reluctant to enable it, it does provide for a set-based approach that does not need to re-query the base table per each column. Joining is the opposite of Splitting and will create a comma-separated list of what was individual rows.
Below is a simple example that uses the SQL# (SQLsharp) library which comes with a User-Defined Aggregate named Agg_Join() that does exactly what is being asked for here. You can download the Free version of SQL# from http://www.SQLsharp.com/ and the example SELECTs from a standard system view. (And to be fair, I am the author of SQL# but this function is available for free).
SELECT sc.[object_id],
OBJECT_NAME(sc.[object_id]) AS [ObjectName],
SQL#.Agg_Join(sc.name) AS [ColumnNames],
SQL#.Agg_Join(DISTINCT sc.system_type_id) AS [DataTypes]
FROM sys.columns sc
GROUP BY sc.[object_id]
I recommend testing this against your current solution(s) to see which is the fastest for the volume of data you expect to have in at least the next year or two.

Resources