Issue converting SQL column to CSV via XML

Issue converting SQL column to CSV via XML - sql-server

I'm replacing a comma-delimited field in a SQL Server reporting application with a many-to-many relationship. The application code can't be replaced just yet, so I need to derive the same CSV column it currently expects. I've used the FOR XML PATH trick in the past, and it seems like a fast set-based solution that should easy to implement.
My current query looks like this:
SELECT
Report = rr.Report_ID,
RoleList = STUFF((SELECT ', ' + r.[Name] AS [text()]
FROM [dbo].[Role] r
WHERE r.Role_ID = rr.Role_ID
FOR XML PATH ('')), 1, 1, '')
FROM
[dbo].ReportRole rr
ORDER BY
rr.Report_ID;
What I expect is this:
Report RoleList
--------------------------------------------------------------------
2 Senior Application Developer
3 Senior Application Developer, Manager Information Systems
But what I get is this instead:
Report RoleList
--------------------------------------
2 Senior Application Developer
3 Senior Application Developer
3 Manager Information Systems
I'm using SQL Server 2017. Does this version not support the XML-to-CSV hack from previous versions?

You've tagged your question with SQL-Server 2017 and you ask:
Does this version not support the XML-to-CSV hack from previous versions?
Yes, it does (as Paurian's answer told you), but it is even better: This version supports STRING_AGG():
Not knowing your tables I set up a mini mockup to simulate your issue (according to your The tables are normalized and the join table is a simple many-to-many ID map):
Two tables with a m:n-mapping table in between
DECLARE #mockupA TABLE(ID INT,SomeValue VARCHAR(10));
INSERT INTO #mockupA VALUES(1,'A1'),(2,'A2');
DECLARE #mockupB TABLE(ID INT,SomeValue VARCHAR(10));
INSERT INTO #mockupB VALUES(1,'B1'),(2,'B2');
DECLARE #mockupMapping TABLE(ID_A INT,ID_B INT);
INSERT INTO #mockupMapping VALUES(1,1),(1,2),(2,2);
--The query will simply join these tables, then use GROUP BY together with STRING_AGG(). The WITHIN GROUP clause allows you to determine the sort order of the concatenated string.
SELECT a.ID,a.SomeValue
,STRING_AGG(b.SomeValue,', ') WITHIN GROUP(ORDER BY b.ID) AS B_Values
FROM #mockupA a
INNER JOIN #mockupMapping m ON a.ID=m.ID_A
INNER JOIN #mockupB b ON b.ID=m.ID_B
GROUP BY a.ID,a.SomeValue;
The result
ID SomeValue B_Values
1 A1 B1, B2
2 A2 B2

Having a table named "ReportRole" indicates a linking table between Report and Role in your case. If that's true, you could make the ReportRole table a part of your inner query and keep your Report IDs in the external query:
SELECT
Report = rpt.Report_ID,
RoleList = STUFF((SELECT ', ' + r.[Name] AS [text()]
FROM [dbo].[Role] r
JOIN [dbo].ReportRole rr
ON r.Role_ID = rr.Role_ID
WHERE rr.Report_ID = rpt.Report_ID
FOR XML PATH ('')), 1, 1, '')
FROM
[dbo].Report rpt
ORDER BY
rpt.Report_ID;

Related

Concatenate text from multiple rows into a single text string in SQL Server

The script shown here work in SQL Server but NOT in SNOWFLAKE SQL. What is the equivalent in SNOWFLAKE SQL?
SELECT DISTINCT
ST2.SubjectID,
SUBSTRING((SELECT ',' + ST1.StudentName AS [text()]
FROM dbo.Students ST1
WHERE ST1.SubjectID = ST2.SubjectID
ORDER BY ST1.SubjectID
FOR XML PATH (''), TYPE).value('text()[1]', 'nvarchar(max)'), 2, 1000) [Students]
FROM
dbo.Students ST2
RESULTS FROM SAMPLE BELOW: IT CONCATENATES TEXT FROM ALL THE ROWS INTO A SINGLE TEXT STRING BY ID
I tried the above in SQL Server and it worked, however, I need to use a datawarehouse in Snowflake and snowflake doesn't use XML PATH. They have XMLGET but I can't figure out how to use it.

You seem to want listagg. Implementation should look like this
select SubjectId, listagg(distinct StudentName,',') as Students
from your_table
group by SubjectId;

As Lukasz mentions, the FOR XML PATH ('') syntax in SQL Server was a common way to implement string aggregation before the existence of an explicate## Heading ## operator in later SQL Server versions. This answer describes how it works in SQL Server.
If you are on a version of SQL Server that support the operator, then you could change your code to use STRING_AGG and test that it gives the correct results on SQL Server. Then to migrate to Snowflake, you can simply change the STRING_AGG keyword to LISTAGG.
If you have a lot of such SQL to convert, you might consider using tooling that will recognize such specialized syntax and convert it to the simpler form.

so if your source data look like:
select * from values
(1, 'student_a'),
(1, 'student_b'),
(1, 'student_c'),
(2, 'student_z'),
(2, 'student_a')
both the old code and Phil's code will have a random order, the original code did an ORDER BY SubjectID but that is the value being grouped.
In snowflake an order like this can be done with within group (order by studentname)
so Phil's answer becomes:
select
subjectid,
listagg(distinct studentname,',') within group (order by studentname) as students
from students
group by subjectid;
which then gives the results:
SUBJECTID
STUDENTS
2
student_a,student_z
1
student_a,student_b,student_c

Test if a SQL Server column exists without using the schema?

A 3rd party DB we read in ADO.Net recently added a column in a new version of their code. It's a fkey to a new table.
We have read-only access to the data tables, so in theory cannot rely on the schema to do this. So...
1) is INFORMATION_SCHEMA always available for items you can access, or is it possible we will not have rights even to tables we can read?
2) if (1) is "bad", what would be the canonical solution? In SQL itself I would do a SELECT * FROM x WHERE 1=0 and then test the headers, is there an equivalent test in .Net?

You can list the column names (values are optional) from any Table or Query via a little XML.
Example
Declare #AnyTableOrQuery Table (EmpID int,EmpName varchar(50),Salary int,Location varchar(100))
Insert Into #AnyTableOrQuery Values
(1,'Arul',100,null)
,(2,'Jane',120,'New York')
Select B.*
From ( values (cast((Select Top 1 * From #AnyTableOrQuery for XML RAW,ELEMENTS XSINIL) as xml))) A(XMLData)
Cross Apply (
Select Column_Name = a.value('local-name(.)','varchar(100)')
,Column_Value = a.value('.','varchar(max)')
From A.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./*') as C2(a)
) B
Returns
Column_Name Column_Value
EmpID 1
EmpName Arul
Salary 100
Location
EDIT
#MauryMarkowitz Provided a much better solution (see comment below)
sp_describe_first_result_set #tsql = N'Select * from YourTable'

Converting n columns to delimited string from text only subquery

This is for Microsoft SQL Server 2016 (SP1) - 13.0.4001.0
I have a rather annoying bit of data that needs to be converted into a comma separated string. My options are limited due to it being a 3rd party software that replaces text in my query and runs it. For example, I will write the following:
SELECT %myvalues% for xml path('')
Which then gets turned into:
SELECT 'test1','test2','test3'...'testn' for xml path('')
Which returns
test1test2test3...testn
This works, but it doesn't separate the text with commas or spaces. Here's the result I want:
test1, test2, test3, ... testn
The problem is, I can't control how it inserts the text. I did find the STUFF function among a bunch of other solutions but none seem to work when I don't know the column names.
For example, I get:
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.

I have no idea if this is the best way to do it, but I have had success with the following method:
IF OBJECT_ID('mytable', 'U') IS NOT NULL DROP TABLE mytable;
CREATE TABLE mytable ("#remove#'mylist', 'Ofannoying', 'comma', 'seperated values'" INT NULL);
SELECT REPLACE(REPLACE((SELECT c.Name FROM sys.columns c INNER JOIN sys.objects o ON o.object_id = c.object_id WHERE o.name = 'mytable'), '''', ''), '#remove#', '')
Where my actual code looks something like this:
CREATE TABLE mytable ("#remove##myvalues#" INT NULL);

SQL Server: Self Join query; select only records with matching first name WITHOUT a where statement

I am completely new to SQL Server. I'm stuck on a lab question. I cannot use a WHERE statement to limit the results. I attached the directions I was given below. Expected result should return 6 rows. I am currently returning 122 rows.We are using Microsoft SQL Server Management Studio. We are pulling from a large, pre-configured database with thousands of records.
This is the quoted text from the lab.
Write a SELECT statement that returns three columns:
VendorID From the Vendors table
VendorName From the Vendors table
Contact Name Alias for VendorContactFName and VendorContactLName, with a space in between.
Write a SELECT statement which compares each vendor whose VendorContactFName has the same first name as another VendorContactFName. In other words, find all the different Vendors whose VendorContactFName have the same first name.
Compound JOIN Condition.
No WHERE condition. Sort the final result set by Contact Name (6 rows returned)
Hint: Use a self-join & correlation names; Ex: The Vendor table is both V1 & V2. QUALIFY ALL COLUMN NAMES in the query including those in the SELECT statement
This is what I have come up with so far, but can't figure out how to limit the records without a WHERE statement. I may have excess code that I don't need here, or missing code that I do need.
Here's the code I came up with to start.
SELECT
V1.VendorID AS VendorID, V1.VendorName AS VendorName,
V1.VendorContactFName + ' ' + V1.VendorContactLName AS [Contact Name]
FROM
Vendors AS V1
JOIN
Vendors AS V2 ON (V1.VendorContactFName = V2.VendorContactFName)
AND (V1.VendorID = V2.VendorID)
ORDER BY
[Contact Name];
Query Result
DB Diagram

You just need to update the JOIN condition, FirstName should match between V1 and V2, but vendorId should be different. Also use CONCAT function for Contact name
SELECT DISTINCT V1.VendorID AS VendorID,
V1.VendorName AS VendorName,
CONCAT(V1.VendorContactFName, ' ', V1.VendorContactLName)
AS [Contact Name]
FROM Vendors AS V1 JOIN Vendors AS V2
ON (V1.VendorContactFName = V2.VendorContactFName) AND
(V1.VendorID <> V2.VendorID)
ORDER BY [Contact Name]

Concat multiple rows in SSIS in one column

I have two tables:
Table 1 (Name, (...), ProductID)
Table 2 (Customer Name, ProductID)
Sample Table 1
Test1 | 1
Sample Table 2:
Customer1 | 1
Customer2 | 1
The table schema is fixed and I cannot change it and I cannot create additional views, etc.
With SQL Server Integration Services I have to create a new table. Within this new table I need to have one column with the customers like Customer1; Customer2 for Test1.
I know that I could do it with COALESCE, but I have no idea how to do that within SSIS. Which transformation should I use, and how?
Update
Here is the sample for COALESCE:
DECLARE #names VARCHAR(150)
SELECT #names = COALESCE(#names + '; ', '') + [CustomerName]
FROM Table2
SELECT #names
How to insert this snippet into one new SELECT * FROM Table1?

Take your OLEDB source and use a SQL query for that source.
SELECT T1.Name, T2.Customer
FROM [TABLE 1] AS T1
INNER JOIN [Table 2] as T2
ON T1.ProductID = T2.ProductID
Then use a data flow to move that to your OLEDB destination. No need to get fancy with SSIS on this if you can easily make the database engine handle it.

Your best, and most performant, solution would be to use a SQL Query as your Source, instead of the raw tables. Then you can do the COALESCE (or Concatenation) in the SQL query and it will be passed through the SSIS data pipe.

Ok, I've found the solution:
SELECT [All_My_Other_Fields], STUFF(
(SELECT '; ' + [CustomerName]
FROM Table2
WHERE Table2.ProductID=Table1.ProductID
FOR XML PATH (''))
, 1, 1, '') AS Customers
FROM Table1
Thank you all for your help!