Concat multiple rows in SSIS in one column - sql-server

I have two tables:
Table 1 (Name, (...), ProductID)
Table 2 (Customer Name, ProductID)
Sample Table 1
Test1 | 1
Sample Table 2:
Customer1 | 1
Customer2 | 1
The table schema is fixed and I cannot change it and I cannot create additional views, etc.
With SQL Server Integration Services I have to create a new table. Within this new table I need to have one column with the customers like Customer1; Customer2 for Test1.
I know that I could do it with COALESCE, but I have no idea how to do that within SSIS. Which transformation should I use, and how?
Update
Here is the sample for COALESCE:
DECLARE #names VARCHAR(150)
SELECT #names = COALESCE(#names + '; ', '') + [CustomerName]
FROM Table2
SELECT #names
How to insert this snippet into one new SELECT * FROM Table1?

Take your OLEDB source and use a SQL query for that source.
SELECT T1.Name, T2.Customer
FROM [TABLE 1] AS T1
INNER JOIN [Table 2] as T2
ON T1.ProductID = T2.ProductID
Then use a data flow to move that to your OLEDB destination. No need to get fancy with SSIS on this if you can easily make the database engine handle it.

Your best, and most performant, solution would be to use a SQL Query as your Source, instead of the raw tables. Then you can do the COALESCE (or Concatenation) in the SQL query and it will be passed through the SSIS data pipe.

Ok, I've found the solution:
SELECT [All_My_Other_Fields], STUFF(
(SELECT '; ' + [CustomerName]
FROM Table2
WHERE Table2.ProductID=Table1.ProductID
FOR XML PATH (''))
, 1, 1, '') AS Customers
FROM Table1
Thank you all for your help!

Related

Issue converting SQL column to CSV via XML

I'm replacing a comma-delimited field in a SQL Server reporting application with a many-to-many relationship. The application code can't be replaced just yet, so I need to derive the same CSV column it currently expects. I've used the FOR XML PATH trick in the past, and it seems like a fast set-based solution that should easy to implement.
My current query looks like this:
SELECT
Report = rr.Report_ID,
RoleList = STUFF((SELECT ', ' + r.[Name] AS [text()]
FROM [dbo].[Role] r
WHERE r.Role_ID = rr.Role_ID
FOR XML PATH ('')), 1, 1, '')
FROM
[dbo].ReportRole rr
ORDER BY
rr.Report_ID;
What I expect is this:
Report RoleList
--------------------------------------------------------------------
2 Senior Application Developer
3 Senior Application Developer, Manager Information Systems
But what I get is this instead:
Report RoleList
--------------------------------------
2 Senior Application Developer
3 Senior Application Developer
3 Manager Information Systems
I'm using SQL Server 2017. Does this version not support the XML-to-CSV hack from previous versions?
You've tagged your question with SQL-Server 2017 and you ask:
Does this version not support the XML-to-CSV hack from previous versions?
Yes, it does (as Paurian's answer told you), but it is even better: This version supports STRING_AGG():
Not knowing your tables I set up a mini mockup to simulate your issue (according to your The tables are normalized and the join table is a simple many-to-many ID map):
Two tables with a m:n-mapping table in between
DECLARE #mockupA TABLE(ID INT,SomeValue VARCHAR(10));
INSERT INTO #mockupA VALUES(1,'A1'),(2,'A2');
DECLARE #mockupB TABLE(ID INT,SomeValue VARCHAR(10));
INSERT INTO #mockupB VALUES(1,'B1'),(2,'B2');
DECLARE #mockupMapping TABLE(ID_A INT,ID_B INT);
INSERT INTO #mockupMapping VALUES(1,1),(1,2),(2,2);
--The query will simply join these tables, then use GROUP BY together with STRING_AGG(). The WITHIN GROUP clause allows you to determine the sort order of the concatenated string.
SELECT a.ID,a.SomeValue
,STRING_AGG(b.SomeValue,', ') WITHIN GROUP(ORDER BY b.ID) AS B_Values
FROM #mockupA a
INNER JOIN #mockupMapping m ON a.ID=m.ID_A
INNER JOIN #mockupB b ON b.ID=m.ID_B
GROUP BY a.ID,a.SomeValue;
The result
ID SomeValue B_Values
1 A1 B1, B2
2 A2 B2
Having a table named "ReportRole" indicates a linking table between Report and Role in your case. If that's true, you could make the ReportRole table a part of your inner query and keep your Report IDs in the external query:
SELECT
Report = rpt.Report_ID,
RoleList = STUFF((SELECT ', ' + r.[Name] AS [text()]
FROM [dbo].[Role] r
JOIN [dbo].ReportRole rr
ON r.Role_ID = rr.Role_ID
WHERE rr.Report_ID = rpt.Report_ID
FOR XML PATH ('')), 1, 1, '')
FROM
[dbo].Report rpt
ORDER BY
rpt.Report_ID;

Test if a SQL Server column exists without using the schema?

A 3rd party DB we read in ADO.Net recently added a column in a new version of their code. It's a fkey to a new table.
We have read-only access to the data tables, so in theory cannot rely on the schema to do this. So...
1) is INFORMATION_SCHEMA always available for items you can access, or is it possible we will not have rights even to tables we can read?
2) if (1) is "bad", what would be the canonical solution? In SQL itself I would do a SELECT * FROM x WHERE 1=0 and then test the headers, is there an equivalent test in .Net?
You can list the column names (values are optional) from any Table or Query via a little XML.
Example
Declare #AnyTableOrQuery Table (EmpID int,EmpName varchar(50),Salary int,Location varchar(100))
Insert Into #AnyTableOrQuery Values
(1,'Arul',100,null)
,(2,'Jane',120,'New York')
Select B.*
From ( values (cast((Select Top 1 * From #AnyTableOrQuery for XML RAW,ELEMENTS XSINIL) as xml))) A(XMLData)
Cross Apply (
Select Column_Name = a.value('local-name(.)','varchar(100)')
,Column_Value = a.value('.','varchar(max)')
From A.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./*') as C2(a)
) B
Returns
Column_Name Column_Value
EmpID 1
EmpName Arul
Salary 100
Location
EDIT
#MauryMarkowitz Provided a much better solution (see comment below)
sp_describe_first_result_set #tsql = N'Select * from YourTable'

How to retrieve all record from every table where id = 1 in MS SQL Server 2008 R2

How do I retrieve all record from every table (Ex: table1, table2, table3, ... tableN ) where id = 1 from single database (EX: database1) in SQL Server 2008 R2?
Let's suppose I have 1 database and in that I have infinite tables (EX. table1,table2, ....,tableN). Is this possible to get all the record from entire database where id=1 on each table? I think it is possible with SQL information_schema.table or information_schema.column, but I don't know how to use this.
Any help appreciated
Thanks in advance!
You can use the undocumented sp_msforeachtable
sp_msforeachtable
#command1 = 'SELECT * FROM ? WHERE id=1',
#whereand = ' And Object_id In (Select Object_id From sys.columns Where name=''id'')'
#command1 is your query. The question mark is a placeholder the stored procedure uses to insert the table name.
#whereand limits the search to just tables that have the column named id
I don't have much idea of MySQL, Use this in db2
Select C.NAME,C.ROLLNUMBER from TABLE1 C,TABLE2 A where C.ROLLNUMBER=A.ROLLNUMBER and id = 1 order by C.CIRCLENAME
hope this will work too..
If you want to mention the tables manually try this.
SELECT COL1,COL2,COL3....COLN FROM TABLE1 WHERE ID=1
UNION ALL
SELECT COL1,COL2,COL3....COLN FROM TABLE2 WHERE ID=1
UNION ALL
SELECT COL1,COL2,COL3....COLN FROM TABLE2 WHERE ID=1
:
:
:
UNION ALL
SELECT COL1,COL2,COL3....COLN FROM TABLEN WHERE ID=1
Note: The col1,col2,col3...coln Should Be Same Datatype in All The Mentioned Tables
This for dynamic building of all the tables with id =1
SELECT STUFF((SELECT '
UNION ALL
SELECT COL1,COL2,COL3..COLN FROM '+TABLE_NAME + ' WHERE ID=1 '
FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE='BASE TABLE' FOR XML PATH(''),type).value('.', 'NVARCHAR(MAX)'),1,11,'')
Note:Your tables must contain the common column i.e ID .change the column names as per you need but all the mentioned columns in select statement should be contain in all of your tables.
The best possible way would be to generate a dynamic query and executed it to get the required information.
To generate the query you can use the schema information related system tables and feed the data in a fixed format table. i.e. make a fix format tables having a defined column structure. That will help to feed the data.
For example:
CREATE TABLE AllTableData
(
TableId int,
TableName nvarchar(250),
TableData NVARCHAR(max),
SelectedId int
)
Where TableId is the id of table from system table and TableData will contain the concatenated value string of all columns of a table with some separator identifier.
;WITH T AS
(
SELECT
T1.*
FROM
INFORMATION_SCHEMA.TABLES T1
INNER JOIN INFORMATION_SCHEMA.COLUMNS T2 ON T2.TABLE_NAME = T2.TABLE_NAME
WHERE
T2.COLUMN_NAME = 'Id'
AND T1.TABLE_TYPE='BASE TABLE'
),
DynamicQuery AS (
SELECT
1 AS Id,
CONCAT(
'SELECT ', QUOTENAME(T.TABLE_NAME,''''),' AS [TableName],',
CONCAT(' CONCAT(',STUFF((SELECT CONCAT(', [' , C.COLUMN_NAME,']' )
FROM INFORMATION_SCHEMA.COLUMNS C
WHERE C.TABLE_NAME = T.TABLE_NAME
FOR XML PATH('')), 1, 1, ''),') AS [TableData]'
)
,', 1 AS [SelectedId] FROM ', T.TABLE_NAME,' WHERE Id = 1'
) [FinalString]
FROM T
)
SELECT DISTINCT
STUFF((SELECT ' UNION ALL ' + DQ2.FinalString
FROM DynamicQuery DQ2
WHERE DQ2.Id = DQ1.Id
FOR XML PATH('')), 1, 10, '') [FinalString]
FROM DynamicQuery DQ1
GROUP BY DQ1.Id, DQ1.FinalString
I think, this is what you are searching for.

How to frame insert statements from select statement data in SQL Server?

I've selected some data from a table it gives some rows in as result. Now, I want to generate insert statements from result data in SQL Server.
Please suggest me any solutions.
If the destination table is a new table,
you may use SQL SELECT INTO Statement.
We can copy all columns into the new table:
SELECT *
INTO newtable [IN externaldb]
FROM table1;
Or we can copy only the columns we want into the new table:
SELECT column_name(s)
INTO newtable [IN externaldb]
FROM table1;
The new table will be created with the column-names and types as defined in the SELECT statement. You can apply new names using the AS clause.
Like this,
SELECT 'insert into tabledestination (col1destination,col2destination)
values (' + col1source + ',' + col2source + ')'
FROM tablesource;

Performance issue when trying to query based on concatenated columns in SQL

I have two tables
Table1:
-------------------------------
id | pid | name | place | num |
-------------------------------
Table2:
------------------
pid | name | key |
------------------
Now i am writing a query which is a concatenation of two columns, one from table1 and the other from table2.
select *
from table1 join table2
on table1.pid = table2.pid
and table2.key + '-' + table1.num = 'ABC-123'
Since this concatenation is done on two tables which again has to scan most of the rows for the result, the result fetch is very slow and not instantaneous which would be expected.
In such a case what would be advisable. Can anyone help me with this.
Initially it was thought to create a function based index so that there would be some performance gain, but I am not sure whether this will help or not. Moreover I was not able to get a way to create a function based index on two columns from different tables.
New Addition:
The answers given were legitimate but I felt it is going away from my actual requirement. If I have a requirement like this
select table2.key + '-' + table1.num identity
from table1 join table2
on table1.pid = table2.pid
The actual requirement is that I have to concatenate the values from both tables and expose it in a view. Then anyone can query on that column identity from the view. So basically the concatenation will be a must.
Try like this,
SELECT *
FROM table1
JOIN table2
ON table1.pid = table2.pid
AND table2.KEY = 'ABC'
AND table1.num = '123'
If you're only searching on one value at a time, you can split the values up programmatically. This is similar to the other answers, but does the splitting for you.
CREATE PROCEDURE usp_Get_Concat_ID
#ConcatID AS varchar(20)
AS
BEGIN
declare #Table1Num as varchar(20) = SUBSTRING(#ConcatID, 0, CHARINDEX('-', #ConcatID))
declare #Table2Key as varchar(20) = SUBSTRING(#ConcatID, CHARINDEX('-', #ConcatID)+1, LEN(#ConcatID))
select *
from table1 join table2
on table1.pid = table2.pid
and table1.num = #Table1Num
and table2.key = #Table2Key
END
GO
Then, you can just call usp_Get_Concat_ID 'ABC-123'.
If you want users to be able to query on the concatenated value efficiently, you can create an indexed view. Additional view columns can be added as needed. Be aware of the SET option requirements for updating the tables referenced by the indexed view (https://msdn.microsoft.com/en-us/library/ms191432.aspx). If you are not using SQL Server Enterprise or Developer edition, you'll need to add the NOEXPAND hint to queries in order for the view index to be considered by the optimizer.
Also, I strongly suggest you avoid using reserved keywords as column names (key and identity);
CREATE VIEW dbo.IndexedView
WITH SCHEMABINDING
AS
SELECT table2.[key] + '-' + table1.num [identity]
FROM dbo.table1
JOIN dbo.table2
ON table1.pid = table2.pid;
GO
CREATE UNIQUE CLUSTERED INDEX cdx_IndexedView ON IndexedView([identity]);
GO
instead of using one string 'ABC-123' try split it to 'ABC' & '123'-
SELECT *
FROM table1
JOIN table2
ON table1.pid = table2.pid
AND
table2.key IN ('ABC')
AND
table1.num IN ('123')

Resources