Unpivot dynamic table columns into key value rows - sql-server

The problem that I need to resolve is data transfer from one table with many dynamic fields into other structured key value table.
The first table comes from a data export from another system, and has the following structure ( it can have any column name and data):
[UserID],[FirstName],[LastName],[Email],[How was your day],[Would you like to receive weekly newsletter],[Confirm that you are 18+] ...
The second table is where I want to put the data, and it has the following structure:
[UserID uniqueidentifier],[QuestionText nvarchar(500)],[Question Answer nvarchar(max)]
I saw many examples showing how to unpivot table, but my problem is that I dont know what columns the Table 1 will have. Can I somehow dynamically unpivot the first table,so no matter what columns it has, it is converted into a key-value structure and import the data into the second table.
I will really appreciate your help with this.

You can't pivot or unpivot in one query without knowing the columns.
What you can do, assuming you have privileges, is query sys.columns to get the field names of your source table then build an unpivot query dynamically.
--Source table
create table MyTable (
id int,
Field1 nvarchar(10),
Field2 nvarchar(10),
Field3 nvarchar(10)
);
insert into MyTable (id, Field1, Field2, Field3) values ( 1, 'aaa', 'bbb', 'ccc' );
insert into MyTable (id, Field1, Field2, Field3) values ( 2, 'eee', 'fff', 'ggg' );
insert into MyTable (id, Field1, Field2, Field3) values ( 3, 'hhh', 'iii', 'jjj' );
--key/value table
create table MyValuesTable (
id int,
[field] sysname,
[value] nvarchar(10)
);
declare #columnString nvarchar(max)
--This recursive CTE examines the source table's columns excluding
--the 'id' column explicitly and builds a string of column names
--like so: '[Field1], [Field2], [Field3]'.
;with columnNames as (
select column_id, name
from sys.columns
where object_id = object_id('MyTable','U')
and name <> 'id'
),
columnString (id, string) as (
select
2, cast('' as nvarchar(max))
union all
select
b.id + 1, b.string + case when b.string = '' then '' else ', ' end + '[' + a.name + ']'
from
columnNames a
join columnString b on b.id = a.column_id
)
select top 1 #columnString = string from columnString order by id desc
--Now I build a query around the column names which unpivots the source and inserts into the key/value table.
declare #sql nvarchar(max)
set #sql = '
insert MyValuestable
select id, field, value
from
(select * from MyTable) b
unpivot
(value for field in (' + #columnString + ')) as unpvt'
--Query's ready to run.
exec (#sql)
select * from MyValuesTable
In case you're getting your source data from a stored procedure, you can use OPENROWSET to get the data into a table, then examine that table's column names. This link shows how to do that part.
https://stackoverflow.com/a/1228165/300242
Final note: If you use a temporary table, remember that you get the column names from tempdb.sys.columns like so:
select column_id, name
from tempdb.sys.columns
where object_id = object_id('tempdb..#MyTable','U')

Related

SQL Server extract data from XML column without tag names

I have an XML string:
<XML>
<xml_line>
<col1>1</col1>
<col2>foo 1</col2>
</xml_line>
<xml_line>
<col1>2</col1>
<col2>foo 2</col2>
</xml_line>
</XML>
I am extracting data from that string (stored in #data_xml) by storing it in SQL Server table and parsing it:
-- create temp table, insert XML string
CREATE TABLE table1 (data_xml XML)
INSERT table1
SELECT #data_xml
-- parse XML string into temp table
SELECT
N.C.value('col1[1]', 'int') col1_name,
N.C.value('col2[1]', 'varchar(31)') col2_name,
FROM
table1
CROSS APPLY
data_xml.nodes('//xml_line') N(C)
I would like to know if there is a generic way to accomplish the same without specifying column names (i.e. col1[1], col2[1])
You can use something like:
SELECT
N.C.value('let $i := . return count(//xml_line[. << $i]) + 1', 'int') as LineNumber,
Item.Node.value('local-name(.)', 'varchar(max)') name,
Item.Node.value('.', 'varchar(max)') value
FROM
table1
CROSS APPLY
data_xml.nodes('//xml_line') N(C)
CROSS APPLY
N.C.nodes('*') Item(Node)
To get:
LineNumber
name
value
1
col1
1
1
col2
foo 1
2
col1
2
2
col2
foo 2
See this db<>fiddle.
However, to spread columns horizontally, you will need to generate dynamic SQL after querying for distinct element names.
ADDENDUM: Here is an updated db<>fiddle that also shows a dynamic SQL example.
The above maps all values as VARCHAR(MAX). If you have NVARCHAR data you can make the appropriate changes. If you have a need to map specific columns to specific types, you will need to explicitly define and populate a name-to-type mapping table and incorporate that into the dynamic SQL logic. The same may be necessary if you prefer that the result columns be in a specific order.
ADDENDUM 2: This updated db<>fiddle now includes column type and ordering logic.
--------------------------------------------------
-- Extract column names
--------------------------------------------------
DECLARE #Names TABLE (name VARCHAR(100))
INSERT #Names
SELECT DISTINCT Item.Node.value('local-name(.)', 'varchar(max)')
FROM table1
CROSS APPLY data_xml.nodes('//xml_line/*') Item(Node)
--SELECT * FROM #Names
--------------------------------------------------
-- Define column-to-type mapping
--------------------------------------------------
DECLARE #ColumnTypeMap TABLE ( ColumnName SYSNAME, ColumnType SYSNAME, ColumnOrder INT)
INSERT #ColumnTypeMap
VALUES
('col1', 'int', 1),
('col2', 'varchar(10)', 2)
DECLARE #ColumnTypeDefault SYSNAME = 'varchar(max)'
--------------------------------------------------
-- Define SQL Templates
--------------------------------------------------
DECLARE #SelectItemTemplate VARCHAR(MAX) =
' , N.C.value(<colpath>, <coltype>) <colname>
'
DECLARE #SqlTemplate VARCHAR(MAX) =
'SELECT
N.C.value(''let $i := . return count(//xml_line[. << $i]) + 1'', ''int'') as LineNumber
<SelectItems>
FROM
table1
CROSS APPLY
data_xml.nodes(''//xml_line'') N(C)
'
--------------------------------------------------
-- Expand SQL templates into SQL
--------------------------------------------------
DECLARE #SelectItems VARCHAR(MAX) = (
SELECT STRING_AGG(SI.SelectItem, '')
WITHIN GROUP(ORDER BY ISNULL(T.ColumnOrder, 999), N.Name)
FROM #Names N
LEFT JOIN #ColumnTypeMap T ON T.ColumnName = N.name
CROSS APPLY (
SELECT SelectItem = REPLACE(REPLACE(REPLACE(
#SelectItemTemplate
, '<colpath>', QUOTENAME(N.name + '[1]', ''''))
, '<colname>', QUOTENAME(N.name))
, '<coltype>', QUOTENAME(ISNULL(T.ColumnType, #ColumnTypeDefault), ''''))
) SI(SelectItem)
)
DECLARE #Sql VARCHAR(MAX) = REPLACE(#SqlTemplate, '<SelectItems>', #SelectItems)
--------------------------------------------------
-- Execute
--------------------------------------------------
SELECT DynamicSql = #Sql
EXEC (#Sql)
Result (with some additional data):
LineNumber
col1
col2
bar
foo
1
1
foo 1
null
More
2
2
foo 2
Stuff
null

Insert multiple rows of data with out looping the table data

I have a table where it holds some duplicate entries, I would like to copy over the distinct entries to another table with out looping the data. I need to check if the distinct data exists in other table and insert what ever is missing. Here is the query I am writing, I feel like it can be implement better
CREATE TABLE ForgeRock
([productName] varchar(13));
INSERT INTO ForgeRock
([productName])
VALUES
('OpenIDM'), ('OpenAM'), ('OpenDJ'), ('OpenDJ'),('OpenDJ1');
CREATE TABLE ForgeRock1
([productName] varchar(13));
DECLARE #prodName NVARCHAR(MAX)
SELECT DISTINCT #prodName = STUFF((SELECT ',' + productName
FROM ForgeRock
FOR XML PATH('')) ,1,1,'')
set #prodName = ''''+replace(#prodName,',',''',''')+''''
INSERT INTO ForgeRock1 (productName)
SELECT DISTINCT productName FROM ForgeRock WHERE
productName NOT IN (SELECT productName FROM ForgeRock1
where productName NOT IN (#prodName))
Here is the sample fiddle I tried out http://sqlfiddle.com/#!18/9dbe8f/1/0, is this query efficient or can it be better
This query should do what you want :)
INSERT INTO ForgeRock1 (productName)
SELECT DISTINCT productName FROM ForgeRock fr
WHERE NOT EXISTS ( SELECT 1 FROM ForgeRock1 fr1 WHERE fr1.productName = fr.productName )

How do I place the values in one column into a comma-delimited variable in SQL Server?

I have a table with values that I would like to use as column names in a dynamic PIVOT query. As such, I need to put the values into a comma-delimited string (and then into a variable).
In other words, I have a table #ColumnData like this:
ID Title
1 Income
2 Rent
3 Utilities
4 Childcare
And I need a the column "Title" in this form:
#Variable = [Income],[Rent],[Utilities],[Childcare]
You can use FOR XML PATH('') to concatenate:
CREATE TABLE #Tbl(ID INT IDENTITY(1, 1), Title VARCHAR(50));
INSERT INTO #Tbl(Title) VALUES ('Income'), ('Rent'), ('Utilities'), ('Childcare');
DECLARE #ColumnDate VARCHAR(MAX) = '';
SELECT #ColumnDate =
STUFF((
SELECT ',' + QUOTENAME(Title)
FROM #Tbl
ORDER BY ID
FOR XML PATH(''), type).value('.[1]','nvarchar(max)')
, 1, 1, '');
SELECT #ColumnDate;
DROP TABLE #Tbl;

Get multiple rows using FOR JSON clause

Using PostgreSQL I can have multiple rows of json objects.
select (select ROW_TO_JSON(_) from (select c.name, c.age) as _) as jsonresult from employee as c
This gives me this result:
{"age":65,"name":"NAME"}
{"age":21,"name":"SURNAME"}
But in SqlServer when I use the FOR JSON AUTO clause it gives me an array of json objects instead of multiple rows.
select c.name, c.age from customer c FOR JSON AUTO
[{"age":65,"name":"NAME"},{"age":21,"name":"SURNAME"}]
How to get the same result format in SqlServer ?
By constructing separate JSON in each individual row:
SELECT (SELECT [age], [name] FOR JSON PATH, WITHOUT_ARRAY_WRAPPER)
FROM customer
There is an alternative form that doesn't require you to know the table structure (but likely has worse performance because it may generate a large intermediate JSON):
SELECT [value] FROM OPENJSON(
(SELECT * FROM customer FOR JSON PATH)
)
no structure better performance
SELECT c.id, jdata.*
FROM customer c
cross apply
(SELECT * FROM customer jc where jc.id = c.id FOR JSON PATH , WITHOUT_ARRAY_WRAPPER) jdata (jdata)
Same as Barak Yellin but more lazy:
1-Create this proc
CREATE PROC PRC_SELECT_JSON(#TBL VARCHAR(100), #COLS VARCHAR(1000)='D.*') AS BEGIN
EXEC('
SELECT X.O FROM ' + #TBL + ' D
CROSS APPLY (
SELECT ' + #COLS + '
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) X (O)
')
END
2-Can use either all columns or specific columns:
CREATE TABLE #TEST ( X INT, Y VARCHAR(10), Z DATE )
INSERT #TEST VALUES (123, 'TEST1', GETDATE())
INSERT #TEST VALUES (124, 'TEST2', GETDATE())
EXEC PRC_SELECT_JSON #TEST
EXEC PRC_SELECT_JSON #TEST, 'X, Y'
If you're using PHP add SET NOCOUNT ON; in the first row (why?).

Insert from single table into multiple tables, invalid column name error

I am trying to do the following but getting an "Invalid Column Name {column}" error. Can someone please help me see the error of my ways? We recently split a transaction table into 2 tables, one containing the often updated report column names and the other containing the unchanging transactions. This leave me trying to change what was a simple insert into 1 table to a complex insert into 2 tables with unique columns. I attempted to do that like so:
INSERT INTO dbo.ReportColumns
(
FullName
,Type
,Classification
)
OUTPUT INSERTED.Date, INSERTED.Amount, INSERTED.Id INTO dbo.Transactions
SELECT
[Date]
,Amount
,FullName
,Type
,Classification
FROM {multiple tables}
The "INSERTED.Date, INSERTED.Amount" are the source of the errors, with or without the "INSERTED." in front.
-----------------UPDATE------------------
Aaron was correct and it was impossible to manage with an insert but I was able to vastly improve the functionality of the insert and add some other business rules with the Merge functionality. My final solution resembles the following:
DECLARE #TransactionsTemp TABLE
(
[Date] DATE NOT NULL,
Amount MONEY NOT NULL,
ReportColumnsId INT NOT NULL
)
MERGE INTO dbo.ReportColumns AS Trgt
USING ( SELECT
{FK}
,[Date]
,Amount
,FullName
,Type
,Classification
FROM {multiple tables}) AS Src
ON Src.{FK} = Trgt.{FK}
WHEN MATCHED THEN
UPDATE SET
Trgt.FullName = Src.FullName,
Trgt.Type= Src.Type,
Trgt.Classification = Src.Classification
WHEN NOT MATCHED BY TARGET THEN
INSERT
(
FullName,
Type,
Classification
)
VALUES
(
Src.FullName,
Src.Type,
Src.Classification
)
OUTPUT Src.[Date], Src.Amount, INSERTED.Id INTO #TransactionsTemp;
MERGE INTO dbo.FinancialReport AS Trgt
USING (SELECT
[Date] ,
Amount ,
ReportColumnsId
FROM #TransactionsTemp) AS Src
ON Src.[Date] = Trgt.[Date] AND Src.ReportColumnsId = Trgt.ReportColumnsId
WHEN NOT MATCHED BY TARGET And Src.Amount <> 0 THEN
INSERT
(
[Date],
Amount,
ReportColumnsId
)
VALUES
(
Src.[Date],
Src.Amount,
Src.ReportColumnsId
)
WHEN MATCHED And Src.Amount <> 0 THEN
UPDATE SET Trgt.Amount = Src.Amount
WHEN MATCHED And Src.Amount = 0 THEN
DELETE;
Hope that helps someone else in the future. :)
Output clause will return values you are inserting into a table, you need multiple inserts, you can try something like following
declare #staging table (datecolumn date, amount decimal(18,2),
fullname varchar(50), type varchar(10),
Classification varchar(255));
INSERT INTO #staging
SELECT
[Date]
,Amount
,FullName
,Type
,Classification
FROM {multiple tables}
Declare #temp table (id int, fullname varchar(50), type varchar(10));
INSERT INTO dbo.ReportColumns
(
FullName
,Type
,Classification
)
OUTPUT INSERTED.id, INSERTED.fullname, INSERTED.type INTO #temp
SELECT
FullName
,Type
,Classification
FROM #stage
INSERT into dbo.transacrions (id, date, amount)
select t.id, s.datecolumn, s.amount from #temp t
inner join #stage s on t.fullname = s.fullname and t.type = s.type
I am fairly certain you will need to have two inserts (or create a view and use an instead of insert trigger). You can only use the OUTPUT clause to send variables or actual inserted values ti another table. You can't use it to split up a select into two destination tables during an insert.
If you provide more information (like how the table has been split up and how the rows are related) we can probably provide a more specific answer.

Resources