Parse XML and generate new rows through SQL Query - sql-server

I've the input data in SQL table in below format:
ID Text
1 <Key><Name>Adobe</Name><Display>Ado</Display></Key><Key>.....</Key>
2 <Key><Name></Name><Display>Microsoft</Display><Version>1.1</Version></Key>
There can be multiple keys for each ID.There could be several thousand rows in a table in above format. I've to generate the final sql output in below format
ID Name Display Version
1 Adobe Ado
1 xyz yz 1.2
2 Microsoft 1.1
I am using the below query to parse Text column, but getting all data in one row. How can I split that data in multiple rows as indicated above.
SELECT
CAST(CAST(Text AS XML).query('data(/Key/Name)') AS VARCHAR(MAX)) AS Name,
CAST(CAST(Text AS XML).query('data(/Key/Display)') as VARCHAR(MAX)) AS DisplayName,
CAST(CAST(Text AS XML).query('data(/Key/Version)') AS VARCHAR(MAX)) AS Version
FROM
ABC where ID = 1
Currently I am running this query for each ID at a time. Is there a way to run for all ID's together. Also, is there any other efficient way to get the desired output.

Here is the example:
-- Sample demonstrational schema
declare #t table (
Id int primary key,
TextData nvarchar(max) not null
);
insert into #t
values
(1, N'<Key><Name>Adobe</Name><Display>Ado</Display></Key><Key><Name>xyz</Name><Display>yz</Display><Version>1.2</Version></Key>'),
(2, N'<Key><Name></Name><Display>Microsoft</Display><Version>1.1</Version></Key>');
-- The actual query
with cte as (
select t.Id, cast(t.TextData as xml) as [XMLData]
from #t t
)
select c.Id,
k.c.value('./Name[1]', 'varchar(max)') as [Name],
k.c.value('./Display[1]', 'varchar(max)') as [DisplayName],
k.c.value('./Version[1]', 'varchar(max)') as [Version]
from cte c
cross apply c.XMLData.nodes('/Key') k(c);
Different type can be corrected with the in-place cast/convert done in CTE (or equivalent subquery).

Related

SQL Server parse XML column to get a column value if other column value equals certain value

In SQL Server 2014 a table with a CustomColumns column that contains XML data with the following structure:
<CustomColumnsCollection>
<CustomColumn>
<Name>Brand</Name>
<DataType>0</DataType>
<Value>Duprim</Value>
</CustomColumn>
<CustomColumn>
<Name>LabelGroup</Name>
<DataType>0</DataType>
<Value />
</CustomColumn>
...
</CustomColumnsCollection>
I want to get value of column Value where column Name equals, i.e. 'Brand' (the following code is a part of bigger query, which I saved as VIEW):
MAX(DISTINCT PR.CustomColumns.value('(/CustomColumnsCollection/CustomColumn/Name="Brand"/Value)[0]', 'varchar(max)')) AS Brand
In this case I would like it to return 'Duprim'. How is this achieved?
Here is another method by using XPath predicate.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, CustomColumns XML);
INSERT INTO #tbl (CustomColumns)
VALUES
(N'<CustomColumnsCollection>
<CustomColumn>
<Name>Brand</Name>
<DataType>0</DataType>
<Value>Duprim</Value>
</CustomColumn>
<CustomColumn>
<Name>LabelGroup</Name>
<DataType>0</DataType>
<Value/>
</CustomColumn>
</CustomColumnsCollection>');
-- DDL and sample data population, end
DECLARE #param VARCHAR(30) = 'Brand';
SELECT ID
, c.value('(Value/text())[1]', 'VARCHAR(50)') AS [Value]
FROM #tbl
CROSS APPLY CustomColumns.nodes('/CustomColumnsCollection/CustomColumn[(Name/text())[1] eq sql:variable("#param")]') AS t(c);
-- hard-coded value
SELECT ID
, c.value('(Value/text())[1]', 'VARCHAR(50)') AS [Value]
FROM #tbl
CROSS APPLY CustomColumns.nodes('/CustomColumnsCollection/CustomColumn[(Name/text())[1] eq "Brand"]') AS t(c);
Output
+----+--------+
| ID | Value |
+----+--------+
| 1 | Duprim |
+----+--------+
To help you with the view that is consumed by the MS Excel.
It would be great if you could provide a minimal reproducible example:
(1) DDL and sample data population, i.e. CREATE table(s) plus INSERT, T-SQL statements.
(2) What you need to do, i.e. logic.
(3) Desired output based on the sample data in #1 above.
SQL for Excel
SELECT ID
, CustomColumns.value('(/CustomColumnsCollection/CustomColumn[(Name/text())[1] eq "Brand"]/Value/text())[1]', 'VARCHAR(50)') AS [Value]
FROM #tbl;
Try something like this:
SELECT
xc.value('(Value)[1]', 'varchar(50)')
FROM
PR
CROSS APPLY
PR.CustomColumns.nodes('/CustomColumnsCollection/CustomColumn') AS XT(XC)
WHERE
xc.value('(Name)[1]', 'varchar(50)') = 'Brand'
The .nodes() returns a list of XML fragments, each representing a <CustomColumn> node. Select the one with the Name value of Brand in the WHERE clause, and get the value of Value for that XML node

How to write the COLUMNS to ROWS in SQL server [duplicate]

Looking for elegant (or any) solution to convert columns to rows.
Here is an example: I have a table with the following schema:
[ID] [EntityID] [Indicator1] [Indicator2] [Indicator3] ... [Indicator150]
Here is what I want to get as the result:
[ID] [EntityId] [IndicatorName] [IndicatorValue]
And the result values will be:
1 1 'Indicator1' 'Value of Indicator 1 for entity 1'
2 1 'Indicator2' 'Value of Indicator 2 for entity 1'
3 1 'Indicator3' 'Value of Indicator 3 for entity 1'
4 2 'Indicator1' 'Value of Indicator 1 for entity 2'
And so on..
Does this make sense? Do you have any suggestions on where to look and how to get it done in T-SQL?
You can use the UNPIVOT function to convert the columns into rows:
select id, entityId,
indicatorname,
indicatorvalue
from yourtable
unpivot
(
indicatorvalue
for indicatorname in (Indicator1, Indicator2, Indicator3)
) unpiv;
Note, the datatypes of the columns you are unpivoting must be the same so you might have to convert the datatypes prior to applying the unpivot.
You could also use CROSS APPLY with UNION ALL to convert the columns:
select id, entityid,
indicatorname,
indicatorvalue
from yourtable
cross apply
(
select 'Indicator1', Indicator1 union all
select 'Indicator2', Indicator2 union all
select 'Indicator3', Indicator3 union all
select 'Indicator4', Indicator4
) c (indicatorname, indicatorvalue);
Depending on your version of SQL Server you could even use CROSS APPLY with the VALUES clause:
select id, entityid,
indicatorname,
indicatorvalue
from yourtable
cross apply
(
values
('Indicator1', Indicator1),
('Indicator2', Indicator2),
('Indicator3', Indicator3),
('Indicator4', Indicator4)
) c (indicatorname, indicatorvalue);
Finally, if you have 150 columns to unpivot and you don't want to hard-code the entire query, then you could generate the sql statement using dynamic SQL:
DECLARE #colsUnpivot AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #colsUnpivot
= stuff((select ','+quotename(C.column_name)
from information_schema.columns as C
where C.table_name = 'yourtable' and
C.column_name like 'Indicator%'
for xml path('')), 1, 1, '')
set #query
= 'select id, entityId,
indicatorname,
indicatorvalue
from yourtable
unpivot
(
indicatorvalue
for indicatorname in ('+ #colsunpivot +')
) u'
exec sp_executesql #query;
well If you have 150 columns then I think that UNPIVOT is not an option. So you could use xml trick
;with CTE1 as (
select ID, EntityID, (select t.* for xml raw('row'), type) as Data
from temp1 as t
), CTE2 as (
select
C.id, C.EntityID,
F.C.value('local-name(.)', 'nvarchar(128)') as IndicatorName,
F.C.value('.', 'nvarchar(max)') as IndicatorValue
from CTE1 as c
outer apply c.Data.nodes('row/#*') as F(C)
)
select * from CTE2 where IndicatorName like 'Indicator%'
sql fiddle demo
You could also write dynamic SQL, but I like xml more - for dynamic SQL you have to have permissions to select data directly from table and that's not always an option.
UPDATEAs there a big flame in comments, I think I'll add some pros and cons of xml/dynamic SQL. I'll try to be as objective as I could and not mention elegantness and uglyness. If you got any other pros and cons, edit the answer or write in comments
cons
it's not as fast as dynamic SQL, rough tests gave me that xml is about 2.5 times slower that dynamic (it was one query on ~250000 rows table, so this estimate is no way exact). You could compare it yourself if you want, here's sqlfiddle example, on 100000 rows it was 29s (xml) vs 14s (dynamic);
may be it could be harder to understand for people not familiar with xpath;
pros
it's the same scope as your other queries, and that could be very handy. A few examples come to mind
you could query inserted and deleted tables inside your trigger (not possible with dynamic at all);
user don't have to have permissions on direct select from table. What I mean is if you have stored procedures layer and user have permissions to run sp, but don't have permissions to query tables directly, you still could use this query inside stored procedure;
you could query table variable you have populated in your scope (to pass it inside the dynamic SQL you have to either make it temporary table instead or create type and pass it as a parameter into dynamic SQL;
you can do this query inside the function (scalar or table-valued). It's not possible to use dynamic SQL inside the functions;
Just to help new readers, I've created an example to better understand #bluefeet's answer about UNPIVOT.
SELECT id
,entityId
,indicatorname
,indicatorvalue
FROM (VALUES
(1, 1, 'Value of Indicator 1 for entity 1', 'Value of Indicator 2 for entity 1', 'Value of Indicator 3 for entity 1'),
(2, 1, 'Value of Indicator 1 for entity 2', 'Value of Indicator 2 for entity 2', 'Value of Indicator 3 for entity 2'),
(3, 1, 'Value of Indicator 1 for entity 3', 'Value of Indicator 2 for entity 3', 'Value of Indicator 3 for entity 3'),
(4, 2, 'Value of Indicator 1 for entity 4', 'Value of Indicator 2 for entity 4', 'Value of Indicator 3 for entity 4')
) AS Category(ID, EntityId, Indicator1, Indicator2, Indicator3)
UNPIVOT
(
indicatorvalue
FOR indicatorname IN (Indicator1, Indicator2, Indicator3)
) UNPIV;
Just because I did not see it mentioned.
If 2016+, here is yet another option to dynamically unpivot data without actually using Dynamic SQL.
Example
Declare #YourTable Table ([ID] varchar(50),[Col1] varchar(50),[Col2] varchar(50))
Insert Into #YourTable Values
(1,'A','B')
,(2,'R','C')
,(3,'X','D')
Select A.[ID]
,Item = B.[Key]
,Value = B.[Value]
From #YourTable A
Cross Apply ( Select *
From OpenJson((Select A.* For JSON Path,Without_Array_Wrapper ))
Where [Key] not in ('ID','Other','Columns','ToExclude')
) B
Returns
ID Item Value
1 Col1 A
1 Col2 B
2 Col1 R
2 Col2 C
3 Col1 X
3 Col2 D
I needed a solution to convert columns to rows in Microsoft SQL Server, without knowing the colum names (used in trigger) and without dynamic sql (dynamic sql is too slow for use in a trigger).
I finally found this solution, which works fine:
SELECT
insRowTbl.PK,
insRowTbl.Username,
attr.insRow.value('local-name(.)', 'nvarchar(128)') as FieldName,
attr.insRow.value('.', 'nvarchar(max)') as FieldValue
FROM ( Select
i.ID as PK,
i.LastModifiedBy as Username,
convert(xml, (select i.* for xml raw)) as insRowCol
FROM inserted as i
) as insRowTbl
CROSS APPLY insRowTbl.insRowCol.nodes('/row/#*') as attr(insRow)
As you can see, I convert the row into XML (Subquery select i,* for xml raw, this converts all columns into one xml column)
Then I CROSS APPLY a function to each XML attribute of this column, so that I get one row per attribute.
Overall, this converts columns into rows, without knowing the column names and without using dynamic sql. It is fast enough for my purpose.
(Edit: I just saw Roman Pekar answer above, who is doing the same.
I used the dynamic sql trigger with cursors first, which was 10 to 100 times slower than this solution, but maybe it was caused by the cursor, not by the dynamic sql. Anyway, this solution is very simple an universal, so its definitively an option).
I am leaving this comment at this place, because I want to reference this explanation in my post about the full audit trigger, that you can find here: https://stackoverflow.com/a/43800286/4160788
DECLARE #TableName varchar(max)=NULL
SELECT #TableName=COALESCE(#TableName+',','')+t.TABLE_CATALOG+'.'+ t.TABLE_SCHEMA+'.'+o.Name
FROM sysindexes AS i
INNER JOIN sysobjects AS o ON i.id = o.id
INNER JOIN INFORMATION_SCHEMA.TABLES T ON T.TABLE_NAME=o.name
WHERE i.indid < 2
AND OBJECTPROPERTY(o.id,'IsMSShipped') = 0
AND i.rowcnt >350
AND o.xtype !='TF'
ORDER BY o.name ASC
print #tablename
You can get list of tables which has rowcounts >350 . You can see at the solution list of table as row.
The opposite of this is to flatten a column into a csv eg
SELECT STRING_AGG ([value],',') FROM STRING_SPLIT('Akio,Hiraku,Kazuo', ',')

SQL Server XML parse issue

I need to parse XML into a SQL Server 2012 database. However, I cannot find any good guide to parse this kind XML (here is SELECT TOP 2 FROM table):
<ns2:SoftWare xmlns:ns2="http://www.example.com" xmlns:ns3="http://www.example2.com"><keyc>123-ABC</keyc><statusc>Y</statusc></ns2:SoftWare>
<ns2:custom-data xmlns:ns2="http://www.example.com/2"><timec>2016.01.02</timec><customer>8R</customer><keyc>8R</keyc><statusc>N</statusc></ns2:custom-data>
Any help, how I can parse "keyc" value from XML?
So, I can use it select clause / or insert it to database.
You can use the nodes and value to get that entity:
DECLARE #Data TABLE (XmlText XML)
INSERT #Data VALUES
('<ns2:SoftWare xmlns:ns2="http://www.example.com" xmlns:ns3="http://www.example2.com"><keyc>123-ABC</keyc><statusc>Y</statusc></ns2:SoftWare>'),
('<ns2:custom-data xmlns:ns2="http://www.example.com/2"><timec>2016.01.02</timec><customer>8R</customer><keyc>8R</keyc><statusc>N</statusc></ns2:custom-data>')
SELECT
Nodes.KeyC.value('.', 'VARCHAR(50)') AS KeyC
FROM #Data D
CROSS APPLY XmlText.nodes('//keyc') AS Nodes(KeyC)
This outputs the following:
KeyC
-----------
123-ABC
8R

SQL Server table to xml

this time i have question how to convert MSSQL table to XML
My source SQL table:
+-----------+-----------------+
|atributname|atributvalue |
+-----------+-----------------+
|phone |222 |
|param4 |bbbbcdsfceecc |
|param3 |bbbbcdsfceecc |
|param2 |bbbbcdsfccc |
+-----------+-----------------+
Expected result sample:
<items>
<phone>222</phone>
<prama4>bbbbcdsfceecc</param4>
<param3>bbbbcdsfceecc</param3>
<param2>bbbbcdsfccc</param2>
</items>
I tried lot of variations of the following query
SELECT atributname,atributvalue
FROM sampletable FOR XML PATH (''), ROOT ('items');
but results are not good :( should be exactly like in "Expected result sample"
any help
ps
Script to create sampletable:
create table sampletable
(atributname varchar(20),
atributvalue varchar(20))
insert into sampletable (atributname,atributvalue)
values ('phone','222');
insert into sampletable (atributname,atributvalue)
values ('param4','bbbbcdsfceecc');
insert into sampletable (atributname,atributvalue)
values ('param3','bbbbcdsfceecc');
insert into sampletable (atributname,atributvalue)
values ('param2','bbbbcdsfccc');
That's not how FOR XML works. It's columns that get turned into XML elements, not rows. In order to obtain the expected result, you would need to have columns named phone, param4, and so on - not rows with these values in attributename.
If there are specific elements you want in the XML, you could perform a pivot on the data first, then use FOR XML.
Example of a pivot would be:
SELECT [phone], [param2], [param3], [param4]
FROM
(
SELECT attributename, attributevalue
FROM attributes
) a
PIVOT
(
MAX(attributevalue)
FOR attributename IN ([phone], [param2], [param3], [param4])
) AS pvt
FOR XML ROOT('items')
Of course the aggregate will only work if attributevalue is a numeric data type. If it's a character-type column, then you'll have some trouble with the pivot, as there are no built-in string aggregates in SQL server AFAIK...
ok
finally i have done this in several ways,
but this is simplest version suitable for medium dataset
declare #item nvarchar(max)
set #item= (SELECT '<' + atributname +'>' +
cast(atributvalue as nvarchar(max)) +'</' + atributname +'>'
FROM sampletable FOR XML PATH (''), ROOT ('items'));
select replace(replace(#item,'<','<'),'>','>')

SQL Server: Output an XML field as tabular data using a stored procedure

I am using a table with an XML data field to store the audit trails of all other tables in the database.
That means the same XML field has various XML information. For example my table has two records with XML data like this:
1st record:
<client>
<name>xyz</name>
<ssn>432-54-4231</ssn>
</client>
2nd record:
<emp>
<name>abc</name>
<sal>5000</sal>
</emp>
These are the two sample formats and just two records. The table actually has many more XML formats in the same field and many records in each format.
Now my problem is that upon query I need these XML formats to be converted into tabular result sets.
What are the options for me? It would be a regular task to query this table and generate reports from it. I want to create a stored procedure to which I can pass that I need to query "<emp>" or "<client>", then my stored procedure should return tabular data.
does this help?
INSERT INTO #t (data) SELECT '
<client>
<name>xyz</name>
<ssn>432-54-4231</ssn>
</client>'
INSERT INTO #t (data) SELECT '
<emp>
<name>abc</name>
<sal>5000</sal>
</emp>'
DECLARE #el VARCHAR(20)
SELECT #el = 'client'
SELECT
x.value('local-name(.)', 'VARCHAR(20)') AS ColumnName,
x.value('.','VARCHAR(20)') AS ColumnValue
FROM #t
CROSS APPLY data.nodes('/*[local-name(.)=sql:variable("#el")]') a (x)
/*
ColumnName ColumnValue
-------------------- --------------------
client xyz432-54-4231
*/
SELECT #el = 'emp'
SELECT
x.value('local-name(.)', 'VARCHAR(20)') AS ColumnName,
x.value('.','VARCHAR(20)') AS ColumnValue
FROM #t
CROSS APPLY data.nodes('/*[local-name(.)=sql:variable("#el")]') a (x)
/*
ColumnName ColumnValue
-------------------- --------------------
emp abc5000
*/
Neither xyz432-54-4231 nor abc5000 is valid XML.
You can try to select only one particular format with a like statement, f.e.:
select *
from YourTable
where YourColumn like '[a-z][a-z][a-z][0-9][0-9][0-9][0-9]'
This would match 3 letters followed by 4 numbers.
A better option is probably to add an extra column to the table, where you save the type of the logging. Then you can use that column to select all "emp" or "client" rows.
An option would be to create a series of views that present the aduit table, per type in the relations that you're execpting
for example
select
c.value('name','nvarchar(50)') as name,
c.value('ssn', 'nvarchar(20)') as ssn
from yourtable
cross apply yourxmlcolumn.nodes('/client') as t(c)
you could then follow the same pattern for the emp
you could also create a view (or computed column) to identify each xml type like this:
select yourxmlcolumn.value('local-name(/*[1])', 'varchar(100)') as objectType
from yourtable
Use open xml method
DECLARE #idoc int
EXEC sp_xml_preparedocument #idoc OUTPUT, #xmldoc
SELECT * into #test
FROM OPENXML (#idoc, 'xmlfilepath',2)
WITH (Name varchar(50),ssn varchar(20)
)
EXEC sp_xml_removedocument #idoc
after you get the data in the #test
and you can manipulate this.
you may be put the diff data in diff xml file.

Resources