Query XML field with T-SQL - sql-server

How can I query multiple nodes in XML data with T-SQL and have the result output to a single comma separated string?
For example, I'd like to get a list of all the destination names in the following XML to look like "Germany, France, UK, Italy, Spain, Portugal"
<Holidays>
<Summer>
<Regions>
<Destinations>
<Destination Name="Germany" />
<Destination Name="France" />
<Destination Name="UK" />
<Destination Name="Italy" />
<Destination Name="Spain" />
<Destination Name="Portugal" />
</Destinations>
<Regions>
</Summer>
</Holidays>
I was trying something like:
Countries = [xmlstring].value('/Holidays/Summer/Regions/Destinations/#Name', 'varchar')

First, to get a list of records from a source XML table, you need to use the .nodes function (DEMO):
select Destination.value('data(#Name)', 'varchar(50)') as name
from [xmlstring].nodes('/Holidays/Summer/Regions/Destinations/Destination')
D(Destination)
Sample output:
| NAME |
-------------
| Germany |
| France |
| UK |
| Italy |
| Spain |
| Portugal |
From here, you want to concatenate the destination values into a comma-separated list. Unfortunately, this is not directly supported by T-SQL, so you'll have to use some sort of workaround. If you're working with a source table using multiple rows, the simplest method is the FOR XML PATH('') trick. In this query I use a source table called Data, and split out the XML into separate records, which I then CROSS APPLY with FOR XML PATH('') to generate comma-separated rows. Finally, the final , is stripped from the result to create the list (DEMO):
;with Destinations as (
select id, name
from Data
cross apply (
select Destination.value('data(#Name)', 'varchar(50)') as name
from [xmlstring].nodes('/Holidays/Summer/Regions/Destinations/Destination') D(Destination)
) Destinations(Name)
)
select id, substring(NameList, 1, len(namelist) - 1)
from Destinations as parent
cross apply (
select name + ','
from Destinations as child
where parent.id = child.id
for xml path ('')
) DestList(NameList)
group by id, NameList
Sample Output (Note that I've added another XML fragment to the test data to make a more complex example):
| ID | COLUMN_1 |
-----------------------------------------------
| 1 | Germany,France,UK,Italy,Spain,Portugal |
| 2 | USA,Australia,Brazil |

Related

Converting XML to SQL - Multiple elements in the same node problem with repeating element names

I got a really simple XML file that I want to convert to a table.
XML structure:
<ROOT>
<ID>ID-20</ID> (ONLY 1 ID per file, this will be the first column)
<ProductList>
<ProductID>A-1235</ProductID>
<Quantity>100</Quantity>
<Price>300</Price>
<ProductID>A-12356</ProductID>
<Quantity>110</Quantity>
<Price>310</Price>
<ProductID>A-123567</ProductID>
<Quantity>120</Quantity>
<Price>320</Price>
...
</ProductList>
</ROOT>
The second column would be ProductID, the 3rd Quantity, the 4th Price.
I could make each ProductID appear in separate rows with the first column but I can't make the respective Quantity and Price show next to the ProductID.
My code so far:
SELECT T.C.value('../../../ID[1]', 'nvarchar(20)') AS ID,
C.value('.', 'nvarchar(20)') AS ProductID,
C2.value('(text())[1]', 'nvarchar(20)') AS Quantity--,COMMENTED PRICE OUT FOR NOW
--C2.value('(../Price/text())[1]', 'nvarchar(20)') AS Price
FROM #Xml.nodes('/ROOT/ProductList/ProductID') AS T(C)
cross apply C.nodes('../Quantity') AS T2(C2)
The Cross Apply part causes every Quantity to appear next to every ProductID.
I can't figure out the correct way to align these columns.
I found some similar questions here but I just couldn't figure this out for my case as the XML structure is a bit different.
Could someone please help me with this?
I'd appreciate it very much :)
Problem SOLVED!
Many thanks to all who contributed!
I completely agree with #marc_s, the XML structure is very fragile.
In any case, here is a solution for the current scenario.
#Shnugo recently came up with this approach here: How to extract value form XML?
All credit goes to him.
SQL
DECLARE #xml XML =
N'<ROOT>
<ID>ID-20</ID>
<ProductList>
<ProductID>A-1235</ProductID>
<Quantity>100</Quantity>
<Price>300</Price>
<ProductID>A-12356</ProductID>
<Quantity>110</Quantity>
<Price>310</Price>
<ProductID>A-123567</ProductID>
<Quantity>120</Quantity>
<Price>320</Price>...</ProductList>
</ROOT>';
WITH tally(Nmbr) AS
(
SELECT TOP(#xml.value('count(/ROOT/ProductList/ProductID)','INT'))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM master..spt_values
)
SELECT tally.Nmbr
,#xml.value('(/ROOT/ID/text())[1]','NVARCHAR(20)') AS ID
,#xml.value('(/ROOT/ProductList/ProductID[sql:column("tally.Nmbr")]/text())[1]','NVARCHAR(200)') AS ProductID
,#xml.value('(/ROOT/ProductList/Quantity[sql:column("tally.Nmbr")]/text())[1]','INT') AS Quantity
,#xml.value('(/ROOT/ProductList/Price[sql:column("tally.Nmbr")]/text())[1]','INT') AS Price
FROM tally;
Output
+------+-------+-----------+----------+-------+
| Nmbr | ID | ProductID | Quantity | Price |
+------+-------+-----------+----------+-------+
| 1 | ID-20 | A-1235 | 100 | 300 |
| 2 | ID-20 | A-12356 | 110 | 310 |
| 3 | ID-20 | A-123567 | 120 | 320 |
+------+-------+-----------+----------+-------+
Your current XML structure is rather flawed...
What you should have (and that would easily allow to know what bits of information belong together) is an element per product - something like:
<ProductList>
<Product>
<ID>A-1235</ID>
<Quantity>100</Quantity>
<Price>300</Price>
</Product>
<Product>
<ID>A-12356</ID>
<Quantity>110</Quantity>
<Price>310</Price>
</Product>
</ProductList>
because with the current structure you have, there's no proper and reliable way to know which ProductId, Quantity and Price belong together .... you won't be able to reliably get this information as you have it right now .....
With this structure, your query would be:
SELECT
C.value('(ID)[1]', 'nvarchar(20)') AS ID,
C.value('(Quantity)[1]', 'int') AS Quantity,
C.value('(Price)[1]', 'decimal(16,2)') AS Price
FROM
#Xml.nodes('/ROOT/ProductList/Product') AS T(C)
..jff..
declare #x xml = N'
<ROOT>
<ID>ID-20</ID> (ONLY 1 ID per file, this will be the first column)
<ProductList>
<ProductID>A-1235</ProductID>
<Quantity>100</Quantity>
<Price>300</Price>
<ProductID>A-12356</ProductID>
<Quantity>110</Quantity>
<Price>310</Price>
<ProductID>A-123567</ProductID>
<Quantity>120</Quantity>
<Price>320</Price>
...........................
</ProductList>
</ROOT>';
select
[1],[2],[0],
cast([2] as int) as Quantity
from
(
select
x.n.value('.', 'varchar(20)') as val,
(row_number() over(order by x.n)-1) / 3 as grpid,
row_number() over(order by x.n) % 3 as rowid
from #x.nodes('/ROOT/ProductList/*') as x(n)
) as src
pivot
(
max(val) for rowid in ([1],[2],[0])
) as pvt;

Search with LIKE in PostgreSQL array

I have this table:
id | name | tags
----+----------+-------------------------
1 | test.jpg | {sometags,other_things}
I need to get rows that contain specific tags by searching in array with regular expression or LIKE, like this:
SELECT * FROM images WHERE 'some%' LIKE any(tags);
But this query returns nothing.
with images (id, name, tags) as (values
(1, 'test.jpg', '{sometags, other_things}'::text[]),
(2, 'test2.jpg', '{othertags, other_things}'::text[])
)
select *
from images
where (
select bool_or(tag like 'some%')
from unnest(tags) t (tag)
);
id | name | tags
----+----------+-------------------------
1 | test.jpg | {sometags,other_things}
unnest returns a set which you aggregate with the convenient bool_or function

Best way to concat 1 to n values into single field from two tables

T-SQL
Imagine two tables looking like this:
Table: students
==============================
| TeacherID | SName |
| 1 | Thompson |
| 1 | Nickles |
| 2 | Cree |
==============================
Table: teacher
====================================================
| TeacherID | TName | + many other fields |
| 1 | Pipers | |
| 2 | Slinger | |
====================================================
The field names are completely arbitrary.
I want to create a query with the following output:
================================================================
| TeacherName | many other fields | Students |
| Pipers | | Thompson,Nickles |
================================================================
Currently I have something like this:
SELECT *
FROM teacher
LEFT JOIN (
SELECT DISTINCT
EL2.teacherID,
STUFF(( SELECT ',' + SName
FROM students
WHERE EL2.teacherID = students.teacherID
FOR XML PATH('')
),1,1,'') AS "Students"
FROM students, teacher EL2) t1
ON t1.teacherID = teacher.teacherID
WHERE t1.Students LIKE '%Thompson%'
This works and gives me what I need. The WHERE clause is to illustrate that I
also absolutely need to be able to filter if a teacher has that student, but then put all students that teacher has into the concated field.
My question now is if there is a better way to do this.
I already looked at this:
Concatenate many rows into a single text string?
But it didn't help me much because one I couldn't get it to work with two seperate tables and two I couldn't filter the way I needed.
The SQL Management Studio execution plan indicates that the SELECT DISTINCT is
very expensive and others have said that the reliance on XML PATH is not optimal because it's behaviour can change.
Be carefull with a DISTINCT on names, as you might have two students with the same name! And btw: GROUP BY is in most cases a better performing approach to get a distinct list...
You might try something like this:
SELECT t.*
,STUFF(( SELECT ',' + s.SName
FROM students AS s
WHERE t.teacherID = s.teacherID
FOR XML PATH('')
),1,1,'') AS Students
FROM teacher AS t
WHERE EXISTS(SELECT 1 FROM students AS x WHERE x.teacherID=t.teacherID /*AND [PUT YOUR FILTER HERE]*/)
If I understand this correctly you want to find only teachers where one given student is connected to the teacher. And in this case you want to find all students bound to all teachers connected to the given student, correct?
At the end you find a /*AND [PUT YOUR FILTER HERE]*/ At this place you should put something like AND x.StudentId=123. This will filter the teachers to the rows connected with this student only. For these teachers all students are concatenated...
Use XML Path,..How for XML path works:
select
TeacherID,
Tname,
stuff((select ','+s.sname from students s where s.teacherid=t.teacherid
for xml path('')),1,1,'')as students
from
teachers t

How to do xml parsing in store procedure

I have a table which contains xml format request.
For eg:
Api_id xmlRequest Sent_Time
1 ........ 07-04-2016 10:07:12:345
1 ........ 08-04-2016 10:03:12:345
2 ........ 09-04-2016 10:08:12:345
2 ........ 09-04-2016 10:09:12:345
For Api_id, we can have multiple request.
XML request schema is same, but has different values.
Xml request is as :
<?xml version="1.0"?>
<!DOCTYPE PARTS SYSTEM "parts.dtd">
<PARTS>
<TITLE>Computer Parts</TITLE>
<PART>
<ITEM>Motherboard</ITEM>
<MANUFACTURER>ASUS</MANUFACTURER>
<MODEL>P3B-F</MODEL>
<COST> 123.00</COST>
</PART>
</PARTS>
I need store procedure, so I can send API_id, and value(which i can search in xml request ) and get xml requests based on Item value.
CREATE PROCEDURE getxmlRequest(
#Api_Id INT
#value
,#xmlRequest VARCHAR(max) out)
AS
BEGIN
Set #xmlRequest = SELECT xmlRequest FROM Api_request
WHERE Api_id = #Api_id
/* here need to iterate over #xmlRequest */
Set #Xmlvalue = SELECT X.R.value ('.','nvarchar (150)')
FROM #xmlRequest.nodes(XPATH) X(R)
if(#XmlValue = #value)
/*Add to result so i can return
/*I want to return all #xmlRequest if we has value from xpath*/
END;
So my question If
Set #xmlRequest = SELECT xmlRequest FROM Api_request
WHERE Api_id = #Api_id
If we will get multiple result : does it possible to iterate? If yes how efficient i can ?
How to return multiple #xmlRequest as Api_id is same?
Does any one work on such kind of scenario? Please help me.
Try this query
SELECT *,CONVERT(XML,xmlRequest,2)
FROM Api_request
WHERE Api_id = #Api_id
AND CONVERT(XML,xmlRequest,2).value('(/PARTS/PART/ITEM)[1]','nvarchar(max)')
LIKE '%'+#value+'%'
It will return all the xmlRequest where contains your value
Your question is quite unclear, but please have look on this:
CREATE TABLE #testTbl(Api_id INT, xmlRequest VARCHAR(MAX), SentTime DATETIME);
INSERT INTO #testTbl VALUES
(1,'<?xml version="1.0"?>
<!DOCTYPE PARTS SYSTEM "parts.dtd">
<PARTS>
<TITLE>Computer Parts</TITLE>
<PART>
<ITEM>Motherboard</ITEM>
<MANUFACTURER>ASUS</MANUFACTURER>
<MODEL>P3B-F</MODEL>
<COST> 123.00</COST>
</PART>
</PARTS>',GETDATE())
,(1,'<?xml version="1.0"?>
<!DOCTYPE PARTS SYSTEM "parts.dtd">
<PARTS>
<TITLE>Computer Parts</TITLE>
<PART>
<ITEM>CPU</ITEM>
<MANUFACTURER>INTEL</MANUFACTURER>
<MODEL>CPUModelXY</MODEL>
<COST>345.00</COST>
</PART>
</PARTS>',GETDATE())
,(2,'<?xml version="1.0"?>
<!DOCTYPE PARTS SYSTEM "parts.dtd">
<PARTS>
<TITLE>Car Parts</TITLE>
<PART>
<ITEM>Wheel</ITEM>
<MANUFACTURER>Pirelli</MANUFACTURER>
<MODEL>WheelModelXY</MODEL>
<COST>100.00</COST>
</PART>
</PARTS>',GETDATE());
This will rerturn all rows where the Api_id=1
SELECT Api_id
,CONVERT(XML,xmlRequest,2) AS xmlRequest
,SentTime
FROM #testTbl
WHERE Api_id=1;
This will return table-like data. You can use "normal" SQL (WHERE, GROUP BY, ...) to continue
DECLARE #Api_Id INT=NULL;
WITH MyRequests AS
(
SELECT Api_id
,RealXML.value('(/PARTS/TITLE)[1]','varchar(max)') AS Title
,part.value('ITEM[1]','varchar(max)') AS Item
,part.value('MANUFACTURER[1]','varchar(max)') AS Manufacturer
,part.value('MODEL[1]','varchar(max)') AS Model
,part.value('COST[1]','decimal(12,4)') AS Cost
,SentTime
FROM #testTbl
CROSS APPLY(SELECT CONVERT(XML,xmlRequest,2) AS RealXML) AS ConvertedToXML
CROSS APPLY RealXML.nodes('/PARTS/PART') AS A(part)
WHERE #ApiId IS NULL OR Api_Id=#Api_Id
)
SELECT *
FROM MyRequests
--WHERE ...
--GROUP BY ...
--ORDER ...
;
The result
+---+----------------+-------------+---------+--------------+----------+-------------------------+
| 1 | Computer Parts | Motherboard | ASUS | P3B-F | 123.0000 | 2016-04-07 11:54:08.980 |
+---+----------------+-------------+---------+--------------+----------+-------------------------+
| 1 | Computer Parts | CPU | INTEL | CPUModelXY | 345.0000 | 2016-04-07 11:54:08.980 |
+---+----------------+-------------+---------+--------------+----------+-------------------------+
| 2 | Car Parts | Wheel | Pirelli | WheelModelXY | 100.0000 | 2016-04-07 11:54:08.980 |
+---+----------------+-------------+---------+--------------+----------+-------------------------+
Clean Up
GO
DROP TABLE #testTbl;

TSQL: Extract XML Nested Tags Into Columns

Using SQLServer2008R2
I currently have XML tags with data inside the XML tags (not between them), such as:
<zooid="1"><animals key="all" zebras="22" dogs="0" birds="4" /><animals key="all" workers="yes" vacation="occasion" /> ... *(more)*</zooid>
<zooid="2"><animals key="house" zebras="0" dogs="1" birds="2" /><animals key="house" workers="no" vacation="no" /> ... *(more)*</zoodid>
If I query the XML or use the value function against it, it returns blank values because it tries to read between tags - where no value exists. I need it to read inside of the tags, parse out the values before the equal sign as columns and the values between the quotations as values inside those columns (granted, I could create a function that could do this, but this would be quite meticulous, and I'm curious if something like this already exists). What it should look like this in columns:
Key | Zebras | Dogs | Birds | Key | Workers | Vacation | ... *(more)*
... and this in rows of data:
all | 22 | 0 | 4 | all | yes | occasion | ... *(more)*
house | 0 | 1 | 2 | house | no | no | ... *(more)*
So the final output (just using the two XML rows from the beginning for now), would look like the below data in table form:
Key | Zebras | Dogs | Birds | Key | Workers | Vacation | ... *(more)*
================================================================
all | 22 | 0 | 4 | all | yes | occasion | ... *(more)*
house | 0 | 1 | 2 | house | no | no | ... *(more)*
Other than querying against XML, using the .query tool and even trying the .node tool (using CROSS APPLY see this thread), I haven't been able to generate this.
Try this one -
DECLARE #YourXML NVARCHAR(MAX)
SELECT #YourXML = '
<zooid="1">
<animals key="all" zebras="22" dogs="0" birds="4" />
<animals key="all" workers="yes" vacation="occasion" />
</zooid>
<zooid="2">
<animals key="house" zebras="0" dogs="1" birds="2" />
<animals key="house" workers="no" vacation="no" />
</zoodid>'
DECLARE #XML XML
SELECT #XML =
REPLACE(
REPLACE(#YourXML, 'zooid=', 'zooid id=')
, '</zoodid>'
, '</zooid>')
SELECT
d.[Key]
, Dogs = MAX(d.Dogs)
, Zebras = MAX(d.Zebras)
, Birds = MAX(d.Birds)
, Workers = MAX(d.Workers)
, Vacation = MAX(d.Vacation)
FROM (
SELECT
[Key] = t.p.value('./#key', 'NVARCHAR(50)')
, Zebras = t.p.value('./#zebras', 'INT')
, Dogs = t.p.value('./#dogs', 'INT')
, Birds = t.p.value('./#birds', 'INT')
, Workers = t.p.value('./#workers', 'NVARCHAR(20)')
, Vacation = t.p.value('./#vacation', 'NVARCHAR(20)')
FROM #XML.nodes('/zooid/animals') t(p)
) d
GROUP BY d.[Key]
Your xml appears invalid. How are you able to specify an element like this: ? Generally xml structure is <(elementName) (Attribute)="(Value)"/>. Unless I am mistaken if you are casting text to xml the way it is it will fail. Saying that I can show a working example for proper xml in a self extracting example that will run in SQL Managment Studio as is.
declare #text1 varchar(max) = '<zooid="1"><animals="all" zebras="22" dogs="0" birds="4" /><animals="all" workers="yes" vacation="occasion" /></zooid>'
, #text2 varchar(max) = '<a zooid="1"><b animals="all" zebras="22" dogs="0" birds="4" /><b animals="all" workers="yes" vacation="occasion" /></a>'
, #xml xml
;
begin try
set #xml = cast(#text1 as xml)
end try
begin catch
set #xml = '<ElementName Attribute="BadData Elements are not named" />'
end catch
select #xml
begin try
set #xml = cast(#text2 as xml)
end try
begin catch
set #xml = '<ElementName Attribute="BadData" />'
end catch
select
#xml.value('(/a/b/#animals)[1]', 'varchar(20)') as AnimalsValue
, #xml.value('(/a/b/#zebras)[1]', 'int') as ZebrasValue
, #xml.value('(/a/b/#dogs)[1]', 'int') as DogsValue
, #xml.value('(/a/b/#birds)[1]', 'int') as BirdsValue
, #xml.value('(/a/b/#workers)[1]', 'varchar(16)') as Workers
, #xml.value('(/a/b/#vacation)[1]', 'varchar(16)') as Vacation
The '.value' method is a syntax for querying xml in SQL. I am basically finding the elements(I did generics of a that contained b). Then once at the level I want '#animals' stands for 'attribute of name animals'. The [1] is a position since I can only return one thing at a time, so I chose the first position. Then it needs to a datatype to return. Text is varchar and numbers are ints.
XML query methods: http://msdn.microsoft.com/en-us/library/ms190798.aspx

Resources