Modify XML in SQL server to add a root node - sql-server

To give some background to this problem first, I am rewriting some code that currently loops through some xml, doing an insert to a table at the end of each loop - replacing with a single sp that takes an xml parameter and does the insert in one go, 'shredding' the xml into a table.
The main shred has been done successfully,but currently one of the columns is used to store the entire node. I have been able to work out the query necessary for this (almost), but it misses out the root part of the node. I have come to the conclusion that my query is as good as I can get it, and I am looking at a way to then do an update statement to get the root node back in there.
So my xml is of the form;
<xml>
<Items>
<Item>
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
<Item>
<Item>
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
<Item>
<Item>
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
<Item>
......
<Items>
</xml>
So the basic shredding puts the value from node1 into column1, node2 into column2 etc. The insert statement looks something like;
INSERT INTO mytable col1, col2,...etc.....,wholenodecolumn
Select
doc.col.value('node1[1]', 'int') column1,
doc.col.value('node2[1]', 'varchar(50)') column2,
....etc......,
doc.col.query('*')--this is the query for getting the whole node
FROM #xml.nodes('//Items/Item') doc(col)
The XML that ends up in wholenodecolumn is of the form;
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
but I need it to be of the form
<Item><node1>...</node1><node2>..<node2>.....<noden>...<noden></Item>
There is existing code (a lot of it) that depends on the xml in this column being of the correct form.
So can someone maybe see how to modify the doc.col.query('*') to get the desired result?
Anyway, I gave up on modifying the query, and tried to think of other ways to accomplish the end result. What I am now looking at is an Update after the insert- something like;
update mytable set wholenodecolumn.modify('insert <Item> as first before * ')
If I could do this along with
.modify('insert </Item> as last after * ')
that would be fine, but doing 1 at a time isn't an option as the XML is then invalid
XQuery [mytable.wholenodecolumn.modify()]: Expected end tag 'Item'
and doing both together I don't know if it's possible but I've tried various syntax and can't get to work.
Any other approaches to the problem also gratefully received

I beleive you can specifiy the Root Node name by using the FOR clause.
For example:
select top 1 *
from HumanResources.Department
for XML AUTO, ROOT('RootNodeName')
Take a looks at books online for more details:
http://msdn.microsoft.com/en-us/library/ms190922.aspx

Answering my own question here! - this follows on from the comments to the one of the other attempted answers where I said:
I am currently looking into FLWOR
Xquery constructs in the query.
col.query('for $item in * return <Item> {$item} </item>') is almost
there, but puts around
each node, rather than around all the
nodes
I was almost there with the syntax, a small tweak has given me what I needed;
doc.col.query('<Item> { for $item in * return $item } </item>'
Thankyou to everyone that helped. I have further related issues now but I'll post as separate questions

Couldn't you just add the '' / '' as fixed texts in your select? Something like:
Select
'<Item>',
doc.col.value('node1[1]', 'int') column1,
doc.col.value('node2[1]', 'varchar(50)') column2,
....etc......,
doc.col.query('*'),
'</Item>' --this is the query for getting the whole node
FROM #xml.nodes('//Items/Item') doc(col)
Marc

Related

Is that valid XML and how to replicate with SQL Server

I do have to replicate an XML file with SQL Server and I am now stumbling over the following structure inside the XML file and I don't know how to replicate that.
The structure looks like this at the moment for certain tags:
<ART_TAG1>
<UNMLIMITED/>
</ART_TAG1>
<ART_TAG2>
<ART_TAG3>
<Data_Entry/>
</ART_TAG3>
</ART_TAG2>
I am wondering if this is proper XML that the data inside (unlimited and Data_Entry) is enclosed with a closing XML tag. The XML validator https://www.w3schools.com/xml/xml_validator.asp is telling me this is correct. But now I am struggling with replicating that with Transact-SQL.
If I try to replicate that I can only come up with the following TSQL script, which obviously does not fully look like the original.
SELECT 'UNLIMITED' as 'ART_TAG1'
, 'Data_Entry' as 'ART_TAG2/ART_TAG3'
FOR XML PATH(''), ROOT('root')
<root>
<ART_TAG1>UNLIMITED</ART_TAG1>
<ART_TAG2>
<ART_TAG3>Data_Entry</ART_TAG3>
</ART_TAG2>
</root>
If I get this correctly, your question is:
How can I put my query to create those <SomeElement /> tags?
Look at this:
--This will create filled nodes
SELECT 'outer' AS [OuterNode/#attr]
,'inner' AS [OuterNode/InnerNode]
FOR XML PATH('row');
--The empty string is some kind of content
SELECT 'outer' AS [OuterNode/#attr]
,'' AS [OuterNode/InnerNode]
FOR XML PATH('row');
--the missing value (NULL) is omited by default
SELECT 'outer' AS [OuterNode/#attr]
,NULL AS [OuterNode/InnerNode]
FOR XML PATH('row');
--Now check what happens here:
--First XML has an empty element, while the second uses the self-closing element
DECLARE #xml1 XML=
N'<row>
<OuterNode attr="outer">
<InnerNode></InnerNode>
</OuterNode>
</row>';
DECLARE #xml2 XML=
N'<row>
<OuterNode attr="outer">
<InnerNode/>
</OuterNode>
</row>';
SELECT #xml1,#xml2;
The result is the same for both...
Some background: Semantically the empty element <element></element> is exactly the same as the self-closing element <element />. It should not make any difference, whether you use the one or the other. If your consumer cannot deal with this, it is a problem in the reading part.
Yes, you can force any content into XML on string level, but - as the example shows above - this is just a (dangerous) hack.
XML within T-SQL returns - by default - a missing node as NULL and an empty element as empty (depending on the datatype, and beware of the difference between an element and its text() node).
In short: This is nothing you should have to think about...

SQL: Using XML as input to do an inner join

I have XML coming in as the input, but I'm unclear on how I need to setup the data and statement to get the values from it. My XML is as follows:
<Keys>
<key>246</key>
<key>247</key>
<key>248</key>
</Keys>
And I want to do the following (is simplified to get my point across)
Select *
From Transaction as t
Inner Join #InputXml.nodes('Keys') as K(X)
on K.X.value('#Key', 'INT') = t.financial_transaction_grp_key
Can anyone provide how I would do that? What would my 3rd/4th line in the SQL look like?
Thanks!
From your code I assume this is SQL-Server but you added the tag [mysql]...
For your next question please keep in mind, that it is very important to know your tools (vendor and version).
Assuming T-SQL and [sql-server] (according to the provided sample code) you were close:
DECLARE #InputXml XML=
N'<Keys>
<key>246</key>
<key>247</key>
<key>248</key>
</Keys>';
DECLARE #YourTransactionTable TABLE(ID INT IDENTITY,financial_transaction_grp_key INT);
INSERT INTO #YourTransactionTable VALUES (200),(246),(247),(300);
Select t.*
From #YourTransactionTable as t
Inner Join #InputXml.nodes('/Keys/key') as K(X)
on K.X.value('text()[1]', 'INT') = t.financial_transaction_grp_key;
What was wrong:
.nodes() must go down to the repeating element, which is <key>
In .value() you are using the path #Key, which is wrong on two sides: 1) <key> is an element and not an attribute and 2) XML is strictly case-sensitive, so Key!=key.
An alternative might be this:
WHERE #InputXml.exist('/Keys/key[. cast as xs:int? = sql:column("financial_transaction_grp_key")]')=1;
Which one is faster depends on the count of rows in your source table as well as the count of keys in your XML. Just try it out.
You probably need to parse the XML to a readable format with regex.
I wrote a similar event to parse the active DB from an xmlpayload that was saved on a table. This may or may not work for you, but you should be able to at least get started.
SELECT SUBSTRING(column FROM IF(locate('<key>',column)=0,0,0+LOCATE('<key>',column))) as KEY FROM table LIMIT 1\G

Select XML multiple only a few nodes with the same name

I'm trying to construct a soap message, and I was able to construct the entire message using a single select. Except the problem is, on only a few occasions the same node name is repeated twice.
So for example the required output result should be like so, with two separate id root nodes:
<SoapDocument>
<recordTarget>
<patientRole>
<id root="1.2.3.4" extension="1234567" />
<id root="1.2.3.5.6" extension="0123456789" />
</patientRole>
</recordTarget>
</SoapDocument>
I tried to use my sparse knowledge of xpath to construct the node names like so:
select
'1.2.3.4' AS 'recordTarget/patientRole/id[1]/#root',
'1234567' AS 'recordTarget/patientRole/id[1]/#extension',
'1.2.3.5.6' AS 'recordTarget/patientRole/id[2]/#root',
'0123456789' AS 'recordTarget/patientRole/id[2]/#extension'
FOR XML PATH('SoapDocument'),TYPE
Apparently xpath naming can't be applied to column names id[1] and id[2] like that? Am I missing something here or should the notation be different? What would be the easiest way to constuct the desired result?
From your question I assume, this is not tabular data, but fixed values and you are creating a medical document, assumably a CDA.
Try this:
SELECT
(
SELECT
'1.2.3.4' AS 'id/#root',
'1234567' AS 'id/#extension',
'',
'1.2.3.5.6' AS 'id/#root',
'0123456789' AS 'id/#extension'
FOR XML PATH('patientRole'),TYPE
) AS [SoapDocument/recordTarget]
FOR XML PATH('')
The result:
<SoapDocument>
<recordTarget>
<patientRole>
<id root="1.2.3.4" extension="1234567" />
<id root="1.2.3.5.6" extension="0123456789" />
</patientRole>
</recordTarget>
</SoapDocument>
Some explanation: The empty element in the middle allows you to place two elements with the same name in one query. There are various approaches how you get this into your surrounding tags. This is just one possibility.
UPDATE
I'd like to point to BdR's own answer! Great finding and worth an up-vote!
A little more elaboration on the answer from Shnugo, as it got me trying out some things using an "empty column".
If you do not give the emtpy column a name, it will reset to the XML root node. So the following columns will start from the XML root of the selection you are in at that point. However, if you explicitly name the empty separator column, then the following columns will continue in the hierarchy as set by that column name.
So the selection below will also result in the desired result. It's subtly different, but in my case it allows me to avoid using subselections.
select
'1.2.3.4' AS 'recordTarget/patientRole/id/#root',
'1234567' AS 'recordTarget/patientRole/id/#extension',
'' AS 'recordTarget/patientRole',
'1.2.3.5.6' AS 'recordTarget/patientRole/id/#root',
'0123456789' AS 'recordTarget/patientRole/id/#extension'
FOR XML PATH('SoapDocument'),TYPE
This should do the job:
WITH CTE AS (
SELECT *
FROM (VALUES('1.2.3.4','1234567'),
('1.2.3.5.6','0123456789')) V ([root], [extension]))
SELECT (SELECT (SELECT (SELECT [root] AS [#root],
[extension] AS [#extension]
FROM CTE
FOR XML PATH('id'), TYPE)
FOR XML PATH('patientRole'), TYPE)
FOR XML PATH ('recordTarget'), TYPE)
FOR XML PATH ('SoapDocument');

Extracting XML in a column from a SQL Server database

I have read dozens of posts and have tried numerous SQL queries to try and get this figured out. Sadly, I'm not a SQL expert (not even a novice) nor am I an XML expert. I understand basic queries from SQL, and understand XML tags, mostly.
I'm trying to query a database table, and have the data show a list of values from a column that contains XML. I'll give you an example of the data. I won't burden you with everything I have tried.
Here is an example of field inside of the column I need. So this is just one row, I would need to query the whole table to get all of the data I need.
When I select * from [table name] it returns hundreds of rows and when I double click in the column name of 'Document' on one row, I get the information I need.
It looks like this:
<code_set xmlns="">
<name>ExampleCodeTable</name>
<last_updated>2010-08-30T17:49:58.7919453Z</last_updated>
<code id="1" last_updated="2010-01-20T17:46:35.1658253-07:00"
start_date="1998-12-31T17:00:00-07:00"
end_date="9999-12-31T16:59:59.9999999-07:00">
<entry locale="en-US" name="T" description="Test1" />
</code>
<code id="2" last_updated="2010-01-20T17:46:35.1658253-07:00"
start_date="1998-12-31T17:00:00-07:00"
end_date="9999-12-31T16:59:59.9999999-07:00">
<entry locale="en-US" name="Z" description="Test2" />
</code>
<displayExpression>[Code] + ' - ' + [Description]</displayExpression>
<sortColumn>[Description]</sortColumn>
</code_set>
Ideally I would write it so it runs the query on the table and produces results like this:
Code Description
--------------------
(Data) (Data)
Any ideas? Is it even possible? The dozens of things I have tried that are always posted in stack, either return Nulls or fail.
Thanks for your help
Try something like this:
SELECT
CodeSetId = xc.value('#id', 'int'),
Description = xc.value('(entry/#description)[1]', 'varchar(50)')
FROM
dbo.YourTableNameHere
CROSS APPLY
YourXmlColumn.nodes('/code_set/code') AS XT(XC)
This basically uses the built-in XQuery to get an "in-memory" table (XT) with a single column (XC), each containing an XML fragment that represents each <code> node inside your <code_set> root node.
Once you have each of these XML fragments, you can use the .value() XQuery operator to "reach in" and grab some pieces of information from it, e.g. it's #id (attribute by the name of id), or the #description attribute on the contained <entry> subelement.
The following query will read the xml field in every row, then shred certain values into a tabular result set.
SELECT
-- get attribute [attribute name] from the parent node
parent.value('./#attribute name','varchar(max)') as ParentAttributeValue,
-- get the text value of the first child node
child.value('./text()', 'varchar(max)') as ChildNodeValueFromFirstChild,
-- get attribute attribute [attribute name] from the first child node
child.value('./#attribute name', 'varchar(max)') as ChildAttributeValueFromFirstChild
FROM
[table name]
CROSS APPLY
-- create a handle named parent that references that <parent node> in each row
[xml field name].nodes('//xpath to parent name') AS ParentName(parent)
CROSS APPLY
-- create a handle named child that references first <child node> in each row
parent.nodes('(xpath from parent/to child)[0]') AS FirstChildNode(child)
GO
Please provide the exact values you want to shred from the XML for a more precise answer.

How to get a particular attribute from XML element in SQL Server

I have something like the following XML in a column of a table:
<?xml version="1.0" encoding="utf-8"?>
<container>
<param name="paramA" value="valueA" />
<param name="paramB" value="valueB" />
...
</container>
I am trying to get the valueB part out of the XML via TSQL
So far I am getting the right node, but now I can not figure out how to get the attribute.
select xmlCol.query('/container/param[#name="paramB"]') from LogTable
I figure I could just add /#value to the end, but then SQL tells me attributes have to be part of a node. I can find a lot of examples for selecting the child nodes attributes, but nothing on the sibling atributes (if that is the right term).
Any help would be appreciated.
Try using the .value function instead of .query:
SELECT
xmlCol.value('(/container/param[#name="paramB"]/#value)[1]', 'varchar(50)')
FROM
LogTable
The XPath expression could potentially return a list of nodes, therefore you need to add a [1] to that potential list to tell SQL Server to use the first of those entries (and yes - that list is 1-based - not 0-based). As second parameter, you need to specify what type the value should be converted to - just guessing here.
Marc
Depending on the the actual structure of your xml, it may be useful to put a view over it to make it easier to consume using 'regular' sql eg
CREATE VIEW vwLogTable
AS
SELECT
c.p.value('#name', 'varchar(10)') name,
c.p.value('#value', 'varchar(10)') value
FROM
LogTable
CROSS APPLY x.nodes('/container/param') c(p)
GO
-- now you can get all values for paramB as...
SELECT value FROM vwLogTable WHERE name = 'paramB'

Resources