XQuery Delete all unnecessary nodes T-SQL - sql-server

Could you help me please with succh problem. I have an XML like:
<Root attr1="val1">
<El1><Child1/></El1>
<El2><Child2/></El2>
...
<ElN><ChildN/></ElN>
</Root>
I need to delete with T-SQL all nodes but node. So I don't know all nodes of the XML-documetn but if that docuemnt has node I need to delete all other nodes but . So the result must be:
<Root attr1="val1">
<El2><Child2/></El2>
</Root>
I thought about getting xml.query('(root/el2)[1]) to a new xml and then wrapping it with root element from the origin xml (somewhow). But if there is the way to modify origin xml?

When dealing with XML where all of the elements are properly closed you could use the query() XML function to perform an XQuery like the following:
declare #xml xml = '<Root attr1="val1">
<El1><Child1a/></El1>
<El1><Child1b/></El1>
<El2><Child2a/></El2>
<El2><Child2b/></El2>
<ElN><ChildNa/></ElN>
<ElN><ChildNb/></ElN>
</Root>';
select #xml.query('
for $root in /Root return
<Root>
{$root/#*}
{$root/El2[1]}
</Root>
');
This returns the XML output:
<Root attr1="val1"><El2><Child2a /></El2></Root>

Related

Trying to query XML Data - node has a space in it

I am trying to learn how to work with xml files and data in SQL Server and I'm trying to query an xml file but nothing is returned.
Here is the xml data:
<?xml version="1.0" encoding="UTF-8"?>
<Report xmlns="AdmissionsByPCP" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="AdmissionsByPCP" xsi:schemaLocation="AdmissionsByPCP http://10.xxx.x.xx/ReportServer_NameofReportServer?%2FHl%20C%20Syst%20Reports%2health%2FAdmissBy&rs%3ACommand=Render&rs%3AFormat=XML&rs%3ASessionID=h0iz5ijxgt2vdl45g3pjfs45&rc%3ASchema=True">
<Tablix2>
<Details_Collection>
<Details PCPCarrier="DoctorsName">
<Subreport1>
<Report Name="PCPAdmitSubReport">
<Tablix5 Textbox5="79">
<Details_Collection>
<Details Textbox37="Discharge Dx Code: ICDCode" Textbox89="Admit Dx Code: ICDCode" LOS="4" DischargeDate="07/10/2017" AdmitDate="07/06/2017" Hospital="Hospital Name" MemberName="Name" DOB="1/1/2019" AdmissionType="Inpatient" MemberNo="12345" Auth="321*I" Status="Close" AdmissionID="00001" LobName="Medicare" CarrierName="CarrierName"/>
</Details></Details_Collection></Tablix5></Report></Subreport1></Details></Details_Collection></Tablix2></Report>
Here is the query I'm using:
Declare #XMLData as XML
Set #XMLData=(
Select bulkcolumn
FROM OPENROWSET (Bulk '\Directory\AdmissionsByPCP.xml',
Single_Blob) a)
Select
#XMLData.value('(/Root/Report/Tablix2/Detail_Collections/DetailsPCPCarrier) [1]', 'varchar(max)') PCP
The query returns null and I don't know why. Is it because there is a space in the node (<Details PCPCarrier>) and if so how do I work around that?
You have misunderstood how XML works. This is the node you are looking for:
<Details PCPCarrier="DoctorsName">
This is not a node called Details PCPCarrier; it is a node called Details with an attribute called PCPCarrier.
So the XPath to select it would be:
/Root/Report/Tablix2/Detail_Collections/Details
Or, if you want to specifically filter by the PCPCarrier attribute existing:
/Root/Report/Tablix2/Detail_Collections/Details[#PCPCarrier]
Or, to get the value of the attribute itself:
/Root/Report/Tablix2/Detail_Collections/Details/#PCPCarrier
IMSoP pointed me in the right direction and I figured out the rest myself.
I also needed to add this:
With XMLNAMESPACES (Default 'AdmissionsByPCP')
So the query looks like this:
Declare #XMLData as XML
Set #XMLData=(
Select *
FROM OPENROWSET (Bulk '\\Directory\AdmissionsByPCP.xml',
Single_Clob) a );
With XMLNAMESPACES (Default 'AdmissionsByPCP')
Select
#XMLData.value('(/Report/Tablix2/Details_Collection/Details/#PCPCarrier)
[1]', 'varchar(max)')

Update xml column in a table

I need to remove a Xml element which has the attribute of A with the value of xxxx from the Xml values in a column.
Method 1:
update t set x = x.query('//E[#A != "xxxx"]')
Method 2:
update t set x.modify('delete /E[#A = "xxxx"]')
Which one is better?
Both calls would not do the same:
DECLARE #xml XML=
N'<root>
<test pos="1" a="xxx">test 1</test>
<test pos="2" a="SomeOther">test 2</test>
<test pos="3" a="xxx">test 3</test>
<OtherElement>This is another element</OtherElement>
</root>';
--Try this with either this approach
SET #xml=#xml.query(N'//test[#a!="xxx"]')
--Or try it with this
SET #xml.modify(N'delete //test[#a="xxx"]')
SELECT #xml;
The result of the first is
<test pos="2" a="SomeOther">test 2</test>
While the second returns
<root>
<test pos="2" a="SomeOther">test 2</test>
<OtherElement>This is another element</OtherElement>
</root>
XML is not stored as the text you see. It is stored as a tree structure representing a complex document. To modify this is fairly easy, just kick out some elements. The query()approach has to rebuild the XML and replace the first with an new one. So my clear advise is: Use the modify()approach! If you are really good with XQuery and FLWOR the query() approach is much mightier, but this is another story...

SQL Server Stored Procedure XML can't get node right

I have this query here:
SELECT [Job_No] as '#Key',
(
)
FOR XML PATH('Job_No'), ROOT('Root')
and it returns like so:
<Root>
<Job_No Key="ORC0023">
</Job_No>
</Root>
How do I get it like so:
<Root>
<Key>ORC0023</Key>
</Root>
try this:
SELECT [Job_No] as 'Key' FROM Jobs
FOR XML PATH(''), root ('Root');
working fiddle

T-SQL XML: <eof> encountered when trying to query child nodes by wildcard

We have an in-house piece of software that works with loosely-defined XML files. I'm trying to extract the child nodes from this step in T-SQL. I'm able to extract the parent node, but I keep getting <eof> syntax errors whenever I query the children.
The XML file looks roughly like this:
<?xml version="1.0"?>
<root>
<steps>
<step>
<steptypeX attribute="somevalue">
<child1>Value</child1>
<child2>Value</child2>
</steptypeX>
</step>
</steps>
</root>
I'm using the following T-SQL:
select
doc.col.query('/child*') --If I use '.' or '*' here I can get the children as XML, but I want the values contained within the nodes on separate rows
from #xmldoc.nodes('/root/steps/step/steptypeX') doc(col)
where doc.col.value('#attribute', 'nvarchar(max)') = 'somevalue'
The error message I'm getting is not clear:
XQuery [query()]: Syntax error near '<eof>'
As far as I can tell, the nodes do exist and I haven't left any XQuery instructions with trailing slashes. I can't really tell what I'm doing wrong here.
If I understand your intention correctly you can use child::*:
DECLARE #xmldoc XML =
N'<?xml version="1.0"?>
<root>
<steps>
<step>
<steptypeX attribute="somevalue">
<child1>Value</child1>
<child2>Value</child2>
</steptypeX>
</step>
</steps>
</root>';
SELECT
doc.col.value('text()[1]', 'nvarchar(max)')
FROM #xmldoc.nodes('/root/steps/step/steptypeX/child::*') doc(col)
WHERE doc.col.value('../#attribute', 'nvarchar(max)') = 'somevalue';
LiveDemo

Get followin sibling in SQL Server XPath

Since SQL Server does not support following-sibling axis - what is the best way to get it? Let's say I have XML like this and I would like to get the first 'b' node after a node matching the value 'dog':
<root>
<a>cat</a>
<b>Cats don't like milk</b>
<a>dog</a>
<b>Dogs like everything</b>
</root>
You could try something like this.
declare #X xml = '
<root>
<a>cat</a>
<b>Cats don''t like milk</b>
<a>dog</a>
<c>not this</c>
<b>Dogs like everything</b>
<b>and not this</b>
</root>'
select #X.query('(/root/b[. >> (/root/a[. = "dog"])[1]])[1]')

Resources