Is that valid XML and how to replicate with SQL Server

Is that valid XML and how to replicate with SQL Server - sql-server

I do have to replicate an XML file with SQL Server and I am now stumbling over the following structure inside the XML file and I don't know how to replicate that.
The structure looks like this at the moment for certain tags:
<ART_TAG1>
<UNMLIMITED/>
</ART_TAG1>
<ART_TAG2>
<ART_TAG3>
<Data_Entry/>
</ART_TAG3>
</ART_TAG2>
I am wondering if this is proper XML that the data inside (unlimited and Data_Entry) is enclosed with a closing XML tag. The XML validator https://www.w3schools.com/xml/xml_validator.asp is telling me this is correct. But now I am struggling with replicating that with Transact-SQL.
If I try to replicate that I can only come up with the following TSQL script, which obviously does not fully look like the original.
SELECT 'UNLIMITED' as 'ART_TAG1'
, 'Data_Entry' as 'ART_TAG2/ART_TAG3'
FOR XML PATH(''), ROOT('root')
<root>
<ART_TAG1>UNLIMITED</ART_TAG1>
<ART_TAG2>
<ART_TAG3>Data_Entry</ART_TAG3>
</ART_TAG2>
</root>

If I get this correctly, your question is:
How can I put my query to create those <SomeElement /> tags?
Look at this:
--This will create filled nodes
SELECT 'outer' AS [OuterNode/#attr]
,'inner' AS [OuterNode/InnerNode]
FOR XML PATH('row');
--The empty string is some kind of content
SELECT 'outer' AS [OuterNode/#attr]
,'' AS [OuterNode/InnerNode]
FOR XML PATH('row');
--the missing value (NULL) is omited by default
SELECT 'outer' AS [OuterNode/#attr]
,NULL AS [OuterNode/InnerNode]
FOR XML PATH('row');
--Now check what happens here:
--First XML has an empty element, while the second uses the self-closing element
DECLARE #xml1 XML=
N'<row>
<OuterNode attr="outer">
<InnerNode></InnerNode>
</OuterNode>
</row>';
DECLARE #xml2 XML=
N'<row>
<OuterNode attr="outer">
<InnerNode/>
</OuterNode>
</row>';
SELECT #xml1,#xml2;
The result is the same for both...
Some background: Semantically the empty element <element></element> is exactly the same as the self-closing element <element />. It should not make any difference, whether you use the one or the other. If your consumer cannot deal with this, it is a problem in the reading part.
Yes, you can force any content into XML on string level, but - as the example shows above - this is just a (dangerous) hack.
XML within T-SQL returns - by default - a missing node as NULL and an empty element as empty (depending on the datatype, and beware of the difference between an element and its text() node).
In short: This is nothing you should have to think about...

Related

Retrieve an element value from an XML string during an SQL select request

I'm querying a table T which has a string column StrXML that has XML text stored in it. Here's an example of the XML stored:
<Sequence mc:Ignorable="sap sads" DisplayName="Post Processing"
sap:VirtualizedContainerService.HintSize="424,318"
mva:VisualBasic.Settings="Assembly references and imported namespaces serialized as XML namespaces"
xmlns="http://schemas.microsoft.com/netfx/2009/xaml/activities"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006
xmlns:mee="clr-namespace:MatX.eRP.Entities;assembly=eRP.Entities"
xmlns:mepa="clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities"
xmlns:mva="clr-namespace:Microsoft.VisualBasic.Activities;assembly=System.Activities"
xmlns:sads="http://schemas.microsoft.com/netfx/2010/xaml/activities/debugger"
xmlns:sap="http://schemas.microsoft.com/netfx/2009/xaml/activities/presentation"
xmlns:scg="clr-namespace:System.Collections.Generic;assembly=mscorlib"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml">
<sap:WorkflowViewStateService.ViewState>
<scg:Dictionary x:TypeArguments="x:String, x:Object">
<x:Boolean x:Key="IsExpanded">True</x:Boolean>
</scg:Dictionary>
</sap:WorkflowViewStateService.ViewState>
<mepa:BasicOperation Description="Traitement Thermique" DisplayName="HeatTreatment" Guid="82800b59-e181-4a93-b483-7e2cd9b14827" sap:VirtualizedContainerService.HintSize="402,154" Scope="Build">
<mepa:BasicOperation.MeasurementDescriptions>
<scg:List x:TypeArguments="mee:MeasurementDescription" Capacity="0" />
</mepa:BasicOperation.MeasurementDescriptions>
</mepa:BasicOperation>
<mepa:BasicOperation Description="Finition manuelle" DisplayName="Manual Finishing" Guid="cd64be75-6968-47fe-8aac-93a4fdf37892">
<mepa:BasicOperation.MeasurementDescriptions>
<scg:List x:TypeArguments="mee:MeasurementDescription" Capacity="4">
<mee:MeasurementDescription Max="{x:Null}" Min="{x:Null}" Guid="7c1a37f1-f39d-4ed3-8048-6b0a266c70b9" IsRequired="False" Name="MesureMF1" Type="Double" />
<mee:MeasurementDescription Max="{x:Null}" Min="{x:Null}" Guid="a21b0c0d-dfff-4237-9975-4179bcefe7c2" IsRequired="False" Name="MesureMF2" Type="Double" />
</scg:List>
</mepa:BasicOperation.MeasurementDescriptions>
</mepa:BasicOperation>
</Sequence>
In my select request on table T, I want to only show the Description value for which the Guid="82800b59-e181-4a93-b483-7e2cd9b14827".
How can I do that?

In a comment I mentioned already, that one of your namespaces is missing the final ". This is a big problem, if it's not just a copy-and-paste issue... (not well formed)
XML should not be stored in a string column (slow and dangerous!). If you database does not support XML natively the XML should at least be checked.
You did not mention the actual RDBMS, but the XQuery-principles should be the same (however your RDBMS deals with XQuery actually).
The simple approach is this XQuery (fetch any <BasicOperation>, wherever it is placed, and filter for the given GUID)
//*:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description
With SQL-Server you can try this
SELECT CAST(T.StrXML AS XML).value(N'(//*:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description)[1]',N'nvarchar(max)')
The more specific (and recommended) approach is this:
declare namespace dflt="http://schemas.microsoft.com/netfx/2009/xaml/activities";
declare namespace mepa="clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities";
dflt:Sequence/mepa:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description
Again - with SQL-Server - you might try this:
SELECT CAST(T.StrXML AS XML).value(N'declare namespace dflt="http://schemas.microsoft.com/netfx/2009/xaml/activities";
declare namespace mepa="clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities";
(dflt:Sequence/mepa:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description)[1]',N'nvarchar(max)')
If the GUID-value is variable SQL-Server would allow you to pass the value in from a variable declared outside. Read about sql:variable() and sql:column().
UPDATE
You can use lower-case() to get a secure comparison:
DECLARE #xml XML=
'<root>
<a guid="82800b59-e181-4a93-b483-7e2cd9b14827" />
<a guid="82800B59-E181-4A93-B483-7E2CD9B14827" />
</root>';
DECLARE #guid UNIQUEIDENTIFIER='82800B59-E181-4A93-B483-7E2CD9B14827';
SELECT #xml.query(N'/root/a[lower-case(#guid)=lower-case(sql:variable("#guid"))]')

Try something like this, assuming this is for SQL Server:
;WITH XMLNAMESPACES(DEFAULT 'http://schemas.microsoft.com/netfx/2009/xaml/activities',
'clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities' AS mepa)
SELECT
T.X.value('#Description', 'varchar(100)') AS JobTitle
FROM
#XTable
CROSS APPLY
XmlData.nodes('/Sequence/mepa:BasicOperation') AS T(X)
WHERE
T.X.value('#Guid','varchar(50)') = '82800b59-e181-4a93-b483-7e2cd9b14827'

Extracting XML in a column from a SQL Server database

I have read dozens of posts and have tried numerous SQL queries to try and get this figured out. Sadly, I'm not a SQL expert (not even a novice) nor am I an XML expert. I understand basic queries from SQL, and understand XML tags, mostly.
I'm trying to query a database table, and have the data show a list of values from a column that contains XML. I'll give you an example of the data. I won't burden you with everything I have tried.
Here is an example of field inside of the column I need. So this is just one row, I would need to query the whole table to get all of the data I need.
When I select * from [table name] it returns hundreds of rows and when I double click in the column name of 'Document' on one row, I get the information I need.
It looks like this:
<code_set xmlns="">
<name>ExampleCodeTable</name>
<last_updated>2010-08-30T17:49:58.7919453Z</last_updated>
<code id="1" last_updated="2010-01-20T17:46:35.1658253-07:00"
start_date="1998-12-31T17:00:00-07:00"
end_date="9999-12-31T16:59:59.9999999-07:00">
<entry locale="en-US" name="T" description="Test1" />
</code>
<code id="2" last_updated="2010-01-20T17:46:35.1658253-07:00"
start_date="1998-12-31T17:00:00-07:00"
end_date="9999-12-31T16:59:59.9999999-07:00">
<entry locale="en-US" name="Z" description="Test2" />
</code>
<displayExpression>[Code] + ' - ' + [Description]</displayExpression>
<sortColumn>[Description]</sortColumn>
</code_set>
Ideally I would write it so it runs the query on the table and produces results like this:
Code Description
--------------------
(Data) (Data)
Any ideas? Is it even possible? The dozens of things I have tried that are always posted in stack, either return Nulls or fail.
Thanks for your help

Try something like this:
SELECT
CodeSetId = xc.value('#id', 'int'),
Description = xc.value('(entry/#description)[1]', 'varchar(50)')
FROM
dbo.YourTableNameHere
CROSS APPLY
YourXmlColumn.nodes('/code_set/code') AS XT(XC)
This basically uses the built-in XQuery to get an "in-memory" table (XT) with a single column (XC), each containing an XML fragment that represents each <code> node inside your <code_set> root node.
Once you have each of these XML fragments, you can use the .value() XQuery operator to "reach in" and grab some pieces of information from it, e.g. it's #id (attribute by the name of id), or the #description attribute on the contained <entry> subelement.

The following query will read the xml field in every row, then shred certain values into a tabular result set.
SELECT
-- get attribute [attribute name] from the parent node
parent.value('./#attribute name','varchar(max)') as ParentAttributeValue,
-- get the text value of the first child node
child.value('./text()', 'varchar(max)') as ChildNodeValueFromFirstChild,
-- get attribute attribute [attribute name] from the first child node
child.value('./#attribute name', 'varchar(max)') as ChildAttributeValueFromFirstChild
FROM
[table name]
CROSS APPLY
-- create a handle named parent that references that <parent node> in each row
[xml field name].nodes('//xpath to parent name') AS ParentName(parent)
CROSS APPLY
-- create a handle named child that references first <child node> in each row
parent.nodes('(xpath from parent/to child)[0]') AS FirstChildNode(child)
GO
Please provide the exact values you want to shred from the XML for a more precise answer.

SQL Server : read XML data

I have this query taken from the site www.SQLauthority.com:
DECLARE #MyXML XML
SET #MyXML = '<SampleXML>
<Colors>
<Color1>White</Color1>
<Color2>Blue</Color2>
<Color3>Black</Color3>
<Color4 Special="Light">Green</Color4>
<Color5>Red</Color5>
</Colors>
<Fruits>
<Fruits1>Apple</Fruits1>
<Fruits2>Pineapple</Fruits2>
<Fruits3>Grapes</Fruits3>
<Fruits4>Melon</Fruits4>
</Fruits>
</SampleXML>'
SELECT
a.b.value('Colors[1]/Color1[1]','varchar(10)') AS Color1,
a.b.value('Colors[1]/Color2[1]','varchar(10)') AS Color2,
a.b.value('Colors[1]/Color3[1]','varchar(10)') AS Color3,
a.b.value('Colors[1]/Color4[1]/#Special','varchar(10)')+' '+
+a.b.value('Colors[1]/Color4[1]','varchar(10)') AS Color4,
a.b.value('Colors[1]/Color5[1]','varchar(10)') AS Color5,
a.b.value('Fruits[1]/Fruits1[1]','varchar(10)') AS Fruits1,
a.b.value('Fruits[1]/Fruits2[1]','varchar(10)') AS Fruits2,
a.b.value('Fruits[1]/Fruits3[1]','varchar(10)') AS Fruits3,
a.b.value('Fruits[1]/Fruits4[1]','varchar(10)') AS Fruits4
FROM #MyXML.nodes('SampleXML') a(b)
I am not getting a better picture of how the nodes fetching from the xml data.
I have few queries regarding this.
what is a(b) in this?
how the structure will change if i have another node inside colors and all the existing child nodes appended to that?
ie:
<Colorss>
<Colors>
<Color1>White</Color1>
<Color2>Blue</Color2>
<Color3>Black</Color3>
<Color4 Special="Light">Green</Color4>
<Color5>Red</Color5>
</Colors>
<Colorss>
<Fruits>
<Fruits1>Apple</Fruits1>
<Fruits2>Pineapple</Fruits2>
<Fruits3>Grapes</Fruits3>
<Fruits4>Melon</Fruits4>
</Fruits>
what does it mean by a.b.value? When I mouse over it shows a is derived table. Can I check value of the table a?
Any help in this will be appreciated.

what is a(b) in this?
The call to .nodes('SampleXML') is a XQuery function which returns a pseudo table which contains one column of an XML fragment for each of the elements that this XPath expression matches - and the a(b) is the table alias (a) for that column, and b is the name of the column in that pseudo table containing the XML fragments.
what does it mean by a.b.value?
This is based on the above - a is the table alias for that temporary, inline pseudo table, b is the column name for the column in that table, and .value() is another XQuery function that will extract a single value from XML, based on the XPath expression (first argument) and it will return it as the datatype specified in the second argument.
You should check out those introductions to XQuery support in SQL Server to understand better:
Introduction to XQuery in SQL Server 2005
XQuery basics
and there are numerous other introductions and tutorials on XQuery - just search with your favorite search engine and you'll get tons of hits!

here's my stab # it:
a-refers to root;b-refers to root and child node
DECLARE #MyXML XML
SET #MyXML = '<SampleXML>
<Colors>
<Color1>White</Color1>
<Color2>Blue</Color2>
<Color3>Black</Color3>
<Color4 Special="Light">Green</Color4>
<Color5>Red
<Color6>Black44</Color6>
<Color7>Black445</Color7>
</Color5>
</Colors>
<Fruits>
<Fruits1>Apple</Fruits1>
<Fruits2>Pineapple</Fruits2>
<Fruits3>Grapes</Fruits3>
<Fruits4>Melon</Fruits4>
</Fruits>
</SampleXML>'
to get an inner child
SELECT
a.c.value('Colors1/Color11','varchar(10)') AS Color1,
a.c.value('Colors1/Color21','varchar(10)') AS Color2,
a.c.value('Colors1/Color31','varchar(10)') AS Color3,
a.c.value('Colors1/Color41/#Special','varchar(10)') AS Color4,
a.c.value('Colors1/Color51','varchar(10)') AS Color5,
a.c.value('Colors1/Color51/Color71','varchar(50)') AS Color6a,
a.c.value('Colors1/Color51/Color61','varchar(50)') AS Color6b, a.c.value('Fruits1/Fruits11','varchar(10)') AS Fruits1,
a.c.value('Fruits1/Fruits21','varchar(10)') AS Fruits2,
a.c.value('Fruits1/Fruits31','varchar(10)') AS Fruits3,
a.c.value('Fruits1/Fruits41','varchar(10)') AS Fruits4
FROM #MyXML.nodes('SampleXML') a(c)
A nodes() method invocation with the query expression /root/Color(n) would return a rowset with three rows, each containing a logical copy of the original XML document, and with the context item set to one of the nodes
see here

Generate XML in proper syntax from SQL Server table

How to write a SQL statement to generate XML like this
<ROOT>
<Production.Product>
<ProductID>1 </ProductID>
<Name>Adjustable Race</Name>
........
</Production.Product>
</ROOT>
Currently I am getting this with
SELECT * FROM Production.Product
FOR XML auto
Result is:
<ROOT>
<Production.Product ProductID="1" Name="Adjustable Race"
ProductNumber="AR-5381" MakeFlag="0" FinishedGoodsFlag="0"
SafetyStockLevel="1000" ReorderPoint="750" StandardCost="0.0000"
ListPrice="0.0000" DaysToManufacture="0" SellStartDate="1998-06-01T00:00:00"
rowguid="694215B7-08F7-4C0D-ACB1-D734BA44C0C8"
ModifiedDate="2004-03-11T10:01:36.827" />

One simple way would be to use:
SELECT *
FROM Production.Product
FOR XML AUTO, ELEMENTS
Then, your data should be stored in XML elements inside the <Production.Product> node.
If you need even more control, then you should look at the FOR XML PATH syntax - check out this MSDN article on What's new in FOR XML in SQL Server 2005 which explains the FOR XML PATH (among other new features).
Basically, with FOR XML PATH, you can control very easily how things are rendered - as elements or as attributes - something like:
SELECT
ProductID AS '#ProductID', -- rendered as attribute on XML node
Name, ProductNumber, -- all rendered as elements inside XML node
.....
FROM Production.Product
FOR XML PATH('NewProductNode') -- define a new name for the XML node
This would give you something like:
<NewProductNode ProductID="1">
<Name>Adjustabel Race</Name>
<ProductNumber>AR-5381</ProductNumber>
.....
</NewProductNode>

Using SQL Server 2005's XQuery select all nodes with a specific attribute value, or with that attribute missing

Update: giving a much more thorough example.
The first two solutions offered were right along the lines of what I was trying to say not to do. I can't know location, it needs to be able to look at the whole document tree. So a solution along these lines, with /Books/ specified as the context will not work:
SELECT x.query('.') FROM #xml.nodes('/Books/*[not(#ID) or #ID = 5]') x1(x)
Original question with better example:
Using SQL Server 2005's XQuery implementation I need to select all nodes in an XML document, just once each and keeping their original structure, but only if they are missing a particular attribute, or that attribute has a specific value (passed in by parameter). The query also has to work on the whole XML document (descendant-or-self axis) rather than selecting at a predefined depth.
That is to say, each individual node will appear in the resultant document only if it and every one of its ancestors are missing the attribute, or have the attribute with a single specific value.
For example:
If this were the XML:
DECLARE #Xml XML
SET #Xml =
N'
<Library>
<Novels>
<Novel category="1">Novel1</Novel>
<Novel category="2">Novel2</Novel>
<Novel>Novel3</Novel>
<Novel category="4">Novel4</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
<Volume category="2">G-L</Volume>
<Volume category="3">M-S</Volume>
<Volume category="4">T-Z</Volume>
</Encyclopedia>
</Encyclopedias>
<Dictionaries category="1">
<Dictionary>Webster</Dictionary>
<Dictionary>Oxford</Dictionary>
</Dictionaries>
</Library>
'
A parameter of 1 for category would result in this:
<Library>
<Novels>
<Novel category="1">Novel1</Novel>
<Novel>Novel3</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
</Encyclopedia>
</Encyclopedias>
<Dictionaries category="1">
<Dictionary>Webster</Dictionary>
<Dictionary>Oxford</Dictionary>
</Dictionaries>
</Library>
A parameter of 2 for category would result in this:
<Library>
<Novels>
<Novel category="2">Novel2</Novel>
<Novel>Novel3</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
<Volume category="2">G-L</Volume>
</Encyclopedia>
</Encyclopedias>
</Library>
I know XSLT is perfectly suited for this job, but it's not an option. We have to accomplish this entirely in SQL Server 2005. Any implementations not using XQuery are fine too, as long as it can be done entirely in T-SQL.

It's not clear for me from your example what you're actually trying to achieve. Do you want to return a new XML with all the nodes stripped out except those that fulfill the condition? If yes, then this looks like the job for an XSLT transform which I don't think it's built-in in MSSQL 2005 (can be added as a UDF: http://www.topxml.com/rbnews/SQLXML/re-23872_Performing-XSLT-Transforms-on-XML-Data-Stored-in-SQL-Server-2005.aspx).
If you just need to return the list of nodes then you can use this expression:
//Book[not(#ID) or #ID = 5]
but I get the impression that it's not what you need. It would help if you can provide a clearer example.
Edit: This example is indeed more clear. The best that I could find is this:
SET #Xml.modify('delete(//*[#category!=1])')
SELECT #Xml
The idea is to delete from the XML all the nodes that you don't need, so you remain with the original structure and the needed nodes. I tested with your two examples and it produced the wanted result.
However modify has some restrictions - it seems you can't use it in a select statement, it has to modify data in place. If you need to return such data with a select you could use a temporary table in which to copy the original data and then update that table. Something like this:
INSERT INTO #temp VALUES(#Xml)
UPDATE #temp SET data.modify('delete(//*[#category!=2])')
Hope that helps.

The question is not really clear, but is this what you're looking for?
DECLARE #Xml AS XML
SET #Xml =
N'
<Books>
<Book ID="1">Book1</Book>
<Book ID="2">Book2</Book>
<Book ID="3">Book3</Book>
<Book>Book4</Book>
<Book ID="5">Book5</Book>
<Book ID="6">Book6</Book>
<Book>Book7</Book>
<Book ID="8">Book8</Book>
</Books>
'
DECLARE #BookID AS INT
SET #BookID = 5
DECLARE #Result AS XML
SET #result = (SELECT #xml.query('//Book[not(#ID) or #ID = sql:variable("#BookID")]'))
SELECT #result