SQL Server XML Data with CDATA Block - sql-server

I am storing an XML payload in a SQL Server table that has a datatype of 'XML'. The data I am receiving has a section that is enclosed in a CDATA block. Here is an example:
<event>
<projectId>123456</projectId>
<eventTs>2018-01-04T13:07:23</eventTs>
<eventData>
<![CDATA[
<company>
<companyId>849</companyId>
<companyName>My Company Name</companyName>
<activeFlag>Y</activeFlag>
<timestamp>27-JUL-17</timestamp>
</company>
]]>
</eventData>
</event>
when this data lands in my table in the field that has a data type of 'XML' the 'CDATA' block is stripped out but then all of the "<" and ">" characters are escaped. Since those characters are escaped, XPATH queries on that data field no longer work. Is there any way around this behavior short of having to strip out the CDATA block before it is inserted/converted to an XML data type?
This is what the data looks like after being inserted into the XML datatype field:
<event>
<projectId>123456</projectId>
<eventTs>2018-01-04T13:07:23</eventTs>
<eventData>
<company>
<companyId>849</companyId>
<companyName>My Company Name</companyName>
<activeFlag>Y</activeFlag>
<timestamp>27-JUL-17</timestamp>
</company>
</eventData>
</event>

It is a very bad approach to store XML within a CDATA section. Everything within this block is just text. This special text looks like an XML, but you cannot query this XML to return the <companyName>.
Try this:
DECLARE #xml XML=
N'<event>
<projectId>123456</projectId>
<eventTs>2018-01-04T13:07:23</eventTs>
<eventData>
<![CDATA[
<company>
<companyId>849</companyId>
<companyName>My Company Name</companyName>
<activeFlag>Y</activeFlag>
<timestamp>27-JUL-17</timestamp>
</company>
]]>
</eventData>
</event>';
SQL Server's developer decided not even to support CDATA anymore. It will implicitly be taken away, while its content remains properly escaped. But you can read the content without problems:
SELECT #xml.value('(/event/eventData)[1]','nvarchar(max)');
The point is: This result looks like an XML, but - in order to use it like an XML - it must be casted.
This you could do to solve this:
DECLARE #innerXML XML=(SELECT CAST('<eventData>' + #xml.value('(/event/eventData)[1]','nvarchar(max)') + '</eventData>' AS XML));
SET #xml.modify('delete /event/eventData[1]');
SET #xml.modify('insert sql:variable("#innerXML") as last into /event[1]');
SELECT #xml;
Easy going:
As long as the incoming XML is a string (before you try to cast it to XML) you can just throw away the CDATA markers:
DECLARE #xmlString NVARCHAR(MAX)=
N'<event>
<projectId>123456</projectId>
<eventTs>2018-01-04T13:07:23</eventTs>
<eventData>
<![CDATA[
<company>
<companyId>849</companyId>
<companyName>My Company Name</companyName>
<activeFlag>Y</activeFlag>
<timestamp>27-JUL-17</timestamp>
</company>
]]>
</eventData>
</event>';
SELECT CAST(REPLACE(REPLACE(#xmlString,' <![CDATA[',''),']]>','') AS XML)

Related

Convert VARCHAR(max) column to uppercase then to XML in a single query

I tried searching in StackOverflow with no success.
I would like to perform following two conversions on a VARCHAR(max) column, but in one query. This is T-SQL.
Convert all text in column to uppercase with UPPER() function
Convert column to XML datatype by using CAST(column AS XML) function
I tried below and is syntactically incorrect.
SELECT CAST(UPPER(inputText) AS XML) AS ConvertedText
FROM SampleTable
Error returned by SSMS. (When I remove UPPER() the query runs without error.)
namespaces beginning with "xml" are reserved
XML is strictly case-sensitive. It is always dangerous to deal with XML with string-methods. Because XML is not just a string with some fancy extras...
Your XML is - I take this from the posted error message - including a xml-declaration. And additionally I'll speak about namespaces. XML is expecting the xml-declaration and declaration of namespace in lower-case. This cannot be upper-cased.
Check this out: I define a XML with a declaration, a default namespace and one more prefixed namespace.
DECLARE #testXML NVARCHAR(MAX)=
N'<?xml version="1.0" encoding="UTF-16"?>
<root xmlns="dummy.default" xmlns:blah="Some.blah.namespace">
<test a="attribute value">element value</test>
<blah:NamespacedElement>value in a namespaced element</blah:NamespacedElement>
</root>';
SELECT UPPER(#testXML);
/*
<?XML VERSION="1" ENCODING="UTF-16"?>
<ROOT XMLNS="DUMMY.DEFAULT" XMLNS:BLAH="SOME.BLAH.NAMESPACE">
<TEST A="ATTRIBUTE VALUE">ELEMENT VALUE</TEST>
<BLAH:NAMESPACEDELEMENT>VALUE IN A NAMESPACED ELEMENT</BLAH:NAMESPACEDELEMENT>
</ROOT>
*/
--The declaration is broken as all internal content is expected to be lower-case. But this is easy. We can cut it away entirely. Within SQL-Server there is no sense in this declaration. It will be omited in any case...
--Secondly we have to deal with the xmlns:
SELECT SUBSTRING(REPLACE(UPPER(#testXML),'xmlns','xmlns'),PATINDEX('%?>%',#testXML)+2,1000000);
/*
<ROOT xmlns="DUMMY.DEFAULT" xmlns:BLAH="SOME.BLAH.NAMESPACE">
<TEST A="ATTRIBUTE VALUE">ELEMENT VALUE</TEST>
<BLAH:NAMESPACEDELEMENT>VALUE IN A NAMESPACED ELEMENT</BLAH:NAMESPACEDELEMENT>
</ROOT>
*/
--You can see, that the declaration is gone and the xmlns is lower-case now. And this can be casted to XML:
SELECT CAST(SUBSTRING(REPLACE(UPPER(#testXML),'xmlns','xmlns'),PATINDEX('%?>%',#testXML)+2,1000000) AS XML)
But - to be honest - if this is not just a weird homework thing, you should never change the casing of XML for the whole thing (including the mark-up).

Retrieve an element value from an XML string during an SQL select request

I'm querying a table T which has a string column StrXML that has XML text stored in it. Here's an example of the XML stored:
<Sequence mc:Ignorable="sap sads" DisplayName="Post Processing"
sap:VirtualizedContainerService.HintSize="424,318"
mva:VisualBasic.Settings="Assembly references and imported namespaces serialized as XML namespaces"
xmlns="http://schemas.microsoft.com/netfx/2009/xaml/activities"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006
xmlns:mee="clr-namespace:MatX.eRP.Entities;assembly=eRP.Entities"
xmlns:mepa="clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities"
xmlns:mva="clr-namespace:Microsoft.VisualBasic.Activities;assembly=System.Activities"
xmlns:sads="http://schemas.microsoft.com/netfx/2010/xaml/activities/debugger"
xmlns:sap="http://schemas.microsoft.com/netfx/2009/xaml/activities/presentation"
xmlns:scg="clr-namespace:System.Collections.Generic;assembly=mscorlib"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml">
<sap:WorkflowViewStateService.ViewState>
<scg:Dictionary x:TypeArguments="x:String, x:Object">
<x:Boolean x:Key="IsExpanded">True</x:Boolean>
</scg:Dictionary>
</sap:WorkflowViewStateService.ViewState>
<mepa:BasicOperation Description="Traitement Thermique" DisplayName="HeatTreatment" Guid="82800b59-e181-4a93-b483-7e2cd9b14827" sap:VirtualizedContainerService.HintSize="402,154" Scope="Build">
<mepa:BasicOperation.MeasurementDescriptions>
<scg:List x:TypeArguments="mee:MeasurementDescription" Capacity="0" />
</mepa:BasicOperation.MeasurementDescriptions>
</mepa:BasicOperation>
<mepa:BasicOperation Description="Finition manuelle" DisplayName="Manual Finishing" Guid="cd64be75-6968-47fe-8aac-93a4fdf37892">
<mepa:BasicOperation.MeasurementDescriptions>
<scg:List x:TypeArguments="mee:MeasurementDescription" Capacity="4">
<mee:MeasurementDescription Max="{x:Null}" Min="{x:Null}" Guid="7c1a37f1-f39d-4ed3-8048-6b0a266c70b9" IsRequired="False" Name="MesureMF1" Type="Double" />
<mee:MeasurementDescription Max="{x:Null}" Min="{x:Null}" Guid="a21b0c0d-dfff-4237-9975-4179bcefe7c2" IsRequired="False" Name="MesureMF2" Type="Double" />
</scg:List>
</mepa:BasicOperation.MeasurementDescriptions>
</mepa:BasicOperation>
</Sequence>
In my select request on table T, I want to only show the Description value for which the Guid="82800b59-e181-4a93-b483-7e2cd9b14827".
How can I do that?
In a comment I mentioned already, that one of your namespaces is missing the final ". This is a big problem, if it's not just a copy-and-paste issue... (not well formed)
XML should not be stored in a string column (slow and dangerous!). If you database does not support XML natively the XML should at least be checked.
You did not mention the actual RDBMS, but the XQuery-principles should be the same (however your RDBMS deals with XQuery actually).
The simple approach is this XQuery (fetch any <BasicOperation>, wherever it is placed, and filter for the given GUID)
//*:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description
With SQL-Server you can try this
SELECT CAST(T.StrXML AS XML).value(N'(//*:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description)[1]',N'nvarchar(max)')
The more specific (and recommended) approach is this:
declare namespace dflt="http://schemas.microsoft.com/netfx/2009/xaml/activities";
declare namespace mepa="clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities";
dflt:Sequence/mepa:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description
Again - with SQL-Server - you might try this:
SELECT CAST(T.StrXML AS XML).value(N'declare namespace dflt="http://schemas.microsoft.com/netfx/2009/xaml/activities";
declare namespace mepa="clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities";
(dflt:Sequence/mepa:BasicOperation[#Guid="82800b59-e181-4a93-b483-7e2cd9b14827"]/#Description)[1]',N'nvarchar(max)')
If the GUID-value is variable SQL-Server would allow you to pass the value in from a variable declared outside. Read about sql:variable() and sql:column().
UPDATE
You can use lower-case() to get a secure comparison:
DECLARE #xml XML=
'<root>
<a guid="82800b59-e181-4a93-b483-7e2cd9b14827" />
<a guid="82800B59-E181-4A93-B483-7E2CD9B14827" />
</root>';
DECLARE #guid UNIQUEIDENTIFIER='82800B59-E181-4A93-B483-7E2CD9B14827';
SELECT #xml.query(N'/root/a[lower-case(#guid)=lower-case(sql:variable("#guid"))]')
Try something like this, assuming this is for SQL Server:
;WITH XMLNAMESPACES(DEFAULT 'http://schemas.microsoft.com/netfx/2009/xaml/activities',
'clr-namespace:MatX.eRP.PostProcessing.Activities;assembly=PostProcessing.Activities' AS mepa)
SELECT
T.X.value('#Description', 'varchar(100)') AS JobTitle
FROM
#XTable
CROSS APPLY
XmlData.nodes('/Sequence/mepa:BasicOperation') AS T(X)
WHERE
T.X.value('#Guid','varchar(50)') = '82800b59-e181-4a93-b483-7e2cd9b14827'

How do I convert SQL to XML?

I am trying to output SQL as XML to match the exact format as the following
<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-
sued.com/Logistics/Shipping/ProrateImport/20151009">
<Prorates>
<Prorate>
<OrderTypeId>1</OrderTypeId>
<DeliveryDate>2015-10-12T00:00:00+02:00</DeliveryDate>
<DivNo>632</DivNo>
<ProrateUnit>1</ProrateUnit>
<ProrateProducts>
<ProrateProduct ProductCode="8467">
<ProrateItems>
<ProrateItem StoreNo="1">
<Quantity>5</Quantity>
</ProrateItem>
<ProrateItem StoreNo="2">
<Quantity>5</Quantity>
</ProrateItem>
<ProrateItem StoreNo="3">
<Quantity>5</Quantity>
</ProrateItem>
</ProrateItems>
</ProrateProduct>
</ProrateProducts>
</Prorate>
</Prorates>
</ProrateImport>
Here is my query:
SELECT
OrderTypeID,
DeliveryDate, DivNo,
ProrateUnit,
(SELECT
ProductOrder [#ProductCode],
(SELECT
ProrateItem [#StoreNo],
CAST(Quantity AS INT) [Quantity]
FROM
##Result2 T3
WHERE
T3.DivNo = T2.DivNo
AND T3.DivNo = T1.DivNo
AND T3.DeliveryDate = T2.DeliveryDate
AND T3.DeliveryDate = T1.DeliveryDate
AND T3.ProductOrder = t2.ProductOrder
FOR XML PATH('ProrateItem'), TYPE, ROOT('ProrateItems')
)
FROM
##Result2 T2
WHERE
T2.DivNo = T1.DivNo
AND T2.DeliveryDate = T1.DeliveryDate
FOR XML PATH('ProrateProduct'), TYPE, ROOT('ProrateProducts')
)
FROM
##Result2 T1
GROUP BY
OrderTypeID, DeliveryDate, DivNo, ProrateUnit
FOR XML PATH('Prorate'), TYPE, ROOT('Prorates')
How do I add in the Following and have the ProrateImport/20151009" change to the current date?
<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-
sued.com/Logistics/Shipping/ProrateImport/20151009">
This is my first time I have used XML
Im not sure i understand. Did you create the first XML yourself and just need to add the last script?
DECLARE #XMLHEADER nvarchar(max)
SET #XMLHEADER = '<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/'+convert(varchar(8),getdate(),112)+'"
>'
select #xmlheader
And then you just need to add the rest of your output from your select statement.
There are several problems:
How to introduce namespaces?
How to introduce namespaces dynamically
How to add a <?xml ?> directive
two-leveled root (<ProrateImport><Prorate>)
namespaces
You have to use WITH XMLNAMESSPACES to introduce a namespace to your query.
Hint: the naked xmlns is introduced by DEFAULT, the xsi namespace will be introduced automatically by using ELEMENTS XSINIL:
WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema' AS xsd
,DEFAULT 'http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20151009')
SELECT 1 AS Dummy
FOR XML PATH('rowElement'), ELEMENTS XSINIL, ROOT('root')
The result
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20151009"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
Note: The namespaces must be stated literally. No computations, no variables!
dynamic namespaces
This is - out of the box - impossible. But you might use dynamically created SQL and use EXEC to get your result. Just create exactly the statement as above
DECLARE #cmd VARCHAR(MAX)=
'
WITH XMLNAMESPACES(''http://www.w3.org/2001/XMLSchema'' AS xsd
,DEFAULT ''http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/' + CONVERT(VARCHAR(8),GETDATE(),112) + ''')
SELECT 1 AS Dummy
FOR XML PATH(''rowElement''), ELEMENTS XSINIL, ROOT(''root'')';
PRINT #cmd
EXEC(#cmd);
the result
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20171019"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
directive
The directive cannot be introduced into XML. SQL-Server will omit any <?xml ?> directive! This can only be done on string level:
DECLARE #cmd VARCHAR(MAX)=
'
WITH XMLNAMESPACES(''http://www.w3.org/2001/XMLSchema'' AS xsd
,DEFAULT ''http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/' + CONVERT(VARCHAR(8),GETDATE(),112) + ''')
SELECT(
SELECT 1 AS Dummy
FOR XML PATH(''rowElement''), ELEMENTS XSINIL, ROOT(''root'')) AS MyResult';
CREATE TABLE #resultTable(MyXmlAsString VARCHAR(MAX))
INSERT INTO #resultTable(MyXmlAsString)
EXEC(#cmd);
SELECT '<?xml version="1.0" encoding="utf-8"?>' + MyXmlAsString
FROM #resultTable;
The result
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20171019"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
two-leveled root
You can nest two FOR XML statements to achieve this:
WITH XMLNAMESPACES(DEFAULT 'blah')
SELECT
(
SELECT 1 AS Dummy
FOR XML PATH('rowElement'),ROOT('innerRoot'),TYPE
)
FOR XML PATH('outerRoot');
But the annoying part is, that namespaces are introduced by each sub-select over and over. Not wrong but very annoying! A well known Microsoft connect issue. Please sign in and vote for it! The result:
<outerRoot xmlns="blah">
<innerRoot xmlns="blah"> <!--Here's the second xmlns! -->
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</innerRoot>
</outerRoot>
Your solution
After explained all this I'd suggest to create the XML without any namespace or declaration (what you are doing already!), then convert the result to NVARCHAR(MAX) and add the header and the closing footer on string level. This is ugly, but in your case the only way.
Hint: You will not be able to store the final result in a native XML type in SQL Server without loosing the directive.

Parsing XML with XMLNS in SQL Server 2008 R2

I'm fairly new to querying XML datatypes. We receive XMLs from partners and one such partner sends us XMLs like this:
DECLARE #ResultData XML = '<outGoing xmlns="urn:testsystems-com:HH.2015.Services.Telephony.OutGoing">
<customer>
<ID>158</ID>
</customer>
</outGoing>'
In this example, I would like to pull only the ID out of the XML, but it seems the xmlns is preventing me from getting anything inside the XML:
SELECT cust.value('(ID)[1]', 'VARCHAR(40)') as 'CustomerID'
FROM #ResultData.nodes('/outGoing/customer') as t(cust)
returns NUll, but if I manually remove the XMLNS from the XML I get 158.
I've experimented with WITH XMLNAMESPACES to see if I could use that, but I'm obviously missing something. Since these XMLs will be coming in automatically, I would like to be able to parse the XML, but right now I'm stuck.
That should work:
DECLARE #ResultData XML = '<outGoing xmlns="urn:testsystems-com:HH.2015.Services.Telephony.OutGoing">
<customer>
<ID>158</ID>
</customer>
</outGoing>'
;WITH XMLNAMESPACES(DEFAULT 'urn:testsystems-com:HH.2015.Services.Telephony.OutGoing')
SELECT
#ResultData.value('(/outGoing/customer/ID)[1]', 'int')
or to use your approach:
;WITH XMLNAMESPACES(DEFAULT 'urn:testsystems-com:HH.2015.Services.Telephony.OutGoing')
SELECT
CustomerID = cust.value('(ID)[1]', 'INT')
FROM
#ResultData.nodes('/outGoing/customer') as t(cust)
This will return 158 as its value.
I've used WITH XMLNAMESPACES(DEFAULT .....) since this is the only XML namespace in play, and it's defined at the top-level node - so it applies to every node in the XML structure.

SQL - Read an XML node from a table field

I am using SQL Server 2008. I have a field called RequestParameters in one of my SQL table called Requests with XML data. An example would be:
<RequestParameters xmlns="http://schemas.datacontract.org/2004/07/My.Name.Space" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="1">
<Data z:Id="2" i:type="CheckoutRequest">
<UserGuid>7ec38c44-5aa6-49e6-9fc7-25e9028f2148</UserGuid>
<DefaultData i:nil="true" />
</Data>
</RequestParameters>
I ultimately want to retrieve the value of UserGuid. For that, I am doing this:
SELECT RequestParameters.value('(/RequestParameters/Data/UserGuid)[0]', 'uniqueidentifier') as UserGuid
FROM Requests
However, the results I am seeing are all NULL. What am I doing wrong?
You have to specify the default namespace and use [1] instead of [0].
WITH XMLNAMESPACES(default 'http://schemas.datacontract.org/2004/07/My.Name.Space')
SELECT RequestParameters.value('(/RequestParameters/Data/UserGuid)[1]', 'uniqueidentifier') as UserGuid
FROM Requests;
SQL Fiddle
declare #XML xml
set #XML = "<RequestParameters xmlns="http://schemas.datacontract.org/2004/07/My.Name.Space" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="1">
<Data z:Id="2" i:type="CheckoutRequest">
<UserGuid>7ec38c44-5aa6-49e6-9fc7-25e9028f2148</UserGuid>
<DefaultData i:nil="true" />
</Data>
</RequestParameters>"
select #XML.value('(/RequestParameters/Data /UserGuid)[1]', 'varchar')
'

Resources