Delete empty XML nodes using T-SQL FOR XML PATH - sql-server

I'm using FOR XML PATH to construct XML out of a table in SQL Server 2008R2. The XML has to be constructed as follows:
<Root>
<OuterElement>
<NumberNode>1</NumberNode>
<FormattedNumberNode>0001</KFormattedNumberNode>
<InnerContainerElement>
<InnerNodeOne>0240</InnerNodeOne>
<InnerNodeStartDate>201201</InnerNodeStartDate>
</InnerContainerElement>
</OuterElement>
</Root>
According to the schema files, the InnerContainerElement is optional, while the InnerNodeOne is required. The schema files aren't set up by me, are quite complex, referring each other and not having explicit XSD-namespaces, so I can't easily load them into the database.
The XML has to be created from a table, which is filled using the following query:
SELECT
1 AS NumberNode
, '0001' AS [FormattedNumberNode]
, '0240' AS [InnerNodeOne]
, '201201' AS [InnerNodeStartDate]
INTO #temporaryXMLStore
UNION
SELECT
2 AS NumberNode
, '0001' AS [FormattedNumberNode]
, NULL AS [InnerNodeOne]
, NULL AS [InnerNodeStartDate]
I can think of two ways to construct this XML with FOR XML PATH.
1) Using 'InnerContainerElement' as named result from an XML subquery:
SELECT
NumberNode
, [FormattedNumberNode]
, (
SELECT
[InnerNodeOne]
, [InnerNodeStartDate]
FOR XML PATH(''), TYPE
) AS [InnerContainerElement]
FROM #temporaryXMLStore
FOR XML PATH('OuterElement'), ROOT('Root') TYPE
2) Using 'InnerContainerElement' as an output element from an XML subquery, but without naming it:
SELECT
NumberNode
, [FormattedNumberNode]
, (
SELECT
[InnerNodeOne]
, [InnerNodeStartDate]
FOR XML PATH('InnerContainerElement'), TYPE
)
FROM #temporaryXMLStore
FOR XML PATH('OuterElement'), ROOT('Root'), TYPE
However, none of them gives the desired result: in both cases, the result looks like
<Root>
<OuterElement>
<NumberNode>1</NumberNode>
<FormattedNumberNode>0001</FormattedNumberNode>
<InnerContainerElement>
<InnerNodeOne>0240</InnerNodeOne>
<InnerNodeStartDate>201201</InnerNodeStartDate>
</InnerContainerElement>
</OuterElement>
<OuterElement>
<NumberNode>2</NumberNode>
<FormattedNumberNode>0001</FormattedNumberNode>
<InnerContainerElement></InnerContainerElement>
<!-- Or, when using the second codeblock: <InnerContainerElement /> -->
</OuterElement>
</Root>
Whenever InnerContainerElement is empty, it is still displayed as an empty element. This is invalid according to the schema: whenever the element InnerContainerElement is in the XML, InnerNodeOne is required too.
How do I construct my FOR XML PATH query in such a way that the InnerContainerElement is left out whenever it's empty?

You need to make sure that the InnerContainerElement has zero rows for the case when there is no content.
select T.NumberNode,
T.FormattedNumberNode,
(
select T.InnerNodeOne,
T.InnerNodeStartDate
where T.InnerNodeOne is not null or
T.InnerNodeStartDate is not null
for xml path('InnerContainerElement'), type
)
from #temporaryXMLStore as T
for xml path('OuterElement'), root('Root')
Or you could specify the element InnerContainerElement as a part of a column alias.
select T.NumberNode,
T.FormattedNumberNode,
T.InnerNodeOne as 'InnerContainerElement/InnerNodeOne',
T.InnerNodeStartDate as 'InnerContainerElement/InnerNodeStartDate'
from #temporaryXMLStore as T
for xml path('OuterElement'), root('Root')

Related

Can't figure out how to search XML column in my table

I have a table called v_EpisodeAudit, with a column called EventData that contains XML data. The XML data differs from row to row, so one record could have XML data in this column that looks like this:
<AddMDMDocument>
<EpisodeMDMId>282521</EpisodeMDMId>
<OncologyReferral>0</OncologyReferral>
<SpecialPalliativeReferral>0</SpecialPalliativeReferral>
<SurgeonReferral>0</SurgeonReferral>
<MDMReport>0</MDMReport>
<GPReferral>0</GPReferral>
<GPReferralApproval>0</GPReferralApproval>
<GeneralPalliativeCare>0</GeneralPalliativeCare>
<AuditLogin>mkell010</AuditLogin>
<AuditTrust>4</AuditTrust>
<Error />
</AddMDMDocument>
while another row might contain the following XML data:
<CloseEpisode>
<EpisodeId>652503</EpisodeId>
<TrackingStatusId>9</TrackingStatusId>
<TrackingClosureReason>100</TrackingClosureReason>
<DateOfTrackingClosure>Sep 25 2017 12:37PM</DateOfTrackingClosure>
<AuditLogin>ccass001</AuditLogin>
<AuditTrust>1</AuditTrust>
<Error />
</CloseEpisode>
And there are further differing types/configurations of XML data. I've read about 20 different sources this morning trying to work out how to search against the XML data in this column to get a specific EpisodeId in the CloseEpisode XMLs, and I can't for the life of me figure it out. Can anyone help me with a query that will find a specified EpisodeId in this column?
XML can be queried very generically. Some approaches:
DECLARE #v_EpisodeAudit TABLE(ID INT IDENTITY, [EventData] XML);
INSERT INTO #v_EpisodeAudit VALUES
(N'<AddMDMDocument>
<EpisodeMDMId>282521</EpisodeMDMId>
<OncologyReferral>0</OncologyReferral>
<SpecialPalliativeReferral>0</SpecialPalliativeReferral>
<SurgeonReferral>0</SurgeonReferral>
<MDMReport>0</MDMReport>
<GPReferral>0</GPReferral>
<GPReferralApproval>0</GPReferralApproval>
<GeneralPalliativeCare>0</GeneralPalliativeCare>
<AuditLogin>mkell010</AuditLogin>
<AuditTrust>4</AuditTrust>
<Error />
</AddMDMDocument>')
,(N'<CloseEpisode>
<EpisodeId>652503</EpisodeId>
<TrackingStatusId>9</TrackingStatusId>
<TrackingClosureReason>100</TrackingClosureReason>
<DateOfTrackingClosure>Sep 25 2017 12:37PM</DateOfTrackingClosure>
<AuditLogin>ccass001</AuditLogin>
<AuditTrust>1</AuditTrust>
<Error />
</CloseEpisode>');00
--This will return the very first node on the second level
SELECT ID
,vEA.[EventData].value(N'local-name(/*[1]/*[1])',N'nvarchar(max)') AS NodeName
,vEA.[EventData].value(N'/*[1]/*[1]/text()[1]',N'nvarchar(max)') AS NodeValue
FROM #v_EpisodeAudit AS vEA
--This will return all nodes of the sevond level and use WHERE with LIKE to find the Episode..Id elements
SELECT ID
,SecondLevelNode.Nd.value(N'local-name(.)',N'nvarchar(max)') AS NodeName
,SecondLevelNode.Nd.value(N'text()[1]',N'nvarchar(max)') AS NodeValue
FROM #v_EpisodeAudit AS vEA
OUTER APPLY vEA.[EventData].nodes(N'/*/*') AS SecondLevelNode(Nd)
WHERE SecondLevelNode.Nd.value(N'local-name(.)',N'nvarchar(max)') LIKE 'Episode%' --or LIKE 'Episode%Id'
--Similar but filtering on XQuery level
SELECT ID
,SecondLevelNode.Nd.value(N'local-name(.)',N'nvarchar(max)') AS NodeName
,SecondLevelNode.Nd.value(N'text()[1]',N'nvarchar(max)') AS NodeValue
FROM #v_EpisodeAudit AS vEA
OUTER APPLY vEA.[EventData].nodes(N'/*/*[substring(local-name(),1,7)="Episode"]') AS SecondLevelNode(Nd)
Use the xml querying functions
select EventData.value('(/CloseEpisode/EpisodeId)[1]','int')
from v_EpisodeAudit
where EventData.value('local-name(/*[1])','varchar(100)')='CloseEpisode'
or perhaps
select EventData
from #v_EpisodeAudit
where EventData.value('(/CloseEpisode/EpisodeId)[1]','int')=652503
depending on what you're trying to do.
If you don't know the root node name, you could use
select EventData.value('(//EpisodeId)[1]','int')
from v_EpisodeAudit
where EventData.exist('//EpisodeId')=1
See https://learn.microsoft.com/en-us/sql/t-sql/xml/value-method-xml-data-type

Root element with header and detail rows

I use a for xml select to produce an xml file.
I want to have two root nodes that contain some header tags and then detail rows that are a result from a query on a table.
Example:
<Root>
<FileHeader>
HEADER ROWS
</FileHeader>
<Jobs>
<Message xmlns="url">
<Header Destination="1" xmlns="url"/>
<Body>
<ListItem xmlns="url">
DETAIL ROWS FROM SELECT
</ListItem>
</Body>
</Message>
</Jobs>
</Root>
The query I am trying to produce this is this one:
WITH XMLNAMESPACES('url')
SELECT(
SELECT
HEADER ROWS
FOR XML PATH('FileHeader'),
TYPE),
(SELECT
'1' AS 'Message/Header/#Destination',
'url' AS 'Message/Header/#xmlns'
FOR XML PATH(''),
TYPE),
(SELECT
DETAIL ROWS FROM SELECT
FROM MY_TABLE
FOR XML PATH('Jobs'),ROOT('Body'),
TYPE )
FOR XML PATH ('Root')
MY_table and its data are irrelevant as all tags inside the final select are correct are validated against the xsd schema.
The FileHeader and Header tags are populated with values given from variables, so no tables are used there.
I am missing something on the middle part of the query (the second select). With my way, I can't have the Header tag inside the Jobs/Body path.
What is more, I cannot fill in the with xmlns value. I even used the following as I found on some forums and still can't manage to produce a well formatted tag with the xmlns attribute.
;WITH XMLNAMESPACES ('url' as xmlns)
Thank you!
Some things to state first:
It is not allowed to add the namespaces xmlns like any other attribute
It is possible to add this default namespace with WITH XMLNAMESPACES. But - in cases of sub-queries - this namespace will be inserte repeatedly. This is not wrong yet annoying and it can blow up your XML and make it fairly hard to read...
You can create a nested XML ad-hoc (inlined), or prepare it first and insert it to the final query. This allows you to add default namespace to a deeper level only.
It's not absolutely clear to me, what you are really looking for, but this might point you in the right direction:
DECLARE #HeaderData TABLE(SomeValue INT,SomeText VARCHAR(100));
DECLARE #DetailData TABLE(DetailsID INT,DetailText VARCHAR(100));
INSERT INTO #HeaderData VALUES
(100,'Value 100')
,(200,'Value 200');
INSERT INTO #DetailData VALUES
(1,'Detail 1')
,(2,'Detail 2');
DECLARE #BodyWithNamespace XML;
WITH XMLNAMESPACES(DEFAULT 'SomeURL_for_Body_default_ns')
SELECT #BodyWithNamespace=
(
SELECT *
FROM #DetailData AS dd
FOR XML PATH('DetailRow'),ROOT('ListItem'),TYPE
);
SELECT(
SELECT *
FROM #HeaderData AS hd
FOR XML PATH('HeaderRow'),ROOT('FileHeader'),TYPE
)
,
(
SELECT 1 AS [Header/#Destination]
,#BodyWithNamespace
FOR XML PATH('Message'),ROOT('Jobs'),TYPE
)
FOR XML PATH ('Root')
The result
<Root>
<FileHeader>
<HeaderRow>
<SomeValue>100</SomeValue>
<SomeText>Value 100</SomeText>
</HeaderRow>
<HeaderRow>
<SomeValue>200</SomeValue>
<SomeText>Value 200</SomeText>
</HeaderRow>
</FileHeader>
<Jobs>
<Message>
<Header Destination="1" />
<ListItem xmlns="SomeURL_for_Body_default_ns">
<DetailRow>
<DetailsID>1</DetailsID>
<DetailText>Detail 1</DetailText>
</DetailRow>
<DetailRow>
<DetailsID>2</DetailsID>
<DetailText>Detail 2</DetailText>
</DetailRow>
</ListItem>
</Message>
</Jobs>
</Root>

How to retrieve XML that contains CDATA, from Database

I need to retrieve XML in the following format
<mv>
<v>!CDATA[[some_inner_xml_1]]</v>
<v>!CDATA[[some_inner_xml_2]]</v>
</mv>
I just learned that data in <v /> will be some other XML. When I thought that data will be an integer, I wrote this and it worked
select IdentifierText as 'v' from ipmruntime.RecordsToExport where BatchID = 5 for xml path(''), Root('mv')
I was trying to use syntax 'v!cdata' - it doesn't like it. I don't know where to stick CDATA in it
I tried another syntax
SELECT
1 AS Tag,
null AS Parent,
IdentifierText as 'mv!1!v!cdata'
FROM ipmruntime.RecordsToExport
where BatchID = 5
FOR XML EXPLICIT, root('mv')
It results in almost what I need
<mv><mv><v><![CDATA[47f81be4-b54f-4703-840b-62b306c40842]]></v></mv><mv><v><![CDATA[3ba36a1f-bf75-4ed9-911e-26f10fba5587]]></v></mv></mv>
Or, if I use 'v!1' in the same query, it will give me <mv><v></v><v></v></mv> but where than goes CDATA?
But this has each <v> wrapped into <mv>. Obviously, I am not great with XML/SqlServer combo...
You can do this way :
select
1 as Tag,
null as Parent,
IdentifierText as [v!1!!CDATA] --[tag name!tag type!tag attribute!other optional setting]
from ipmruntime.RecordsToExport
where BatchID = 5
for xml explicit, root('mv')

Using attribute more than once in FOR XML Path T-SQL query with same element name

I am trying to create an xml output in SQL 2008 using FOR XML Path. This is working fine:
<Taxonomy>
<Category Level="1">Clothing</Category>
<SubCategory Level="2">Jeans</SubCategory>
</Taxonomy>
But I would like the output to be:
<Taxonomy>
<Category Level="1">Clothing</Category>
<Category Level="2">Jeans</Category>
</Taxonomy>
Of course you can code as following:
1 as 'Taxonomy/Category/#Level',
2 as 'Taxonomy/Category/#Level',
t.MainCat as 'Taxonomy/Category',
t.SubCat as 'Taxonomy/Category',
But this gives an error message:
Attribute-centric column 'Column name is repeated. The same attribute cannot be generated more than once on the same XML tag.
What can be done to get the desired output?
Would a subselect work or some kind of cross apply? Or perhaps a union? But how?
---- EDIT - after several answers came up with following solution:
SELECT
1 as 'Category/#Level',
t.Cat as 'Category'
FROM table t
UNION
SELECT
2 as 'Category/#Level',
t.SubCat as 'Category'
FROM table t
FOR XML PATH (''), ROOT('Taxonomy')
gives this output:
<Taxonomy>
<Category Level="1">Clothing</Category>
<Category Level="2">Jeans</Category>
</Taxonomy>
Still have to figure out how to put this partial coding in a much larger code with several 'nested' FOR XML's already
The shortcut methods may not cut it for this. AUTO and PATH don't like multiple elements with the same name. Looks like you would have to use the FOR XML EXPLICIT command.
It works, but is cumbersome.
Sample:
--Generate Sample Data
--FOR XML EXPLICIT requires two meta fields: Tag and Parent
--Tag is the ID of the current element.
--Parent is the ID of the parent element, or NULL for root element.
DECLARE #DataTable as table
(Tag int NOT NULL
, Parent int
, TaxonomyValue nvarchar(max)
, CategoryValue nvarchar(max)
, CategoryLevel int)
--Fill with sample data: Category Element (2), under Taxonomy(1), with no Taxonomy value.
INSERT INTO #DataTable
VALUES (2, 1, NULL, 1, 'Clothing')
, (2, 1, NULL, 2, 'Jeans')
--First part of query: Define the XML structure
SELECT
1 as Tag --root element
, NULL as Parent
, NULL as [Taxonomy!1] --Assign Taxonomy Element to the first element, aka root.
, NULL as [Category!2] --Assign Category Element as a child to Taxonomy.
, NULL as [Category!2!Level] --Give Category an Attribute 'Level'
--The actual data to fill the XML
UNION
SELECT
Data.Tag
, Data.Parent
, Data.TaxonomyValue
, Data.CategoryValue
, Data.CategoryLevel
FROM
#DataTable as Data
FOR XML EXPLICIT
Generates XML
<Taxonomy>
<Category Level="1">Clothing</Category>
<Category Level="2">Jeans</Category>
</Taxonomy>
Edit: Had columns reversed. No more Jeans level.
I know this is an old post, but I want to share one solution that avoids that FOR XML EXPLICIT command complexity for big xmls.
It's enough to add null as a child of Taxonomy, and error will disappear:
select 1 as 'Taxonomy/Category/#Level',
t.MainCat as 'Taxonomy/Category',
NULL AS 'Taxonomy/*',
2 as 'Taxonomy/Category/#Level',
t.SubCat as 'Taxonomy/Category',
from t
I hope it helps.
I have another way. It seemed a tad bit easy to me.
Say, for example I have an xml like
DECLARE #xml xml='<parameters></parameters>'
DECLARE #multiplenodes XML = '<test1><test2 Level="1">This is a test node 2</test2><test2 Level="2">This is another test node</test2></test1>'
SET #xml.modify('insert sql:variable("#multiplenodes") into (/parameters)[1]')
SELECT #xml
Do tell me if this helps.

Generate XML in proper syntax from SQL Server table

How to write a SQL statement to generate XML like this
<ROOT>
<Production.Product>
<ProductID>1 </ProductID>
<Name>Adjustable Race</Name>
........
</Production.Product>
</ROOT>
Currently I am getting this with
SELECT * FROM Production.Product
FOR XML auto
Result is:
<ROOT>
<Production.Product ProductID="1" Name="Adjustable Race"
ProductNumber="AR-5381" MakeFlag="0" FinishedGoodsFlag="0"
SafetyStockLevel="1000" ReorderPoint="750" StandardCost="0.0000"
ListPrice="0.0000" DaysToManufacture="0" SellStartDate="1998-06-01T00:00:00"
rowguid="694215B7-08F7-4C0D-ACB1-D734BA44C0C8"
ModifiedDate="2004-03-11T10:01:36.827" />
One simple way would be to use:
SELECT *
FROM Production.Product
FOR XML AUTO, ELEMENTS
Then, your data should be stored in XML elements inside the <Production.Product> node.
If you need even more control, then you should look at the FOR XML PATH syntax - check out this MSDN article on What's new in FOR XML in SQL Server 2005 which explains the FOR XML PATH (among other new features).
Basically, with FOR XML PATH, you can control very easily how things are rendered - as elements or as attributes - something like:
SELECT
ProductID AS '#ProductID', -- rendered as attribute on XML node
Name, ProductNumber, -- all rendered as elements inside XML node
.....
FROM Production.Product
FOR XML PATH('NewProductNode') -- define a new name for the XML node
This would give you something like:
<NewProductNode ProductID="1">
<Name>Adjustabel Race</Name>
<ProductNumber>AR-5381</ProductNumber>
.....
</NewProductNode>

Resources