SQL Server 2014 - FOR XML AUTO avoid automatic node nesting - sql-server

I'm trying to build some query to export data in XML and I build this query:
select
[invoice].*,
[rows].*,
[payment].payerID,
[items].picture
from InvoicesHeader [invoice]
join InvoicesRows [rows] on [rows].invoiceID=[invoice].invoiceID
join Payments [payments] on [payments].paymentID=[invoice].paymentID
join Items [items] on [items].itemID=[rows].itemID
FOR XML Auto, ROOT ('invoices'), ELEMENTS
and I got something like this as result
<invoices>
<invoice>
<ID>82</ID>
<DocType>R</DocType>
<DocYear>2017</DocYear>
<DocNumber>71</DocNumber>
<IssueDate>2017-07-17T15:17:30.237</IssueDate>
<OrderID>235489738019</OrderID>
...
<payments>
<payerID>3234423f33</payerID>
<rows>
<ID>163</ID>
<ItemID>235489738019</ItemID>
<Quantity>2</Quantity>
<Price>1</Price>
<VATCode>22</VATCode>
<Color>-</Color>
<Size></Size>
<SerialNumber></SerialNumber>
<items>
<picture>http://nl.imgbb.com/AAOSwOdpXyB4I.JPG</picture>
</items>
</rows>
....
</payments>
</invoice>
</invoices>
while I would like to have something like this where
[rows] is childnode of invoice and not of payments
<invoices>
<invoice>
<ID>82</ID>
<DocType>R</DocType>
<DocYear>2017</DocYear>
<DocNumber>71</DocNumber>
<IssueDate>2017-07-17T15:17:30.237</IssueDate>
<OrderID>235489738019</OrderID>
...
<payments>
<payerID>3234423f33</payerID>
</payments>
<rows>
<ID>163</ID>
<ItemID>235489738019</ItemID>
<Quantity>2</Quantity>
<Price>1</Price>
<VATCode>22</VATCode>
<Color>-</Color>
<Size></Size>
<SerialNumber></SerialNumber>
<items>
<picture>http://nl.imgbb.com/AAOSwOdpXyB4I.JPG</picture>
</items>
</rows>
....
</invoice>
</invoices>
seen some solution where there are many
FOR XML AUTO
put all together, but the data here comes from connected table, would be a pity to re-query 2-3 times same values
how can achieve it?
Thanks

Try changing the select order around to this;
select
[invoice].*,
[payment].payerID,
[items].picture,
[rows].*
from InvoicesHeader [invoice]
join InvoicesRows [rows] on [rows].invoiceID=[invoice].invoiceID
join Payments [payments] on [payments].paymentID=[invoice].paymentID
join Items [items] on [items].itemID=[rows].itemID
FOR XML Auto, ROOT ('invoices'), ELEMENTS

well, found that have to use FOR XML PATH instead and add the other table as subquery with each FOR XML PATH as follows:
select
[invoice].*,
p.payerID,
(select r.* from InvoiceRows r where r.invoiceID=i.invoiceID for XML PATH ('rows'), type)
from InvoicesHeader i
join payment p on i.paymentID=p.paymentID
FOR XML PATH('invoice'), ROOT ('invoices'), ELEMENTS

Related

T-SQL XPath query including Parent

This problem keeps messing around with my Friday afternoon:
I have this XML:
declare #xml as XML
set #xml =
'<fields>
<field>
<id>1</id>
<items>
<item>
<name>name1_1</name>
<value>value1_1</value>
</item>
<item>
<name>name1_2</name>
<value>value1_2</value>
</item>
</items>
</field>
<field>
<id>2</id>
<items>
<item>
<name>name2_1</name>
<value>value2_1</value>
</item>
<item>
<name>name2_2</name>
<value>value2_2</value>
</item>
</items>
</field>
</fields>'
Using T-SQL and XPath, I need a query to get this result:
id name value
1 name1_1 value1_1
1 name1_2 value1_2
2 name2_1 value2_1
2 name2_2 value2_2
I'm getting name and value with:
SELECT c.value('name[1]', 'nvarchar(255)') name,
c.value('value[1]', 'nvarchar(255)') value
FROM #xml.nodes('fields/field/items/item') t(c)
...but how to insert the parent column "id"?
Your own code uses .nodes() to get a derived table from repeating elements. In your case there are two levels of repeating elements:
many fields and within each field
many items
You have to use .nodes() twice:
SELECT fld.value(N'(id/text())[1]',N'int') AS FieldID
,itm.value(N'(name/text())[1]',N'nvarchar(max)') AS ItemName
,itm.value(N'(value/text())[1]',N'nvarchar(max)') AS ItemValue
FROM #xml.nodes(N'/fields/field') AS A(fld)
OUTER APPLY A.fld.nodes(N'items/item') AS B(itm);
The first .nodes() comes back with XML fragments, one for each field, the second node is called for each of these field-fragments to pick their items.
Use OUTER APPLY if there might be fields without <item> nodes and CROSS APPLY when you do not want to see fields without <item> nodes (similar to LEFT JOIN vs INNER JOIN)
Assumption: there is only one id element per field.
SELECT c.value('../../id[1]', 'int') id,
c.value('name[1]', 'nvarchar(255)') name,
c.value('value[1]', 'nvarchar(255)') value
FROM #xml.nodes('fields/field/items/item') t(c)
The .. operator means "select parent of node" in XPATH. So the query will select the parent of item, then the parent of items, then the first child node id

Search XML files stored in a MS SQL database

I have over 500,000 XML files stored in a MS SQL data base such as the one below (which has been edited to save space in the question).
<?xml version="1.0"?>
<PROJECTS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<APPLICATION_ID>7000518</APPLICATION_ID>
<ACTIVITY>C06</ACTIVITY>
<ADMINISTERING_IC>RR</ADMINISTERING_IC>
<APPLICATION_TYPE>1</APPLICATION_TYPE>
<BUDGET_START>09/01/2009</BUDGET_START>
<BUDGET_END>09/30/2013</BUDGET_END>
<FULL_PROJECT_NUM>1C06RR020539-01A1</FULL_PROJECT_NUM>
<FY>2009</FY>
<ORG_STATE>CA</ORG_STATE>
<ORG_ZIPCODE>900952000</ORG_ZIPCODE>
<PIS>
<PI>
<PI_NAME>JONES,MARY</PI_NAME>
<PI_ID>9876543</PI_ID>
</PI>
<PI>
<PI_NAME>DOE, JOHN</PI_NAME>
<PI_ID>1234567</PI_ID>
</PI>
</PIS>
<PROJECT_TERMSX>
<TERM>Extramural Activities</TERM>
<TERM>Extramural Research Facilities Construction Project</TERM>
</PROJECT_TERMSX>
<PROJECT_TITLE>The Center for Oral/Research</PROJECT_TITLE>
<SUPPORT_YEAR>1</SUPPORT_YEAR>
</row>
</PROJECTS>
I can search for any of the single nodes using something like:
SELECT nref.value('(APPLICATION_ID)[1]', 'Int') APPLICATION_ID,
nref.value('(ACTIVITY)[1]', 'varchar(3)') ACTIVITY
FROM [XML_2010] cross apply XMLData.nodes('//PROJECTS/row') as R(nref)
WHERE nref.value('(CORE_PROJECT_NUM)[1]', 'varchar(25)') LIKE '%CA187342%'
But how can I find the data associated with all XML files that have DOE, JOHN as a PI which is a sub node to PIS? Such as the APPLICATION_ID and BUDGET_START etc?
Thanks for the help
XML is great for archives and data exchange, but is the wrong container to store actively used / filtered / searched data. Therefore I'd strongly suggest to transfer all your data in classical, indexed tables like this:
Attention I reduce your XML to some examples per level, the rest follows the same approach and is up to you. The declared table variable is to mock-up a test scenario:
DECLARE #YourTable TABLE(ID INT IDENTITY,YourXml XML);
INSERT INTO #YourTable VALUES
('<PROJECTS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<APPLICATION_ID>7000518</APPLICATION_ID>
<ACTIVITY>C06</ACTIVITY>
<!-- more first level elements like above -->
<!-- Here there are multiple PIs -->
<PIS>
<PI>
<PI_NAME>JONES,MARY</PI_NAME>
<PI_ID>9876543</PI_ID>
</PI>
<PI>
<PI_NAME>DOE, JOHN</PI_NAME>
<PI_ID>1234567</PI_ID>
</PI>
</PIS>
<!-- Here there are multiple PROJECT_TERMS -->
<PROJECT_TERMSX>
<TERM>Extramural Activities</TERM>
<TERM>Extramural Research Facilities Construction Project</TERM>
</PROJECT_TERMSX>
<!-- These are normal first level elements again -->
<PROJECT_TITLE>The Center for Oral/Research</PROJECT_TITLE>
<SUPPORT_YEAR>1</SUPPORT_YEAR>
</row>
</PROJECTS>');
--This SELECT reads all first-level-data together with the partial XMLs into a temp table #Projects:
SELECT r.value('(APPLICATION_ID/text())[1]','bigint') AS APPLICATION_ID
,r.value('(ACTIVITY/text())[1]','nvarchar(max)') AS ACTIVITY
--more columns like above
,r.query('PIS') AS AllPis
,r.query('PROJECT_TERMSX') AS AllProjectTerms
--more first level columns
INTO #Projects
FROM #YourTable AS t
OUTER APPLY t.YourXml.nodes('/PROJECTS/row') AS A(r);
--This SELECT reads from #Projects and stores all related PI-data in another temp table #PIs
SELECT APPLICATION_ID
,p.value('(PI_ID/text())[1]','bigint') AS PI_ID
,p.value('(PI_NAME/text())[1]','nvarchar(max)') AS PI_NAME
INTO #PIs
FROM #Projects AS p
OUTER APPLY p.AllPis.nodes('PIS/PI') AS A(p);
--Same with #Terms
SELECT APPLICATION_ID
,t.value('(./text())[1]','nvarchar(max)') AS TERM
INTO #Terms
FROM #Projects AS p
OUTER APPLY p.AllProjectTerms.nodes('PROJECT_TERMSX/TERM') AS A(t);
--This is now the content of your temp tables
SELECT * FROM #Projects;
SELECT * FROM #PIs;
SELECT * FROM #Terms;
--Clean up
GO
DROP TABLE #Projects;
DROP TABLE #PIs;
DROP TABLE #Terms;
Before the Clean up you will enter some code, which writes your data out of these staging tables into real tables. The IDs to define the relation are stored together with the data. This should be easy. You will need INSERT INTO or MERGE, depending if you have to deal with already existing data.
Hint
You might think about a m:n-relation between projects and PIs and projects and terms. For this you'd write a separate PI-table and a separate Term-table with a mapping table in between (holding the application_id and the second id, both as foreign keys)

SQL Server query xml column

I need to pull values from an XML column. The table contains 3 fields with one being an XML column like below:
TransID int,
Place varchar(20),
Custom XML
The XML column is structured as following:
<Fields>
<Field>
<Id>9346-00155D1C204E</Id>
<TransactionCode>0710</TransactionCode>
<Amount>5.0000</Amount>
</Field>
<Field>
<Id>A6F0-BA07EF3A7D43</Id>
<TransactionCode>0885</TransactionCode>
<Amount>57.9000</Amount>
</Field>
<Field>
<Id>9BDA-7858FD182Z3C</Id>
<TransactionCode>0935</TransactionCode>
<Amount>25.85000</Amount>
</Field>
</Fields>
I need to be able to query the xml column and return only the value for the <Amount> if there is a <Transaction code> = 0935. Note: there are records where this transaction code isn’t present, but it won't exist in the same record twice.
This is probably simple, but I’m having a problem returning just the <amount> value where the <transaction code> = 0935.
You can try this way :
DECLARE #transCode VARCHAR(10) = '0935'
SELECT field.value('Amount[1]', 'decimal(18,5)') as Amount
FROM yourTable t
OUTER APPLY t.Custom.nodes('/Fields/Field[TransactionCode=sql:variable("#transCode)"]') as x(field)
Alternatively, you can put logic for filtering Field by TransactionCode in SQL WHERE clause instead of in XPath expression, like so :
DECLARE #transCode VARCHAR(10) = '0935'
SELECT field.value('Amount[1]', 'decimal(18,5)') as Amount
FROM yourTable t
OUTER APPLY t.Custom.nodes('/Fields/Field') as x(field)
WHERE field.value('TransactionCode[1]', 'varchar(10)') = #transCode
SQL Fiddle Demo
You can use an XPath like this in your TSQL:
SELECT
*,
Custom.value('(/Fields/Field[#Name="Id"]/#Value)[1]', 'varchar(50)')
FROM YourTable
WHERE Custom.value('(/Fields/Field[#Name="Id"]/#Value)[1]', 'varchar(50)') = '0655'

Finding Duplicate Values in XML Column

In my SQL Server DB I have table with an XML column. The XML that goes in it is like the sample below:
<Rows>
<Row>
<Name>John</Name>
</Row>
<Row>
<Name>Debbie</Name>
</Row>
<Row>
<Name>Annie</Name>
</Row>
<Row>
<Name>John</Name>
</Row>
</Rows>
I have a requirement that I need to find the occurrence of all rows where the XML data has duplicate entries of <Name>. For example, above we have 'John' twice in the XML.
I can use the exist XML statement to find 1 occurrence, but how can I find if it's more than 1? Thanks.
To identify any table row that has duplicate <Name> values in its XML, you can use exist as well:
exist('//Name[. = preceding::Name]')
To identify which names are duplicates, respectively, you need nodes and CROSS APPLY
SELECT
t.id,
x.Name.value('.', 'varchar(100)') AS DuplicateName
FROM
MyTable t
CROSS APPLY t.MyXmlColumn.nodes('//Name[. = preceding::Name]') AS x(Name)
WHERE
t.MyXmlColumn.exist('//Name[. = preceding::Name]')
Try this:
;with cte as
(SELECT tbl.col.value('.[1]', 'varchar(100)') as name
FROM yourtable
CROSS APPLY xmlcol.nodes('/Rows/Row/Name') as tbl(col))
select name
from cte
group by name
having count(name) > 1
We first use the nodes function to convert from XML to relational data, then use value to get the text inside the Name node. We then put the result of the previous step into a CTE, and use a simple group by to get the value with multiple occurences.
Demo

Select XML from varchar(max) column

I have some XML data stored in a varchar(max) column on SQL Server 2005. The data is in the form (FQTN = fully qualified type name):
<?xml version="1.0" encoding="utf-16"?>
<History xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<EntityViews>
<EntityProxy Type="FQTN" Key="386876" />
<EntityProxy Type="FQTN" Key="387981" />
<!-- etc. -->
</EntityViews>
</History>
How can I select Type, Key so that I get a tabular result from the XML data in this column for a single row? The table has an identity primary key named HistoryId.
;with cteCastToXML as (
select CAST(YourColumn as xml) as x
from YourTable
)
select h.ep.value('#Type','varchar(10)') as [Type],
h.ep.value('#Key', 'varchar(10)') as [Key]
from cteCastToXML
cross apply x.nodes('/History/EntityViews/EntityProxy') as h(ep)
My recommendation would be two fold.
If this is what you will be doing with the column, change the column to be an XML column.
If you need to do this one time, look at taking the value and converting it to XML, then you can operate on it like you would normally. (Here is a link on how to convert).

Resources