XQuery to combine node values with Group By logic - sql-server

I’m looking for an XQuery that will take:
<root>
<entity>
<entityid>1</entityid>
<sometext>this is some text</sometext>
</entity>
<entity>
<entityid>1</entityid>
<sometext>this is some more text</sometext>
</entity>
</root>
And produce a recordset like:
Entityid sometext
1 this is some textthis is some more text
Essentially, combining the values in the 'sometext' nodes while grouping by the entityid. I figured I might be able to accomplish this with loops, but wasn't sure if there was a better way, possibly with a join/group by

declare #XML xml =
'<root>
<entity>
<entityid>1</entityid>
<sometext>this is some text</sometext>
</entity>
<entity>
<entityid>1</entityid>
<sometext>this is some more text</sometext>
</entity>
<entity>
<entityid>2</entityid>
<sometext>Another entity</sometext>
</entity>
</root>';
select T.entityid,
#XML.query('/root/entity[entityid = sql:column("T.entityid")]/sometext').value('.', 'nvarchar(max)') as sometext
from (
select distinct T.N.value('entityid[1]', 'int') as entityid
from #XML.nodes('/root/entity') as T(N)
) as T;
Result:
entityid sometext
----------- -----------------------------------------
1 this is some textthis is some more text
2 Another entity

You could also do that using a more XQuery-based solution, eg
DECLARE #xml XML = '<root>
<entity>
<entityid>1</entityid>
<sometext>this is some text</sometext>
</entity>
<entity>
<entityid>1</entityid>
<sometext>this is some more text</sometext>
</entity>
<entity>
<entityid>2</entityid>
<sometext>Another entity</sometext>
</entity>
</root>'
select
x.c.value('#entityId', 'int') entityId,
x.c.value('.', 'varchar(max)') someText
from
(
select #xml.query('for $e in distinct-values(root/entity/entityid)
return <m entityId = "{$e}">{data(root/entity[entityid = $e]/sometext)}</m>')
) r(c)
cross apply r.c.nodes('m') x(c)
Thanks to Mikael for xml / extra scenario.

Related

mssql parse nested xml

I have an xml message that I need to get the test information out of and into a table using a stored procedure.
I've been using this query:
select distinct
'N' as ORIGSTS,
doc1.Samples.value('(ID)[1]', 'nvarchar(20)') as 'SAMPLE_ID',
doc2.Tests.value('(Name)[1]', 'nvarchar(20)') as 'TEST_NAME'
from
#messageXml.nodes('/CDFAOrderMsg/Samples/Sample') as doc1(Samples),
#messageXml.nodes('/CDFAOrderMsg/Samples/Sample/Tests/Test') as doc2(Tests)
where doc1.Samples.value('(ID)[1]', 'nvarchar(20)') = 456
order by 2, 3
The problem is that it returns the sample ID 456 along with all tests listed in the message. I need to be able to extract the test names along with their associated sample Id to insert into a table. Currently, with two samples and three tests each it returns 12 rows when it should only return 6.
How can I make it return a list of all samples along with their respective test names?
Thanks,
Scott
<OrderMsg xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Samples>
<SourceType>Non-Animal</SourceType>
<Sample>
<ID>456</ID>
<Tests>
<Test>
<Name>SPC</Name>
</Test>
<Test>
<Name>COL</Name>
</Test>
<Test>
<Name>ANTI</Name>
</Test>
</Tests>
</Sample>
<Sample>
<ID>457</ID>
<Tests>
<Test>
<Name>HPC</Name>
</Test>
<Test>
<Name>DEL</Name>
</Test>
<Test>
<Name>NVT</Name>
</Test>
</Tests>
</Sample>
</Samples>
</OrderMsg>
Here is a query that gives the expected result using a outer apply function to get the child nodes collection.
DECLARE #x xml
SET #x = '<OrderMsg xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Samples>
<SourceType>Non-Animal</SourceType>
<Sample>
<ID>456</ID>
<Tests>
<Test>
<Name>SPC</Name>
</Test>
<Test>
<Name>COL</Name>
</Test>
<Test>
<Name>ANTI</Name>
</Test>
</Tests>
</Sample>
<Sample>
<ID>457</ID>
<Tests>
<Test>
<Name>HPC</Name>
</Test>
<Test>
<Name>DEL</Name>
</Test>
<Test>
<Name>NVT</Name>
</Test>
</Tests>
</Sample>
</Samples>
</OrderMsg>'
SELECT DISTINCT
'N' AS ORIGSTS,
s.sampleNode.query('ID').value('.', 'nvarchar(20)') AS 'SAMPLE_ID',
t.testNode.query('Test/Name').value('.', 'nvarchar(20)') AS 'TEST_NAME'
FROM #x.nodes('//Samples/Sample') s (sampleNode)
OUTER APPLY (SELECT
x.testNode.query('.') testNode
FROM sampleNode.nodes('Tests/Test') x (testNode)) t
WHERE s.sampleNode.value('(ID)[1]', 'nvarchar(20)') = 456
ORDER BY 2, 3

TSQL CTE hierarchy

I am trying to create a CTE on my table to pull in a hierarchy of employees.
I have a starting point which is a "Director" and I want to find everyone that reports to each person under them.
Here is what I have so far:
;WITH EmpTable_CTE (FirstName, LastName, QID, Email) AS
(
SELECT FirstName,
LastName,
QID,
Email
FROM EmployeeTable E
WHERE QID = '12345'
UNION ALL
SELECT E.FirstName,
E.LastName,
E.QID,
E.Email
FROM EmployeeTable E
INNER JOIN EmpTable_CTE AS E2 ON E.MgrQID = E2.QID
)
SELECT * FROM EmpTable_CTE
This seems to work providing me a list of employees but there is no "hierarchy" to it.
How can I go about using FOR XML to create the hierarchy that I am looking for?
<Director>Bob Smith</Director>
<Direct>Jim Smith</Direct>
<Direct>Employee 1</direct>
<Direct>Employee 2</direct>
<Direct>Employee 3</direct>
<Direct>Bob Jones</Direct>
<Direct>Employee 1</direct>
<Direct>Employee 2</direct>
<Direct>Employee 3</direct>
<Direct>Employee A</direct>
I'm sure its just a matter of placing the FOR XML line somewhere but cant quite figure it out.
Update: Here is a SQL Fiddle of sample data:
http://sqlfiddle.com/#!6/a48f6/1
This is how I would expect the data to be from the fiddle:
<Director>Jim Jones</Director>
<Direct>Bob Jones</Direct>
<Direct>Jake Jones</Direct>
<Direct>Smith Jones</Direct>
<Direct>Carl Jones</Direct>
<Direct>Bobby Jones</Direct>
<Direct>Danny Jones</Direct>
<Direct>Billy Jones</Direct>
Part of the difficulty is in the XML structure you presented - If you passed that into a parser, it would all be flat, and running the results of my process below without stuffing the First and Last name into an attribute made the nodes come out in mixed content (text with nodes on the same level).
So, I went searching and found this little gem here on SE. Adapting it to your needs, and throwing in a few fields as attributes from your table, I came up with this:
CREATE FUNCTION dbo.EmpHierarchyNode(#QID int)
RETURNS XML
WITH RETURNS NULL ON NULL INPUT
BEGIN RETURN
(SELECT QID AS "#ID", Email AS "#Email",
FirstName + ' ' + LastName AS "#Name",
CASE WHEN MgrQID = #QID
THEN dbo.EmpHierarchyNode(QID)
END
FROM dbo.EmployeeTable
WHERE MgrQID = #QID
FOR XML PATH('Direct'), TYPE)
END
SELECT QID AS "#ID", Email AS "#Email",
FirstName + ' ' + LastName AS "#Name",
dbo.EmpHierarchyNode(QID)
FROM dbo.EmployeeTable
WHERE MgrQID IS NULL
FOR XML PATH('Director'), TYPE
Essentially, this traverses down in the hierarchy, recursively calling itself. The CTE isn't sufficient if your output is targeted for XML. Using this, and what I could glean of your sample data, I got this as a result:
<Director ID="1" Email="bsmith#someCompany.com" Name="Bob Smith">
<Direct ID="2" Email="jsmith#someCompany.com" Name="Jim Smith">
<Direct ID="4" Email="e1#someCompany.com" Name="Employee 1" />
<Direct ID="5" Email="e2#someCompany.com" Name="Employee 2" />
<Direct ID="7" Email="e4#someCompany.com" Name="Employee 4" />
</Direct>
<Direct ID="3" Email="bjones#someCompany.com" Name="Bob Jones">
<Direct ID="6" Email="e3#someCompany.com" Name="Employee 3" />
<Direct ID="8" Email="e5#someCompany.com" Name="Employee 5" />
<Direct ID="9" Email="e6#someCompany.com" Name="Employee 6" />
</Direct>
</Director>
Hope this helps.
EDIT: Last Minute SQLFiddle Example.
See if it meets yours requirement:
;WITH EmpTable_CTE (FirstName, LastName, QID, Email) AS
(
SELECT FirstName,
LastName,
QID,
Email
FROM EmployeeTable E
WHERE QID = 1
UNION ALL
SELECT E.FirstName,
E.LastName,
E.QID,
E.Email
FROM EmployeeTable E
INNER JOIN EmpTable_CTE AS E2
ON E.MgrQID = E2.QID
)
SELECT LastName + ', ' + FirstName FROM EmpTable_CTE FOR XML PATH('Direct'), ROOT('Director'), TYPE

XQuery parent and child

Example:
DECLARE #XML XML = '
<Items>
<document id="doc1" value="100">
<details>
<detail detailID="1" detailValue="20"/>
<detail detailID="2" detailValue="80"/>
</details>
</document>
<document id="doc2" value="0">
<details>
</details>
</document>
</Items>
'
I want results like this:
id value detailID detailValue
doc1 100 1 20
doc1 100 2 80
doc2 0 NULL NULL
Tried:
SELECT document.value('../../#docID', 'VARCHAR(10)') AS 'docID',
document.value('../../#value', 'INT') AS 'value',
document.value('#detailID', 'VARCHAR(10)') AS 'detailID',
document.value('#detailValue', 'INT') AS 'detailValue'
FROM #XML.nodes('Items/document/details/detail') AS Documents(document)
But, doc2 is not listed... Also, tried with CROSS JOIN and INNER JOIN, but performance is very bad.
Try this:
SELECT document.value('#id', 'VARCHAR(10)') AS docID,
document.value('#value', 'INT') AS value,
Detail.value('#detailID', 'INT') as DetailId,
Detail.value('#detailValue', 'INT') as DetailValue
FROM #XML.nodes('Items/document') AS Documents(document)
outer apply Documents.document.nodes('details/detail') as Details(Detail);
Just one added detail:
#XML.nodes('//whatever_depth') AS Documents(document)
Using '//' Allows you to query not directly from root
Regards,
Dennes

TSQL FOR XML EXPLICIT

Not able to to get the desired XML output
The following:
SELECT 1 as Tag,
0 as Parent,
sID as [Document!1!sID],
docID as [Document!1!docID],
null as [To!2!value]
FROM docSVsys with (nolock)
where docSVsys.sID = '57'
UNION ALL
SELECT 2 as Tag,
1 as Parent,
sID,
NULL,
value
FROM docMVtext
WHERE docMVtext.sID = '57'
ORDER BY [Document!1!sID],[To!2!value]
FOR XML EXPLICIT;
Produces:
<Document sID="57" docID="3.818919.C41P3UKK00BRICLAY0AR1ET2EBPYSU4SA">
<To value="Frank Ermis" />
<To value="Keith Holst" />
<To value="Mike Grigsby" />
</Document>
What I want is:
<Document sID="57">
<docID>3.818919.C41P3UKK00BRICLAY0AR1ET2EBPYSU4SA</docID>
<To>
<Value>Frank Ermis</Value>
<Value>Keith Holst</Value>
<Value>Mike Grigsby</Value>
</To>
</Document>
Can I get that ouput with FOR XML?
Ok I get they may be technically equivalent.
What I want and what I need are not the same.
Using xDocument for this is is SLOW.
There are millions of documents and need to XML up to 1 million at a time to XML.
The TSQL FOR XML is super fast.
I just need to get FOR XML to format.
The solution (based on accepted answer):
SELECT top 4
[sv].[sID] AS '#sID'
,[sv].[sParID] AS '#sParID'
,[sv].[docID] AS 'docID'
,[sv].addDate as 'addDate'
,(SELECT [value] AS 'value'
FROM [docMVtext] as [mv]
WHERE [mv].[sID] = [sv].[sID]
AND [mv].[fieldID] = '113'
ORDER BY [mv].[value]
FOR XML PATH (''), type
) AS "To"
,(SELECT [value] AS 'value'
FROM [docMVtext] as [mv]
WHERE [mv].[sID] = [sv].[sID]
AND [mv].[fieldID] = '130'
ORDER BY [mv].[value]
FOR XML PATH (''), type
) AS "MVtest"
FROM [docSVsys] as [sv]
WHERE [sv].[sID] >= '57'
ORDER BY
[sv].[sParID], [sv].[sID]
FOR XML PATH('Document'), root('Documents')
Produces:
<Documents>
<Document sID="57" sParID="57">
<docID>3.818919.C41P3UKK00BRICLAY0AR1ET2EBPYSU4SA</docID>
<addDate>2011-10-28T12:26:00</addDate>
<To>
<value>Frank Ermis</value>
<value>Keith Holst</value>
<value>Mike Grigsby</value>
</To>
<MVtest>
<value>MV test 01</value>
<value>MV test 02</value>
<value>MV test 03</value>
<value>MV test 04</value>
</MVtest>
</Document>
<Document sID="58" sParID="57">
<docID>3.818919.C41P3UKK00BRICLAY0AR1ET2EBPYSU4SA.1</docID>
<addDate>2011-10-28T12:26:00</addDate>
</Document>
<Document sID="59" sParID="59">
<docID>3.818920.KJKP5LYKTNIODOEI4JDOKJ2BXJI5P0BIA</docID>
<addDate>2011-10-28T12:26:00</addDate>
<To>
<value>Vladimir Gorny</value>
</To>
</Document>
<Document sID="60" sParID="59">
<docID>3.818920.KJKP5LYKTNIODOEI4JDOKJ2BXJI5P0BIA.1</docID>
<addDate>2011-10-28T12:26:00</addDate>
</Document>
</Documents>
Now what I need to do is to add a DispName attribute to the element MVtext. Attribute cannot have any spaces and I would like to include the friendly name e.g. Multi Value Text.
Try something like this (untested, since I don't have your database tables to test against...):
SELECT
sv.sID AS '#sID',
sv.docID AS 'docID',
(SELECT
value AS 'value'
FROM
dbo.docMVtext mv
WHERE
mv.sID = sv.sID
ORDER BY mv.value
FOR XML PATH (''), TYPE) AS 'To'
FROM
dbo.docSVsys sv
WHERE
sv.sID = '57'
ORDER BY
sv.sID
FOR XML PATH('Document')
Does that give you what you're looking for?? And don't you agree with John and me: this is much simpler than FOR XML EXPLICIT.....
From Examples: Using PATH Mode:
USE AdventureWorks2008R2;
GO
SELECT ProductModelID AS "#ProductModelID",
Name AS "#ProductModelName",
(SELECT ProductID AS "data()"
FROM Production.Product
WHERE Production.Product.ProductModelID =
Production.ProductModel.ProductModelID
FOR XML PATH ('')
) AS "#ProductIDs",
(
SELECT Name AS "ProductName"
FROM Production.Product
WHERE Production.Product.ProductModelID =
Production.ProductModel.ProductModelID
FOR XML PATH (''), type
) AS "ProductNames"
FROM Production.ProductModel
WHERE ProductModelID= 7 OR ProductModelID=9
FOR XML PATH('ProductModelData');

TSQL - XML query help

I have an XML in this format
<tests>
<test>
<testid>1</testid>
<testval>8</testval>
<testname>
<testid>1</testid>
<testname>test 1</testname>
</testname>
</test>
<test>
<testid>2</testid>
<testval>5</testval>
<testname>
<testid>2</testid>
<testname>test 2</testname>
</testname>
</test>
</tests>
using TSQL/XML query how do I achieve this result
[Testid][TestVal][TestName]
1 8 Test 1
2 5 Test 2
Try this:
declare #input XML = '<tests>
<test>
<testid>1</testid>
<testval>8</testval>
<testname>
<testid>1</testid>
<testname>test 1</testname>
</testname>
</test>
<test>
<testid>2</testid>
<testval>5</testval>
<testname>
<testid>2</testid>
<testname>test 2</testname>
</testname>
</test>
</tests>'
select
Tests.value('(testid)[1]', 'int') as 'TestID',
Tests.value('(testval)[1]', 'int') as 'TestVal',
Tests.value('(testname/testname)[1]', 'varchar(20)') as 'TestName'
FROM
#input.nodes('/tests/test') as List(Tests)
This gives you the desired output.
If you have a table of those XML columns, you might need to use a slightly different approach (using CROSS APPLY):
select
tbl.SomeValue, tbl.SomeOtherValue,
Tests.value('(testid)[1]', 'int') as 'TestID',
Tests.value('(testval)[1]', 'int') as 'TestVal',
Tests.value('(testname/testname)[1]', 'varchar(20)') as 'TestName'
FROM
dbo.YourTable tbl
CROSS APPLY
tbl.XmlColumn.nodes('/tests/test') as List(Tests)

Resources