Getting multiple values from same xml column in SQL Server - sql-server

I want to get the values from same xml node under same element.
Sample data:
I have to select all <award_number> values.
This is my SQL code:
DECLARE #xml XML;
DECLARE #filePath varchar(max);
SET #filePath = '<workFlowMeta><fundgroup><funder><award_number>0710564</award_number><award_number>1106058</award_number><award_number>1304977</award_number><award_number>1407404</award_number></funder></fundgroup></workFlowMeta>'
SET #xml = CAST(#filePath AS XML);
SELECT
REPLACE(Element.value('award_number','NVARCHAR(255)'), CHAR(10), '') AS award_num
FROM
#xml.nodes('workFlowMeta/fundgroup/funder') Datalist(Element);
Can't change this #xml.nodes('workFlowMeta/fundgroup/funder'), because I'm getting multiple node values inside funder node.
Can anyone please help me?

Since those <award_number> nodes are inside the <funder> nodes, and there could be several <funder> nodes (if I understood your question correctly), you need to use two .nodes() calls like this:
SELECT
XC.value('.', 'int')
FROM
#xml.nodes('/workFlowMeta/fundgroup/funder') Datalist(Element)
CROSS APPLY
Element.nodes('award_number') AS XT(XC)
The first .nodes() call gets all <funder> elements, and then the second call goes into each <funder> element to get all <award_number> nodes inside of that element and outputs the value of the <award_number> element as a INT (I couldn't quite understand what you're trying to do to the <award_number> value in your code sample....)

Your own code was very close, but
You are diving one level to low
You need to set a singleton XPath for .value(). In most cases this means a [1] at the end)
As you want to read many <award_number> elements, this is the level you have to step down in .nodes(). Reading these element's values is easy, once you have your hands on it:
SELECT
REPLACE(Element.value('text()[1]','NVARCHAR(255)'), CHAR(10), '') AS award_num
FROM
#xml.nodes('/workFlowMeta/fundgroup/funder/award_number') Datalist(Element);
What are you trying to do with the REPLACE()?
If all <arward_number> elements contain valid numbers, you should use int or bigint as target type and there shouldn't be any need to replace non-numeric characters. Try it like this:
SELECT Element.value('text()[1]','int') AS award_num
FROM #xml.nodes('/workFlowMeta/fundgroup/funder/award_number') Datalist(Element);
If marc_s is correct...
... and you have to deal with several <funder> groups, each of which contains several <award_number> nodes, go with his approach (two calls to .nodes())

Related

Parse XML using SQL

I'm using MS SQL2016 and I have an XML file that I need to parse to put various data elements into the separate fields. For the most part everything works find except I need a little help to identify a particular node value. If I have (I put only a snippet of the xml here but it does show the problem)
DECLARE #xmlString xml
SET #xmlString ='<PubmedArticle>
<MedlineCitation Status="PubMed-not-MEDLINE" Owner="NLM">
<PMID Version="1">25685064</PMID>
<Article PubModel="Electronic-eCollection">
<Journal>
<ISSN IssnType="Electronic">1234-5678</ISSN>
<ISSN IssnType="Print">1475-2867</ISSN>
<JournalIssue CitedMedium="Print">
<Volume>15</Volume>
<Issue>1</Issue>
<PubDate>
<Year>2015</Year>
</PubDate>
</JournalIssue>
</Journal>
</Article>
</MedlineCitation>
</PubmedArticle>'
select
nref.value('Article[1]/Journal[1]/ISSN[1]','varchar(max)') ISSN
from #xmlString.nodes ('//MedlineCitation[1]') as R(nref)
I bypass the second ISSNType and read the first value available. I need to pull both values. What do I need to change? Thanks
You can read as second column:
SELECT
nref.value('Article[1]/Journal[1]/ISSN[1]','varchar(max)') ISSN,
nref.value('Article[1]/Journal[1]/ISSN[2]','varchar(max)') ISSN2
FROM #xmlString.nodes('//MedlineCitation[1]') as R(nref)
Or
SELECT
nref.value('ISSN[1]','varchar(max)') ISSN,
nref.value('ISSN[2]','varchar(max)') ISSN2
FROM #xmlString.nodes('//MedlineCitation[1]/Article[1]/Journal[1]') as R(nref)
Or as a separate row:
SELECT nref.value('.','varchar(MAX)') ISSN
from #xmlString.nodes('//MedlineCitation[1]/Article[1]/Journal[1]/ISSN') as R(nref)
Update
If number of ISSNs may vary, I recommend normalize your resultset:
SELECT
nref.value('.','varchar(MAX)') Issn,
nref.value('#IssnType','varchar(MAX)') IssnType
FROM #xmlString.nodes('//MedlineCitation[1]/Article[1]/Journal[1]/ISSN') as R(nref)

SQL XML modify replace

I need help in replacing the XML tag value. Sample code is as follows:
declare #l_runtime_xml XML
declare #l_n_DrillRepID numeric(10)
declare #griddrillparam nvarchar(30)
declare #l_s_DrillBtColumnTag nvarchar(256)
declare #l_s_BTNameSecond nvarchar(30)
set #l_n_DrillRepID =1538
set #griddrillparam = 'userID'
set #l_s_DrillBtColumnTag = 'V_userID'
set #l_s_BTNameSecond = 'l_s_userID'
declare #l_runtime_xmlAAA nvarchar(max)
set #l_runtime_xmlAAA = N'<REPORT_RUNTIME_XML><USER_ID>AISHU</USER_ID><SHEET><SHEET_NO>1</SHEET_NO><DRILLTHRU_PARAM><ENT_RPT><ENT_RPT_ID>1537</ENT_RPT_ID><ENT_RPT_NAME>Reddy111</ENT_RPT_NAME><DEFAULT>1</DEFAULT><CRITERIA><DISPLAY/><HIDDEN/></CRITERIA><COLUMN_HEADER>N</COLUMN_HEADER><DISPLAY_BUTTON>N</DISPLAY_BUTTON><PARAMLIST><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>userID</NAME><BT_NAME/><V_userID>(none)</V_userID></PARAM><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>langID</NAME><BT_NAME/><V_langID>(none)</V_langID></PARAM><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>l_s_userID</NAME><BT_NAME/><V_l_s_userID>(none)</V_l_s_userID></PARAM><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>a_i_langID</NAME><BT_NAME/><V_a_i_langID>(none)</V_a_i_langID></PARAM></PARAMLIST></ENT_RPT><ENT_RPT><ENT_RPT_ID>1538</ENT_RPT_ID><ENT_RPT_NAME>Reddy333</ENT_RPT_NAME><DEFAULT>0</DEFAULT><CRITERIA><DISPLAY/><HIDDEN/></CRITERIA><COLUMN_HEADER>N</COLUMN_HEADER><DISPLAY_BUTTON>N</DISPLAY_BUTTON><PARAMLIST><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>userID</NAME><BT_NAME/><V_userID>(none)</V_userID></PARAM><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>langID</NAME><BT_NAME/><V_langID>(none)</V_langID></PARAM><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>l_s_userID</NAME><BT_NAME/><V_l_s_userID>(none)</V_l_s_userID></PARAM><PARAM><COLTYPE/><POSITION>-999</POSITION><BT_COLUMN_NAME/><NAME>a_i_langID</NAME><BT_NAME/><V_a_i_langID>(none)</V_a_i_langID></PARAM></PARAMLIST></ENT_RPT></DRILLTHRU_PARAM></SHEET></REPORT_RUNTIME_XML>'
select #l_runtime_xml = cast(#l_runtime_xmlAAA as XML)
SET #l_runtime_xml.modify('replace value of ((//SHEET/DRILLTHRU_PARAM/ENT_RPT[(ENT_RPT_ID/text())[1] eq sql:variable("#l_n_DrillRepID")]/PARAMLIST/PARAM[(NAME/text())[1] eq sql:variable("#griddrillparam")]/#l_s_DrillBtColumnTag))[1] with sql:variable("#l_s_BTNameSecond")')
set #l_runtime_xmlAAA = cast(#l_runtime_xml as nvarchar(max))
select #l_runtime_xmlAAA
I think your issue is here:
.../#l_s_DrillBtColumnTag))[1]
If I understand this correctly, you are trying to access an element of the name "V_userId" by putting a variable in that place. But this will not work...
Your question is not quite clear to me (and admittably your structure isn't either, it seems too complicated...). And what do you mean with delete the last child of a node?
Your query would be the following:
Find the "ENT_RPT" with the given number, there find the "PARAM" whose name is what's given and on the same level find the element with the name given, which is to be replaced (see local-name()). Replace this with the value given:
SET #l_runtime_xml.modify('replace value of (//SHEET/DRILLTHRU_PARAM/ENT_RPT[ENT_RPT_ID/text()=sql:variable("#l_n_DrillRepID")]/PARAMLIST/PARAM[NAME/text()=sql:variable("#griddrillparam")]/*[local-name(.)=sql:variable("#l_s_DrillBtColumnTag")]/text())[1] with sql:variable("#l_s_BTNameSecond")')
On the first sigth I assume, that you are dealing with named parameters. It was much better to put the value of these parameters in an element with the same name for all of them (e.g. <VALUE>something</VALUE>. They are specified via "NAME" anyway.
And even better was a structure with attributes, something like
...<PARAM Name="userName" position="-999" moreAttributes...>SomeValue</PARAM>
This would make your navigation much easier and less erronous...

SQL Server XQuery - Selecting a Subset

Take for example the following XML:
Initial Data
<computer_book>
<title>Selecting XML Nodes the Fun and Easy Way</title>
<isbn>9999999999999</isbn>
<pages>500</pages>
<backing>paperback</backing>
</computer_book>
and:
<cooking_book>
<title>50 Quick and Easy XML Dishes</title>
<isbn>5555555555555</isbn>
<pages>275</pages>
<backing>paperback</backing>
</cooking_book>
I have something similar in a single xml-typed column of a SQL Server 2008 database. Using SQL Server XQuery, would it be possible to get results such as this:
Resulting Data
<computer_book>
<title>Selecting XML Nodes the Fun and Easy Way</title>
<pages>500</pages>
</computer_book>
and:
<cooking_book>
<title>50 Quick and Easy XML Dishes</title>
<isbn>5555555555555</isbn>
</cooking_book>
Please note that I am not referring to selecting both examples in one query; rather I am selecting each via its primary key (which is in another column). In each case, I am essentially trying to select the root and an arbitrary subset of children. The roots can be different, as seen above, so I do not believe I can hard-code the root node name into a "for xml" clause.
I have a feeling SQL Server's XQuery capabilities will not allow this, and that is fine if it is the case. If I can accomplish this, however, I would greatly appreciate an example.
Here is the test data I used in the queries below:
declare #T table (XMLCol xml)
insert into #T values
('<computer_book>
<title>Selecting XML Nodes the Fun and Easy Way</title>
<isbn>9999999999999</isbn>
<pages>500</pages>
<backing>paperback</backing>
</computer_book>'),
('<cooking_book>
<title>50 Quick and Easy XML Dishes</title>
<isbn>5555555555555</isbn>
<pages>275</pages>
<backing>paperback</backing>
</cooking_book>')
You can filter the nodes under to root node like this using local-name() and a list of the node names you want:
select XMLCol.query('/*/*[local-name()=("isbn","pages")]')
from #T
Result:
<isbn>9999999999999</isbn><pages>500</pages>
<isbn>5555555555555</isbn><pages>275</pages>
If I understand you correctly the problem with this is that you don't get the root node back.
This query will give you an empty root node:
select cast('<'+XMLCol.value('local-name(/*[1])', 'varchar(100)')+'/>' as xml)
from #T
Result:
<computer_book />
<cooking_book />
From this I have found two solutions for you.
Solution 1
Get the nodes from your table to a table variable and then modify the XML to look like you want.
-- Table variable to hold the node(s) you want
declare #T2 table (RootNode xml, ChildNodes xml)
-- Fetch the xml from your table
insert into #T2
select cast('<'+XMLCol.value('local-name(/*[1])', 'varchar(100)')+'/>' as xml),
XMLCol.query('/*/*[local-name()=("isbn","pages")]')
from #T
-- Add the child nodes to the root node
update #T2 set
RootNode.modify('insert sql:column("ChildNodes") into (/*)[1]')
-- Fetch the modified XML
select RootNode
from #T2
Result:
RootNode
<computer_book><isbn>9999999999999</isbn><pages>500</pages></computer_book>
<cooking_book><isbn>5555555555555</isbn><pages>275</pages></cooking_book>
The sad part with this solution is that it does not work with SQL Server 2005.
Solution 2
Get the parts, build the XML as a string and cast it back to XML.
select cast('<'+XMLCol.value('local-name(/*[1])', 'varchar(100)')+'>'+
cast(XMLCol.query('/*/*[local-name()=("isbn","pages")]') as varchar(max))+
'</'+XMLCol.value('local-name(/*[1])', 'varchar(100)')+'>' as xml)
from #T
Result:
<computer_book><isbn>9999999999999</isbn><pages>500</pages></computer_book>
<cooking_book><isbn>5555555555555</isbn><pages>275</pages></cooking_book>
Making the nodes parameterized
In the queries above the nodes you get as child nodes is hard coded in the query. You can use sql:varaible() to do this instead. I have not found a way of making the number of nodes dynamic but you can add as many as you think you need and have null as value for the nodes you don't need.
declare #N1 varchar(10)
declare #N2 varchar(10)
declare #N3 varchar(10)
declare #N4 varchar(10)
set #N1 = 'isbn'
set #N2 = 'pages'
set #N3 = 'backing'
set #N4 = null
select cast('<'+XMLCol.value('local-name(/*[1])', 'varchar(100)')+'>'+
cast(XMLCol.query('/*/*[local-name()=(sql:variable("#N1"),
sql:variable("#N2"),
sql:variable("#N3"),
sql:variable("#N4"))]') as varchar(max))+
'</'+XMLCol.value('local-name(/*[1])', 'varchar(100)')+'>' as xml)
from #T
Result:
<computer_book><isbn>9999999999999</isbn><pages>500</pages><backing>paperback</backing></computer_book>
<cooking_book><isbn>5555555555555</isbn><pages>275</pages><backing>paperback</backing></cooking_book>

In SQL Server can I insert multiple nodes into XML from a table?

I want to generate some XML in a stored procedure based on data in a table.
The following insert allows me to add many nodes but they have to be hard-coded or use variables (sql:variable):
SET #MyXml.modify('
insert
<myNode>
{sql:variable("#MyVariable")}
</myNode>
into (/root[1]) ')
So I could loop through each record in my table, put the values I need into variables and execute the above statement.
But is there a way I can do this by just combining with a select statement and avoiding the loop?
Edit I have used SELECT FOR XML to do similar stuff before but I always find it hard to read when working with a hierarchy of data from multiple tables. I was hoping there would be something using the modify where the XML generated is more explicit and more controllable.
Have you tried nesting FOR XML PATH scalar valued functions?
With the nesting technique, you can brake your SQL into very managable/readable elemental pieces
Disclaimer: the following, while adapted from a working example, has not itself been literally tested
Some reference links for the general audience
http://msdn2.microsoft.com/en-us/library/ms178107(SQL.90).aspx
http://msdn2.microsoft.com/en-us/library/ms189885(SQL.90).aspx
The simplest, lowest level nested node example
Consider the following invocation
DECLARE #NestedInput_SpecificDogNameId int
SET #NestedInput_SpecificDogNameId = 99
SELECT [dbo].[udfGetLowestLevelNestedNode_SpecificDogName]
(#NestedInput_SpecificDogNameId)
Let's say had udfGetLowestLevelNestedNode_SpecificDogName had been written without the FOR XML PATH clause, and for #NestedInput_SpecificDogName = 99 it returns the single rowset record:
#SpecificDogNameId DogName
99 Astro
But with the FOR XML PATH clause,
CREATE FUNCTION dbo.udfGetLowestLevelNestedNode_SpecificDogName
(
#NestedInput_SpecificDogNameId
)
RETURNS XML
AS
BEGIN
-- Declare the return variable here
DECLARE #ResultVar XML
-- Add the T-SQL statements to compute the return value here
SET #ResultVar =
(
SELECT
#SpecificDogNameId as "#SpecificDogNameId",
t.DogName
FROM tblDogs t
FOR XML PATH('Dog')
)
-- Return the result of the function
RETURN #ResultVar
END
the user-defined function produces the following XML (the # signs causes the SpecificDogNameId field to be returned as an attribute)
<Dog SpecificDogNameId=99>Astro</Dog>
Nesting User-defined Functions of XML Type
User-defined functions such as the above udfGetLowestLevelNestedNode_SpecificDogName can be nested to provide a powerful method to produce complex XML.
For example, the function
CREATE FUNCTION [dbo].[udfGetDogCollectionNode]()
RETURNS XML
AS
BEGIN
-- Declare the return variable here
DECLARE #ResultVar XML
-- Add the T-SQL statements to compute the return value here
SET #ResultVar =
(
SELECT
[dbo].[udfGetLowestLevelNestedNode_SpecificDogName]
(t.SpecificDogNameId)
FROM tblDogs t
FOR XML PATH('DogCollection') ELEMENTS
)
-- Return the result of the function
RETURN #ResultVar
END
when invoked as
SELECT [dbo].[udfGetDogCollectionNode]()
might produce the complex XML node (given the appropriate underlying data)
<DogCollection>
<Dog SpecificDogNameId="88">Dino</Dog>
<Dog SpecificDogNameId="99">Astro</Dog>
</DogCollection>
From here, you could keep working upwards in the nested tree to build as complex an XML structure as you please
CREATE FUNCTION [dbo].[udfGetAnimalCollectionNode]()
RETURNS XML
AS
BEGIN
DECLARE #ResultVar XML
SET #ResultVar =
(
SELECT
dbo.udfGetDogCollectionNode(),
dbo.udfGetCatCollectionNode()
FOR XML PATH('AnimalCollection'), ELEMENTS XSINIL
)
RETURN #ResultVar
END
when invoked as
SELECT [dbo].[udfGetAnimalCollectionNode]()
the udf might produce the more complex XML node (given the appropriate underlying data)
<AnimalCollection>
<DogCollection>
<Dog SpecificDogNameId="88">Dino</Dog>
<Dog SpecificDogNameId="99">Astro</Dog>
</DogCollection>
<CatCollection>
<Cat SpecificCatNameId="11">Sylvester</Cat>
<Cat SpecificCatNameId="22">Tom</Cat>
<Cat SpecificCatNameId="33">Felix</Cat>
</CatCollection>
</AnimalCollection>
Use sql:column instead of sql:variable. You can find detailed info here: http://msdn.microsoft.com/en-us/library/ms191214.aspx
Can you tell a bit more about what exactly you are planning to do.
Is it simply generating XML data based on a content of the table
or adding some data from the table to an existing xml structure?
There are great series of articles on the subject on XML in SQLServer written by Jacob Sebastian, it starts with the basics of generating XML from the data in the table

Using SQL Server 2005's XQuery select all nodes with a specific attribute value, or with that attribute missing

Update: giving a much more thorough example.
The first two solutions offered were right along the lines of what I was trying to say not to do. I can't know location, it needs to be able to look at the whole document tree. So a solution along these lines, with /Books/ specified as the context will not work:
SELECT x.query('.') FROM #xml.nodes('/Books/*[not(#ID) or #ID = 5]') x1(x)
Original question with better example:
Using SQL Server 2005's XQuery implementation I need to select all nodes in an XML document, just once each and keeping their original structure, but only if they are missing a particular attribute, or that attribute has a specific value (passed in by parameter). The query also has to work on the whole XML document (descendant-or-self axis) rather than selecting at a predefined depth.
That is to say, each individual node will appear in the resultant document only if it and every one of its ancestors are missing the attribute, or have the attribute with a single specific value.
For example:
If this were the XML:
DECLARE #Xml XML
SET #Xml =
N'
<Library>
<Novels>
<Novel category="1">Novel1</Novel>
<Novel category="2">Novel2</Novel>
<Novel>Novel3</Novel>
<Novel category="4">Novel4</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
<Volume category="2">G-L</Volume>
<Volume category="3">M-S</Volume>
<Volume category="4">T-Z</Volume>
</Encyclopedia>
</Encyclopedias>
<Dictionaries category="1">
<Dictionary>Webster</Dictionary>
<Dictionary>Oxford</Dictionary>
</Dictionaries>
</Library>
'
A parameter of 1 for category would result in this:
<Library>
<Novels>
<Novel category="1">Novel1</Novel>
<Novel>Novel3</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
</Encyclopedia>
</Encyclopedias>
<Dictionaries category="1">
<Dictionary>Webster</Dictionary>
<Dictionary>Oxford</Dictionary>
</Dictionaries>
</Library>
A parameter of 2 for category would result in this:
<Library>
<Novels>
<Novel category="2">Novel2</Novel>
<Novel>Novel3</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
<Volume category="2">G-L</Volume>
</Encyclopedia>
</Encyclopedias>
</Library>
I know XSLT is perfectly suited for this job, but it's not an option. We have to accomplish this entirely in SQL Server 2005. Any implementations not using XQuery are fine too, as long as it can be done entirely in T-SQL.
It's not clear for me from your example what you're actually trying to achieve. Do you want to return a new XML with all the nodes stripped out except those that fulfill the condition? If yes, then this looks like the job for an XSLT transform which I don't think it's built-in in MSSQL 2005 (can be added as a UDF: http://www.topxml.com/rbnews/SQLXML/re-23872_Performing-XSLT-Transforms-on-XML-Data-Stored-in-SQL-Server-2005.aspx).
If you just need to return the list of nodes then you can use this expression:
//Book[not(#ID) or #ID = 5]
but I get the impression that it's not what you need. It would help if you can provide a clearer example.
Edit: This example is indeed more clear. The best that I could find is this:
SET #Xml.modify('delete(//*[#category!=1])')
SELECT #Xml
The idea is to delete from the XML all the nodes that you don't need, so you remain with the original structure and the needed nodes. I tested with your two examples and it produced the wanted result.
However modify has some restrictions - it seems you can't use it in a select statement, it has to modify data in place. If you need to return such data with a select you could use a temporary table in which to copy the original data and then update that table. Something like this:
INSERT INTO #temp VALUES(#Xml)
UPDATE #temp SET data.modify('delete(//*[#category!=2])')
Hope that helps.
The question is not really clear, but is this what you're looking for?
DECLARE #Xml AS XML
SET #Xml =
N'
<Books>
<Book ID="1">Book1</Book>
<Book ID="2">Book2</Book>
<Book ID="3">Book3</Book>
<Book>Book4</Book>
<Book ID="5">Book5</Book>
<Book ID="6">Book6</Book>
<Book>Book7</Book>
<Book ID="8">Book8</Book>
</Books>
'
DECLARE #BookID AS INT
SET #BookID = 5
DECLARE #Result AS XML
SET #result = (SELECT #xml.query('//Book[not(#ID) or #ID = sql:variable("#BookID")]'))
SELECT #result

Resources