Parsing xml in sql server - sql-server

I have a table with an ntext type column that holds xml. I have tried to apply many examples of how to pull the value for the company's name from the xml for a particular node, but continue to get a syntax error. Below is what I've done, except substituted my select statement for the actual xml output
DECLARE #companyxml xml
SET #companyxml =
'<Home>
<slideshowImage1>1105</slideshowImage1>
<slideshowImage2>1106</slideshowImage2>
<slideshowImage3>1107</slideshowImage3>
<slideshowImage4>1108</slideshowImage4>
<slideshowImage5>1109</slideshowImage5>
<bottomNavImg1>1155</bottomNavImg1>
<bottomNavImg2>1156</bottomNavImg2>
<bottomNavImg3>1157</bottomNavImg3>
<pageTitle>Acme Capital Management |Homepage</pageTitle>
<metaKeywords><![CDATA[]]></metaKeywords>
<metaDescription><![CDATA[]]></metaDescription>
<companyName>Acme Capital Management</companyName>
<logoImg>1110</logoImg>
<pageHeader></pageHeader>
</Home>'
SELECT c.value ('companyName','varchar(1000)') AS companyname
FROM #companyxml.nodes('/Home') AS c
For some reason, the select c.value statement has a syntax problem that I can't figure out. On hover in SSMS, it says 'cannot find either column "c" or the user-defined function or aggregate "c.value", or the name is ambiguous.'
Any help on the syntax would be greatly appreciated.

try this
DECLARE #companyxml xml
SET #companyxml =
'<Home>
<slideshowImage1>1105</slideshowImage1>
<slideshowImage2>1106</slideshowImage2>
<slideshowImage3>1107</slideshowImage3>
<slideshowImage4>1108</slideshowImage4>
<slideshowImage5>1109</slideshowImage5>
<bottomNavImg1>1155</bottomNavImg1>
<bottomNavImg2>1156</bottomNavImg2>
<bottomNavImg3>1157</bottomNavImg3>
<pageTitle>Acme Capital Management Homepage</pageTitle>
<metaKeywords>CDATA</metaKeywords>
<metaDescription>CDATA</metaDescription>
<companyName>Acme Capital Management</companyName>
<logoImg>1110</logoImg>
<pageHeader></pageHeader>
</Home>'
DECLARE #Result AS varchar(50)
SET #result = #companyxml.value('(/Home/companyName/text())[1]','varchar(50)')
SELECT #result

Related

Sql Server - How do I get JSON nested value in my SQL Select statement

Environment: SQL Server 2014 and above
How do I access the email value in my JSON value with my SELECT statement?
select JSON_VALUE('[{"data":{"email":"test#email.com"}}]', '$.email') as test
Json support was only introduced in SQL Server 2016 - so with any prior version you would need to either use string manipulation code or simply parse the json outside of SQL Server (maybe using a CLR function)
For 2016 version or higher, you can use JSON_VALUE like this:
declare #json as varchar(100) = '[{"data":{"email":"test#email.com"}}]';
select JSON_VALUE(#json, '$[0].data.email') as test
For older versions - you might be able to get away with this, but if your json value does not contain an email property, you will get unexpected results:
select substring(string, start, charindex('"', string, start+1) - start) as test
from (
select #json as string, charindex('"email":"', #json) + 9 as start
) s
You can see a live demo on db<>fiddle
Another way. PatternSplitCM is great for stuff like this.
Extract a single Email value:
DECLARE #json as varchar(200) = '[{"data":{"email":"test#email.com"}}]';
SELECT f.Item
FROM dbo.patternsplitCM(#json,'[a-z0-9#.]') AS f
WHERE f.item LIKE '%[a-z]%#%.%[a-z]%'; -- Simple Email Check Pattern
Extracting all Email Addresses (if/when there are more):
DECLARE #json VARCHAR(200) = '[{"data":{"email":"test#email.com"},{"email2":"test2#email.net"}},{"data":{"MoreEmail":"test3#email.555whatever"}}]';
SELECT f.Item
FROM dbo.patternsplitCM(#json,'[a-z0-9#.]') AS f
WHERE f.item LIKE '%[a-z]%#%.%[a-z]%'; -- Simple Email Check Pattern
Returns:
Item
--------------------------
test#email.com
test2#email.net
test3#email.555whatever
Or... the get only the first Email address that appears:
SELECT TOP (1) f.Item
FROM dbo.patternsplitCM(#json,'[a-z0-9#.]') AS f
WHERE f.item LIKE '%[a-z]%#%.%[a-z]%' -- Simple Email Check Pattern
ORDER BY ROW_NUMBER() OVER (ORDER BY f.ItemNumber)
Nasty fast, super-simple. No cursors, loops or other bad stuff.
With v2014 there is no JSON support, but - if your real JSON is that simple - it is sometimes a good idea to use some replacements in order to transform the JSON to XML like here, which allows for the native XML methods:
DECLARE #YourJSON NVARCHAR(MAX)=N'[{"data":{"email":"test#email.com"}}]';
SELECT CAST(REPLACE(REPLACE(REPLACE(REPLACE(#YourJSON,'[{"','<'),'":{"',' '),'":"','="'),'}}]',' />') AS XML).value('(/data/#email)[1]','nvarchar(max)');
It can be done in two ways:
First, if your JSON data is between [ ] like in your question:
select JSON_VALUE('[{"data":{"email":"test#email.com"}}]','$[0].data.email' ) as test
And if your JSON data is not between [ ]:
select JSON_VALUE('{"data":{"email":"test#email.com"}}','$.data.email' ) as test
You can teste the code above here
Your query should be like this (SQL Server 2016):
DECLARE #json_string NVARCHAR(MAX) = 'your_json_value'
SELECT [key],value
FROM OPENJSON(#json_string, '$.email'))
UPDATE :
select JSON_VALUE(#json_string, '$[0].data.email') as test

SQL Server 2012 - How to extract a value from an XML string?

This seems very basic, but I haven't been able to find an example that works for me, so I'd appreciate any advice.
I have a SQL Server function that determines various dates based on our fiscal year and today's date, and returns one row which looks like...
<row LastDayPrevMonth="2015-04-30T00:00:00" LastDayPrevMonthLY="2014-04-30T00:00:00" ... />
In the stored proc which calls that function, I've done...
DECLARE #X XML
SET #X = dbo.GetFiscalYearDates()
...but then I can't seem to extract the value of LastDayPrevMonth.
I've tried dozens of variations of this:
SELECT ROW.ITEM.VALUE('LastDayPrevMonth', 'VARCHAR(30)')[1] AS Foo FROM #x.nodes('row/item') ... sometimes with an "AS Bar" at the end...
That particular syntax gives the error "incorrect syntax near the keywork 'as'", but any tweaks I do don't help.
Thanks for your assistance, dudes!
declare #doc xml
select #doc= '
<root>
<row LastDayPrevMonth="2015-04-30T00:00:00" LastDayPrevMonthLY="2014-04-30T00:00:00" />
</root>
'
SELECT
LastDayPrevMonth = Y.i.value('(#LastDayPrevMonth)[1]', 'datetime')
, LastDayPrevMonthLY = Y.i.value('#LastDayPrevMonthLY[1]', 'datetime')
FROM
#doc.nodes('root/row') AS Y(i)

Scalar function fn_cdc_get_min_lsn() constantly returns '0x00000000000000000000' for valid table names?

I have Change Data Capture (CDC) activated on my MS SQL 2008 database and use the following code to add a new tabel to the data capture:
EXEC sys.sp_cdc_enable_table
#source_schema ='ordering',
#source_name ='Fields',
#role_name = NULL,
#supports_net_changes = 0;
However, whenever I try to select the changes from the tracking tables using the sys.fn_cdc_get_min_lsn(#TableName) function
SET #Begin_LSN = sys.fn_cdc_get_min_lsn('Fields')
I always get the zero value.
I tried adding the schema name using the following spelling:
SET #Begin_LSN = sys.fn_cdc_get_min_lsn('ordering.Fields')
but this didn't help.
My mystake was to assume that sys.fn_cdc_get_min_lsn() accepts the table name. I was mostly misguided by the examples in MSDN documentation, probably and didn't check the exact meaning of the parameters.
It turns out that the sys.fn_cdc_get_min_lsn() accepts the capture instance name, not table name!
A cursory glance at my current capture instances:
SELECT capture_instance FROM cdc.change_tables
returns the correct parameter name:
ordering_Fields
So, one should use underscore as schema separator, and not the dot notation as it is common in SQL Server.
I know this is mostly already explained in this post but I thought I would put together my evenings journey through CDC
This error:
"An insufficient number of arguments were supplied for the procedure or function cdc..."
Is probably caused by your low LSN being 0x00
This in turn might be because you put the wrong instance name in with fn_cdc_get_min_lsn.
Use SELECT * FROM cdc.change_tables to find it
Lastly make sure you use binary(10) to store your LSN. If you use just varbinary or binary, you will again get 0x00. This is clearly payback for me scoffing at all those noobs using varchar and wondering why their strings are truncated to one character.
Sample script:
declare #S binary(10)
declare #E binary(10)
SET #S = sys.fn_cdc_get_min_lsn('dbo_YourTable')
SET #E = sys.fn_cdc_get_max_lsn()
SELECT #S, #E
SELECT *
FROM [cdc].[fn_cdc_get_net_changes_dbo_issuedToken2]
(
#S,#E,'all'
)
The above answer is correct. Alternatively you can add an additional parameter capture_instance to the cdc enable
EXEC sys.sp_cdc_enable_table
#source_schema ='ordering',
#source_name ='Fields',
#capture_instance = 'dbo_Fields'
#role_name = NULL,
#supports_net_changes = 0;
then use the capture_instance string in the min_lsn function
SET #Begin_LSN = sys.fn_cdc_get_min_lsn('dbo_Fields')
will return the first LSN, and not 0x00000000000000000000.
This is partiularly useful when trying to solve the error
"An insufficient number of arguments were supplied for the procedure or function cdc..." from SQL when calling
cdc_get_net_changes_Fields(#Begin_LSN, sys.fn_cdc_get_max_lsn(), 'all')
Which simply means "LSN out of expected range"

In SQL Server can I insert multiple nodes into XML from a table?

I want to generate some XML in a stored procedure based on data in a table.
The following insert allows me to add many nodes but they have to be hard-coded or use variables (sql:variable):
SET #MyXml.modify('
insert
<myNode>
{sql:variable("#MyVariable")}
</myNode>
into (/root[1]) ')
So I could loop through each record in my table, put the values I need into variables and execute the above statement.
But is there a way I can do this by just combining with a select statement and avoiding the loop?
Edit I have used SELECT FOR XML to do similar stuff before but I always find it hard to read when working with a hierarchy of data from multiple tables. I was hoping there would be something using the modify where the XML generated is more explicit and more controllable.
Have you tried nesting FOR XML PATH scalar valued functions?
With the nesting technique, you can brake your SQL into very managable/readable elemental pieces
Disclaimer: the following, while adapted from a working example, has not itself been literally tested
Some reference links for the general audience
http://msdn2.microsoft.com/en-us/library/ms178107(SQL.90).aspx
http://msdn2.microsoft.com/en-us/library/ms189885(SQL.90).aspx
The simplest, lowest level nested node example
Consider the following invocation
DECLARE #NestedInput_SpecificDogNameId int
SET #NestedInput_SpecificDogNameId = 99
SELECT [dbo].[udfGetLowestLevelNestedNode_SpecificDogName]
(#NestedInput_SpecificDogNameId)
Let's say had udfGetLowestLevelNestedNode_SpecificDogName had been written without the FOR XML PATH clause, and for #NestedInput_SpecificDogName = 99 it returns the single rowset record:
#SpecificDogNameId DogName
99 Astro
But with the FOR XML PATH clause,
CREATE FUNCTION dbo.udfGetLowestLevelNestedNode_SpecificDogName
(
#NestedInput_SpecificDogNameId
)
RETURNS XML
AS
BEGIN
-- Declare the return variable here
DECLARE #ResultVar XML
-- Add the T-SQL statements to compute the return value here
SET #ResultVar =
(
SELECT
#SpecificDogNameId as "#SpecificDogNameId",
t.DogName
FROM tblDogs t
FOR XML PATH('Dog')
)
-- Return the result of the function
RETURN #ResultVar
END
the user-defined function produces the following XML (the # signs causes the SpecificDogNameId field to be returned as an attribute)
<Dog SpecificDogNameId=99>Astro</Dog>
Nesting User-defined Functions of XML Type
User-defined functions such as the above udfGetLowestLevelNestedNode_SpecificDogName can be nested to provide a powerful method to produce complex XML.
For example, the function
CREATE FUNCTION [dbo].[udfGetDogCollectionNode]()
RETURNS XML
AS
BEGIN
-- Declare the return variable here
DECLARE #ResultVar XML
-- Add the T-SQL statements to compute the return value here
SET #ResultVar =
(
SELECT
[dbo].[udfGetLowestLevelNestedNode_SpecificDogName]
(t.SpecificDogNameId)
FROM tblDogs t
FOR XML PATH('DogCollection') ELEMENTS
)
-- Return the result of the function
RETURN #ResultVar
END
when invoked as
SELECT [dbo].[udfGetDogCollectionNode]()
might produce the complex XML node (given the appropriate underlying data)
<DogCollection>
<Dog SpecificDogNameId="88">Dino</Dog>
<Dog SpecificDogNameId="99">Astro</Dog>
</DogCollection>
From here, you could keep working upwards in the nested tree to build as complex an XML structure as you please
CREATE FUNCTION [dbo].[udfGetAnimalCollectionNode]()
RETURNS XML
AS
BEGIN
DECLARE #ResultVar XML
SET #ResultVar =
(
SELECT
dbo.udfGetDogCollectionNode(),
dbo.udfGetCatCollectionNode()
FOR XML PATH('AnimalCollection'), ELEMENTS XSINIL
)
RETURN #ResultVar
END
when invoked as
SELECT [dbo].[udfGetAnimalCollectionNode]()
the udf might produce the more complex XML node (given the appropriate underlying data)
<AnimalCollection>
<DogCollection>
<Dog SpecificDogNameId="88">Dino</Dog>
<Dog SpecificDogNameId="99">Astro</Dog>
</DogCollection>
<CatCollection>
<Cat SpecificCatNameId="11">Sylvester</Cat>
<Cat SpecificCatNameId="22">Tom</Cat>
<Cat SpecificCatNameId="33">Felix</Cat>
</CatCollection>
</AnimalCollection>
Use sql:column instead of sql:variable. You can find detailed info here: http://msdn.microsoft.com/en-us/library/ms191214.aspx
Can you tell a bit more about what exactly you are planning to do.
Is it simply generating XML data based on a content of the table
or adding some data from the table to an existing xml structure?
There are great series of articles on the subject on XML in SQLServer written by Jacob Sebastian, it starts with the basics of generating XML from the data in the table

Using SQL Server 2005's XQuery select all nodes with a specific attribute value, or with that attribute missing

Update: giving a much more thorough example.
The first two solutions offered were right along the lines of what I was trying to say not to do. I can't know location, it needs to be able to look at the whole document tree. So a solution along these lines, with /Books/ specified as the context will not work:
SELECT x.query('.') FROM #xml.nodes('/Books/*[not(#ID) or #ID = 5]') x1(x)
Original question with better example:
Using SQL Server 2005's XQuery implementation I need to select all nodes in an XML document, just once each and keeping their original structure, but only if they are missing a particular attribute, or that attribute has a specific value (passed in by parameter). The query also has to work on the whole XML document (descendant-or-self axis) rather than selecting at a predefined depth.
That is to say, each individual node will appear in the resultant document only if it and every one of its ancestors are missing the attribute, or have the attribute with a single specific value.
For example:
If this were the XML:
DECLARE #Xml XML
SET #Xml =
N'
<Library>
<Novels>
<Novel category="1">Novel1</Novel>
<Novel category="2">Novel2</Novel>
<Novel>Novel3</Novel>
<Novel category="4">Novel4</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
<Volume category="2">G-L</Volume>
<Volume category="3">M-S</Volume>
<Volume category="4">T-Z</Volume>
</Encyclopedia>
</Encyclopedias>
<Dictionaries category="1">
<Dictionary>Webster</Dictionary>
<Dictionary>Oxford</Dictionary>
</Dictionaries>
</Library>
'
A parameter of 1 for category would result in this:
<Library>
<Novels>
<Novel category="1">Novel1</Novel>
<Novel>Novel3</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
</Encyclopedia>
</Encyclopedias>
<Dictionaries category="1">
<Dictionary>Webster</Dictionary>
<Dictionary>Oxford</Dictionary>
</Dictionaries>
</Library>
A parameter of 2 for category would result in this:
<Library>
<Novels>
<Novel category="2">Novel2</Novel>
<Novel>Novel3</Novel>
</Novels>
<Encyclopedias>
<Encyclopedia>
<Volume>A-F</Volume>
<Volume category="2">G-L</Volume>
</Encyclopedia>
</Encyclopedias>
</Library>
I know XSLT is perfectly suited for this job, but it's not an option. We have to accomplish this entirely in SQL Server 2005. Any implementations not using XQuery are fine too, as long as it can be done entirely in T-SQL.
It's not clear for me from your example what you're actually trying to achieve. Do you want to return a new XML with all the nodes stripped out except those that fulfill the condition? If yes, then this looks like the job for an XSLT transform which I don't think it's built-in in MSSQL 2005 (can be added as a UDF: http://www.topxml.com/rbnews/SQLXML/re-23872_Performing-XSLT-Transforms-on-XML-Data-Stored-in-SQL-Server-2005.aspx).
If you just need to return the list of nodes then you can use this expression:
//Book[not(#ID) or #ID = 5]
but I get the impression that it's not what you need. It would help if you can provide a clearer example.
Edit: This example is indeed more clear. The best that I could find is this:
SET #Xml.modify('delete(//*[#category!=1])')
SELECT #Xml
The idea is to delete from the XML all the nodes that you don't need, so you remain with the original structure and the needed nodes. I tested with your two examples and it produced the wanted result.
However modify has some restrictions - it seems you can't use it in a select statement, it has to modify data in place. If you need to return such data with a select you could use a temporary table in which to copy the original data and then update that table. Something like this:
INSERT INTO #temp VALUES(#Xml)
UPDATE #temp SET data.modify('delete(//*[#category!=2])')
Hope that helps.
The question is not really clear, but is this what you're looking for?
DECLARE #Xml AS XML
SET #Xml =
N'
<Books>
<Book ID="1">Book1</Book>
<Book ID="2">Book2</Book>
<Book ID="3">Book3</Book>
<Book>Book4</Book>
<Book ID="5">Book5</Book>
<Book ID="6">Book6</Book>
<Book>Book7</Book>
<Book ID="8">Book8</Book>
</Books>
'
DECLARE #BookID AS INT
SET #BookID = 5
DECLARE #Result AS XML
SET #result = (SELECT #xml.query('//Book[not(#ID) or #ID = sql:variable("#BookID")]'))
SELECT #result

Resources