How do I perform a string function whilst using OPENJSON WITH? - sql-server

In this example, DT_RowId is a concatenated string. I need to extract out its values, and make them available in a WHERE clause (not shown).
Is there a way to perform string functions on a value as part of a FROM OPENJSON WITH?
Is there a proper way to extract concatenated strings from a value without using a clunky SELECT statement?
Side note: This example is REALLY part of an UPDATE statement, so I'd be using the extracted values in the WHERE clause (not shown here). Also, also: Split is a custom string function we have.
BTW: I have full control of that DT_RowId, and i could make it an array, for example, [42, 1, 1]
declare #jsonRequest nvarchar(max) = '{"DT_RowId":"42_1_14","Action":"edit","Schedule":"1","Slot":"1","Period":"9:00 to 9:30 UPDATED","AMOnly":"0","PMOnly":"0","AllDay":"1"}'
select
(select Item from master.dbo.Split(source.DT_RowId, '_', 0) where ItemIndex = 0) as ID
,source.Schedule
,source.Slot
,source.[Period]
,source.AllDay
,source.PMOnly
,source.AMOnly
from openjson(#jsonRequest, '$')
with
(
DT_RowId varchar(255) '$.DT_RowId' /*concatenated string of row being edited */
,Schedule tinyint '$.Schedule'
,Slot tinyint '$.Slot'
,[Period] varchar(20) '$.Period'
,AllDay bit '$.AllDay'
,PMOnly bit '$.PMOnly'
,AMOnly bit '$.AMOnly'
) as source

Using SQL-Server 2016+ offers a nice trick to split a string fast and position-aware:
select
DTRow.AsJson as DTRow_All_Content
,JSON_VALUE(DTRow.AsJson,'$[0]') AS DTRow_FirstValue
,source.Schedule
,source.Slot
,source.[Period]
,source.AllDay
,source.PMOnly
,source.AMOnly
from openjson(#jsonRequest, '$')
with
(
DT_RowId varchar(255) '$.DT_RowId' /*concatenated string of row being edited */
,Schedule tinyint '$.Schedule'
,Slot tinyint '$.Slot'
,[Period] varchar(20) '$.Period'
,AllDay bit '$.AllDay'
,PMOnly bit '$.PMOnly'
,AMOnly bit '$.AMOnly'
) as source
OUTER APPLY(SELECT CONCAT('["',REPLACE([source].DT_RowId,'_','","'),'"]')) DTRow(AsJson);
The magic is the transformation of 42_1_14 to ["42","1","14"] with some simple string methods. With this you can use JSON_VALUE() to fetch an item by its position.
General hint: If you have full control of DT_RowId you should rather create this JSON array right from the start and avoid hacks while reading this...
update
Just to demonstrate how this would run, if the value was a JSON-array, check this out:
declare #jsonRequest nvarchar(max) = '{"DT_RowId":["42","1","14"]}'
select
source.DT_RowId as DTRow_All_Content
,JSON_VALUE(source.DT_RowId,'$[0]') AS DTRow_FirstValue
from openjson(#jsonRequest, '$')
with
(
DT_RowId NVARCHAR(MAX) AS JSON
) as source;
update 2
Just to add a little to your self-answer:
We must think of JSON as a special string. As there is no native JSON data type, the engine does not know, when the string is a string, and when it is JSON.
Using NVARCHAR(MAX) AS JSON in the WITH-clause allows to deal with the return value again with JSON methods. For example, we could use CROSS APPLY OPENJSON(UseTheValueHere) to dive into nested lists and objects.
Actually there's no need to use this at all. If there are no repeating elements, one could just parse all the values directly:
SELECT JSON_VALUE(#jsonRequest,'$.DT_RowId[0]') AS DTRowId_1
,JSON_VALUE(#jsonRequest,'$.Action') AS [Action]
--and so on...
But this would mean to parse the JSON over and over, which is very expensive.
Using OPENJSON means to read the whole JSON in one single pass (on the current level) and return the elements found (with or without a JSON path) in a derived set (one row for each element).
The WITH-clause is meant to perform kind of PIVOT-action and returns the elements as a multi-column-set. The additional advantage is, that you can specify the data type and - if necessary - a differing JSON path and the column's alias.
You can use any valid JSON path (as well in the WITH-clause as in JSON_VALUE() or in many other places). That means that there are several ways to get the same result. Understanding how the engine works, will enable you to find the most performant approach.

OP here. Just expanding on the answer I accepted by Shnugo, with some details and notes... Hopefully all this might help somebody out there.
I am going to make DT_RowId an array
I will use AS JSON for DT_RowId in the OPENJSON WITH statement
I can then treat it as a json structure, and use JSON_VALUE to extract a value at a specific index
declare #jsonRequest nvarchar(max) = '{"DT_RowId":["42", "1", "14"],"Action":"edit","Schedule":"1","Slot":"1","Period":"9:00 to 9:30 UPDATED","AMOnly":"0","PMOnly":"0","AllDay":"1"}'
select
source.DT_RowId as DTRowId_FULL_JSON_Struct /*the full array*/
,JSON_VALUE(source.DT_RowId,'$[0]') AS JSON_VAL_0 /*extract value at index 0 from json structure*/
,JSON_VALUE(source.DT_RowId,'$[1]') AS JSON_VAL_1 /*extract value at index 1 from json structure*/
,JSON_VALUE(source.DT_RowId,'$[2]') AS JSON_VAL_2 /*extract value at index 2 from json structure*/
,source.DT_RowId_Index0 /*already extracted*/
,source.DT_RowId_Index1 /*already extracted*/
,source.DT_RowId_Index2 /*already extracted*/
,source.Schedule
,source.Slot
,source.Period
,source.AllDay
,source.PMOnly
,source.AMOnly
from openjson(#jsonRequest, '$')
with
(
DT_RowId nvarchar(max) as json /*format as json; do the rest in the SELECT statement*/
,DT_RowId_Index0 varchar(2) '$.DT_RowId[0]' /*When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.*/
,DT_RowId_Index1 varchar(2) '$.DT_RowId[1]' /*When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.*/
,DT_RowId_Index2 varchar(2) '$.DT_RowId[2]' /*When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.*/
,Schedule tinyint '$.Schedule'
,Slot tinyint '$.Slot'
,[Period] varchar(20) '$.Period'
,AllDay bit '$.AllDay'
,PMOnly bit '$.PMOnly'
,AMOnly bit '$.AMOnly'
) as source

Related

Parsing a xml string without the root elements using T-SQL

Below is the XML string stored in a column in my database table and I need to parse it using T-SQL
Installation type:Interior
Wiring Length (ft):4
Connector type:RJ45
Location description:basement
WirelessID:1
DevID:1234567
wontTurnOff:true
How do I parse this since they're missing the root and child tags?
Thanks.
It appears that the data format in your table column is not an XML format. It normally looks like the following and the column type will be set as "XML". Note that there is a hierarchy and each item associated with the book has a beginning and end tag. There can be multiple books under catalog.
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications with XML.</description>
</book>
</catalog>
However, having pointed out the above, I'm assuming that you probably are dealing with something somewhat different that may have initially of been extracted for an XML source and placed into the format that you've provided. If the format and seven items to parse out are consistent (with different values from one record to the next in the table), then here is an alternative that may assist. This is an example that you can copy into a query and test and use as a reference in the final SQL that will probably need to include a conversion to integer for some of your numeric values parsed out. This example will accommodate the differing length of the values associated with each item. However, in order for this to work, the items must follow in the same order as you provided in your example.
DECLARE #string varchar(max)
SET #string = 'Installation type:Interior
Wiring Length (ft):4
Connector type:RJ45
Location description:basement
WirelessID:1
DevID:1234567
wontTurnOff:true'
DECLARE #Installation_type varchar(50)
,#Wiring_Length varchar(10)
,#Connector_type varchar(10)
,#Location_Description varchar(100)
,#WirelessID varchar(10)
,#DevID varchar(10)
,#wontTurnOff varchar(5)
SELECT SUBSTRING(#string, charindex('Installation type:', #string)+18, (charindex('Wiring Length (ft):', #string)-2) - (charindex('Installation type:', #string)+18))
SELECT SUBSTRING(#string, charindex('Wiring Length (ft):', #string)+19, (charindex('Connector type:', #string)-2) - (charindex('Wiring Length (ft):', #string)+19))
SELECT SUBSTRING(#string, charindex('Connector type:', #string)+15, (charindex('Location description:', #string)-2) - (charindex('Connector type:', #string)+15))
SELECT SUBSTRING(#string, charindex('Location description:', #string)+21, (charindex('WirelessID:', #string)-2) - (charindex('Location description:', #string)+21))
SELECT SUBSTRING(#string, charindex('WirelessID:', #string)+11, (charindex('DevID:', #string)-2) - (charindex('WirelessID:', #string)+11))
SELECT SUBSTRING(#string, charindex('DevID:', #string)+6, (charindex('wontTurnOff:', #string)-2) - (charindex('DevID:', #string)+6))
SELECT SUBSTRING(#string, charindex('wontTurnOff:', #string)+12, (len(#string)+1) - (charindex('wontTurnOff:', #string)+12))
It is common for data to arrive in all kinds of strange formats and sometimes they are fragmented. If your data is truly XML and needs to be parsed as such, then you assign your values to a temporary varchar(max) variable, perform find and replaces within it for each item (for example convert "WirelessID:1" to "1">) concatenate the entire value in a topmost hierarchy XML Tag like "" used in the initial XML example, convert the entire varchar(max) value to an XML and then apply the XML parse function included with SQL.
Hope this helps. Without seeing your data I can only suggest the two options. I've been doing ETL development for many years and it is common to be required to use every trick in the book to resolve data formatting issues when there is no control over that source.

Is there a way to return either a string or embedded JSON using FOR JSON?

I have a nvarchar column that I would like to return embedded in my JSON results if the contents is valid JSON, or as a string otherwise.
Here is what I've tried:
select
(
case when IsJson(Arguments) = 1 then
Json_Query(Arguments)
else
Arguments
end
) Results
from Unit
for json path
This always puts Results into a string.
The following works, but only if the attribute contains valid JSON:
select
(
Json_Query(
case when IsJson(Arguments) = 1 then
Arguments
else
'"' + String_escape(IsNull(Arguments, ''), 'json') + '"' end
)
) Results
from Unit
for json path
If Arguments does not contain a JSON object a runtime error occurs.
Update: Sample data:
Arguments
---------
{ "a": "b" }
Some text
Update: any version of SQL Server will do. I'd even be happy to know that it's coming in a beta or something.
I did not find a good solution and would be happy, if someone comes around with a better one than this hack:
DECLARE #tbl TABLE(ID INT IDENTITY,Arguments NVARCHAR(MAX));
INSERT INTO #tbl VALUES
(NULL)
,('plain text')
,('[{"id":"1"},{"id":"2"}]');
SELECT t1.ID
,(SELECT Arguments FROM #tbl t2 WHERE t2.ID=t1.ID AND ISJSON(Arguments)=0) Arguments
,(SELECT JSON_QUERY(Arguments) FROM #tbl t2 WHERE t2.ID=t1.ID AND ISJSON(Arguments)=1) ArgumentsJSON
FROM #tbl t1
FOR JSON PATH;
As NULL-values are omitted, you will always find eiter Arguments or ArgumentsJSON in your final result. Treating this JSON as NVARCHAR(MAX) you can use REPLACE to rename all to the same Arguments.
The problem seems to be, that you cannot include two columns with the same name within your SELECT, but each column must have a predictable type. This depends on the order you use in CASE (or COALESCE). If the engine thinks "Okay, here's text", all will be treated as text and your JSON is escaped. But if the engine thinks "Okay, some JSON", everything is handled as JSON and will break if this JSON is not valid.
With FOR XML PATH there are some tricks with column namig (such as [*], [node()] or even twice the same within one query), but FOR JSON PATH is not that powerfull...
When you say that your statement "... always puts Results into a string.", you probably mean that when JSON is stored in a text column, FOR JSON escapes this text. Of course, if you want to return an unescaped JSON text, you need to use JSON_QUERY function only for your valid JSON text.
Next is a small workaround (based on FOR JSON and string manipulation), that may help to solve your problem.
Table:
CREATE TABLE #Data (
Arguments nvarchar(max)
)
INSERT INTO #Data
(Arguments)
VALUES
('{"a": "b"}'),
('Some text'),
('{"c": "d"}'),
('{"e": "f"}'),
('More[]text')
Statement:
SELECT CONCAT(N'[', j1.JsonOutput, N',', j2.JsonOutput, N']')
FROM
(
SELECT JSON_QUERY(Arguments) AS Results
FROM #Data
WHERE ISJSON(Arguments) = 1
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) j1 (JsonOutput),
(
SELECT STRING_ESCAPE(ISNULL(Arguments, ''), 'json') AS Results
FROM #Data
WHERE ISJSON(Arguments) = 0
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) j2 (JsonOutput)
Output:
[{"Results":{"a": "b"}},{"Results":{"c": "d"}},{"Results":{"e": "f"}},{"Results":"Some text"},{"Results":"More[]text"}]
Notes:
One disadvantage here is that the order of the items in the generated output is not the same as in the table.

Delete an object from nested array in openjson SQL Server 2016

I want to delete the "AttributeName" : "Manufacturer" from the below json in SQL Server 2016:
declare #json nvarchar(max) = '[{"Type":"G","GroupBy":[],
"Attributes":[{"AttributeName":"Class Designation / Compressive Strength"},{"AttributeName":"Size"},{"AttributeName":"Manufacturer"}]}]'
This is the query I tried which is not working
select JSON_MODIFY((
select JSON_Query(#json, '$[0].Attributes') as res),'$.AttributeName.Manufacturer', null)
Here is the working solution using the for json and open json. The point is to:
Identify the item you wish to delete and replace it with NULL. This is done by JSON_MODIFY(#json,'$[0].Attributes[2]', null). We're simply saying, take the 2nd element in Attributes and replace it by null
Convert this array to a row set. We need to somehow get rid of this null element and that's something we can filter easily in SQL by where [value] is not null
Assemble it all back to original JSON. That's done by FOR JSON AUTO
Please bear in mind one important aspect of such JSON data transformations:
JSON is designed for information exchange or eventually to store the information. But you should avoid more complicated data manipulation on SQL level.
Anyway, solution here:
declare #json nvarchar(max) = '[{"Type": "G","GroupBy": [],"Attributes": [{"AttributeName": "Class Designation / Compressive Strength"}, {"AttributeName": "Size"}, {"AttributeName": "Manufacturer"}]}]';
with src as
(
SELECT * FROM OPENJSON(
JSON_Query(
JSON_MODIFY(#json,'$[0].Attributes[2]', null) , '$[0].Attributes'))
)
select JSON_MODIFY(#json,'$[0].Attributes', (
select JSON_VALUE([value], '$.AttributeName') as [AttributeName] from src
where [value] is not null
FOR JSON AUTO
))

Getting multiple values from same xml column in SQL Server

I want to get the values from same xml node under same element.
Sample data:
I have to select all <award_number> values.
This is my SQL code:
DECLARE #xml XML;
DECLARE #filePath varchar(max);
SET #filePath = '<workFlowMeta><fundgroup><funder><award_number>0710564</award_number><award_number>1106058</award_number><award_number>1304977</award_number><award_number>1407404</award_number></funder></fundgroup></workFlowMeta>'
SET #xml = CAST(#filePath AS XML);
SELECT
REPLACE(Element.value('award_number','NVARCHAR(255)'), CHAR(10), '') AS award_num
FROM
#xml.nodes('workFlowMeta/fundgroup/funder') Datalist(Element);
Can't change this #xml.nodes('workFlowMeta/fundgroup/funder'), because I'm getting multiple node values inside funder node.
Can anyone please help me?
Since those <award_number> nodes are inside the <funder> nodes, and there could be several <funder> nodes (if I understood your question correctly), you need to use two .nodes() calls like this:
SELECT
XC.value('.', 'int')
FROM
#xml.nodes('/workFlowMeta/fundgroup/funder') Datalist(Element)
CROSS APPLY
Element.nodes('award_number') AS XT(XC)
The first .nodes() call gets all <funder> elements, and then the second call goes into each <funder> element to get all <award_number> nodes inside of that element and outputs the value of the <award_number> element as a INT (I couldn't quite understand what you're trying to do to the <award_number> value in your code sample....)
Your own code was very close, but
You are diving one level to low
You need to set a singleton XPath for .value(). In most cases this means a [1] at the end)
As you want to read many <award_number> elements, this is the level you have to step down in .nodes(). Reading these element's values is easy, once you have your hands on it:
SELECT
REPLACE(Element.value('text()[1]','NVARCHAR(255)'), CHAR(10), '') AS award_num
FROM
#xml.nodes('/workFlowMeta/fundgroup/funder/award_number') Datalist(Element);
What are you trying to do with the REPLACE()?
If all <arward_number> elements contain valid numbers, you should use int or bigint as target type and there shouldn't be any need to replace non-numeric characters. Try it like this:
SELECT Element.value('text()[1]','int') AS award_num
FROM #xml.nodes('/workFlowMeta/fundgroup/funder/award_number') Datalist(Element);
If marc_s is correct...
... and you have to deal with several <funder> groups, each of which contains several <award_number> nodes, go with his approach (two calls to .nodes())

In SQL Server can I insert multiple nodes into XML from a table?

I want to generate some XML in a stored procedure based on data in a table.
The following insert allows me to add many nodes but they have to be hard-coded or use variables (sql:variable):
SET #MyXml.modify('
insert
<myNode>
{sql:variable("#MyVariable")}
</myNode>
into (/root[1]) ')
So I could loop through each record in my table, put the values I need into variables and execute the above statement.
But is there a way I can do this by just combining with a select statement and avoiding the loop?
Edit I have used SELECT FOR XML to do similar stuff before but I always find it hard to read when working with a hierarchy of data from multiple tables. I was hoping there would be something using the modify where the XML generated is more explicit and more controllable.
Have you tried nesting FOR XML PATH scalar valued functions?
With the nesting technique, you can brake your SQL into very managable/readable elemental pieces
Disclaimer: the following, while adapted from a working example, has not itself been literally tested
Some reference links for the general audience
http://msdn2.microsoft.com/en-us/library/ms178107(SQL.90).aspx
http://msdn2.microsoft.com/en-us/library/ms189885(SQL.90).aspx
The simplest, lowest level nested node example
Consider the following invocation
DECLARE #NestedInput_SpecificDogNameId int
SET #NestedInput_SpecificDogNameId = 99
SELECT [dbo].[udfGetLowestLevelNestedNode_SpecificDogName]
(#NestedInput_SpecificDogNameId)
Let's say had udfGetLowestLevelNestedNode_SpecificDogName had been written without the FOR XML PATH clause, and for #NestedInput_SpecificDogName = 99 it returns the single rowset record:
#SpecificDogNameId DogName
99 Astro
But with the FOR XML PATH clause,
CREATE FUNCTION dbo.udfGetLowestLevelNestedNode_SpecificDogName
(
#NestedInput_SpecificDogNameId
)
RETURNS XML
AS
BEGIN
-- Declare the return variable here
DECLARE #ResultVar XML
-- Add the T-SQL statements to compute the return value here
SET #ResultVar =
(
SELECT
#SpecificDogNameId as "#SpecificDogNameId",
t.DogName
FROM tblDogs t
FOR XML PATH('Dog')
)
-- Return the result of the function
RETURN #ResultVar
END
the user-defined function produces the following XML (the # signs causes the SpecificDogNameId field to be returned as an attribute)
<Dog SpecificDogNameId=99>Astro</Dog>
Nesting User-defined Functions of XML Type
User-defined functions such as the above udfGetLowestLevelNestedNode_SpecificDogName can be nested to provide a powerful method to produce complex XML.
For example, the function
CREATE FUNCTION [dbo].[udfGetDogCollectionNode]()
RETURNS XML
AS
BEGIN
-- Declare the return variable here
DECLARE #ResultVar XML
-- Add the T-SQL statements to compute the return value here
SET #ResultVar =
(
SELECT
[dbo].[udfGetLowestLevelNestedNode_SpecificDogName]
(t.SpecificDogNameId)
FROM tblDogs t
FOR XML PATH('DogCollection') ELEMENTS
)
-- Return the result of the function
RETURN #ResultVar
END
when invoked as
SELECT [dbo].[udfGetDogCollectionNode]()
might produce the complex XML node (given the appropriate underlying data)
<DogCollection>
<Dog SpecificDogNameId="88">Dino</Dog>
<Dog SpecificDogNameId="99">Astro</Dog>
</DogCollection>
From here, you could keep working upwards in the nested tree to build as complex an XML structure as you please
CREATE FUNCTION [dbo].[udfGetAnimalCollectionNode]()
RETURNS XML
AS
BEGIN
DECLARE #ResultVar XML
SET #ResultVar =
(
SELECT
dbo.udfGetDogCollectionNode(),
dbo.udfGetCatCollectionNode()
FOR XML PATH('AnimalCollection'), ELEMENTS XSINIL
)
RETURN #ResultVar
END
when invoked as
SELECT [dbo].[udfGetAnimalCollectionNode]()
the udf might produce the more complex XML node (given the appropriate underlying data)
<AnimalCollection>
<DogCollection>
<Dog SpecificDogNameId="88">Dino</Dog>
<Dog SpecificDogNameId="99">Astro</Dog>
</DogCollection>
<CatCollection>
<Cat SpecificCatNameId="11">Sylvester</Cat>
<Cat SpecificCatNameId="22">Tom</Cat>
<Cat SpecificCatNameId="33">Felix</Cat>
</CatCollection>
</AnimalCollection>
Use sql:column instead of sql:variable. You can find detailed info here: http://msdn.microsoft.com/en-us/library/ms191214.aspx
Can you tell a bit more about what exactly you are planning to do.
Is it simply generating XML data based on a content of the table
or adding some data from the table to an existing xml structure?
There are great series of articles on the subject on XML in SQLServer written by Jacob Sebastian, it starts with the basics of generating XML from the data in the table

Resources