Delete an object from nested array in openjson SQL Server 2016 - sql-server

I want to delete the "AttributeName" : "Manufacturer" from the below json in SQL Server 2016:
declare #json nvarchar(max) = '[{"Type":"G","GroupBy":[],
"Attributes":[{"AttributeName":"Class Designation / Compressive Strength"},{"AttributeName":"Size"},{"AttributeName":"Manufacturer"}]}]'
This is the query I tried which is not working
select JSON_MODIFY((
select JSON_Query(#json, '$[0].Attributes') as res),'$.AttributeName.Manufacturer', null)

Here is the working solution using the for json and open json. The point is to:
Identify the item you wish to delete and replace it with NULL. This is done by JSON_MODIFY(#json,'$[0].Attributes[2]', null). We're simply saying, take the 2nd element in Attributes and replace it by null
Convert this array to a row set. We need to somehow get rid of this null element and that's something we can filter easily in SQL by where [value] is not null
Assemble it all back to original JSON. That's done by FOR JSON AUTO
Please bear in mind one important aspect of such JSON data transformations:
JSON is designed for information exchange or eventually to store the information. But you should avoid more complicated data manipulation on SQL level.
Anyway, solution here:
declare #json nvarchar(max) = '[{"Type": "G","GroupBy": [],"Attributes": [{"AttributeName": "Class Designation / Compressive Strength"}, {"AttributeName": "Size"}, {"AttributeName": "Manufacturer"}]}]';
with src as
(
SELECT * FROM OPENJSON(
JSON_Query(
JSON_MODIFY(#json,'$[0].Attributes[2]', null) , '$[0].Attributes'))
)
select JSON_MODIFY(#json,'$[0].Attributes', (
select JSON_VALUE([value], '$.AttributeName') as [AttributeName] from src
where [value] is not null
FOR JSON AUTO
))

Related

Sql Server - How do I get JSON nested value in my SQL Select statement

Environment: SQL Server 2014 and above
How do I access the email value in my JSON value with my SELECT statement?
select JSON_VALUE('[{"data":{"email":"test#email.com"}}]', '$.email') as test
Json support was only introduced in SQL Server 2016 - so with any prior version you would need to either use string manipulation code or simply parse the json outside of SQL Server (maybe using a CLR function)
For 2016 version or higher, you can use JSON_VALUE like this:
declare #json as varchar(100) = '[{"data":{"email":"test#email.com"}}]';
select JSON_VALUE(#json, '$[0].data.email') as test
For older versions - you might be able to get away with this, but if your json value does not contain an email property, you will get unexpected results:
select substring(string, start, charindex('"', string, start+1) - start) as test
from (
select #json as string, charindex('"email":"', #json) + 9 as start
) s
You can see a live demo on db<>fiddle
Another way. PatternSplitCM is great for stuff like this.
Extract a single Email value:
DECLARE #json as varchar(200) = '[{"data":{"email":"test#email.com"}}]';
SELECT f.Item
FROM dbo.patternsplitCM(#json,'[a-z0-9#.]') AS f
WHERE f.item LIKE '%[a-z]%#%.%[a-z]%'; -- Simple Email Check Pattern
Extracting all Email Addresses (if/when there are more):
DECLARE #json VARCHAR(200) = '[{"data":{"email":"test#email.com"},{"email2":"test2#email.net"}},{"data":{"MoreEmail":"test3#email.555whatever"}}]';
SELECT f.Item
FROM dbo.patternsplitCM(#json,'[a-z0-9#.]') AS f
WHERE f.item LIKE '%[a-z]%#%.%[a-z]%'; -- Simple Email Check Pattern
Returns:
Item
--------------------------
test#email.com
test2#email.net
test3#email.555whatever
Or... the get only the first Email address that appears:
SELECT TOP (1) f.Item
FROM dbo.patternsplitCM(#json,'[a-z0-9#.]') AS f
WHERE f.item LIKE '%[a-z]%#%.%[a-z]%' -- Simple Email Check Pattern
ORDER BY ROW_NUMBER() OVER (ORDER BY f.ItemNumber)
Nasty fast, super-simple. No cursors, loops or other bad stuff.
With v2014 there is no JSON support, but - if your real JSON is that simple - it is sometimes a good idea to use some replacements in order to transform the JSON to XML like here, which allows for the native XML methods:
DECLARE #YourJSON NVARCHAR(MAX)=N'[{"data":{"email":"test#email.com"}}]';
SELECT CAST(REPLACE(REPLACE(REPLACE(REPLACE(#YourJSON,'[{"','<'),'":{"',' '),'":"','="'),'}}]',' />') AS XML).value('(/data/#email)[1]','nvarchar(max)');
It can be done in two ways:
First, if your JSON data is between [ ] like in your question:
select JSON_VALUE('[{"data":{"email":"test#email.com"}}]','$[0].data.email' ) as test
And if your JSON data is not between [ ]:
select JSON_VALUE('{"data":{"email":"test#email.com"}}','$.data.email' ) as test
You can teste the code above here
Your query should be like this (SQL Server 2016):
DECLARE #json_string NVARCHAR(MAX) = 'your_json_value'
SELECT [key],value
FROM OPENJSON(#json_string, '$.email'))
UPDATE :
select JSON_VALUE(#json_string, '$[0].data.email') as test

How do I perform a string function whilst using OPENJSON WITH?

In this example, DT_RowId is a concatenated string. I need to extract out its values, and make them available in a WHERE clause (not shown).
Is there a way to perform string functions on a value as part of a FROM OPENJSON WITH?
Is there a proper way to extract concatenated strings from a value without using a clunky SELECT statement?
Side note: This example is REALLY part of an UPDATE statement, so I'd be using the extracted values in the WHERE clause (not shown here). Also, also: Split is a custom string function we have.
BTW: I have full control of that DT_RowId, and i could make it an array, for example, [42, 1, 1]
declare #jsonRequest nvarchar(max) = '{"DT_RowId":"42_1_14","Action":"edit","Schedule":"1","Slot":"1","Period":"9:00 to 9:30 UPDATED","AMOnly":"0","PMOnly":"0","AllDay":"1"}'
select
(select Item from master.dbo.Split(source.DT_RowId, '_', 0) where ItemIndex = 0) as ID
,source.Schedule
,source.Slot
,source.[Period]
,source.AllDay
,source.PMOnly
,source.AMOnly
from openjson(#jsonRequest, '$')
with
(
DT_RowId varchar(255) '$.DT_RowId' /*concatenated string of row being edited */
,Schedule tinyint '$.Schedule'
,Slot tinyint '$.Slot'
,[Period] varchar(20) '$.Period'
,AllDay bit '$.AllDay'
,PMOnly bit '$.PMOnly'
,AMOnly bit '$.AMOnly'
) as source
Using SQL-Server 2016+ offers a nice trick to split a string fast and position-aware:
select
DTRow.AsJson as DTRow_All_Content
,JSON_VALUE(DTRow.AsJson,'$[0]') AS DTRow_FirstValue
,source.Schedule
,source.Slot
,source.[Period]
,source.AllDay
,source.PMOnly
,source.AMOnly
from openjson(#jsonRequest, '$')
with
(
DT_RowId varchar(255) '$.DT_RowId' /*concatenated string of row being edited */
,Schedule tinyint '$.Schedule'
,Slot tinyint '$.Slot'
,[Period] varchar(20) '$.Period'
,AllDay bit '$.AllDay'
,PMOnly bit '$.PMOnly'
,AMOnly bit '$.AMOnly'
) as source
OUTER APPLY(SELECT CONCAT('["',REPLACE([source].DT_RowId,'_','","'),'"]')) DTRow(AsJson);
The magic is the transformation of 42_1_14 to ["42","1","14"] with some simple string methods. With this you can use JSON_VALUE() to fetch an item by its position.
General hint: If you have full control of DT_RowId you should rather create this JSON array right from the start and avoid hacks while reading this...
update
Just to demonstrate how this would run, if the value was a JSON-array, check this out:
declare #jsonRequest nvarchar(max) = '{"DT_RowId":["42","1","14"]}'
select
source.DT_RowId as DTRow_All_Content
,JSON_VALUE(source.DT_RowId,'$[0]') AS DTRow_FirstValue
from openjson(#jsonRequest, '$')
with
(
DT_RowId NVARCHAR(MAX) AS JSON
) as source;
update 2
Just to add a little to your self-answer:
We must think of JSON as a special string. As there is no native JSON data type, the engine does not know, when the string is a string, and when it is JSON.
Using NVARCHAR(MAX) AS JSON in the WITH-clause allows to deal with the return value again with JSON methods. For example, we could use CROSS APPLY OPENJSON(UseTheValueHere) to dive into nested lists and objects.
Actually there's no need to use this at all. If there are no repeating elements, one could just parse all the values directly:
SELECT JSON_VALUE(#jsonRequest,'$.DT_RowId[0]') AS DTRowId_1
,JSON_VALUE(#jsonRequest,'$.Action') AS [Action]
--and so on...
But this would mean to parse the JSON over and over, which is very expensive.
Using OPENJSON means to read the whole JSON in one single pass (on the current level) and return the elements found (with or without a JSON path) in a derived set (one row for each element).
The WITH-clause is meant to perform kind of PIVOT-action and returns the elements as a multi-column-set. The additional advantage is, that you can specify the data type and - if necessary - a differing JSON path and the column's alias.
You can use any valid JSON path (as well in the WITH-clause as in JSON_VALUE() or in many other places). That means that there are several ways to get the same result. Understanding how the engine works, will enable you to find the most performant approach.
OP here. Just expanding on the answer I accepted by Shnugo, with some details and notes... Hopefully all this might help somebody out there.
I am going to make DT_RowId an array
I will use AS JSON for DT_RowId in the OPENJSON WITH statement
I can then treat it as a json structure, and use JSON_VALUE to extract a value at a specific index
declare #jsonRequest nvarchar(max) = '{"DT_RowId":["42", "1", "14"],"Action":"edit","Schedule":"1","Slot":"1","Period":"9:00 to 9:30 UPDATED","AMOnly":"0","PMOnly":"0","AllDay":"1"}'
select
source.DT_RowId as DTRowId_FULL_JSON_Struct /*the full array*/
,JSON_VALUE(source.DT_RowId,'$[0]') AS JSON_VAL_0 /*extract value at index 0 from json structure*/
,JSON_VALUE(source.DT_RowId,'$[1]') AS JSON_VAL_1 /*extract value at index 1 from json structure*/
,JSON_VALUE(source.DT_RowId,'$[2]') AS JSON_VAL_2 /*extract value at index 2 from json structure*/
,source.DT_RowId_Index0 /*already extracted*/
,source.DT_RowId_Index1 /*already extracted*/
,source.DT_RowId_Index2 /*already extracted*/
,source.Schedule
,source.Slot
,source.Period
,source.AllDay
,source.PMOnly
,source.AMOnly
from openjson(#jsonRequest, '$')
with
(
DT_RowId nvarchar(max) as json /*format as json; do the rest in the SELECT statement*/
,DT_RowId_Index0 varchar(2) '$.DT_RowId[0]' /*When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.*/
,DT_RowId_Index1 varchar(2) '$.DT_RowId[1]' /*When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.*/
,DT_RowId_Index2 varchar(2) '$.DT_RowId[2]' /*When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.*/
,Schedule tinyint '$.Schedule'
,Slot tinyint '$.Slot'
,[Period] varchar(20) '$.Period'
,AllDay bit '$.AllDay'
,PMOnly bit '$.PMOnly'
,AMOnly bit '$.AMOnly'
) as source

Is there a way to return either a string or embedded JSON using FOR JSON?

I have a nvarchar column that I would like to return embedded in my JSON results if the contents is valid JSON, or as a string otherwise.
Here is what I've tried:
select
(
case when IsJson(Arguments) = 1 then
Json_Query(Arguments)
else
Arguments
end
) Results
from Unit
for json path
This always puts Results into a string.
The following works, but only if the attribute contains valid JSON:
select
(
Json_Query(
case when IsJson(Arguments) = 1 then
Arguments
else
'"' + String_escape(IsNull(Arguments, ''), 'json') + '"' end
)
) Results
from Unit
for json path
If Arguments does not contain a JSON object a runtime error occurs.
Update: Sample data:
Arguments
---------
{ "a": "b" }
Some text
Update: any version of SQL Server will do. I'd even be happy to know that it's coming in a beta or something.
I did not find a good solution and would be happy, if someone comes around with a better one than this hack:
DECLARE #tbl TABLE(ID INT IDENTITY,Arguments NVARCHAR(MAX));
INSERT INTO #tbl VALUES
(NULL)
,('plain text')
,('[{"id":"1"},{"id":"2"}]');
SELECT t1.ID
,(SELECT Arguments FROM #tbl t2 WHERE t2.ID=t1.ID AND ISJSON(Arguments)=0) Arguments
,(SELECT JSON_QUERY(Arguments) FROM #tbl t2 WHERE t2.ID=t1.ID AND ISJSON(Arguments)=1) ArgumentsJSON
FROM #tbl t1
FOR JSON PATH;
As NULL-values are omitted, you will always find eiter Arguments or ArgumentsJSON in your final result. Treating this JSON as NVARCHAR(MAX) you can use REPLACE to rename all to the same Arguments.
The problem seems to be, that you cannot include two columns with the same name within your SELECT, but each column must have a predictable type. This depends on the order you use in CASE (or COALESCE). If the engine thinks "Okay, here's text", all will be treated as text and your JSON is escaped. But if the engine thinks "Okay, some JSON", everything is handled as JSON and will break if this JSON is not valid.
With FOR XML PATH there are some tricks with column namig (such as [*], [node()] or even twice the same within one query), but FOR JSON PATH is not that powerfull...
When you say that your statement "... always puts Results into a string.", you probably mean that when JSON is stored in a text column, FOR JSON escapes this text. Of course, if you want to return an unescaped JSON text, you need to use JSON_QUERY function only for your valid JSON text.
Next is a small workaround (based on FOR JSON and string manipulation), that may help to solve your problem.
Table:
CREATE TABLE #Data (
Arguments nvarchar(max)
)
INSERT INTO #Data
(Arguments)
VALUES
('{"a": "b"}'),
('Some text'),
('{"c": "d"}'),
('{"e": "f"}'),
('More[]text')
Statement:
SELECT CONCAT(N'[', j1.JsonOutput, N',', j2.JsonOutput, N']')
FROM
(
SELECT JSON_QUERY(Arguments) AS Results
FROM #Data
WHERE ISJSON(Arguments) = 1
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) j1 (JsonOutput),
(
SELECT STRING_ESCAPE(ISNULL(Arguments, ''), 'json') AS Results
FROM #Data
WHERE ISJSON(Arguments) = 0
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
) j2 (JsonOutput)
Output:
[{"Results":{"a": "b"}},{"Results":{"c": "d"}},{"Results":{"e": "f"}},{"Results":"Some text"},{"Results":"More[]text"}]
Notes:
One disadvantage here is that the order of the items in the generated output is not the same as in the table.

BigQuery or SQL Server SPLIT query

I have searched around and can not find much on this topic. I have a table, that gets logging information. As a result the column I am interested in contains multiple values that I need to search against. The column is formatted in a php URL style. i.e.
/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32
This makes all searches end up with really long regexes to get data. Then join statements to combine data.
Is there a way in BigQuery, or SQL Server that I can pull the information from that column and put it into new columns?
Example:
The information I would like extracted begins after the ?, and ends at &, The string can sometimes be longer, and contains additional headers.
Thanks,
Below is for BigQuery Standard SQL and addresses below aspect of your question
Is there a way in BigQuery, ... that I can pull the information from that column and put it into new columns?
#standardSQL
CREATE TEMP FUNCTION parseColumn(kv STRING, column_name STRING) AS (
IF(SPLIT(kv, '=')[OFFSET(0)]= column_name, SPLIT(kv, '=')[OFFSET(1)], NULL)
);
WITH `project.dataset.table` AS (
SELECT '/test/test.aspx?extra=abc&DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url UNION ALL
SELECT '/test/test.aspx?DS_Vendor=55192&DS_ProdVer=4.30.100.0&more=123&DS_ProdLang=DE&DS_Product=MTE&DS_OfficeBits=64'
)
SELECT
MIN(parseColumn(kv, 'DS_Vendor')) AS DS_Vendor,
MIN(parseColumn(kv, 'DS_ProdVer')) AS DS_ProdVer,
MIN(parseColumn(kv, 'DS_ProdLang')) AS DS_ProdLang,
MIN(parseColumn(kv, 'DS_Product')) AS DS_Product,
MIN(parseColumn(kv, 'DS_OfficeBits')) AS DS_OfficeBits
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')) AS kv
GROUP BY url
with the result as below
Row DS_Vendor DS_ProdVer DS_ProdLang DS_Product DS_OfficeBits
1 55039 7.90.100.0 EN MTT 32
2 55192 4.30.100.0 DE MTE 64
Below is also addressed
The string can sometimes be longer, and contains additional headers.
One example using BigQuery (with standard SQL):
SELECT REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')
FROM (
SELECT '/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url
)
This returns the parts of the URL as an ARRAY<STRING>. To go one step further, you can get back an ARRAY<STRUCT<key STRING, value STRING>> with a query of this form:
SELECT
ARRAY(
SELECT AS STRUCT
SPLIT(part, '=')[OFFSET(0)] AS key,
SPLIT(part, '=')[OFFSET(1)] AS value
FROM UNNEST(REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')) AS part
) AS keys_and_values
FROM (
SELECT '/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url
)
...or with the keys and values as top-level columns:
SELECT
SPLIT(part, '=')[OFFSET(0)] AS key,
SPLIT(part, '=')[OFFSET(1)] AS value
FROM (
SELECT '/test/test.aspx?DS_Vendor=55039&DS_ProdVer=7.90.100.0&DS_ProdLang=EN&DS_Product=MTT&DS_OfficeBits=32' AS url
)
CROSS JOIN UNNEST(REGEXP_EXTRACT_ALL(url, r'[?&]([^?&]+)')) AS part

How to force SQL Server to return empty JSON array

I'm using SQL Server 2016, which supports JSON PATH to return JSON string.
I wonder how to get just a simple empty json array, I mean [] when my query or sub-query returns null. I've tried this query:
SELECT '' AS TEST
FOR JSON PATH,ROOT('arr')
which returns:
{"arr":[{"test":""}]}
and also this one:
SELECT NULL AS TEST
FOR JSON PATH,ROOT('arr')
which returns:
{"arr":[{}]}
it's better but still not correct, I need this:
{"arr":[]}
You can always check this with ISNULL, e.g.:
select ISNULL( (SELECT * FROM sys.tables where 1=2 FOR JSON PATH), '[]')
If you need this in app layer, maybe it would be better to check is there some results set in data access code, and if not just return [] or {}.
This works, and can be composed within another for json query:
select json_query('[]') arr
for json path, without_array_wrapper
When nesting such subqueries, I've found that combining what others have said works best, i.e.:
Using COALESCE((SELECT .. FOR JSON), '[]') to prevent the null value from the subquery
Using JSON_QUERY() to prevent the escaping / quoting.
For example:
select
json_query(coalesce((select 1 as b where 1 = 0 for json path), '[]')) as a
for json path;
Produces:
|JSON |
|----------|
|[{"a":[]}]|
Without JSON_QUERY
Now the nested json array gets quoted:
select
coalesce((select 1 as b where 1 = 0 for json path), '[]') as a
for json path;
Results in
|JSON |
|------------|
|[{"a":"[]"}]|
Without COALESCE
Now the nested JSON is null:
select
json_query((select 1 as b where 1 = 0 for json path)) as a
for json path`;
Results in
|JSON|
|----|
|[{}]|
A little manual, but if you need a quick hack, here you go:
DECLARE #JSON NVARCHAR(MAX) = (SELECT NULL AS test
FOR JSON PATH,ROOT('arr'))
SELECT REPLACE(#json, '{}', '')
By itself, JSON_QUERY('[]') AS [features] did not work for me. I found that the results were formatted as follows:
"features":"[]"
which was not desirable.
To get the desired result, I needed to store the JSON in a variable, then perform a REPLACE on the result, as follows:
DECLARE #json VARCHAR(MAX) = (SELECT JSON_QUERY('[]') AS [features],
-- Other selected fields elided for brevity
FROM MyTable
FOR JSON, WITHOUT_ARRAY_WRAPPER, INCLUDE_NULL_VALUES);
SET #json = REPLACE(#json, '"features":"[]"', '"features":[]');
SELECT #json;
Yes, it's a terrible hack. But it works, and returns the result I want. Our client absolutely must have an empty array returned, and this was the best way I could find to ensure that it was present.
Right now I had exactly the same problem, I think this is the right way to handle it according to the microsoft documentation:
DECLARE #Array TABLE(TEST VARCHAR(100));
SELECT
arr = ISNULL((SELECT TEST FROM #Array FOR JSON PATH), '[]')
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
Result:
{"arr":[]}
Using JSON_OBJECT and JSON_ARRAY functions(SQL Server 2022):
SELECT JSON_OBJECT('arr':JSON_ARRAY())

Resources