SQL Server - OPENJSON output with an explicit schema - arrays

I have the below array of an HTTP request, in JSON format:
[
{
"Code":"856956645",
"Type":"Colet",
"MeasuredWeight":0.0,
"VolumetricWeight":0.0,
"ConfirmationName":null,
"Observation":" 100 DE SFATURI OASELE",
"ResponseCode":null,
"Event":
[
{
"Date":"2018-11-16T16:22:29.397",
"EventId":73,
"Description":"Ridicare din comanda client",
"LocalityName":"BUCURESTI"
},
{
"Date":"2018-11-17T08:55:06.14",
"EventId":5,
"Description":"Spre destinatar ",
"LocalityName":"BUCURESTI"
}
]
}
]
How could I extract the value of Description element, within the second set of values. I tried with OPENJSON but I couldn't do it:
SELECT *
FROM OPENJSON(#json)
WITH (
Description nvarchar(100) '$.Event.Description'
);

Try nesting instead. Nor sure why your attempt didn't work; I've not had much use for OPENJSON as yet apart from when playing around, however, this works:
SELECT J.Code, J.[Type], E.[Description]
FROM OPENJSON(#json)
WITH (Code bigint '$.Code',
[Type] varchar(10) '$.Type',
[Event] nvarchar(MAX) AS JSON) J
CROSS APPLY OPENJSON([Event])
WITH ([Description] varchar(100) '$.Description',
EventID int '$.EventId') E
WHERE E.EventID = 5;
Edit: Worked out why your attempt wasn't working. The JSON you have has a new JSON object in the Event node, they're not simply properties, like in the documentation's second example here. The entities are wrapped in further brackets ([]), not just braces ({}), and hence why you have to parse the next layer again as a separate JSON object.

Related

Parsing string with multiple delimiters into columns

I want to split strings into columns.
My columns should be:
account_id, resource_type, resource_name
I have a JSON file source that I have been trying to parse via ADF data flow. That hasn't worked for me, hence I flattened the data and brought it into SQL Server (I am open to parsing values via ADF or SQL if anyone can show me how). Please check the JSON file at the bottom.
Use this code to query the data I am working with.
CREATE TABLE test.test2
(
resource_type nvarchar(max) NULL
)
INSERT INTO test.test2 ([resource_type])
VALUES
('account_id:224526257458,resource_type:buckets,resource_name:camp-stage-artifactory'),
('account_id:535533456241,resource_type:buckets,resource_name:tni-prod-diva-backups'),
('account_id:369798452057,resource_type:buckets,resource_name:369798452057-s3-manifests'),
('account_id:460085747812,resource_type:buckets,resource_name:vessel-incident-report-nonprod-accesslogs')
The output that I should be able to query in SQL Server should like this:
account_id
resource_type
resource_name
224526257458
buckets
camp-stage-artifactory
535533456241
buckets
tni-prod-diva-backups
and so forth.
Please help me out and ask for clarification if needed. Thanks in advance.
EDIT:
Source JSON Format:
{
"start_date": "2021-12-01 00:00:00+00:00",
"end_date": "2021-12-31 23:59:59+00:00",
"resource_type": "all",
"records": [
{
"directconnect_connections": [
"account_id:227148359287,resource_type:directconnect_connections,resource_name:'dxcon-fh40evn5'",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:'dxcon-ffxgf6kh'",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:'dxcon-fg5j5v6o'",
"account_id:227148359287,resource_type:directconnect_connections,resource_name:'dxcon-fgvfo1ej'"
]
},
{
"virtual_interfaces": [
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fgvj25vt'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fgbw5gs0'",
"account_id:401311080156,resource_type:virtual_interfaces,resource_name:'dxvif-ffnosohr'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-fg18bdhl'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-ffmf6h64'",
"account_id:390251991779,resource_type:virtual_interfaces,resource_name:'dxvif-fgkxjhcj'",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:'dxvif-ffp6kl3f'"
]
}
]
}
Since you don't have a valid JSON string and not wanting to get in the business of string manipulation... perhaps this will help.
Select B.*
From test2 A
Cross Apply ( Select account_id = max(case when value like 'account_id:%' then stuff(value,1,11,'') end )
,resource_type = max(case when value like 'resource_type:%' then stuff(value,1,14,'') end )
,resource_name = max(case when value like 'resource_name:%' then stuff(value,1,14,'') end )
from string_split(resource_type,',')
)B
Results
account_id resource_type resource_name
224526257458 buckets camp-stage-artifactory
535533456241 buckets tni-prod-diva-backups
369798452057 buckets 369798452057-s3-manifests
460085747812 buckets vessel-incident-report-nonprod-accesslogs
Unfortunately, the values inside the arrays are not valid JSON. You can patch them up by adding {} to the beginning/end, and adding " on either side of : and ,.
DECLARE #json nvarchar(max) = N'{
"start_date": "2021-12-01 00:00:00+00:00",
"end_date": "2021-12-31 23:59:59+00:00",
"resource_type": "all",
"records": [
{
"directconnect_connections": [
"account_id:227148359287,resource_type:directconnect_connections,resource_name:''dxcon-fh40evn5''",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:''dxcon-ffxgf6kh''",
"account_id:401311080156,resource_type:directconnect_connections,resource_name:''dxcon-fg5j5v6o''",
"account_id:227148359287,resource_type:directconnect_connections,resource_name:''dxcon-fgvfo1ej''"
]
},
{
"virtual_interfaces": [
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-fgvj25vt''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-fgbw5gs0''",
"account_id:401311080156,resource_type:virtual_interfaces,resource_name:''dxvif-ffnosohr''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-fg18bdhl''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-ffmf6h64''",
"account_id:390251991779,resource_type:virtual_interfaces,resource_name:''dxvif-fgkxjhcj''",
"account_id:227148359287,resource_type:virtual_interfaces,resource_name:''dxvif-ffp6kl3f''"
]
}
]
}';
SELECT
j4.account_id,
j4.resource_type,
TRIM('''' FROM j4.resource_name) resource_name
FROM OPENJSON(#json, '$.records') j1
CROSS APPLY OPENJSON(j1.value) j2
CROSS APPLY OPENJSON(j2.value) j3
CROSS APPLY OPENJSON('{"' + REPLACE(REPLACE(j3.value, ':', '":"'), ',', '","') + '"}')
WITH (
account_id bigint,
resource_type varchar(20),
resource_name varchar(100)
) j4;
db<>fiddle
The first three calls to OPENJSON have no schema, so the resultset is three columns: key value and type. In the case of arrays (j1 and j3), key is the index into the array. In the case of single objects (j2), key is each property name.

T-SQL Json_modify to append a property to each object

I have a stored procedure that accepts a JSON string as input parameter. The input JSON string is like this:
[
{
"name":"Jhon",
"surname":"Smith",
"skills":["C#","VB.NET"]
},
{
"name":"Robert",
"surname":"Jhonson",
"skills":["T-SQL","Pascal"]
}
]
How can I add a unique GUID property to each principal object automatically?
Looking at your example data you already discovered this page of the documentation that tells you how to insert values with the json_modify() function. The examples on that page are written for a single "principal object".
If I interpret this correctly, then your sample has 2 principle objects. Using another page of the documentation shows how you can split that sample in rows with the openjson() function. You can then apply the json_modify() from the first documentation page on each row.
declare #var nvarchar(max) =
'[
{
"name":"Jhon",
"surname":"Smith",
"skills":["C#","VB.NET"]
},
{
"name":"Robert",
"surname":"Jhonson",
"skills":["T-SQL","Pascal"]
}
]';
select row_number() over(order by (select null)) as ObjNumber,
json_modify(j.value, '$.guid', convert(nvarchar(100), newid())) as ObjValue
from openjson(#var, '$') j
The result looks like this:
ObjNumber ObjValue
----------- ----------------------------------------------------
1 {
"name":"Jhon",
"surname":"Smith",
"skills":["C#","VB.NET"]
,"guid":"154C5581-588C-41AA-B292-BB6459F8F4DC"}
2 {
"name":"Robert",
"surname":"Jhonson",
"skills":["T-SQL","Pascal"]
,"guid":"46ACFDD6-58DE-4DB0-8D7A-9B1CCABFF8D8"}
Fiddle
To add the rows back together, just add for json path. This does however require a field alias (here MyObjects) that ends up in the output.
select json_modify(j.value, '$.guid', convert(nvarchar(100), newid())) as MyObjects
from openjson(#var, '$') j
for json path;
Output:
[{"MyObjects":{
"name":"Jhon",
"surname":"Smith",
"skills":["C#","VB.NET"]
,"guid":"FCED4D30-B2B0-460B-97FA-EDA820039572"}},{"MyObjects":{
"name":"Robert",
"surname":"Jhonson",
"skills":["T-SQL","Pascal"]
,"guid":"9FF02A70-0455-4E5C-8C11-27BB2688929D"}}]
Fiddle
To update the variable use the following code. Bonus: replace() removes the previously added field alias.
set #var = replace(
( select json_modify(j.value, '$.guid', convert(nvarchar(100), newid())) as MyObjects
from openjson(#var, '$') j
for json path ),
'"MyObjects":', '');
Final output for select #var:
[{{
"name":"Jhon",
"surname":"Smith",
"skills":["C#","VB.NET"]
,"guid":"66CB37D3-FAEF-4186-94D8-8AC0CF6EB1AC"}},{{
"name":"Robert",
"surname":"Jhonson",
"skills":["T-SQL","Pascal"]
,"guid":"564D6904-D981-40AC-BA9C-8B06015ACE50"}}]
Fiddle

Update/Add attribute to json array in T-SQL

All of the examples examples I see dealing with json arrays have the array nested under a top level object. I have a json array in a column:
[{"key": "value1"}, {"key": "value2"}]
I would like to run a sql script to add/update a key for each element in the array, resulting in:
[{"key": "value1", "otherKey": "otherValue"}, {"key": "value", "otherKey": "otherValue"}]
Yes, in my case I want the same value set for each array member. I've tried:
declare #info nvarchar(max)
SET #info = '[{"key": "value1"}, {"key": "value2"}]'
print JSON_MODIFY(#info, '[0].otherKey', '""')
and fails with "JSON path is not properly formatted. Unexpected character '[' is found at position 0."
This is in MSSQL 2017.
The approach, that can be used, depends on JSON structure (I assume, that the count of the items in the JSON array is not fixed):
If the input JSON array has items (JSON objects) with fixed key/keys, you may use a combination of OPENJSON() with explicit schema (to parse this JSON as table) and FOR JSON (to modify and return the rows as JSON)
If the input JSON has items with different structure, you may use a combination of OPENJSON() with default schema, JSON_MODIFY() and STRING_AGG().
JSON:
declare #info nvarchar(max)
SET #info = '[{"key": "value1"}, {"key": "value2"}]'
Statement for fixed structure:
SELECT #info = (
SELECT [key], 'OtherValue' AS OtherKey
FROM OPENJSON(#info) WITH ([key] varchar(100) '$.key')
FOR JSON PATH
)
Statement for variable structure:
SELECT #info = CONCAT('[', STRING_AGG(JSON_MODIFY([value], '$.OtherKey', 'OtherValue'), ','), ']')
FROM OPENJSON(#info)
SELECT #info
Result:
[{"key":"value1","OtherKey":"OtherValue"},{"key":"value2","OtherKey":"OtherValue"}]
Note, that the reason for the error is that the value of path parameter ([0].otherKey) is wrong. The correct path expression is $[0].otherKey for the first item in the JSON array.
Have you tried adding $ before the indexer sign? Like:
declare #info nvarchar(max)
SET #info = '[{"key": "value1"}, {"key": "value2"}]'
SELECT JSON_MODIFY(#info, '$[0].otherKey', '""')
This gives the following output:
[{"key": "value1","otherKey":"\"\""}, {"key": "value2"}]

SQL Server: How to remove a key from a Json object

I have a query like (simplified):
SELECT
JSON_QUERY(r.SerializedData, '$.Values') AS [Values]
FROM
<TABLE> r
WHERE ...
The result is like this:
{ "2019":120, "20191":120, "201902":121, "201903":134, "201904":513 }
How can I remove the entries with a key length less then 6.
Result:
{ "201902":121, "201903":134, "201904":513 }
One possible solution is to parse the JSON and generate it again using string manipulations for keys with desired length:
Table:
CREATE TABLE Data (SerializedData nvarchar(max))
INSERT INTO Data (SerializedData)
VALUES (N'{"Values": { "2019":120, "20191":120, "201902":121, "201903":134, "201904":513 }}')
Statement (for SQL Server 2017+):
UPDATE Data
SET SerializedData = JSON_MODIFY(
SerializedData,
'$.Values',
JSON_QUERY(
(
SELECT CONCAT('{', STRING_AGG(CONCAT('"', [key] ,'":', [value]), ','), '}')
FROM OPENJSON(SerializedData, '$.Values') j
WHERE LEN([key]) >= 6
)
)
)
SELECT JSON_QUERY(d.SerializedData, '$.Values') AS [Values]
FROM Data d
Result:
Values
{"201902":121,"201903":134,"201904":513}
Notes:
It's important to note, that JSON_MODIFY() in lax mode deletes the specified key if the new value is NULL and the path points to a JSON object. But, in this specific case (JSON object with variable key names), I prefer the above solution.

MSSQL JSON_VALUE to match ANY Object in Array

I have a table with a JSON text field:
create table breaches(breach_id int, detail text);
insert into breaches values
( 1,'[{"breachedState": null},
{"breachedState": "PROCESS_APPLICATION",}]')
I'm trying to use MSSQL's in build JSON parsing functions to test whether ANY object in a JSON array has a matching member value.
If the detail field was a single JSON object, I could use:
select * from breaches
where JSON_VALUE(detail,'$.breachedState') = 'PROCESS_APPLICATION'
but it's an Array, and I want to know if ANY Object has breachedState = 'PROCESS_APPLICATION'
Is this possible using MSSQL's JSON functions?
You can use function OPENJSON to check each object, try this query:
select * from breaches
where exists
(
select *
from
OPENJSON (detail) d
where JSON_VALUE(value,'$.breachedState') = 'PROCESS_APPLICATION'
)
Btw, there is an extra "," in your insert query, it should be:
insert into breaches values
( 1,'[{"breachedState": null},
{"breachedState": "PROCESS_APPLICATION"}]')

Resources