I'm required to supply a json object like this:
[
{
id: '59E59BC82852A1A5881C082D5FFCAC10',
user: {
...users[1],
last_message: "16-06-2022",
topic: "Shipment"
},
unread: 2,
},
{
id: '521A754B2BD028B13950CB08CDA49075',
user: {
...users[2],
last_message: "15-06-2022",
topic: "Settings"
},
unread: 0,
}
]
it is not difficult for me to build a json like this:
(with this fiddle https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=bf62626de20d3ca7191aa9c1ef0cd39b)
[
{
"id": "59E59BC82852A1A5881C082D5FFCAC10",
"user": {
"id": 1,
"last_message": "16-06-2022",
"topic": "Shipment"
},
"unread": 2
},
{
"id": "521A754B2BD028B13950CB08CDA49075",
"user": {
"id": 2,
"last_message": "15-06-2022",
"topic": "Settings"
},
"unread": 1
},
{
"id": "898BB874D0CBBB1EFBBE56D2626DC847",
"user": {
"id": 3,
"last_message": "18-06-2022",
"topic": "Account"
},
"unread": 1
}
]
but I have no idea how to put the ...users[1], instead of "id": 1 into user node:
is there a way?
This is not actually valid JSON, but you can create it yourself using STRING_AGG and CONCAT
SELECT
'[' + STRING_AGG(u.spread, ',') + ']'
FROM (
SELECT
spread = CONCAT(
'{id:''',
u.userId,
''',user:{...users[',
ROW_NUMBER() OVER (ORDER BY u.id),
'],last_message: "',
t.created_at,
'",topic:"',
t.topic,
'"},unread:',
(SELECT COUNT(*) FROM #tickets t3 WHERE t3.userId = u.userId AND t3.read_at IS NULL),
'}'
)
FROM #Users u
CROSS APPLY (
SELECT top 1
t.ticketId,
t.created_at,
t.topic
FROM #Tickets t
WHERE t.userId = u.userId
ORDER BY
t.created_at DESC
) t
) u
Note that you may need to escape values, but I don't know how this not-JSON works so couldn't say.
db<>fiddle
Related
With standard fields, like id, this works perfectly. But I am not finding a way to parse the custom fields where the structure is
"custom_fields": [
{
"id": 57852188,
"value": ""
},
{
"id": 57522467,
"value": ""
},
{
"id": 57522487,
"value": ""
}
]
The general format that I have been using is:
Select v:id,v:updatedat
from zd_tickets
updated data:
{
"id":151693,
"brand_id": 36000,
"created_at": "2022-0523T19:26:35Z",
"custom_fields": [
{ "id": 57866008, "value": false },
{ "id": 360022282754, "value": "" },
{ "id": 80814087, "value": "NC" } ],
"group_id": 36000770
}
If you want to select all repeating elements you will need to use FLATTEN, otherwise you can use standard notation. This is all documented here: https://docs.snowflake.com/en/user-guide/querying-semistructured.html#retrieving-a-single-instance-of-a-repeating-element
So using this CTE to access the data in a way that look like a table:
with data(json) as (
select parse_json(column1) from values
('{
"id":151693,
"brand_id": 36000,
"created_at": "2022-0523T19:26:35Z",
"custom_fields": [
{ "id": 57866008, "value": false },
{ "id": 360022282754, "value": "" },
{ "id": 80814087, "value": "NC" } ],
"group_id": 36000770
} ')
)
SQL to unpack the top level items, as you have shown you have working:
select
json:id::number as id
,json:brand_id::number as brand_id
,try_to_timestamp(json:created_at::text, 'yyyy-mmddThh:mi:ssZ') as created_at
,json:custom_fields as custom_fields
from data;
gives:
ID
BRAND_ID
CREATED_AT
CUSTOM_FIELDS
151693
36000
2022-05-23 19:26:35.000
[ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
So now how to tackle that json/array of custom_fields..
Well if you only ever have 3 values, and the order is always the same..
select
to_array(json:custom_fields) as custom_fields_a
,custom_fields_a[0] as field_0
,custom_fields_a[1] as field_1
,custom_fields_a[2] as field_2
from data;
gives:
CUSTOM_FIELDS_A
FIELD_0
FIELD_1
FIELD_2
[ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
{ "id": 57866008, "value": false }
{ "id": 360022282754, "value": "" }
{ "id": 80814087, "value": "NC" }
so we can use flatten to access those objects, which makes "more rows"
select
d.json:id::number as id
,d.json:brand_id::number as brand_id
,try_to_timestamp(d.json:created_at::text, 'yyyy-mmddThh:mi:ssZ') as created_at
,f.*
from data as d
,table(flatten(input=>json:custom_fields)) f
ID
BRAND_ID
CREATED_AT
SEQ
KEY
PATH
INDEX
VALUE
THIS
151693
36000
2022-05-23 19:26:35.000
1
[0]
0
{ "id": 57866008, "value": false }
[ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
151693
36000
2022-05-23 19:26:35.000
1
[1]
1
{ "id": 360022282754, "value": "" }
[ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
151693
36000
2022-05-23 19:26:35.000
1
[2]
2
{ "id": 80814087, "value": "NC" }
[ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
So we can pull out know values (a manual PIVOT)
select
d.json:id::number as id
,d.json:brand_id::number as brand_id
,try_to_timestamp(d.json:created_at::text, 'yyyy-mmddThh:mi:ssZ') as created_at
,max(iff(f.value:id=80814087, f.value:value::text, null)) as v80814087
,max(iff(f.value:id=360022282754, f.value:value::text, null)) as v360022282754
,max(iff(f.value:id=57866008, f.value:value::text, null)) as v57866008
from data as d
,table(flatten(input=>json:custom_fields)) f
group by 1,2,3, f.seq
grouping by the f.seq means if you have many "rows" of input these will be kept apart, even if they share common values for 1,2,3
gives:
ID
BRAND_ID
CREATED_AT
V80814087
V360022282754
V57866008
151693
36000
2022-05-23 19:26:35.000
NC
<empty string>
false
Now if you do not know the names of the values, there is no way short of dynamic SQL and double parsing to turns rows into columns.
I ended up doing the following, with 2 different CTEs (CTE and UCF):
Used to_array to gather my custom fields
Unioned the custom fields together twice; once for the id of the field and once for the value (and used combinations of substring, position and replace to clean up data as needed (same setup for all fields)
Joined the resulting data to a Custom Fields Table (contains the id and a name) to include the name of the custom field in my result set.
WITH UCF AS (--Union Gathered Array into 2 fields (an id field and a value field)
WITH CTE AS( ---Gather array of custom fields
SELECT v:id as id,
to_array(v:custom_fields) as cf
,cf[0] as f0,cf[1] as f1,cf[2] as f2
FROM ZD_TICKETS)
SELECT id,
substring(f0,7,position(',',f0)-7) AS cf_id, REPLACE(substring(f0,position('value":',f0)+8,position('"',f0,position('value":',f0)+8)),'"}') AS cf_value
FROM CTE c
WHERE f0 not like '%null%'
UNION
SELECT id,
substring(f1,7,position(',',f1)-7) AS cf_id,
REPLACE(substring(f1,position('value":',f1)+8,position('"',f1,position('value":',f1)+8)),'"}') AS cf_value
FROM CTE c
WHERE f1 not like '%null%'
-- field 3
UNION
SELECT id,
substring(f2,7,position(',',f2)-7) AS cf_id,
REPLACE(substring(f2,position('value":',f2)+8,position('"',f2,position('value":',f2)+8)),'"}') AS cf_value
FROM CTE c
WHERE f2 not like '%null%' --this removes records where the value is null
)
SELECT UCF.*,CFD.name FROM UCF
LEFT OUTER JOIN "FLBUSINESS_DB"."STAGING"."FILE_ZD_CUSTOM_FIELD_IDS" CFD
ON CFD.id=UCF.cf_id
WHERE cf_value<>'' --this removes records where the value is blank
The result set looks like:
From this JSON File (just an example) I need to reach this final result
{
"Id": "101",
"name": "C01",
"testparameters": {
"room": [
{
"Floor": "First_Floor",
"Rooms": ["Room1", "Room2", "Room3"]
},
{
"Floor": "Second_Floor",
"Rooms": ["Room1", "Room2", "Room3"]
}
],
"area": [
{
"Name": "Area1",
"Subarea": ["Subarea1", "Subarea2", "Subarea3"]
},
{
"Name": "Area2",
"Subarea": ["Subarea4", "Subarea5"]
}
],
"requirements": [{
"condition": "",
"type": "type1",
"field1": "",
"field2": "aaaaa",
"operator": "",
"value2": ""
},
{
"condition": "AND",
"type": "type2",
"field1": "",
"field2": "numPersons",
"operator": ">",
"value2": "20"
},
{
"condition": "OR",
"type": "type2",
"field1": "",
"field2": "specification",
"operator": "=",
"value2": "wifi"
}
]
}
}
'
In one register I need to have all the information that is requested.
This is the first time that I need to parse a JSON file. After asking (a lot) I manage to reach the expected result by doing this:
Parsing JSON Example
However, I had to open the JSON file several times, and process each section apart. I'm wondering, how can I improve the code by reducing the number of times that I need to use the OPENJSON function, and in particular, how to rewrite the code snippet that handle the requirements section.
I must say, your desired result looks pretty de-normalized, you may want to rethink it.
Be that as it may, you can combine these quite easily, by using nested subqueries
SELECT
ID = JSON_VALUE(j.json, '$.Id'),
name = JSON_VALUE(j.json, '$.name'),
area = (
SELECT STRING_AGG(concat(d.a , ':', b.value),' - ')
from openjson(j.json, '$.testparameters.area')
with
(
a nvarchar(250) '$.Name',
s nvarchar(max) '$.Subarea' as json
) as d
cross apply openjson(d.s) as b
),
room = (
SELECT STRING_AGG(concat(c.f, ':', d.value), ' - ')
from openjson(j.json, '$.testparameters.room')
with(
f nvarchar(50) '$.Floor',
r nvarchar(Max) '$.Rooms' as json
) as c
cross apply openjson(c.r) as d
),
requirements = (
SELECT IIF(
SUBSTRING(requirements,1,3) = 'AND' or SUBSTRING(requirements,1,3) = 'OR',
SUBSTRING(requirements,5,LEN(requirements)),
requirements
)
from
(
select
STRING_AGG(CONCAT_WS(' ',
a.condition,
a.field2,
operator,
IIF (ISNUMERIC(a.value2) = 1,
a.value2,
CONCAT('''',a.value2,'''')
)
),
' ') as requirements
from openjson(j.json, '$.testparameters.requirements' )
with
(
condition nvarchar(255) '$.condition',
type nvarchar(255) '$.type',
field2 nvarchar(255) '$.field2',
operator nvarchar(255) '$.operator',
value2 nvarchar(255) '$.value2'
) a
where a.type = 'type2'
) a
)
FROM (VALUES(#json)) AS j(json) -- or you can reference a table
I have below JSON file, which is in the external stage, I'm trying to write a copy query into the table with the below query. But it's fetching a single record from the node "values" whereas I need to insert all child elements for the values node. I have loaded this file into a table with the variant datatype.
The query I'm using:
select record:batchId batchId, record:results[0].pageInfo.numberOfPages NoofPages, record:results[0].pageInfo.pageNumber pageNo,
record:results[0].pageInfo.pageSize PgSz, record:results[0].requestId requestId,record:results[0].showPopup showPopup,
record:results[0].values[0][0].columnId columnId,record:results[0].values[0][0].value val
from lease;
{
"batchId": "",
"results": [
{
"pageInfo": {
"numberOfPages": ,
"pageNumber": ,
"pageSize":
},
"requestId": "",
"showPopup": false,
"values": [
[
{
"columnId": ,
"value": ""
},
{
"columnId": ,
"value":
}
]
]
}
]
}
you need to use the LATERAL FLATTEN functions, something like this:
I created this table:
create table json_test (seq_no integer, json_text variant);
and then populated it with this JSON string:
insert into json_test(seq_no, json_text)
select 1, parse_json($${
"batchId": "a",
"results": [
{
"pageInfo": {
"numberOfPages": "1",
"pageNumber": "1",
"pageSize": "100000"
},
"requestId": "a",
"showPopup": false,
"values": [
[
{
"columnId": "4567",
"value": "2020-10-09T07:24:29.000Z"
},
{
"columnId": "4568",
"value": "2020-10-10T10:24:29.000Z"
}
]
]
}
]
}$$);
Then the following query:
select
json_text:batchId batchId
,json_text:results[0].pageInfo.numberOfPages numberOfPages
,json_text:results[0].pageInfo.pageNumber pageNumber
,json_text:results[0].pageInfo.pageSize pageSize
,json_text:results[0].requestId requestId
,json_text:results[0].showPopup showPopup
,f.value:columnId columnId
,f.value:value value
from json_test t
,lateral flatten(input => t.json_text:results[0]:values[0]) f;
gives these results - which I think is roughly what you are looking for:
BATCHID NUMBEROFPAGES PAGENUMBER PAGESIZE REQUESTID SHOWPOPUP COLUMNID VALUE
"a" "1" "1" "100000" "a" false "4567" "2020-10-09T07:24:29.000Z"
"a" "1" "1" "100000" "a" false "4568" "2020-10-10T10:24:29.000Z"
The JSON object contains this:
"entities": {
"hashtags": [
{
"indices": [
17,
29
],
"text": "HBOMAX4ZACK"
},
{
"indices": [
38,
51
],
"text": "TheSnyderCut"
}
],
I want to select only those rows that contain 'XYZ' in ANY ONE of the entries in 'hashtags'. I know I can do this:
select record_content:text, * from tweets where record_content:entities:hashtags[0].text = 'HBOMAX4ZACK';
But as you can see, I have hard-coded 'hashtags[0]' in this case. I want to check if 'HBOMAX4ZACK' exists in any element.
Use lateral and FLATTEN to explode the json. With flatten you are exploding the json up to the level you need (in path):
select t.json,
t2.VALUE value_matching
from (select parse_json('{
"entities": {
"hashtags": [{
"indices": [
17,
29
],
"text": "HBOMAX4ZACK"
},
{
"indices": [
38,
51
],
"text": "TheSnyderCut"
}
]
}
}') as json) t,
lateral flatten(input => parse_json(t.json), path => 'entities.hashtags') t2
WHERE t2.VALUE:text = 'HBOMAX4ZACK';
You can find a tutorial to the topic here https://community.snowflake.com/s/article/How-To-Lateral-Join-Tutorial
I have a table [JsonTable], and the column [JsonData] save the json string,
JsonData like:
{
"Content": [
{
"ContentId": "123",
"Type": 1
},
{
"ContentId": "456",
"Type": 2
}
]
}
or
[
{
"ContentId": "123",
"Type": 1
},
{
"ContentId": "456",
"Type": 2
}
]
How can I inner join like
SELECT* FROM [Content] AS C
INNER JOIN [JsonTable] AS J ON C.[Id] IN (SELECT value FROM OPENJSON(J.[JsonData],'$.Content.ContentId'))
You can use this query.
SELECT * FROM [Content] AS C
INNER JOIN [JsonTable] AS J ON C.[Id] IN (SELECT JSON_VALUE(value,'$.ContentId')
FROM OPENJSON(J.[JsonData],'$.Content'))