Postgres aggregate nested jsonb array values

Postgres aggregate nested jsonb array values - arrays

In Postgres 11.x I am trying to aggregate elements in a nested jsonb object which has an array field into a single row per device_id. Here's example data for a table called configurations.
id
device_id
data
1
1
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 1}], ""other_data"": {}}"
2
1
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 1}, {""other_data"": {}, ""sensor_type"": 2}], ""other_data"": {}}"
3
1
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 3}], ""other_data"": {}}"
4
2
"{""sensors"": [{""other_data"": {}, ""sensor_type"": 4}], ""other_data"": {}}"
5
2
"{""sensors"": null, ""other_data"": {}}"
6
3
"{""sensors"": [], ""other_data"": {}}"
My goal output would have a single row per device_id with an array of distinct sensor_types, example:
device_id
sensor_types
1
[1,2,3]
2
[4]
3
[ ] null would also be fine here
Tried a bunch of things but running into various problems, here's some SQL to set up a test environment:
CREATE TEMPORARY TABLE configurations(
id SERIAL PRIMARY KEY,
device_id SERIAL,
data JSONB
);
INSERT INTO configurations(device_id, data) VALUES
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 1, "other_data": {} } ] }'),
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 1, "other_data": {} }, { "sensor_type": 2, "other_data": {} }] }'),
(1, '{ "other_data": {}, "sensors": [ { "sensor_type": 3, "other_data": {} }] }'),
(2, '{ "other_data": {}, "sensors": [ { "sensor_type": 4, "other_data": {} }] }'),
(2, '{ "other_data": {}, "sensors": null }'),
(3, '{ "other_data": {}, "sensors": [] }');
Quick note, my real table has about 100,000 rows and the jsonb data is much more complicated but follows this general structure.

The JSONB null causes some problems in Postgres and should rather be avoided when possible. You can convert the value to an empty array with the expression
coalesce(nullif(data->'sensors', 'null'), '[]')
The first attempt:
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
group by device_id;
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4,NULL}
3 | {NULL}
(3 rows)
may be unsatisfactory because of nulls in the result. When trying to remove them
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
where value is not null
group by device_id;
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4}
(2 rows)
device_id = 3 disappears. Well, we can get all device_ids from the table:
select distinct device_id, sensor_types
from configurations
left join (
select device_id, array_agg(distinct value->'sensor_type') as sensor_types
from configurations
left join jsonb_array_elements(coalesce(nullif(data->'sensors', 'null'), '[]')) on true
where value is not null
group by device_id
) s
using(device_id);
device_id | sensor_types
-----------+--------------
1 | {1,2,3}
2 | {4}
3 |
(3 rows)

Related

T-SQL FOR JSON PATH but without a column name in the JSON result

Is there a way in SQL to return an array of values without the column name included? I've searched for a solution but have not found anything
CREATE TABLE #data (
id int
);
INSERT INTO #data VALUES (1)
INSERT INTO #data VALUES (2)
INSERT INTO #data VALUES (3)
INSERT INTO #data VALUES (4)
INSERT INTO #data VALUES (5)
INSERT INTO #data VALUES (6)
INSERT INTO #data VALUES (7)
INSERT INTO #data VALUES (8)
INSERT INTO #data VALUES (9)
INSERT INTO #data VALUES (10)
SELECT id FROM #data FOR JSON PATH
DROP TABLE #data
The above T-SQL returns the below JSON but I want it without the column name just an array of values
[
{
"id": 1
},
{
"id": 2
},
{
"id": 3
},
{
"id": 4
},
{
"id": 5
},
{
"id": 6
},
{
"id": 7
},
{
"id": 8
},
{
"id": 9
},
{
"id": 10
}
]
I would like the resulting output to be:
[
1,
2,
3,
4,
5,
6,
7,
8,
9,
10
]

SQL Server : SELECT JSON Column in this JSON Structure

I would like to know will it possible to select data from below JSON structure?
[
{
"A": 6,
"Status": 1
},
{
"A": 3,
"Status": 0
},
{
"A": 6,
"Status": 1
},
{
"A": 7,
"Status": 0
}
]
According this link, there is Property before the array/object.
"EmployeeInfo": {
"FirstName":"Jignesh",
"LastName":"Trivedi",
"Code":"CCEEDD",
"Addresses": [
{ "Address":"Test 0", "City":"Gandhinagar", "State":"Gujarat"},
{ "Address":"Test 1", "City":"Gandhinagar", "State":"Gujarat"}
]
}
For example, (getting sample from above link), we see the query is started with property EmployeeInfo so that make query possible to get data in this query.
SELECT JSON_VALUE(#JSONData, '$.EmployeeInfo.FirstName')
So I just can't figure out how could this be achieve from the structure provide above, anyone could point me to some sample code that will be helpful. Thanks.

You have two options to parse this JSON array:
Using OPENJSON() with explicit schema once - to get the content of each item
Using OPENJSON() twice - to get the index and the content of each item
JSON:
DECLARE #json varchar(max) = '
[
{
"A": 6,
"Status": 1
},
{
"A": 3,
"Status": 0
},
{
"A": 6,
"Status": 1
},
{
"A": 7,
"Status": 0
}
]'
Using OPENJSON() with explicit schema once:
SELECT A, Status
FROM OPENJSON(#json) WITH (
A int,
Status int
)
Result:
A Status
6 1
3 0
6 1
7 0
Using OPENJSON() twice:
SELECT
j1.[key] AS Index,
j2.A, j2.Status
FROM OPENJSON(#json) j1
CROSS APPLY OPENJSON(j1.[value]) WITH (
A int,
Status int
) j2
Result:
Index A Status
0 6 1
1 3 0
2 6 1
3 7 0
Of course, you can always access an array item by index:
SELECT
JSON_QUERY(#json, '$[0]') AS Item,
JSON_VALUE(#json, '$[0].A') AS A,
JSON_VALUE(#json, '$[0].Status') AS Status
Result:
Item A Status
{"A": 6, "Status": 1} 6 1

Something like this
declare #json nvarchar(max) =N'
[
{
"A": 6,
"Status": 1
},
{
"A": 3,
"Status": 0
},
{
"A": 6,
"Status": 1
},
{
"A": 7,
"Status": 0
}
]'
select * from openjson(#json) with (A int,
Status int);
Output
A Status
6 1
3 0
6 1
7 0

How to join JSON to update multiple rows by primary key

I am trying to update a log with JSON in SQL Server 2017. I can update a data point with json_value, which covers a few cases, but would ultimately like to join in incoming JSON.
Sample table:
key | col_1 | col_2 | col_3
----+-------------------------------+---------------|-----------------
1 | json.lines[0].data.meta.data | json.lines[0] | json.header.note
2 | json.lines[1].data.meta.data} | json.lines[1] | json.header.note
3 | json.lines[2].data.meta.data} | json.lines[2] | json.header.note
I'd like to update a single property in col_1 and update col_2 with an object as as as string.
Sample JSON:
declare #json nvarchar(max) = '[{
header: {
note: 'some note'
}, lines: [{
data {
id: {
key: 0,
name: 'item_1'
},
meta: {
data: 'item_1_data'
}
}, {...}, {...}
}]
}]'
Query:
update logTable set
col_1 = json_value(#json,'$.lines[__index__].data.meta.data'), -- what would the syntax for __index__ be?
col_2 = j.lines[key], -- pseudo code
col_3 = json_value(#json, '$'.header.note')
inner join openjson(#json) j
on json_value(#json,'$.line[?].id.key') = logTable..key -- ? denotes indices that I'd like to iterate = join over
Expected Output:
key | col_1 | col_2 | col_3
----+---------------+----------------------------|---------
1 | 'item_1_data' | 'data: { id: { key: 0...}' | '{header: { note: ...} }'
2 | 'item_2_data' | 'data: { id: { key: 1...}' | '{header: { note: ...} }'
3 | 'item_3_data' | 'data: { id: { key: 2...}' | '{header: { note: ...} }'
I'm not sure how to handle iterating over the $.line indices, but think a join would solve this if properly implemented.
How can I join to arrays of objects to update SQL rows by primary key?

Original answer:
You may try to parse your JSON using OPENJSON with explicit schema (note, that your JSON is not valid):
Table and JSON:
CREATE TABLE #Data (
[key] int,
col_1 nvarchar(100),
col_2 nvarchar(max)
)
INSERT INTO #Data
([key], [col_1], [col_2])
VALUES
(1, N'', N''),
(2, N'', N''),
(3, N'', N'')
DECLARE #json nvarchar(max) = N'[{
"lines": [
{
"data": {
"id": {
"key": 1,
"name": "item_1"
},
"meta": {
"data": "item_1_data"
}
}
},
{
"data": {
"id": {
"key": 2,
"name": "item_2"
},
"meta": {
"data": "item_2_data"
}
}
},
{
"data": {
"id": {
"key": 3,
"name": "item_3"
},
"meta": {
"data": "item_3_data"
}
}
}
]
}]'
Statement:
UPDATE #Data
SET
col_1 = j.metadata,
col_2 = j.data
FROM #Data
INNER JOIN (
SELECT *
FROM OPENJSON(#json, '$[0].lines') WITH (
[key] int '$.data.id.key',
metadata nvarchar(100) '$.data.meta.data',
data nvarchar(max) '$' AS JSON
)
) j ON #Data.[key] = j.[key]
Update:
Header is common for all rows, so use JSON_QUERY() to update the table:
Table and JSON:
CREATE TABLE #Data (
[key] int,
col_1 nvarchar(100),
col_2 nvarchar(max),
col_3 nvarchar(max)
)
INSERT INTO #Data
([key], col_1, col_2, col_3)
VALUES
(1, N'', N'', N''),
(2, N'', N'', N''),
(3, N'', N'', N'')
DECLARE #json nvarchar(max) = N'[{
"header": {
"note": "some note"
},
"lines": [
{
"data": {
"id": {
"key": 1,
"name": "item_1"
},
"meta": {
"data": "item_1_data"
}
}
},
{
"data": {
"id": {
"key": 2,
"name": "item_2"
},
"meta": {
"data": "item_2_data"
}
}
},
{
"data": {
"id": {
"key": 3,
"name": "item_3"
},
"meta": {
"data": "item_3_data"
}
}
}
]
}]'
Statement:
UPDATE #Data
SET
col_1 = j.metadata,
col_2 = j.data,
col_3 = JSON_QUERY(#json, '$[0].header')
FROM #Data
INNER JOIN (
SELECT *
FROM OPENJSON(#json, '$[0].lines') WITH (
[key] int '$.data.id.key',
metadata nvarchar(100) '$.data.meta.data',
data nvarchar(max) '$' AS JSON
)
) j ON #Data.[key] = j.[key]

Querying for an array object in Postgres jsonb column

I have a Postgres table with 2 columns "nodes" & "timestamp".The "nodes" column is of type jsonb & is an array of objects of the following format:
[
{
"addr": {},
"node_number": "1",
"primary": false
},
{
"addr": {},
"node_number": "2",
"primary": true
},
]
I want to find the object in this array that has "primary":true in the most recent row. If the above was the latest row, the result should be:
{
"addr": { },
"node_number": "2",
"primary": true
}
I have tried:
SELECT(nodes -> 0) FROM table WHERE nodes #> '[{"primary": true}]'
order by timestamp desc
limit 1;
which gives the object at index 0 in the array not the desired object that has "primary": true.
How can I implement the query ?

Use jsonb_array_elements() in a lateral join:
select elem
from my_table
cross join jsonb_array_elements(nodes) as elem
where (elem->>'primary')::boolean
elem
---------------------------------------------------
{"addr": {}, "primary": true, "node_number": "2"}
(1 row)

PostgreSQL json and array processing

I need to output json out from the query.
Input data:
Documents:
==========
id | name | team
------------------
1 | doc1 | {"authors": [1, 2, 3], "editors": [3, 4, 5]}
Persons:
========
id | name |
--------------
1 | Person1 |
2 | Person2 |
3 | Person3 |
4 | Person4 |
5 | Person5 |
Query:
select d.id, d.name,
(select jsonb_build_object(composed)
from
(
select teamGrp.key,
(
select json_build_array(persAgg) from
(
select
(
select jsonb_agg(pers) from
(
select person.id, person.name
from
persons
where (persList.value)::int=person.id
) pers
)
from
json_array_elements_text(teamGrp.value::json) persList
) persAgg
)
from
jsonb_each_text(d.team) teamGrp
) teamed
) as teams
from
documents d;
and i expect the following output:
{"id": 1, "name": "doc1", "teams":
{"authors": [{"id": 1, "name": "Person1"}, {"id": 2, "name": "Person2"}, {"id": 3, "name": "Person3"}],
"editors": [{"id": 3, "name": "Person3"}, {"id": 5, "name": "Person5"}, {"id": 5, "name": "Person5"}]}
But received an error:
ERROR: more than one row returned by a subquery used as an expression
Where is the problem and how to fix it?
PostgreSQL 9.5

I think the following (super complicated query) should to it:
SELECT
json_build_object(
'id',id,
'name',name,
'teams',(
SELECT json_object_agg(team_name,
(SELECT
json_agg(json_build_object('id',value,'name',Persons.name))
FROM json_array_elements(team_members)
INNER JOIN Persons ON (value#>>'{}')::integer=Persons.id
)
)
FROM json_each(team) t(team_name,team_members)
)
)
FROM Documents;
I am using subqueries where I run json aggregates.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Postgres aggregate nested jsonb array values - arrays

Related

T-SQL FOR JSON PATH but without a column name in the JSON result

SQL Server : SELECT JSON Column in this JSON Structure

How to join JSON to update multiple rows by primary key

Querying for an array object in Postgres jsonb column

PostgreSQL json and array processing

Categories

Resources