Create Postgres JSONB Index on Array Sub-Object - arrays

I have table myTable with a JSONB column myJsonb with a data structure that I want to index like:
{
"myArray": [
{
"subItem": {
"email": "bar#bar.com"
}
},
{
"subItem": {
"email": "foo#foo.com"
}
}
]
}
I want to run indexed queries on email like:
SELECT *
FROM mytable
WHERE 'foo#foo.com' IN (
SELECT lower(
jsonb_array_elements(myjsonb -> 'myArray')
-> 'subItem'
->> 'email'
)
);
How do I create a Postgres JSONB index for that?

If you don't need the lower() in there, the query can be simple and efficient:
SELECT *
FROM mytable
WHERE myjsonb -> 'myArray' #> '[{"subItem": {"email": "foo#foo.com"}}]'
Supported by a jsonb_path_ops index:
CREATE INDEX mytable_myjsonb_gin_idx ON mytable
USING gin ((myjsonb -> 'myArray') jsonb_path_ops);
But the match is case-sensitive.
Case-insensitive!
If you need the search to match disregarding case, things get more complex.
You could use this query, similar to your original:
SELECT *
FROM t
WHERE EXISTS (
SELECT 1
FROM jsonb_array_elements(myjsonb -> 'myArray') arr
WHERE lower(arr #>>'{subItem, email}') = 'foo#foo.com'
);
But I can't think of a good way to use an index for this.
Instead, I would use an expression index based on a function extracting an array of lower-case emails:
Function:
CREATE OR REPLACE FUNCTION f_jsonb_arr_lower(_j jsonb, VARIADIC _path text[])
RETURNS jsonb LANGUAGE sql IMMUTABLE AS
'SELECT jsonb_agg(lower(elem #>> _path)) FROM jsonb_array_elements(_j) elem';
Index:
CREATE INDEX mytable_email_arr_idx ON mytable
USING gin (f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') jsonb_path_ops);
Query:
SELECT *
FROM mytable
WHERE f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') #> '"foo#foo.com"';
While this works with an untyped string literal or with actual jsonb values, it stops working if you pass text or varchar (like in a prepared statement). Postgres does not know how to cast because the input is ambiguous. You need an explicit cast in this case:
... #> '"foo#foo.com"'::text::jsonb;
Or pass a simple string without enclosing double quotes and do the conversion to jsonb in Postgres:
... #> to_jsonb('foo#foo.com'::text);
Related, with more explanation:
Query for array elements inside JSON type
Index for finding an element in a JSON array

Related

BigQuery: extract keys from json object, convert json from object to key-value array

I have a table with a column which contains a json-object, the value type is always a string.
I need 2 kind of information:
a list of the json keys
convert the json in an array of key-value pairs
This is what I got so far, which is working:
CREATE TEMP FUNCTION jsonObjectKeys(input STRING)
RETURNS Array<String>
LANGUAGE js AS """
return Object.keys(JSON.parse(input));
""";
CREATE TEMP FUNCTION jsonToKeyValueArray(input STRING)
RETURNS Array<Struct<key String, value String>>
LANGUAGE js AS """
let json = JSON.parse(input);
return Object.keys(json).map(e => {
return { "key" : e, "value" : json[e] }
});
""";
WITH input AS (
SELECT "{\"key1\": \"value1\", \"key2\": \"value2\"}" AS json_column
UNION ALL
SELECT "{\"key1\": \"value1\", \"key3\": \"value3\"}" AS json_column
UNION ALL
SELECT "{\"key5\": \"value5\"}" AS json_column
)
SELECT
json_column,
jsonObjectKeys(json_column) AS keys,
jsonToKeyValueArray(json_column) AS key_value
FROM input
The problem is that FUNCTION is not the best in term of compute optimization, so I'm trying to understand if there is a way to use plain SQL to achieve these 2 needs (or 1 of them at least) using only SQL w/o functions.
Below is for BigQuery Standard SQL
#standardsql
select
json_column,
array(select trim(split(kv, ':')[offset(0)]) from t.kv kv) as keys,
array(
select as struct
trim(split(kv, ':')[offset(0)]) as key,
trim(split(kv, ':')[offset(1)]) as value
from t.kv kv
) as key_value
from input,
unnest([struct(split(translate(json_column, '{}"', '')) as kv)]) t
If to apply to sample data from your question - output is

SQL Server: How to remove a key from a Json object

I have a query like (simplified):
SELECT
JSON_QUERY(r.SerializedData, '$.Values') AS [Values]
FROM
<TABLE> r
WHERE ...
The result is like this:
{ "2019":120, "20191":120, "201902":121, "201903":134, "201904":513 }
How can I remove the entries with a key length less then 6.
Result:
{ "201902":121, "201903":134, "201904":513 }
One possible solution is to parse the JSON and generate it again using string manipulations for keys with desired length:
Table:
CREATE TABLE Data (SerializedData nvarchar(max))
INSERT INTO Data (SerializedData)
VALUES (N'{"Values": { "2019":120, "20191":120, "201902":121, "201903":134, "201904":513 }}')
Statement (for SQL Server 2017+):
UPDATE Data
SET SerializedData = JSON_MODIFY(
SerializedData,
'$.Values',
JSON_QUERY(
(
SELECT CONCAT('{', STRING_AGG(CONCAT('"', [key] ,'":', [value]), ','), '}')
FROM OPENJSON(SerializedData, '$.Values') j
WHERE LEN([key]) >= 6
)
)
)
SELECT JSON_QUERY(d.SerializedData, '$.Values') AS [Values]
FROM Data d
Result:
Values
{"201902":121,"201903":134,"201904":513}
Notes:
It's important to note, that JSON_MODIFY() in lax mode deletes the specified key if the new value is NULL and the path points to a JSON object. But, in this specific case (JSON object with variable key names), I prefer the above solution.

How to update JSON array with PostgreSQL

I have the following inconvenience, I want to update a key of an JSON array using only PostgreSQL. I have the following json:
[
{
"ch":"1",
"id":"12",
"area":"0",
"level":"Superficial",
"width":"",
"length":"",
"othern":"5",
"percent":"100",
"location":" 2nd finger base"
},
{
"ch":"1",
"id":"13",
"area":"0",
"level":"Skin",
"width":"",
"length":"",
"othern":"1",
"percent":"100",
"location":" Abdomen "
}
]
I need to update the "othern" to another number if the "othern" = X
(X is any number that I pass to the query. Example, update othern if othern = 5).
This JSON can be much bigger, so I need something that can iterate in the JSON array and find all the "othern" that match X number and replace with the new one. Thank you!
I have tried with these functions json of Postgresql, but I do not give with the correct result:
SELECT * FROM jsonb_to_recordset('[{"ch":"1", "id":"12", "area":"0", "level":"Superficial", "width":"", "length":"", "othern":"5", "percent":"100", "location":" 2nd finger base"}, {"ch":"1", "id":"13", "area":"0", "level":"Skin", "width":"", "length":"", "othern":"1", "percent":"100", "location":" Abdomen "}]'::jsonb)
AS t (othern text);
I found this function in SQL that is similar to what I need but honestly SQL is not my strength:
CREATE OR REPLACE FUNCTION "json_array_update_index"(
"json" json,
"index_to_update" INTEGER,
"value_to_update" anyelement
)
RETURNS json
LANGUAGE sql
IMMUTABLE
STRICT
AS $function$
SELECT concat('[', string_agg("element"::text, ','), ']')::json
FROM (SELECT CASE row_number() OVER () - 1
WHEN "index_to_update" THEN to_json("value_to_update")
ELSE "element"
END "element"
FROM json_array_elements("json") AS "element") AS "elements"
$function$;
UPDATE plan_base
SET atts = json_array_update_index([{"ch":"1", "id":"12", "area":"0", "level":"Superficial", "width":"", "length":"", "othern":"5", "percent":"100", "location":" 2nd finger base"}, {"ch":"1", "id":"13", "area":"0", "level":"Skin", "width":"", "length":"", "othern":"1", "percent":"100", "location":" Abdomen "}], '{"othern"}', '{"othern":"71"}'::json)
WHERE id = 2;
The function you provided changes a JSON input, gives out the changed JSON and updates a table parallel.
For a simple update, you don't need a function:
demo:db<>fiddle
UPDATE mytable
SET myjson = s.json_array
FROM (
SELECT
jsonb_agg(
CASE WHEN elems ->> 'othern' = '5' THEN
jsonb_set(elems, '{othern}', '"7"')
ELSE elems END
) as json_array
FROM
mytable,
jsonb_array_elements(myjson) elems
) s
jsonb_array_elements() expands the array into one row per element
jsonb_set() changes the value of each othern field. The relevant JSON objects can be found with a CASE clause
jsonb_agg() reaggregates the elements into an array again.
This array can be used to update your column.
If you really need a function which gets the parameters and returns the changed JSON, then this could be a solution. Of course, this doesn't execute an update. I am not quite sure if you want to achieve this:
demo:db<>fiddle
CREATE OR REPLACE FUNCTION json_array_update_index(_myjson jsonb, _val_to_change int, _dest_val int)
RETURNS jsonb
AS $$
DECLARE
_json_output jsonb;
BEGIN
SELECT
jsonb_agg(
CASE WHEN elems ->> 'othern' = _val_to_change::text THEN
jsonb_set(elems, '{othern}', _dest_val::text::jsonb)
ELSE elems END
) as json_array
FROM
jsonb_array_elements(_myjson) elems
INTO _json_output;
RETURN _json_output;
END;
$$ LANGUAGE 'plpgsql';
If you want to combine both as you did in your question, of course, you can do this:
demo:db<>fiddle
CREATE OR REPLACE FUNCTION json_array_update_index(_myjson jsonb, _val_to_change int, _dest_val int)
RETURNS jsonb
AS $$
DECLARE
_json_output jsonb;
BEGIN
UPDATE mytable
SET myjson = s.json_array
FROM (
SELECT
jsonb_agg(
CASE WHEN elems ->> 'othern' = '5' THEN
jsonb_set(elems, '{othern}', '"7"')
ELSE elems END
) as json_array
FROM
mytable,
jsonb_array_elements(myjson) elems
) s
RETURNING myjson INTO _json_output;
RETURN _json_output;
END;
$$ LANGUAGE 'plpgsql';

MSSQL JSON_VALUE to match ANY Object in Array

I have a table with a JSON text field:
create table breaches(breach_id int, detail text);
insert into breaches values
( 1,'[{"breachedState": null},
{"breachedState": "PROCESS_APPLICATION",}]')
I'm trying to use MSSQL's in build JSON parsing functions to test whether ANY object in a JSON array has a matching member value.
If the detail field was a single JSON object, I could use:
select * from breaches
where JSON_VALUE(detail,'$.breachedState') = 'PROCESS_APPLICATION'
but it's an Array, and I want to know if ANY Object has breachedState = 'PROCESS_APPLICATION'
Is this possible using MSSQL's JSON functions?
You can use function OPENJSON to check each object, try this query:
select * from breaches
where exists
(
select *
from
OPENJSON (detail) d
where JSON_VALUE(value,'$.breachedState') = 'PROCESS_APPLICATION'
)
Btw, there is an extra "," in your insert query, it should be:
insert into breaches values
( 1,'[{"breachedState": null},
{"breachedState": "PROCESS_APPLICATION"}]')

How to delete array element in JSONB column based on nested key value?

How can I remove an object from an array, based on the value of one of the object's keys?
The array is nested within a parent object.
Here's a sample structure:
{
"foo1": [ { "bar1": 123, "bar2": 456 }, { "bar1": 789, "bar2": 42 } ],
"foo2": [ "some other stuff" ]
}
Can I remove an array element based on the value of bar1?
I can query based on the bar1 value using: columnname #> '{ "foo1": [ { "bar1": 123 } ]}', but I've had no luck finding a way to remove { "bar1": 123, "bar2": 456 } from foo1 while keeping everything else intact.
Thanks
Running PostgreSQL 9.6
Assuming that you want to search for a specific object with an inner object of a certain value, and that this specific object can appear anywhere in the array, you need to unpack the document and each of the arrays, test the inner sub-documents for containment and delete as appropriate, then re-assemble the array and the JSON document (untested):
SELECT id, jsonb_build_object(key, jarray)
FROM (
SELECT foo.id, foo.key, jsonb_build_array(bar.value) AS jarray
FROM ( SELECT id, key, value
FROM my_table, jsonb_each(jdoc) ) foo,
jsonb_array_elements(foo.value) AS bar (value)
WHERE NOT bar.value #> '{"bar1": 123}'::jsonb
GROUP BY 1, 2 ) x
GROUP BY 1;
Now, this may seem a little dense, so picked apart you get:
SELECT id, key, value
FROM my_table, jsonb_each(jdoc)
This uses a lateral join on your table to take the JSON document jdoc and turn it into a set of rows foo(id, key, value) where the value contains the array. The id is the primary key of your table.
Then we get:
SELECT foo.id, foo.key, jsonb_build_array(bar.value) AS jarray
FROM foo, -- abbreviated from above
jsonb_array_elements(foo.value) AS bar (value)
WHERE NOT bar.value #> '{"bar1": 123}'::jsonb
GROUP BY 1, 2
This uses another lateral join to unpack the arrays into bar(value) rows. These objects can now be searched with the containment operator to remove the objects from the result set: WHERE NOT bar.value #> '{"bar1": 123}'::jsonb. In the select list the arrays are re-assembled by id and key but now without the offending sub-documents.
Finally, in the main query the JSON documents are re-assembled:
SELECT id, jsonb_build_object(key, jarray)
FROM x -- from above
GROUP BY 1;
The important thing to understand is that PostgreSQL JSON functions only operate on the level of the JSON document that you can explicitly indicate. Usually that is the top level of the document, unless you have an explicit path to some level in the document (like {foo1, 0, bar1}, but you don't have that). At that level of operation you can then unpack to do your processing such as removing objects.

Resources