Count nested JSON array elements over all result rows - arrays

I have a SQL query that I am running in order to get results, where one of the column contains a JSON array.
I want to count the total of JSON elements in total from all returned rows.
I.e. if 2 rows were returned, where one row had 3 JSON array items in metadata column, and the second row had 4 JSON array items in metadata column, I'd like to see 7 as a returned count.
Is this possible?
This is my current SQL query:
WITH _result AS (
SELECT lo.*
FROM laser.laser_checks la
JOIN laser.laser_brands lo ON la.id = lo.brand_id
WHERE lo.type not in (1)
AND la.source in (1,4,5)
AND la.prod_id in (1, 17, 19, 22, 27, 29)
)
SELECT ovr.json -> 'id' AS object_uuid,
ovr.json -> 'username' AS username,
image.KEY AS image_uuid,
image.value AS metadata,
user_id as user_uuid
FROM _result ovr,
jsonb_array_elements(ovr."json" -> 'images') elem,
jsonb_each(elem) image

Unpack the arrays and count the elements:
WITH q AS (/* your query */)
SELECT object_uuid,
username,
image_uuid,
metadata,
user_uuid,
sum(elemcount) OVER () AS total_array_elements
FROM (SELECT q.object_uuid,
q.username,
q.image_uuid,
q.metadata,
q.user_uuid,
count(a.e) AS elemcount
FROM q
LEFT JOIN LATERAL jsonb_array_elements(q.metadata) AS a(e)
ON TRUE
GROUP BY q.object_uuid,
q.username,
q.image_uuid,
q.metadata,
q.user_uuid
) AS p;

An elephant managed to slip everybody's attention in this room: jsonb_array_length().
(Or json_array_length() for json.)
The manual:
Returns the number of elements in the top-level JSON array.
After you have already unnested the JSON array to your level of interest, you can apply the function to the (now) top level. Wrap it in a window function to get total counts for every result row.
Your query should work like this:
SELECT lo.json -> 'id' AS object_uuid
, lo.json -> 'username' AS username
, image.key AS image_uuid
, image.value AS metadata
, lo.user_id AS user_uuid
, sum(jsonb_array_length(image.value)) OVER () AS total_array_elements -- !!!
FROM laser.laser_checks la
JOIN laser.laser_brands lo ON la.id = lo.brand_id
, jsonb_array_elements(lo."json" -> 'images') elem
, jsonb_each(elem) image
WHERE lo.type NOT IN (1)
AND la.source IN (1,4,5)
AND la.prod_id IN (1, 17, 19, 22, 27, 29);
No need for a LATERAL subquery, aggregation, nor even for a CTE, really.
Related:
Sort by length of nested JSON array

Related

union request and pagination in cakephp4

I made two requests. The first one gives me 2419 results and I store the result in $requestFirst. The second, 1 result and I store the result in $requestTwo.
I make a union :
$requestTot = $requestFirst->union($requestTwo);
The total of the $requestTot is 2420 results so all is well so far.
Then :
$request = $this->paginate($requestTot);
$this->set(compact('request'));
And here I don't understand, on each page of the pagination I find the result of $requestTwo. Moreover the pagination displays me :
Page 121 of 121, showing 20 record(s) out of 2,420 total
This is the right number of results except that when I multiply the number of results per page by the number of pages I get 2540. This is the total number of results plus one per page.
Can anyone explain?
Check the generated SQL in Debug Kit's SQL panel, you should see that the LIMIT AND OFFSET clauses are being set on the first query, not appended as global clauses so that they would affect the unionized query.
It will look something like this:
(SELECT id, title FROM a LIMIT 20 OFFSET 0)
UNION
(SELECT id, title FROM b)
So what happens then is that pagination will only be applied to the $requestFirst query, and the $requestTwo query will be unionized on top of it each and every time, hence you'll see its result on every single page.
A workaround for this current limitation would be to use the union query as a subquery or a common table expression from which to fetch the results. In order for this to work you need to make sure that the fields of your queries for the union are being selected without aliasing! This can be achieved by either using Table::subquery():
$requestFirst = $this->TableA
->subquery()
->select(['a', 'b'])
// ...
$requestTwo = $this->TableB
->subquery()
->select(['c', 'd'])
// ...
or by explicitly selecting the fields with aliases equal to the column names:
$requestFirst = $this->TableA
->find()
->select(['a' => 'a', 'b' => 'b'])
// ...
$requestTwo = $this->TableB
->find()
->select(['c' => 'c', 'd' => 'd'])
// ...
Then you can safely use those queries for a union as a subquery:
$union = $requestFirst->union($requestTwo);
$wrapper = $this->TableA
->find()
->from([$this->TableA->getAlias() => $union]);
$request = $this->paginate($wrapper);
or as a common table expression (in case your DBMS supports them):
$union = $requestFirst->union($requestTwo);
$wrapper = $this->TableA
->find()
->with(function (\Cake\Database\Expression\CommonTableExpression $cte) use ($union) {
return $cte
->name('union_source')
->field(['a', 'b'])
->query($union)
})
->select(['a', 'b'])
->from([$this->TableA->getAlias() => 'union_source']);
$request = $this->paginate($wrapper);

Count the size of an array in another column?

I'm trying to update "Count_Column" in "My_Table" to get the count of the array size in Array_Column
UPDATE TABLE [dbo1].[My_Table]
SET [Count_Column] as COUNT(Array_Column.test);
Array_Column in a column that has a JSON array in it
It's unclear why you want to store this information a second time, ideally you would just query it when needed.
Be that as it may, you need to break out the array using OPENJSON
UPDATE dbo1.My_Table
SET Count_Column = (
SELECT COUNT(*)
FROM OPENJSON(My_Table.Array_Column, '$.test')
);
This assumes that the array is located in a property called test, for example:
{
"test" : [
1,
2
]
}
would return 2

Return Parts of an Array in Postgres

I have a column (text) in my Postgres DB (v.10) with a JSON format.
As far as i now it's has an array format.
Here is an fiddle example: Fiddle
If table1 = persons and change_type = create then i only want to return the name and firstname concatenated as one field and clear the rest of the text.
Output should be like this:
id table1 did execution_date change_type attr context_data
1 Persons 1 2021-01-01 Create Name [["+","name","Leon Bill"]]
1 Persons 2 2021-01-01 Update Firt_name [["+","cur_nr","12345"],["+","art_cd","1"],["+","name","Leon"],["+","versand_art",null],["+","email",null],["+","firstname","Bill"],["+","code_cd",null]]
1 Users 3 2021-01-01 Create Street [["+","cur_nr","12345"],["+","art_cd","1"],["+","name","Leon"],["+","versand_art",null],["+","email",null],["+","firstname","Bill"],["+","code_cd",null]]
Disassemble json array into SETOF using json_array_elements function, then assemble it back into structure you want.
select m.*
, case
when m.table1 = 'Persons' and m.change_type = 'Create'
then (
select '[["+","name",' || to_json(string_agg(a.value->>2,' ' order by a.value->>1 desc))::text || ']]'
from json_array_elements(m.context_data::json) a
where a.value->>1 in ('name','firstname')
)
else m.context_data
end as context_data
from mutations m
modified fiddle
(Note:
utilization of alphabetical ordering of names of required fields is little bit dirty, explicit order by case could improve readability
resulting json is assembled from string literals as much as possible since you didn't specified if "+" should be taken from any of original array elements
the to_json()::text is just for safety against injection
)

SQL Server: How to remove a key from a Json object

I have a query like (simplified):
SELECT
JSON_QUERY(r.SerializedData, '$.Values') AS [Values]
FROM
<TABLE> r
WHERE ...
The result is like this:
{ "2019":120, "20191":120, "201902":121, "201903":134, "201904":513 }
How can I remove the entries with a key length less then 6.
Result:
{ "201902":121, "201903":134, "201904":513 }
One possible solution is to parse the JSON and generate it again using string manipulations for keys with desired length:
Table:
CREATE TABLE Data (SerializedData nvarchar(max))
INSERT INTO Data (SerializedData)
VALUES (N'{"Values": { "2019":120, "20191":120, "201902":121, "201903":134, "201904":513 }}')
Statement (for SQL Server 2017+):
UPDATE Data
SET SerializedData = JSON_MODIFY(
SerializedData,
'$.Values',
JSON_QUERY(
(
SELECT CONCAT('{', STRING_AGG(CONCAT('"', [key] ,'":', [value]), ','), '}')
FROM OPENJSON(SerializedData, '$.Values') j
WHERE LEN([key]) >= 6
)
)
)
SELECT JSON_QUERY(d.SerializedData, '$.Values') AS [Values]
FROM Data d
Result:
Values
{"201902":121,"201903":134,"201904":513}
Notes:
It's important to note, that JSON_MODIFY() in lax mode deletes the specified key if the new value is NULL and the path points to a JSON object. But, in this specific case (JSON object with variable key names), I prefer the above solution.

How to delete array element in JSONB column based on nested key value?

How can I remove an object from an array, based on the value of one of the object's keys?
The array is nested within a parent object.
Here's a sample structure:
{
"foo1": [ { "bar1": 123, "bar2": 456 }, { "bar1": 789, "bar2": 42 } ],
"foo2": [ "some other stuff" ]
}
Can I remove an array element based on the value of bar1?
I can query based on the bar1 value using: columnname #> '{ "foo1": [ { "bar1": 123 } ]}', but I've had no luck finding a way to remove { "bar1": 123, "bar2": 456 } from foo1 while keeping everything else intact.
Thanks
Running PostgreSQL 9.6
Assuming that you want to search for a specific object with an inner object of a certain value, and that this specific object can appear anywhere in the array, you need to unpack the document and each of the arrays, test the inner sub-documents for containment and delete as appropriate, then re-assemble the array and the JSON document (untested):
SELECT id, jsonb_build_object(key, jarray)
FROM (
SELECT foo.id, foo.key, jsonb_build_array(bar.value) AS jarray
FROM ( SELECT id, key, value
FROM my_table, jsonb_each(jdoc) ) foo,
jsonb_array_elements(foo.value) AS bar (value)
WHERE NOT bar.value #> '{"bar1": 123}'::jsonb
GROUP BY 1, 2 ) x
GROUP BY 1;
Now, this may seem a little dense, so picked apart you get:
SELECT id, key, value
FROM my_table, jsonb_each(jdoc)
This uses a lateral join on your table to take the JSON document jdoc and turn it into a set of rows foo(id, key, value) where the value contains the array. The id is the primary key of your table.
Then we get:
SELECT foo.id, foo.key, jsonb_build_array(bar.value) AS jarray
FROM foo, -- abbreviated from above
jsonb_array_elements(foo.value) AS bar (value)
WHERE NOT bar.value #> '{"bar1": 123}'::jsonb
GROUP BY 1, 2
This uses another lateral join to unpack the arrays into bar(value) rows. These objects can now be searched with the containment operator to remove the objects from the result set: WHERE NOT bar.value #> '{"bar1": 123}'::jsonb. In the select list the arrays are re-assembled by id and key but now without the offending sub-documents.
Finally, in the main query the JSON documents are re-assembled:
SELECT id, jsonb_build_object(key, jarray)
FROM x -- from above
GROUP BY 1;
The important thing to understand is that PostgreSQL JSON functions only operate on the level of the JSON document that you can explicitly indicate. Usually that is the top level of the document, unless you have an explicit path to some level in the document (like {foo1, 0, bar1}, but you don't have that). At that level of operation you can then unpack to do your processing such as removing objects.

Resources