PostgreSQL aggregate over json arrays - arrays

I have seen a lot of references to using json_array_elements on extracting the elements of a JSON array. However, this appears to only work on exactly 1 array. If I use this in a generic query, I get the error
ERROR: cannot call json_array_elements on a scalar
Given something like this:
orders
{ "order_id":"2", "items": [{"name": "apple","price": 1.10}]}
{ "order_id": "3","items": [{"name": "apple","price": 1.10},{"name": "banana","price": 0.99}]}
I would like to extract
item
count
apple
2
banana
1
Or
item
total_value_sold
apple
2.20
banana
0.99
Is it possible to aggregate over json arrays like this using json_array_elements?

Use the function for orders->'items' to flatten the data:
select elem->>'name' as name, (elem->>'price')::numeric as price
from my_table
cross join jsonb_array_elements(orders->'items') as elem;
It is easy to get the aggregates you want from the flattened data:
select name, count(*), sum(price) as total_value_sold
from (
select elem->>'name' as name, (elem->>'price')::numeric as price
from my_table
cross join jsonb_array_elements(orders->'items') as elem
) s
group by name;
Db<>fiddle.

Related

Flatten and aggregate two columns of arrays via distinct in Snowflake

Table structure is
+------------+---------+
| Animals | Herbs |
+------------+---------+
| [Cat, Dog] | [Basil] |
| [Dog, Lion]| [] |
+------------+---------+
Desired output (don't care about sorting of this list):
unique_things
+------------+
[Cat, Dog, Lion, Basil]
First attempt was something like
SELECT ARRAY_CAT(ARRAY_AGG(DISTINCT(animals)), ARRAY_AGG(herbs))
But this produces
[[Cat, Dog], [Dog, Lion], [Basil], []]
Since the distinct is operating on each array, not looking at distinct components within all arrays
If I understand your requirements right and assuming the source tables of
insert into tabarray select array_construct('cat', 'dog'), array_construct('basil');
insert into tabarray select array_construct('lion', 'dog'), null;
I would say the result would look like this:
select array_agg(distinct value) from
(
select
value from tabarray
, lateral flatten( input => col1 )
union all
select
value from tabarray
, lateral flatten( input => col2 ))
;
UPDATE
It is possible without using FLATTEN, by using ARRAY_UNION_AGG:
Returns an ARRAY that contains the union of the distinct values from the input ARRAYs in a column.
For sample data:
CREATE OR REPLACE TABLE t AS
SELECT ['Cat', 'Dog'] AS Animals, ['Basil'] AS Herbs
UNION SELECT ['Dog', 'Lion'], [];
Query:
SELECT ARRAY_UNION_AGG(ARRAY_CAT(Animals, Herbs)) AS Result
FROM t
or:
SELECT ARRAY_UNION_AGG(Animals) AS Result
FROM (SELECT Animals FROM t
UNION ALL
SELECT Herbs FROM t);
Output:
You could flatten the combined array and then aggregate back:
SELECT ARRAY_AGG(DISTINCT F."VALUE") AS unique_things
FROM tab, TABLE(FLATTEN(ARRAY_CAT(tab.Animals, tab.Herbs))) f
Here is another variation to handle NULLs in case they appear in data set.
SELECT ARRAY_AGG(DISTINCT a.VALUE) unique_things from tab, TABLE (FLATTEN(array_compact(array_append(tab.Animals, tab.Herbs)))) a

BigQuery - Values referenced in UNNEST must be arrays. UNNEST contains expression of type STRUCT ... at [5:18]

Hello guys this time I came across a new error to group array, I will share with you the schema of the table I am consulting so you can tell me the solution. I tried to use ARRAY_TO_STRING but in this case it didn't work out ...
SELECT
individual_details.gender AS gender,
COUNT(DISTINCT profile.owner_id ) AS profile_count_distinct
FROM dataset.profile AS profile
LEFT JOIN UNNEST(profile.individual_details) as individual_details
GROUP BY 1
ORDER BY 2 DESC
Values referenced in UNNEST must be arrays. UNNEST contains expression
of type STRUCT at [5:18]
individual_details is not an ARRAY, but rather STRUCT - so you do not need UNNEST it
Try below
SELECT
individual_details.gender AS gender,
COUNT(DISTINCT profile.owner_id ) AS profile_count_distinct
FROM dataset.profile AS profile
GROUP BY 1
ORDER BY 2 DESC

How to extract elements of a JSONB array?

Running PostgresSQL v10.5.
In my table table_a that has a column metadata which is of type jsonb.
It has a JSON array as one of it's keys array_key with value something like this:
[{"key1":"value11", "key2":"value21", "key3":"value31"},
{"key1":"value21", "key2":"value22", "key3":"value23"}]
This is how I can query this key
SELECT metadata->>'array_key' from table_a
This gives me the entire array. Is there any way I can query only selected keys and perhaps format them?
The type of the array is text i.e pg_typeof(metadata->>'array_key') is text
An ideal output would be
"value11, value13", "value21, value23"
Use jsonb_array_elements() to get elements of the array (as value) which can be filtered by keys:
select value->>'key1' as key1, value->>'key3' as key3
from table_a
cross join jsonb_array_elements(metadata->'array_key');
key1 | key3
---------+---------
value11 | value31
value21 | value23
(2 rows)
Use an aggregate to get the output as a single value for each row, e.g.:
select string_agg(concat_ws(', ', value->>'key1', value->>'key3'), '; ')
from table_a
cross join jsonb_array_elements(metadata->'array_key')
group by id;
string_agg
------------------------------------
value11, value31; value21, value23
(1 row)
Working example in rextester.

Use Values from JSONB Array inside a WHERE IN Clause

I have a JSONB object in PostgreSQL:
'{"cars": ["bmw", "mercedes", "pinto"], "user_name": "ed"}'
I am trying to use values from the "cars" array inside it in the WHERE clause of a SELECT:
SELECT car_id FROM cars WHERE car_type IN ('bmw', 'mercedes', 'pinto');
This will correctly return the values 1, 2, and 3 - see table setup at the bottom of this post.
Currently, in my function I do this:
(1) Extract the "cars" array into a variable `v_car_results`.
(2) Use that variable in the `WHERE` clause.
Pseudo code:
DECLARE v_car_results TEXT
BEGIN
v_car_results = '{"cars": ["bmw", "mercedes", "pinto"], "user_name": "ed"}'::json#>>'{cars}';
-- this returns 'bmw', 'mercedes', 'pinto'
SELECT car_id FROM cars WHERE car_type IN ( v_car_results );
END
However, the SELECT statement is not returning any rows. I know it's reading those 3 car types as a single type. (If I only include one car_type in the "cars" element, the query works fine.)
How would I treat these values as an array inside the WHERE clause?
I've tried a few other things:
The ANY clause.
Various attempts at casting.
These queries:
SELECT car_id FROM cars
WHERE car_type IN (json_array_elements_text('["bmw", "mercedes", "pinto"]'));
...
WHERE car_type IN ('{"cars": ["bmw", "mercedes", "pinto"], "user_name": "ed"}':json->>'cars');
It feels like it's something simple I'm missing. But I've fallen down the rabbit hole on this one. (Maybe I shouldn't even be using the ::json#>> operator?)
TABLE SETUP
CREATE TABLE cars (
car_id SMALLINT
, car_type VARCHAR(255)
);
INSERT INTO cars (car_id, car_type)
VALUES
(1, 'bmw')
, (2, 'mercedes')
, (3, 'pinto')
, (4, 'corolla');
SELECT car_id FROM cars
WHERE car_type IN ('bmw', 'mercedes', 'pinto'); -- Returns Values : 1, 2, 3
Assuming at least the current Postgres 9.5.
Use the set-returning function jsonb_array_elements_text() (as table function!) and join to the result:
SELECT c.car_id
FROM jsonb_array_elements_text('{"cars": ["bmw", "mercedes", "pinto"]
, "user_name": "ed"}'::jsonb->'cars') t(car_type)
JOIN cars c USING (car_type);
Extract the JSON array from the object with jsonb->'cars' and pass the resulting JSON array (still data type jsonb) to the function. (The operator #> would do the job as well.)
Aside: ::json#>> isn't just an operator. It's a cast to json (::json), followed by the operator #>>. You don't need either.
The resulting type text conveniently matches your column type varchar(255), so we don't need type-casting. And assign the column name car_type to allow for the syntax shorthand with USING in the join condition.
This form is shorter, more elegant and typically a bit faster than alternatives with IN () or = ANY() - which would work too. Your attempts were pretty close, but you need a variant with a subquery. This would work:
SELECT car_id FROM cars
WHERE car_type IN (SELECT json_array_elements_text('["bmw", "mercedes", "pinto"]'));
Or, cleaner:
SELECT car_id FROM cars
WHERE car_type IN (SELECT * FROM json_array_elements_text('["bmw", "mercedes", "pinto"]'));
Detailed explanation:
How to use ANY instead of IN in a WHERE clause?
Related:
How to turn JSON array into Postgres array?

How to extract json array elements in postgresql

What I want to do is sum 29.0 and 34.65 and group by P_id
Table: transaction_items
Column name: Debits, P_id
Column data type: text, text
Data:
Debits
[{"amount":29.0,"description":"Fee_Type_1"}
[{"amount":"34.65","description":"Fee_Type_1"}
P_id
16
16
I tried using the solution mentioned here [https://stackoverflow.com/questions/27834482/how-to-get-elements-from-json-array-in-postgresql][1]
select     transaction_line_items.P_id,
           each_attribute ->> 'amount' Rev
from       transaction_line_items
cross join json_array_elements(to_json(Debits)) each_section
cross join json_array_elements(each_section -> 'attributes') each_attribute
where      (each_attribute -> 'amount') is not null;
However, I got an error saying "cannot deconstruct a scalar".
Can someone please let me know how to parse the values I am looking for?
Thank you.
It seems that your data is broken. The values of Debits column are not valid json due to the lack of right square brackets. Assuming that your data should look like this:
[{"amount":29.0,"description":"Fee_Type_1"}]
[{"amount":"34.65","description":"Fee_Type_1"}]
the following query does what you want:
select p_id, sum(amount)
from (
select p_id, (elements->>'amount')::numeric amount
from transaction_items
cross join json_array_elements(debits::json) elements
) sub
group by p_id;

Resources