Postgresql array sum - arrays

Given an array column in a table of a PostgreSQL database containing the following:
{{765,4},{767,3},{569,5},{567,3},{725,5}}
How could I calculate the sum of all second elements of each subarray,
i.e. 4+3+5+3+5

You can try using UNNEST which expands an array to a set of rows, and filtering by the row number:
SELECT *, (
SELECT SUM(v)
FROM UNNEST(array_column) WITH ORDINALITY a(v, n)
WHERE n % 2 = 0
) FROM your_table;

I was able to resolve my objective presented here by using jsonb array.
The jsonbArray [{"an": 4, "qid": 765}, {"an": 3, "qid": 767}, {"an": 5, "qid": 569}, {"an": 3, "qid": 567}, {"an": 5, "qid": 725}]
The query that accomplishes the objective:
WITH answers as (
SELECT
(jsonbArray -> 'an')::int as an,
(jsonbArray -> 'qid')::int as qid
FROM (
SELECT jsonb_array_elements(jsonbArray) AS jsonbArray
FROM user where id = 1
) AS s
group by qid, an
)
select sum(an) as score from answers where qid in (765,725)
Result:
score
9

Related

count jsonb array with condition in postgres

I have a postgres database where some column data are stored as follow:
guest_composition
charging_age
[{"a": 1, "b": 1, "c": 1, "children_ages": [10, 5, 2, 0.1]}]
3
[{"a": 1, "b": 1, "c": 1, "children_ages": [2.5, 1, 4]}]
3
i want to go over the children_ages array and to return the count of children that are above the age of 3. I am having a hard time to use the array data because it is returns as jsonb and not int array.
the first row should return 2 because there are 2 children above the age of 3. The second row should return 1 because there is 1 child above the age of 3.
I have tried the following but it didn't work:
WITH reservation AS (SELECT jsonb_array_elements(reservations.guest_composition)->'children_ages' as children_ages, charging_age FROM reservations
SELECT (CASE WHEN (reservations.charging_age IS NOT NULL AND reservation.children_ages IS NOT NULL) THEN SUM( CASE WHEN (reservation.children_ages)::int[] >=(reservations.charging_age)::int THEN 1 ELSE 0 END) ELSE 0 END) as children_to_charge
You can extract an array of all child ages using a SQL JSON path function:
select jsonb_path_query_array(r.guest_composition, '$[*].children_ages[*] ? (# > 3)')
from reservations r;
The length of that array is then the count you are looking for:
select jsonb_array_length(jsonb_path_query_array(r.guest_composition, '$[*].children_ages[*] ? (# > 3)'))
from reservations r;
It's unclear to me if charging_age is a column and could change in every row. If that is the case, you can pass a parameter to the JSON path function:
select jsonb_path_query_array(
r.guest_composition, '$[*].children_ages[*] ? (# > $age)',
jsonb_build_object('age', charging_age)
)
from reservations r;

What is the equivalent of postgresSQL unnest() in snowflake sql

How do i modify the PostgresSQL in snowflake?
UNNEST(array[
'x' || to_char(date_trunc('MONTH', max(date)), 'Mon YYYY' ,
'y' || to_char(date_trunc('MONTH', max(date)), 'Mon YYYY')
)])
You can use "flatten" to break out values from the array, and then "table" to convert the values into a table:
-- Use an array for testing:
select array_construct(1, 2, 3, 4, 5);
-- Flattens into a table with metadata for each row:
select * from table(flatten(input => array_construct(1, 2, 3, 4, 5)));
--Pulls out just the values from the array:
select value::integer from table(flatten(input => array_construct(1, 2, 3, 4, 5)));
The "::integer" part casts the values to the data type you want from the array. It's optional but recommended.
You can approximate the syntax of unnest by creating a user defined table function:
create or replace function UNNEST(V array)
returns table ("VALUE" variant)
language SQL
aS
$$
select VALUE from table(flatten(input => V))
$$;
You would call it like this:
select * from table(unnest(array_construct(1, 2, 3, 4, 5)));
This returns a table with a single column named VALUE of type variant. You can make a version that returns strings, integers, etc.

Array difference in postgresql

I have two arrays [1,2,3,4,7,6] and [2,3,7] in PostgreSQL which may have common elements. What I am trying to do is to exclude from the first array all the elements that are present in the second.
So far I have achieved the following:
SELECT array
(SELECT unnest(array[1, 2, 3, 4, 7, 6])
EXCEPT SELECT unnest(array[2, 3, 7]));
However, the ordering is not correct as the result is {4,6,1} instead of the desired {1,4,6}.
How can I fix this ?
I finally created a custom function with the following definition (taken from here) which resolved my issue:
create or replace function array_diff(array1 anyarray, array2 anyarray)
returns anyarray language sql immutable as $$
select coalesce(array_agg(elem), '{}')
from unnest(array1) elem
where elem <> all(array2)
$$;
I would use ORDINALITY option of UNNEST and put an ORDER BY in the array_agg function while converting it back to array. NOT EXISTS is preferred over except to make it simpler.
SELECT array_agg(e order by id)
FROM unnest( array[1, 2, 3, 4, 7, 6] ) with ordinality as s1(e,id)
WHERE not exists
(
SELECT 1 FROM unnest(array[2, 3, 7]) as s2(e)
where s2.e = s1.e
)
DEMO
More simple, NULL support, probably faster:
select array(
select v
from unnest(array[2,2,null,1,3,3,4,5,null]) with ordinality as t(v, pos)
where array_position(array[3,3,5,5], v) is null
order by pos
);
Result: {2,2,null,1,4,null}
Function array_diff() with tests.
Postgres is unfortunately lacking this functionality. In my case, what I really needed to do was to detect cases where the array difference was not empty. In that specific case you can do that with the #> operator which means "Does the first array contain the second?"
ARRAY[1,4,3] #> ARRAY[3,1,3] → t
See doc

Efficiently saving summable array values in RDBMs

I have a dataset where we track engagement per-percent (so 8 people are active at 38%, 7 people are active at 39%, etc.). This gives an array with 100 values, filled with integers.
I need to store this in a postgres table. The only/major requirement is that I need to be able to sum the values for each index to form a new array. Example:
Row 1: [5, 3, 5, ... 7]
Row 2: [2, 5, 3, ... 1]
Sum: [7, 8, 8, ... 8]
The naive way to save these would be 100 individual (BIG)INT columns, which would allow you to sum the values per-column over multiple rows. However, this makes the table very wide (and does not seem like the most efficient way to do it). I have looked into (BIG)INT[100] columns, but I cannot seem to find a good, native way to sum the values. Same thing with json(b) columns (with a native JSON array).
Have I overlooked something? Is there a good, efficient way to do this without completely bloating a table?
The solution using unnest() with ordinality:
with the_table(intarr) as (
values
(array[1, 2, 3, 4]),
(array[1, 2, 3, 4]),
(array[1, 2, 3, 4])
)
select array_agg(sum order by ordinality)
from (
select ordinality, sum(unnest)
from the_table,
lateral unnest(intarr) with ordinality
group by 1
) s;
array_agg
------------
{3,6,9,12}
(1 row)
Here is one method that seems to work:
select array_agg(sum_aval order by ind)
from (select ind, sum(aval) sum_aval
from (select id, unnest(a) as aval, generate_series(1, 3) as ind
from (values (1, array[1, 2, 3]), (2, array[3, 4, 5])) v(id, a)
) x
group by ind
) x;
That is, unnest the arrays and generate indexes for them using generate_series(). Then you can aggregate at the index level and then re-combine into an array (using two separate aggregations).

How to select each value of array

Consider following case
Table : tab1
id serial primary key
arr int[]
Now I want to select each value of arr.
SELECT * FROM (SELECT arr FROM tab1) AS tab2
I need kind of iteration in array.
e.g.
id arr
-----------------------------
1 [1,2]
2 [5,6,8]
So I could get result as
arr val
-------------------------------
[1,2] 1
[1,2] 2
[5,6,8] 5
[5,6,8] 6
[5,6,8] 8
Use unnest() for that:
WITH array_data(id,arr) AS ( VALUES
(1,ARRAY[1,2]),
(2,ARRAY[5,6,8])
)
SELECT arr,unnest(arr) AS val
FROM array_data;
I don't know if I've got well but here you have all you need
select id,
unnest(arr),
array_to_string(arr,','),
array_length(arr, 1)
from array_data;

Resources