Postgres: select array element based on value in other column - arrays

I have two column in a postgres table, one containing an array of variable length the other the array position of the elements I want to select.
How do I query these?
testtable
testarray arraypos
a,b,c 3
a NULL
NULL
b,c,d,e 3
a,c 1
What I want is something like:
SELECT testarray[arraypos] FROM testtable;
i.e. getting c, d, a.

Related

How to parse a table with a JSON array field in PostgreSQL into rows?

I have a table that contains a json array. Here is a sample of the contents of the field from:
SELECT json_array FROM table LIMIT 5;
Result:
[{"key1":"value1"}, {"key1":"value2"}, ..., {"key2":"value3"}]
[]
[]
[]{"key1":"value1"}
[]
How can I retrieve all the values and count how many of each value was found?
I am using PostgreSQL 9.5.14, and I have tried the solutions here Querying a JSON array of objects in Postgres
and the ones suggested to me by another generous stackoverflow user in my last question: How can I parse JSON arrays in postgresql?
I tried:
SELECT
value -> 'key1'
FROM
table,
json_array_elements(json_array);
which sadly does not work for me due to receiving the error: cannot call json_array_elements on a scalar
This error happens when using a query that returns more than one row or more than one column as a scalar subquery.
Another solution I tried was:
SELECT json_array as json, (json_array->0),
coalesce(
case
when (json_array->0) IS NULL then null
else (json_array->0->>'key1')
end,
'No value') AS "Value"
FROM table;
which only returned null values for the "Value"
Referencing Querying a JSON array of objects in Postgres I attempted to use this solution as well:
WITH json_test (col) AS (
values (json_arrays)
)
SELECT
y.x->'key1' "key1"
FROM json_test jt,
LATERAL (SELECT json_array_elements(jt.col) x) y;
But I would need to be able to fit all the elements of the json_arrays into json_test
So far I have only attempted to list all the values in the all json arrays, but my ideal end-result for the query resembles this:
Value | Amount
---------------
value1 | 48
value2 | 112
value3 | 93
value4 | 0
Yet again I am grateful for any help with this, thank you in advance.
step-by-step demo:db<>fiddle
SELECT
each.value,
COUNT(*)
FROM
data,
json_array_elements(json_array) elems, -- 1
json_each_text(elems) each -- 2
GROUP BY each.value -- 3
Expand array into one row for each array element
split the key/value pairs into two columns
group by the new value column/count

Counting rows in a table based on multiple array criterias

I am trying to count rows in a table based on multiple criteria in different columns of that table. The criteria are not directly in the formula though; they are arrays which I would like to refer to (and not list them in the formula directly).
Range table example:
Group Status
a 1
b 4
b 6
c 4
a 6
d 5
d 4
a 2
b 2
d 3
b 2
c 1
c 2
c 1
a 4
b 3
Criteria/arrays example:
group
a
b
status
1
2
3
I am able to do this if i only have one array search through a range (1 column) in that table:
{=SUM(COUNTIFS(data[Group],group[group]))}
Returns "9" as expected (=sum of rows in the group column of the data table which match any values in group[group])
But if I add a second different range and a different array I get an incorrect result:
{=SUM(COUNTIFS(data[Group],group[group], data[Status],status[status]))}
Returns "3" but should return "5" (=sum of rows which have 1, 2 or 3 in the status column and a or b in the group column)
I searched and googled for various ideas related to using sumproduct or defining arrays within the formula instead of classifying the whole formula as an array but I was not able to get expected results via those means.
Thank you for your help.
Because your group and status criteria are a different number of values (2 values for group, but 3 values for status), I'm not sure you can do this in a single formula. Best way I know of to do this would be to use a helper column (which can be hidden if preferred).
Put this array formula in a helper column and copy down the length of your data (array formulas must be confirmed with Ctrl+Shift+Enter):
=AND(OR(data[#Group]=group[group]),OR(data[#Status]=status[status]))
And then get the count with: =COUNTIF(helpercolumn,TRUE)
You could use a slightly different approach, using Power Query / Power Pivot.
Name your tables Data, Group and Status, then create the following query, named Filtered Data:
let
tbData = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
tbGroup = Excel.CurrentWorkbook(){[Name="Group"]}[Content],
tbStatus = Excel.CurrentWorkbook(){[Name="Status"]}[Content],
#"Merged Group" = Table.NestedJoin(tbData,{"Group"},tbGroup,{"Group"},"tbGroup",JoinKind.Inner),
#"Merged Status" = Table.NestedJoin(#"Merged Group",{"Status"},tbStatus,{"Status"},"Merged Status",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Status",{"tbGroup", "Merged Status"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Status", type number}})
in
#"Changed Type"
Load To as connection only, and tick Load to Data Model
Now create a DAX measure:
Status Sum:=SUM ( 'Filtered Data'[Status] )
You can then use the following formula on your worksheet, to get the Sum of Status values, for rows matching the criteria specified in the Group and Status tables:
=CUBEVALUE("ThisWorkbookDataModel","[Measures].[Status Sum]")
Simply refresh the data connection to update the value.

Access the index of an element in a jsonb array

I would like to access the index of an element in a jsonb array, like this:
SELECT
jsonb_array_elements(data->'Steps') AS Step,
INDEX_OF_STEP
FROM my_process
I don't see any function in the manual for this.
Is this somehow possible?
Use with ordinality. You have to call the function in the from clause to do this:
with my_process(data) as (
values
('{"Steps": ["first", "second"]}'::jsonb)
)
select value as step, ordinality- 1 as index
from my_process
cross join jsonb_array_elements(data->'Steps') with ordinality
step | index
----------+-------
"first" | 0
"second" | 1
(2 rows)
Read in the documentation (7.2.1.4. Table Functions):
If the WITH ORDINALITY clause is specified, an additional column of type bigint will be added to the function result columns. This column numbers the rows of the function result set, starting from 1.
You could try using
jsonb_each_text(jsonb)
which should supply both the key and value.
There is an example in this question:
Extract key, value from json objects in Postgres
except you would use the jsonb version.

Make a new array with items derived from another array

Given a PostgreSQL ARRAY of items of one type, how can I create a new array where each item is derived from the items in the initial array?
Example: I have an array of INTERVAL values. I want a new array where each item is a NUMERIC(10, 1) that is the total number of seconds in the corresponding INTERVAL value.
I know how to convert one INTERVAL value:
foo=> SELECT '00:01:20.000'::INTERVAL AS duration_interval;
duration_interval
-------------------
00:01:20
(1 row)
foo=> SELECT extract(EPOCH FROM date_trunc('second', '00:01:20.000'::INTERVAL))
::NUMERIC(10, 1) AS duration_seconds;
duration_seconds
------------------
80.0
(1 row)
The array does not exist in a table – this is a value returned from another function call – so the conversion code needs to operate on it as an array.
How can I convert an array of INTERVAL values to an array of corresponding NUMERIC values?
You need to unnest() the array, do the conversion and then aggregate back into an array.
Assuming you want to do this on a real table with a primary key:
SELECT pk, array_agg(extract(epoch from dur_int)::numeric(10,1)
ORDER BY ordinality) AS duration_seconds
FROM my_table, unnest(duration_interval) WITH ORDINALITY d(dur_int)
GROUP BY pk;
If you have a single array, such as the result from a function call:
SELECT array_agg(extract(epoch from dur_int)::numeric(10,1)
ORDER BY ordinality) AS duration_seconds
FROM unnest(function(...)) WITH ORDINALITY d(dur_int);
Note that you need the WITH ORDINALITY clause when unnesting the array. This will add a column ordinality to the result such that every row has two columns: (dur_int interval, ordinality bigint). When putting the array back again with seconds instead of an interval, you order the rows by the ordinality column. That way you ensure that the order in the resulting array of seconds is the same as in the original array of intervals. (In general, SQL row sources have no specific ordering, the server may present rows in any order it prefers.)
If you have access to the function and you are not breaking other uses of it, you might be better off by changing the function such that you can use its result directly.
If there is a primary key then #Patrick answer is enough. If not then use row_number to aggregate on:
with i(i) as (values
(array['00:01:20.000','00:00:30.000']::interval[]),
(array['00:02:10.000','00:01:30.000']::interval[])
)
select array_agg(extract(epoch from a)::numeric(10,1))
from (
select i, row_number() over() as r
from i
) s, unnest(i) a (a)
group by r
;
array_agg
--------------
{80.0,30.0}
{130.0,90.0}

VBA two dimensional arrays connecting to a database

I am working with vba in excel and a database in access. The access database is a table that contains 3 columns; OrderIDs which is a column of numbers saying what order the particular item was in, OrderDescription which is a column that contains the description of the item, and Item # which is a column that gives a number to each particular item (if the item is the same as another, they both are the same item).
I need to build a 2-dimensional array in excel using VBA holding which items were purchased in which orders. The rows will be the Order ID and the columns will be the Item ID. The elements of this array will contain an indicator (like True or a “1”) that indicates that this order contains certain items. For example, row 6 (representing order ID 6) will have “True” in columns 1, 5, and 26 if that order purchased item IDs 1, 5, and 26. All other columns for that order will be blank.
In order to do this, i think I will have to determine the max order number (39) and the max item number(33). This information is available in the database which I can connect to using a .connection and .recordset. Some order numbers and some item numbers may not appear.
Note also that this will likely be a sparse array (not many entries) as most orders contain only a few items. We do not care how many of an item a customer purchased, only that the item was purchased on this order.
MY QUESTION is how can I set up this array? I tried a loop that would assign the values of the order numbers to an array and the items numbers to an array and then dimensioning the array to those sizes, but it wont work.
is there a way to make an element of an array return a value of True if it exists?
Thanks for your help
It seems to me that the best bet may be a cross tab query run on an access connection. You can create your array with the ADO method GetRows : http://www.w3schools.com/ado/met_rs_getrows.asp.
TRANSFORM Nz([Item #],0)>0 AS Val
SELECT OrderNo
FROM Table
GROUP BY OrderNo
PIVOT [Item #]
With a Counter table containing integers from 1 to maximum number of items in a column (field) Num.
TRANSFORM First(q.Val) AS FirstOfVal
SELECT q.OrderNo
FROM (SELECT t.OrderNo, c.Num, Nz([Item #],0)>0 AS Val
FROM TableX t RIGHT JOIN [Counter] c ON t.[Item #] = c.Num
WHERE c.Num<12) q
GROUP BY q.OrderNo
PIVOT q.Num
Output:
OrderNo 1 2 3 4 5 6 7 8 9 10 11
0 0 0 0 0 0
1 -1 -1 -1 -1
2 -1 -1 -1 -1

Resources