Is there a jsonb array overlap function for postgres? - arrays

Am not able to extract and compare two arrays from jsonb in postgres to do an overlap check. Is there a working function for this?
Example in people_favorite_color table:
{
"person_id":1,
"favorite_colors":["red","orange","yellow"]
}
{
"person_id":2,
"favorite_colors":["yellow","green","blue"]
}
{
"person_id":3,
"favorite_colors":["black","white"]
}
Array overlap postgres tests:
select
p1.json_data->>'person_id',
p2.json_data->>'person_id',
p1.json_data->'favorite_colors' && p2.json_data->'favorite_colors'
from people_favorite_color p1 join people_favorite_color p2 on (1=1)
where p1.json_data->>'person_id' < p2.json_data->>'person_id'
Expected results:
p1.id;p2.id;likes_same_color
1;2;t
1;3;f
2;3;f
--edit--
Attempting to cast to text[] results in an error:
select
('{
"person_id":3,
"favorite_colors":["black","white"]
}'::jsonb->>'favorite_colors')::text[];
ERROR: malformed array literal: "["black", "white"]"
DETAIL: "[" must introduce explicitly-specified array dimensions.

Use array_agg() and jsonb_array_elements_text() to convert jsonb array to text array:
with the_data as (
select id, array_agg(color) colors
from (
select json_data->'person_id' id, color
from
people_favorite_color,
jsonb_array_elements_text(json_data->'favorite_colors') color
) sub
group by 1
)
select p1.id, p2.id, p1.colors && p2.colors like_same_colors
from the_data p1
join the_data p2 on p1.id < p2.id
order by 1, 2;
id | id | like_same_colors
----+----+------------------
1 | 2 | t
1 | 3 | f
2 | 3 | f
(3 rows)

Related

Loading CSV data to Snowflake table

Column splits into multiple columns when trying to load the following data in to SnowFlake table since its CSV file.
Column Data :
{"Department":"Mens
Wear","Departmentid":"10.1;20.1","customername":"john4","class":"tops wear","subclass":"sweat shirts","product":"North & Face 2 Bangle","style":"Sweat shirt hoodie - Large - Black"}
Is there any other way to load the data in to single column.
The best solution would be use a different delimiter instead of comma in your CSV file. If it's not possible, then you can ingest the data using a non-existing delimiter to get the whole line as one column, and then parse it. Of course it won't be as effective as native loading:
cat test.csv
1,2020-10-12,Gokhan,{"Department":"Mens Wear","Departmentid":"10.1;20.1","customername":"john4","class":"tops wear","subclass":"sweat shirts","product":"North & Face 2 Bangle","style":"Sweat shirt hoodie - Large - Black"}
create file format csvfile type=csv FIELD_DELIMITER='NONEXISTENT';
select $1 from #my_stage (file_format => csvfile );
create table testtable( id number, d1 date, name varchar, v variant );
copy into testtable from (
select
split( split($1,',{')[0], ',' )[0],
split( split($1,',{')[0], ',' )[1],
split( split($1,',{')[0], ',' )[2],
parse_json( '{' || split($1,',{')[1] )
from #my_stage (file_format => csvfile )
);
select * from testtable;
+----+------------+--------+-----------------------------------------------------------------+
| ID | D1 | NAME | V |
+----+------------+--------+-----------------------------------------------------------------+
| 1 | 2020-10-12 | Gokhan | { "Department": "Mens Wear", "Departmentid": "10.1;20.1", ... } |
+----+------------+--------+-----------------------------------------------------------------+

Concatenate string values of objects in a postgresql json array

I have a postgresql table with a json column filled with objects nested in array. Now I want to build a query that returns the id and concatenated string values of objects in the json column. PostgreSQL version is 9.5.
Example data
CREATE TABLE test
(
id integer,
data json
);
INSERT INTO test (id, data) VALUES (1, '{
"info":"a",
"items":[
{ "name":"a_1" },
{ "name":"a_2" },
{ "name":"a_3" }
]
}');
INSERT INTO test (id, data) VALUES (2, '{
"info":"b",
"items":[
{ "name":"b_1" },
{ "name":"b_2" },
{ "name":"b_3" }
]
}');
INSERT INTO test (id, data) VALUES (3, '{
"info":"c",
"items":[
{ "name":"c_1" },
{ "name":"c_2" },
{ "name":"c_3" }
]
}');
Not quite working as intended example
So far I've been able to get the values from the table, unfortunately without the strings being added to one another.
SELECT
row.id,
item ->> 'name'
FROM
test as row,
json_array_elements(row.data #> '{items}' ) as item;
Which will output:
id | names
----------
1 | a_1
1 | a_2
1 | a_3
2 | b_1
2 | b_2
2 | b_3
3 | c_1
3 | c_2
3 | c_3
Intended output example
How would a query look like that returns this output?
id | names
----------
1 | a_1, a_2, a_3
2 | b_1, b_2, b_3
3 | c_1, c_2, c_3
SQL Fiddle link
Your original attempt was missing a group by step
This should work:
SELECT
id
, STRING_AGG(item->>'name', ', ')
FROM
test,
json_array_elements(test.data->'items') as item
GROUP BY 1
By changing the SQL for the second column into an array should give the required results ....
SELECT
row.id, ARRAY (SELECT item ->> 'name'
FROM
test as row1,
json_array_elements(row.data #> '{items}' ) as item WHERE row.id=row1.id)
FROM
test as row;

Check a value in an array inside a object json in PostgreSQL 9.5

I have an json object containing an array and others properties.
I need to check the first value of the array for each line of my table.
Here is an example of the json
{"objectID2":342,"objectID1":46,"objectType":["Demand","Entity"]}
So I need for example to get all lines with ObjectType[0] = 'Demand' and objectId1 = 46.
This the the table colums
id | relationName | content
Content column contains the json.
just query them? like:
t=# with table_name(id, rn, content) as (values(1,null,'{"objectID2":342,"objectID1":46,"objectType":["Demand","Entity"]}'::json))
select * From table_name
where content->'objectType'->>0 = 'Demand' and content->>'objectID1' = '46';
id | rn | content
----+----+-------------------------------------------------------------------
1 | | {"objectID2":342,"objectID1":46,"objectType":["Demand","Entity"]}
(1 row)

Using jsonb_set() for updating specific jsonb array value

Currently I am working with PostgreSQL 9.5 and try to update a value inside an array of a jsonb field. But I am unable to get the index of the selected value
My table just looks like this:
CREATE TABLE samples (
id serial,
sample jsonb
);
My JSON looks like this:
{"result": [
{"8410": "ABNDAT", "8411": "Abnahmedatum"},
{"8410": "ABNZIT", "8411": "Abnahmezeit"},
{"8410": "FERR_R", "8411": "Ferritin"}
]}
My SELECT statement to get the correct value works:
SELECT
id, value
FROM
samples s, jsonb_array_elements(s.sample#>'{result}') r
WHERE
s.id = 26 and r->>'8410' = 'FERR_R';
results in:
id | value
----------------------------------------------
26 | {"8410": "FERR_R", "8411": "Ferritin"}
Ok, this is what I wanted. Now I want to execute an update using the following UPDATE statement to add a new element "ratingtext" (if not already there):
UPDATE
samples s
SET
sample = jsonb_set(sample,
'{result,2,ratingtext}',
'"Some individual text"'::jsonb,
true)
WHERE
s.id = 26;
After executing the UPDATE statement, my data looks like this (also correct):
{"result": [
{"8410": "ABNDAT", "8411": "Abnahmedatum"},
{"8410": "ABNZIT", "8411": "Abnahmezeit"},
{"8410": "FERR_R", "8411": "Ferritin", "ratingtext": "Some individual text"}
]}
So far so good, but I manually searched the index value of 2 to get the right element inside the JSON array. If the order will be changed, this won't work.
So my problem:
Is there a way to get the index of the selected JSON array element and combine the SELECT statement and the UPDATE statement into one?
Just like:
UPDATE
samples s
SET
sample = jsonb_set(sample,
'{result,' || INDEX OF ELEMENT || ',ratingtext}',
'"Some individual text"'::jsonb,
true)
WHERE
s.id = 26;
The values of samples.id and "8410" are known before preparing the statement.
Or is this not possible at the moment?
You can find an index of a searched element using jsonb_array_elements() with ordinality (note, ordinality starts from 1 while the first index of json array is 0):
select
pos- 1 as elem_index
from
samples,
jsonb_array_elements(sample->'result') with ordinality arr(elem, pos)
where
id = 26 and
elem->>'8410' = 'FERR_R';
elem_index
------------
2
(1 row)
Use the above query to update the element based on its index (note that the second argument of jsonb_set() is a text array):
update
samples
set
sample =
jsonb_set(
sample,
array['result', elem_index::text, 'ratingtext'],
'"some individual text"'::jsonb,
true)
from (
select
pos- 1 as elem_index
from
samples,
jsonb_array_elements(sample->'result') with ordinality arr(elem, pos)
where
id = 26 and
elem->>'8410' = 'FERR_R'
) sub
where
id = 26;
Result:
select id, jsonb_pretty(sample)
from samples;
id | jsonb_pretty
----+--------------------------------------------------
26 | { +
| "result": [ +
| { +
| "8410": "ABNDAT", +
| "8411": "Abnahmedatum" +
| }, +
| { +
| "8410": "ABNZIT", +
| "8411": "Abnahmezeit" +
| }, +
| { +
| "8410": "FERR_R", +
| "8411": "Ferritin", +
| "ratingtext": "Some individual text"+
| } +
| ] +
| }
(1 row)
The last argument in jsonb_set() should be true to force adding a new value if its key does not exist yet. It may be skipped however as its default value is true.
Though concurrency issues seem to be unlikely (due to the restrictive WHERE condition and a potentially small number of affected rows) you may be also interested in Atomic UPDATE .. SELECT in Postgres.

Apply function to every element of an array in a SELECT statement

I am listing all functions of a PostgreSQL schema and need the human readable types for every argument of the functions. OIDs of the types a represented as an array in proallargtypes. I can unnest the array and apply format_type() to it, which causes the query to split into multiple rows for a single function. To avoid that I have to create an outer SELECT to GROUP the argtypes again because, apperently, one cannot group an unnested array. All columns are dependent on proname but I have to list all columns in GROUP BY clause, which is unnecessary but proname is not a primary key.
Is there a better way to achieve my goal of an output like this:
proname | ... | protypes
-------------------------------------
test | ... | {integer,integer}
I am currently using this query:
SELECT
proname,
prosrc,
pronargs,
proargmodes,
array_agg(proargtypes), -- see here
proallargtypes,
proargnames,
prodefaults,
prorettype,
lanname
FROM (
SELECT
p.proname,
p.prosrc,
p.pronargs,
p.proargmodes,
format_type(unnest(p.proallargtypes), NULL) AS proargtypes, -- and here
p.proallargtypes,
p.proargnames,
pg_get_expr(p.proargdefaults, 0) AS prodefaults,
format_type(p.prorettype, NULL) AS prorettype,
l.lanname
FROM pg_catalog.pg_proc p
JOIN pg_catalog.pg_language l
ON l.oid = p.prolang
JOIN pg_catalog.pg_namespace n
ON n.oid = p.pronamespace
WHERE n.nspname = 'public'
) x
GROUP BY proname, prosrc, pronargs, proargmodes, proallargtypes, proargnames, prodefaults, prorettype, lanname
you can use internal "undocumented" function pg_catalog.pg_get_function_arguments(p.oid).
postgres=# SELECT pg_catalog.pg_get_function_arguments('fufu'::regproc);
pg_get_function_arguments
---------------------------
a integer, b integer
(1 row)
Now, there are no build "map" function. So unnest, array_agg is only one possible. You can simplify life with own custom function:
CREATE OR REPLACE FUNCTION format_types(oid[])
RETURNS text[] AS $$
SELECT ARRAY(SELECT format_type(unnest($1), null))
$$ LANGUAGE sql IMMUTABLE;
and result
postgres=# SELECT format_types('{21,22,23}');
format_types
-------------------------------
{smallint,int2vector,integer}
(1 row)
Then your query should to be:
SELECT proname, format_types(proallargtypes)
FROM pg_proc
WHERE pronamespace = 2200 AND proallargtypes;
But result will not be expected probably, because proallargtypes field is not empty only when OUT parameters are used. It is empty usually. You should to look to proargtypes field, but it is a oidvector type - so you should to transform to oid[] first.
postgres=# SELECT proname, format_types(string_to_array(proargtypes::text,' ')::oid[])
FROM pg_proc
WHERE pronamespace = 2200
LIMIT 10;
proname | format_types
------------------------------+----------------------------------------------------
quantile_append_double | {internal,"double precision","double precision"}
quantile_append_double_array | {internal,"double precision","double precision[]"}
quantile_double | {internal}
quantile_double_array | {internal}
quantile | {"double precision","double precision"}
quantile | {"double precision","double precision[]"}
quantile_cont_double | {internal}
quantile_cont_double_array | {internal}
quantile_cont | {"double precision","double precision"}
quantile_cont | {"double precision","double precision[]"}
(10 rows)

Resources