Transform JSON array to boolean columns in PostgreSQL

Transform JSON array to boolean columns in PostgreSQL - arrays

I have a column that contains a JSON array of strings, which I would like to transform into boolean columns. These columns are true if the value was present in the array.
Let's say I have the following columns in Postgres.
|"countries"|
---------------
["NL", "BE"]
["UK"]
I would like to transform this into boolean columns per market. e.g.
|"BE"|"NL"|"UK"|
--------------------
|True|True|False|
|False|False|True|
I know I can manually expand it using case statements for each country code, but there are 200+ countries.
Is there are more elegant solution?

Displaying a various list of columns whose labels are known only at the runtime is not so obvious with postgres. You need some dynamic sql code.
Here is a full dynamic solution whose result is close from your expected result and which relies on the creation of a user-defined composite type and on the standard functions jsonb_populate_record and jsonb_object_agg :
First you create the list of countries as a new composite type :
CREATE TYPE country_list AS () ;
CREATE OR REPLACE PROCEDURE country_list () LANGUAGE plpgsql AS
$$
DECLARE country_list text ;
BEGIN
SELECT string_agg(DISTINCT c.country || ' text', ',')
INTO country_list
FROM your_table
CROSS JOIN LATERAL jsonb_array_elements_text(countries) AS c(country) ;
EXECUTE 'DROP TYPE IF EXISTS country_list' ;
EXECUTE 'CREATE TYPE country_list AS (' || country_list || ')' ;
END ;
$$ ;
Then you can call the procedure country_list () just before executing the final query :
CALL country_list () ;
or even better call the procedure country_list () by trigger when the list of countries is supposed to be modified :
CREATE OR REPLACE FUNCTION your_table_insert_update()
RETURNS trigger LANGUAGE plpgsql VOLATILE AS
$$
BEGIN
IF EXISTS ( SELECT 1
FROM (SELECT jsonb_object_keys(to_jsonb(a.*)) FROM (SELECT(null :: country_list).*) AS a) AS b(key)
RIGHT JOIN jsonb_array_elements_text(NEW.countries) AS c(country)
ON c.country = b.key
WHERE b.key IS NULL
)
THEN CALL country_list () ;
END IF ;
RETURN NEW ;
END ;
$$ ;
CREATE OR REPLACE TRIGGER your_table_insert_update AFTER INSERT OR UPDATE OF countries ON your_table
FOR EACH ROW EXECUTE FUNCTION your_table_insert_update() ;
CREATE OR REPLACE FUNCTION your_table_delete()
RETURNS trigger LANGUAGE plpgsql VOLATILE AS
$$
BEGIN
CALL country_list () ;
RETURN OLD ;
END ;
$$ ;
CREATE OR REPLACE TRIGGER your_table_delete AFTER DELETE ON your_table
FOR EACH ROW EXECUTE FUNCTION your_table_delete() ;
Finally, you should get the expected result with the following query, except that the column label are lower case, and NULL is replacing false in the result :
SELECT (jsonb_populate_record(NULL :: country_list, jsonb_object_agg(lower(c.country), true))).*
FROM your_table AS t
CROSS JOIN LATERAL jsonb_array_elements_text(t.countries) AS c(country)
GROUP BY t
full test result in dbfiddle.

Related

Nested json extraction into postgres table

I have used following query to parse and store json elements into table 'pl'
'test' table is used to store raw json.
select
each_attribute ->> 'id' id,
each_attribute ->> 'sd' sd,
each_attribute ->> 'v' v
from test
cross join json_array_elements(json_array) each_section
cross join json_array_elements(each_section -> 'firstt') each_attribute
I am able to view following json values using above query but not able to insert it into another table using json_populate_recordset.
Table definition I need to insert nested json into:
id integer, character varying(6666), character varying(99999)
Table1(for above definition) should store value for key firstt
Table2(for above definition) should store value for key secondt
Json format:
{
"firstt": [
{
"id": 1,
"sd": "test3",
"v": "2223"
},
{
"id": 2,
"sd": "test2",
"v": "2222"
}],
"secondt": [
{
"id": 1,
"sd": "test3",
"v": "2223"
},
{
"id": 2,
"sd": "test2",
"v": "2222"
}]
}
Please assist. I have tried every possible thing from stackoverflow solutions but nothing is given for nested array like this for insertion.
Adding code for dynamic query. It does not work. Error -'too few arguments for format'.
do $$
DECLARE
my record;
tb_n varchar(50);
BEGIN
FOR my IN
SELECT json_object_keys(json_array) as t FROM test
LOOP
tb_n := my.t;
EXECUTE format($$ WITH tbl_record_arrays as(
SELECT
entries.*
FROM
test
JOIN LATERAL json_each(json_array) as entries(tbl_name,tbl_data_arr) ON TRUE
)
INSERT INTO %I
SELECT
records.*
FROM
tbl_record_arrays
JOIN LATERAL json_populate_recordset(null::%I,tbl_data_arr) records ON TRUE
WHERE
tbl_name = %I$$,tb_n);
END LOOP;
END;
$$;

To create a plpgsql function that dynamically inserts a json array for a specified key into a specified table, you can do:
CREATE OR REPLACE FUNCTION dynamic_json_insert(key_name text,tbl text) RETURNS VOID AS $$
BEGIN
-- the $<tag>$ syntax allows for generating a multiline string
EXECUTE format($sql$
INSERT INTO %1$I
SELECT
entries.*
FROM test
JOIN LATERAL json_populate_recordset(null::%1$I,json_data -> $1) as entries ON TRUE;
$sql$::text,tbl) USING dynamic_json_insert.key_name;
END;
$$ LANGUAGE plpgsql
VOLATILE --modifies data
STRICT -- Returns NULL if any arguments are NULL
SECURITY INVOKER; --Execute this function with the Role of the caller, rather than the Role that defined the function;
and call it like
SELECT dynamic_json_insert('firstt','table_1')
If you want to insert into multiple tables using multiple key value pairs you can make a plpgsql function that takes a variadic array of key,table pairs and then generate a single Common Table Expression (CTE) with all of the INSERTs in a single atomic statement.
First create a custom type:
CREATE TYPE table_key as (
tbl_key text,
relation regclass -- special type that refers to a Postgresql relation
);
Then define the function:
CREATE OR REPLACE FUNCTION dynamic_json_insert(variadic table_keys table_key[]) RETURNS VOID AS $$
DECLARE
tbl_key_len integer = array_length(dynamic_json_insert.table_keys,1);
BEGIN
IF tbl_key_len > 0 THEN
EXECUTE (
--generates a single atomic insert CTE when there are multiple table_keys OR a single insert statement otherwise
--the SELECT is enclosed in parenthesis because it generates a single text value which EXECUTE receives.
SELECT
--append WITH if We have more than 1 table_key (for CTE)
CASE WHEN tbl_key_len > 1 THEN 'WITH ' ELSE '' END
|| string_agg(
CASE
WHEN
--name the auxiliary statement and put it in parenthesis.
is_aux THEN format('%1$I as (%2$s)','ins_' || tk.tbl_key,stmt) || end_char
ELSE stmt
END,E'\n') || ';'
FROM
--unnest the table_keys argument and get its index (rn)
unnest(dynamic_json_insert.table_keys) WITH ORDINALITY AS tk(tbl_key,relation,rn)
-- the JOIN LATERAL here means "for each unnested table_key, generate the rows of the following subquery"
JOIN LATERAL (
SELECT
rn < tbl_key_len is_aux,
--we need a comma between auxiliary statements
CASE WHEN rn = tbl_key_len - 1 THEN '' ELSE ',' END end_char,
--dynamically generate INSERT statement
format($sql$
INSERT INTO %1$I
SELECT
entries.*
FROM test
JOIN LATERAL json_populate_recordset(null::%1$I,json_data -> %2$L) as entries ON TRUE
$sql$::text,tk.relation,tk.tbl_key) stmt
) stmts ON TRUE
);
END IF;
END;
$$ LANGUAGE plpgsql
VOLATILE --modifies data
STRICT -- Returns NULL if any arguments are NULL
SECURITY INVOKER; --Execute this function with the Role of the caller, rather than the Role that defined the function;
Then call the function like:
SELECT dynamic_json_insert(
('firstt','table_1'),
('secondt','table_2')
);
Because of the use of the variadic keyword, you can pass in each element of the array as an individual argument and Postgres will cast to the appropriate types automatically.
The generated/executed SQL for the above function call will be:
WITH ins_firstt as (
INSERT INTO table_1
SELECT
entries.*
FROM test
JOIN LATERAL json_populate_recordset(null::table_1,json_data -> 'firstt') as entries ON TRUE
)
INSERT INTO table_2
SELECT
entries.*
FROM test
JOIN LATERAL json_populate_recordset(null::table_2,json_data -> 'secondt') as entries ON TRUE
;

Stored procedure handling multiple SQL statements in Snowflake

I'm creating a stored procedure in Snowflake that will eventually be called by a task.
However I'm getting the following error:
Multiple SQL statements in a single API call are not supported; use one API call per statement instead
And not sure how approach the advised solution within my Javascript implementation.
Here's what I have
CREATE OR REPLACE PROCEDURE myStoreProcName()
RETURNS VARCHAR
LANGUAGE javascript
AS
$$
var rs = snowflake.execute( { sqlText:
`set curr_date = '2015-01-01';
CREATE OR REPLACE TABLE myTableName AS
with cte1 as (
SELECT
*
FROM Table1
where date = $curr_date
)
,cte2 as (
SELECT
*
FROM Table2
where date = $curr_date
)
select * from
cte1 as 1
inner join cte2 as 2
on(1.key = 2.key)
`
} );
return 'Done.';
$$;

You could write your own helper function(idea of user: waldente):
this.executeMany=(s) => s.split(';').map(sqlText => snowflake.createStatement({sqlText}).execute());
executeMany('set curr_date = '2015-01-01';
CREATE OR REPLACE TABLE ...');
The last statement should not contain ; it also may fail if there is ; in one of DDL which was not intended as separator.

You can't have:
var rs = snowflake.execute( { sqlText:
`set curr_date = '2015-01-01';
CREATE OR REPLACE TABLE myTableName AS
...
`
Instead you need to call execute twice (or more). Each for a different query ending in ;.

Bypassing string array as a parameters and using dynamic query in PostgreSQL

I working in PostgreSQL past years, but i had no idea about array concept and how to handle array in PostgreSQL. i need a dynamic query for selecting columns in multiple table and the result will be in cursor, columns names should be dynamically will change.
for e.g (in a multiple table totally 30 columns is there If user need col1, col5,col6,col25), so select statement query will be dynamically will change like:
select col1, col5,col6,col25 from table ....
another user need col2,col5,col7,col29,col26, select statement will change dynamically as
select col2,col5,col7,col29,col26 from table .... and so on.
stored procedure passing parameters will be array
create or replace function func_my_method(check_in character varying, sel character varying[])...
this sel[] contains like
sel[0]:='col1_name'
sel[1]:='col5_name'
sel[2]:='col6_name'
sel[3]:='col25_name'
so first we have to split the array values in separate variable and these variable will be assumed in select statement will be
'select'||col1, col5,col6,col25||'from......'
Finally want to say briefly I need to pass an array in parameters and have to separate a array values and it will assign to separate variables. these variable will use to select a statement in dynamic manner

A bare refcursor can contain any number of columns. Although you'll need a special statement to read from it: FETCH ...
CREATE OR REPLACE FUNCTION func_my_method(check_in text, sel text[], ref refcursor)
RETURNS refcursor
LANGUAGE plpgsql
AS $func$
BEGIN
OPEN ref FOR EXECUTE 'SELECT ' || (SELECT string_agg(quote_ident(c), ', ')
FROM unnest(sel) c) || ' FROM ...';
RETURN ref;
END;
$func$;
SELECT func_my_method('check_in', ARRAY['col1', 'col2'], 'sample_name');
FETCH ALL IN sample_name;
http://rextester.com/ZCZT84224
Note: You could omit the refcursor parameter & DECLARE one in your function body. This way PostgreSQL will generate a (non-conflicting) name for the refcursor, which will be returned when calling SELECT func_my_method(...). You'll need that name in the FETCH ... statement.
Update: If you want to fully qualify (some) columns (i.e. write table name & column too), you'll need either:
CREATE OR REPLACE FUNCTION func_my_method2(check_in text, sel text[], ref refcursor)
RETURNS refcursor
LANGUAGE plpgsql
AS $func$
BEGIN
OPEN ref FOR EXECUTE 'SELECT ' || (SELECT string_agg((SELECT string_agg(quote_ident(c), '.')
FROM unnest(string_to_array(fq, '.')) c), ', ')
FROM unnest(sel) fq) || ' FROM ...';
RETURN ref;
END;
$func$;
SELECT func_my_method2('check_in', ARRAY['col1', 'check_in.col2'], 'sample_name2');
FETCH ALL IN sample_name2;
(this will split the sel parameter into "parts" of the fully qualified name on . -- but have a disadvantage: the table & column names cannot contain .)
Or:
CREATE OR REPLACE FUNCTION func_my_method3(check_in text, sel text[][], ref refcursor)
RETURNS refcursor
LANGUAGE plpgsql
AS $func$
BEGIN
OPEN ref FOR EXECUTE 'SELECT ' || (SELECT string_agg((SELECT string_agg(quote_ident(sel[i][j]), '.')
FROM generate_subscripts(sel, 2) j), ', ')
FROM generate_subscripts(sel, 1) i) || ' FROM ...';
RETURN ref;
END;
$func$;
SELECT func_my_method3('check_in', ARRAY[ARRAY['check_in', 'col1'], ARRAY['check_in', 'col2']], 'sample_name3');
FETCH ALL IN sample_name3;
(but this has an uncomfortable consequence: since arrays need to be rectangular, all column sub-arrays need to be the same exact dimensions; so you'll need to provide table name for all of the columns or to neither of them.)
http://rextester.com/JNI24740

Stored procedure syntax with IN condition

(1)
=>CREATE TABLE T1(id BIGSERIAL PRIMARY KEY, name TEXT);
CREATE TABLE
(2)
=>INSERT INTO T1
(name) VALUES
('Robert'),
('Simone');
INSERT 0 2
(3)
SELECT * FROM T1;
id | name
----+--------
1 | Robert
2 | Simone
(2 rows)
(4)
CREATE OR REPLACE FUNCTION test_me(id_list BIGINT[])
RETURNS BOOLEAN AS
$$
BEGIN
PERFORM * FROM T1 WHERE id IN ($1);
IF FOUND THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$
LANGUAGE 'plpgsql';
CREATE FUNCTION
My problem is when calling the procedure. I'm not able to find an example on the net showing how to pass a list of values of type BIGINT (or integer, whatsoever).
I tried what follows without success (syntax errors):
First syntax:
eway=> SELECT * FROM test_me('{1,2}'::BIGINT[]);
ERROR: operator does not exist: bigint = bigint[]
LINE 1: SELECT * FROM T1 WHERE id IN ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT * FROM T1 WHERE id IN ($1)
CONTEXT: PL/pgSQL function test_me(bigint[]) line 3 at PERFORM
Second syntax:
eway=> SELECT * FROM test_me('{1,2}');
ERROR: operator does not exist: bigint = bigint[]
LINE 1: SELECT * FROM T1 WHERE id IN ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT * FROM T1 WHERE id IN ($1)
CONTEXT: PL/pgSQL function test_me(bigint[]) line 3 at PERFORM
Third syntax:
eway=> SELECT * FROM test_me(ARRAY [1,2]);
ERROR: operator does not exist: bigint = bigint[]
LINE 1: SELECT * FROM T1 WHERE id IN ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT * FROM T1 WHERE id IN ($1)
CONTEXT: PL/pgSQL function test_me(bigint[]) line 3 at PERFORM
Any clues about a working syntax?
It's like the parser was trying to translate a BIGINT to BIGINT[] in the PEFORM REQUEST but it doesn't make any sense to me...

All your syntax variants to pass an array are correct.
Pass array literal to PostgreSQL function
The problem is with the expression inside the function. You can test with the ANY construct like #Mureinik provided or a number of other syntax variants. In any case run the test with an EXISTS expression:
CREATE OR REPLACE FUNCTION test_me(id_list bigint[])
RETURNS bool AS
$func$
BEGIN
IF EXISTS (SELECT 1 FROM t1 WHERE id = ANY ($1)) THEN
RETURN true;
ELSE
RETURN false;
END IF;
END
$func$ LANGUAGE plpgsql STABLE;
Notes
EXISTS is shortest and most efficient:
PL/pgSQL checking if a row exists - SELECT INTO boolean
The ANY construct applied to arrays is only efficient with small arrays. For longer arrays, other syntax variants are faster. Like:
IF EXISTS (SELECT 1 FROM unnest($1) id JOIN t1 USING (id)) THEN ...
How to do WHERE x IN (val1, val2,…) in plpgsql
Don't quote the language name, it's an identifier, not a string: LANGUAGE plpgsql
Simple variant
While you are returning a boolean value, it can be even simpler. It's probably just for the demo, but as a proof of concept:
CREATE OR REPLACE FUNCTION test_me(id_list bigint[])
RETURNS bool AS
$func$
SELECT EXISTS (SELECT 1 FROM t1 WHERE id = ANY ($1))
$func$ LANGUAGE sql STABLE;
Same result.

The easiest way to check if an item is in an array is with = ANY:
CREATE OR REPLACE FUNCTION test_me(id_list BIGINT[])
RETURNS BOOLEAN AS
$$
BEGIN
PERFORM * FROM T1 WHERE id = ANY ($1);
IF FOUND THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$
LANGUAGE 'plpgsql';

Dependency Tracking function

I just wonder if anyone knows how to automatize views creation after running DROP ... CASCADE?
Now I'm trying to drop view at first with classic DROP VIEW myview statement and if I cannot drop the view because other objects still depend on it then checking out all the objects names that postgres lists and save their creates and then I run drop with cascade. Sometimes it's like over a dozen objects. But maybe you have got some idea to handle this issue in more automated way?
Maybe anybody has got some function?

Next step... (continuation of my previous answer).
function save_views(objectname text) stores views depending on objectname (view or table) in table saved_views.
function restore_views() restores views from table saved_views.
create or replace function save_views_oid(objectid oid)
returns void language plpgsql as $$
declare
r record;
begin
for r in
select distinct c.oid, c.relname, n.nspname
from pg_depend d
join pg_rewrite w on w.oid = d.objid
join pg_class c on c.oid = w.ev_class
join pg_namespace n on n.oid = c.relnamespace
where d.refclassid = 'pg_class'::regclass
and d.classid = 'pg_rewrite'::regclass
and d.refobjid = objectid
and c.oid <> objectid
loop
insert into saved_views values (
'CREATE VIEW ' || r.nspname || '.' || r.relname ||
' AS ' || pg_get_viewdef(r.oid, 'f'));
perform save_views_oid(r.oid);
end loop;
end; $$;
create or replace function save_views(objectname text)
returns void language plpgsql as $$
begin
create table if not exists saved_views(viewbody text);
truncate saved_views;
perform save_views_oid(objectname::regclass);
end; $$;
create or replace function restore_views()
returns void language plpgsql as $$
declare
viewtext text;
begin
for viewtext in
select viewbody from saved_views
loop
execute viewtext;
end loop;
drop table saved_views;
end; $$;
Test:
select save_views('my_view'); -- may be save_views('my_schema.my_view');
select * from saved_views;
Use:
select save_views('my_view');
drop view my_view cascade;
create view my_view as ...
select restore_views();

Table pg_depend contains all necessary informations, but it is not so easy to interpret them. Here you have sketch recursive function to retrieve dependencies of pg_class object in text format. You can tune up the function to your needs (and show us results:).
create or replace function dependency
(class_id regclass, obj_id regclass, obj_subid integer, dep_type "char")
returns setof text language plpgsql as $$
declare
r record;
begin
return query
select pg_describe_object(class_id, obj_id, obj_subid)
|| ' ('|| dep_type|| ')';
for r in
select classid, objid, objsubid, deptype
from pg_depend
where class_id = refclassid
and obj_id = refobjid
and (obj_subid = refobjsubid or obj_subid = 0)
loop
return query select dependency(r.classid, r.objid, r.objsubid, r.deptype);
end loop;
end; $$;
use:
select dependency('pg_class'::regclass, 'my_view'::regclass, 0, ' ');
select dependency('pg_class'::regclass, 'my_table'::regclass, 0, ' ');

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Transform JSON array to boolean columns in PostgreSQL - arrays

Related

Nested json extraction into postgres table

Stored procedure handling multiple SQL statements in Snowflake

Bypassing string array as a parameters and using dynamic query in PostgreSQL

Stored procedure syntax with IN condition

Dependency Tracking function

Categories

Resources