How to get distinct array elements with postgres?

How to get distinct array elements with postgres? - arrays

I have an array with duplicate values in postgres. For example:
SELECT cardinality(string_to_array('1,2,3,4,4', ',')::int[]) as foo
=> "foo"=>"5"
I would like to get unique elements, for example:
SELECT cardinality(uniq(string_to_array('1,2,3,4,4', ',')::int[])) as foo
=> -- No function matches the given name and argument types. You might need to add explicit type casts.
Can I get unique elements of an array in postgres without using UNNEST ?

I prefer this syntax (about 5% faster)
create or replace function public.array_unique(arr anyarray)
returns anyarray as $body$
select array( select distinct unnest($1) )
$body$ language 'sql';
using:
select array_unique(ARRAY['1','2','3','4','4']);

For integer arrays use intarray extension:
create extension if not exists intarray;
select cardinality(uniq(string_to_array('1,2,3,4,4', ',')::int[])) as foo
or the function
create or replace function public.array_unique(arr anyarray)
returns anyarray
language sql
as $function$
select array_agg(distinct elem)
from unnest(arr) as arr(elem)
$function$;
for any array. You can easily modify the function to preserve the original order of the array elements:
create or replace function public.array_unique_ordered(arr anyarray)
returns anyarray
language sql
as $function$
select array_agg(elem order by ord)
from (
select distinct on(elem) elem, ord
from unnest(arr) with ordinality as arr(elem, ord)
order by elem, ord
) s
$function$;
Example:
with my_data(arr) as (values ('{d,d,a,c,b,b,a,c}'::text[]))
select array_unique(arr), array_unique_ordered(arr)
from my_data
array_unique | array_unique_ordered
--------------+----------------------
{a,b,c,d} | {d,a,c,b}
(1 row)

Going off of #klin's accepted answer, I modified it to remove nulls in the process of choosing only the distinct values.
create or replace function public.array_unique_no_nulls(arr anyarray)
returns anyarray
language sql
as $function$
select array_agg(distinct a)
from (
select unnest(arr) a
) alias
where a is not null
$function$;

Related

How can a PostgreSQL function accept an integer or array of integers for the same function?

I have a function in Postgres 9.6 that accepts an int[] parameter. I'd like to have the function accept a single int as well (and convert it to a single element array if necessary).
CREATE OR REPLACE FUNCTION get_subordinates(inp_persona_ids integer[])
-- Get all subordnates of the people passed in as array
-- TODO allow a single persona ID (int) to be passed in as inp_persona_ids
RETURNS TABLE (persona_id int) AS
$$
BEGIN
RETURN QUERY(
WITH RECURSIVE children AS (
-- passed in persona_id
SELECT
id AS persona_id,
manager_id
FROM
personas
WHERE
id = ANY(inp_persona_ids)
UNION
-- and all subordinates
SELECT
p.id AS persona_id,
p.manager_id
FROM
personas p
JOIN children c ON p.manager_id = c.persona_id
)
SELECT
children.persona_id
FROM
children
LEFT JOIN
personas on children.persona_id = personas.id
WHERE personas.disabled IS NOT TRUE
);
END;
$$ LANGUAGE plpgsql
How would I change the function definition and also add some conditional logic to test for int and change to ARRAY[int] if necessary?

It is not possible to handle this in a single function, but you can just overload the function with an integer parameter and pass this as an array to your existing function:
CREATE OR REPLACE FUNCTION get_subordinates(inp_persona_id integer)
RETURNS TABLE (persona_id int) AS
$$
BEGIN
RETURN QUERY SELECT * FROM get_subordinates(ARRAY[inp_persona_id]);
END;
$$ LANGUAGE plpgsql;
Perhaps you might also want to check the argument(s) against NULL, this is up to you.

It is possible with a single function using the VARIADIC modifier:
CREATE OR REPLACE FUNCTION get_subordinates(VARIADIC inp_persona_ids int[])
RETURNS TABLE (persona_id int) AS
$func$
WITH RECURSIVE children AS ( -- passed in persona_id
SELECT id AS persona_id, manager_id, disabled
FROM personas
WHERE id = ANY(inp_persona_ids)
UNION ALL -- and all subordinates
SELECT p.id AS persona_id
, p.manager_id
FROM children c
JOIN personas p ON p.manager_id = c.persona_id
)
SELECT c.persona_id
FROM children c
WHERE c.disabled IS NOT TRUE
$func$ LANGUAGE sql;
But you need to add the keyword VARIADIC in the call when providing an array instead of a list:
SELECT * FROM get_subordinates(VARIADIC '{1,2,3}'::int[]);
SELECT * FROM get_subordinates(1,2,3);
SELECT * FROM get_subordinates(1);
If that's not an option you are back to function overloading as suggested in another answer.
See:
Passing multiple values in single parameter
Pass multiple values in single parameter
Asides
Looks like this can be a simpler SQL function.
UNION made no sense. Duplicates can only occur if your tree goes in circles, which would create an endless loop and the rCTE would error out. Use the cheaper UNION ALL.
LEFT JOIN made no sense. The added WHERE forced it to behave like a plain [INNER] JOIN anyway.
But remove the join completely and retrieve the column disabled inside the rCTE.

Why recursive union does not work with composite types in PostgreSQL

I have a table with fields of composite type. When I've tried to perform recursive union with such fields I got an error.
drop type example_t cascade;
create type example_t as (
value text,
key text
);
drop table if exists example cascade;
create table example (
inbound example_t,
outbound example_t,
primary key (inbound, outbound)
);
create or replace function example_fn(_attrs example_t[])
returns table (attr example_t) as $$
with recursive target as (
select outbound
from example
where array[inbound] <# _attrs
union
select r.outbound
from target as t
inner join example as r on r.inbound = t.outbound
)
select unnest(_attrs)
union
select * from target;
$$ language sql immutable;
select example_fn(array[('foo', 'bar') ::example_t]);
ERROR: could not implement recursive UNION DETAIL: All column datatypes must be hashable. CONTEXT: SQL function "example_fn" during startup SQL state: 0A000
Non-recursive union just works
create or replace function example_fn(_attrs example_t[])
returns table (attr example_t) as $$
select unnest(_attrs)
union
select * from example;
$$ language sql immutable;
select example_fn(array[('foo', 'bar') ::example_t]);
I can refactor my function this way to make it works. But it looks weird. I mean it is less readable. Is there any way to do it better?
create or replace function example_fn(_attrs example_t[])
returns table (attr example_t) as $$
with recursive target as (
select (outbound).value, (outbound).key
from example
where array[inbound] <# _attrs
union
select (r.outbound).value, (r.outbound).key
from target as t
inner join example as r on r.inbound = (t.value, t.key) ::example_t
)
select (unnest(_attrs)).*
union
select * from target;
$$ language sql immutable;

There is a thread on PostgreSQL hackers mailing list and the short explanation by Tom Lane:
In general we consider that a datatype's notion of equality can be defined either by its default btree opclass (which supports sort-based query algorithms) or by its default hash opclass (which supports hash-based query algorithms).
The plain UNION code supports either sorting or hashing, but we've not gotten around to supporting a sort-based approach to recursive UNION. I'm not convinced that it's worth doing ...
As a workaround use union all:
with recursive target as (
select outbound
from example
where inbound = ('a', 'a')::example_t
union all
select r.outbound
from target as t
inner join example as r on r.inbound = t.outbound
)
select *
-- or, if necessary
-- select distinct *
from target

Stored procedure syntax with IN condition

(1)
=>CREATE TABLE T1(id BIGSERIAL PRIMARY KEY, name TEXT);
CREATE TABLE
(2)
=>INSERT INTO T1
(name) VALUES
('Robert'),
('Simone');
INSERT 0 2
(3)
SELECT * FROM T1;
id | name
----+--------
1 | Robert
2 | Simone
(2 rows)
(4)
CREATE OR REPLACE FUNCTION test_me(id_list BIGINT[])
RETURNS BOOLEAN AS
$$
BEGIN
PERFORM * FROM T1 WHERE id IN ($1);
IF FOUND THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$
LANGUAGE 'plpgsql';
CREATE FUNCTION
My problem is when calling the procedure. I'm not able to find an example on the net showing how to pass a list of values of type BIGINT (or integer, whatsoever).
I tried what follows without success (syntax errors):
First syntax:
eway=> SELECT * FROM test_me('{1,2}'::BIGINT[]);
ERROR: operator does not exist: bigint = bigint[]
LINE 1: SELECT * FROM T1 WHERE id IN ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT * FROM T1 WHERE id IN ($1)
CONTEXT: PL/pgSQL function test_me(bigint[]) line 3 at PERFORM
Second syntax:
eway=> SELECT * FROM test_me('{1,2}');
ERROR: operator does not exist: bigint = bigint[]
LINE 1: SELECT * FROM T1 WHERE id IN ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT * FROM T1 WHERE id IN ($1)
CONTEXT: PL/pgSQL function test_me(bigint[]) line 3 at PERFORM
Third syntax:
eway=> SELECT * FROM test_me(ARRAY [1,2]);
ERROR: operator does not exist: bigint = bigint[]
LINE 1: SELECT * FROM T1 WHERE id IN ($1)
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT * FROM T1 WHERE id IN ($1)
CONTEXT: PL/pgSQL function test_me(bigint[]) line 3 at PERFORM
Any clues about a working syntax?
It's like the parser was trying to translate a BIGINT to BIGINT[] in the PEFORM REQUEST but it doesn't make any sense to me...

All your syntax variants to pass an array are correct.
Pass array literal to PostgreSQL function
The problem is with the expression inside the function. You can test with the ANY construct like #Mureinik provided or a number of other syntax variants. In any case run the test with an EXISTS expression:
CREATE OR REPLACE FUNCTION test_me(id_list bigint[])
RETURNS bool AS
$func$
BEGIN
IF EXISTS (SELECT 1 FROM t1 WHERE id = ANY ($1)) THEN
RETURN true;
ELSE
RETURN false;
END IF;
END
$func$ LANGUAGE plpgsql STABLE;
Notes
EXISTS is shortest and most efficient:
PL/pgSQL checking if a row exists - SELECT INTO boolean
The ANY construct applied to arrays is only efficient with small arrays. For longer arrays, other syntax variants are faster. Like:
IF EXISTS (SELECT 1 FROM unnest($1) id JOIN t1 USING (id)) THEN ...
How to do WHERE x IN (val1, val2,…) in plpgsql
Don't quote the language name, it's an identifier, not a string: LANGUAGE plpgsql
Simple variant
While you are returning a boolean value, it can be even simpler. It's probably just for the demo, but as a proof of concept:
CREATE OR REPLACE FUNCTION test_me(id_list bigint[])
RETURNS bool AS
$func$
SELECT EXISTS (SELECT 1 FROM t1 WHERE id = ANY ($1))
$func$ LANGUAGE sql STABLE;
Same result.

The easiest way to check if an item is in an array is with = ANY:
CREATE OR REPLACE FUNCTION test_me(id_list BIGINT[])
RETURNS BOOLEAN AS
$$
BEGIN
PERFORM * FROM T1 WHERE id = ANY ($1);
IF FOUND THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
$$
LANGUAGE 'plpgsql';

Postgres: How to return an integer array from stored function

I want to create a query using something like the following:
select id, array(id_adj(id)) from existingtable
which would be two columns: 1 with the id, and the 2nd column with an array of integers.
The function id_adj returns a set of rows (single column of integers) and is written as follows:
DROP FUNCTION IF EXISTS id_adj(hz_id int);
CREATE FUNCTION id_adj(id int) returns SETOF int AS $$
select b.id
from existingtable a, existingtable b
where a.id != b.id
and a.id=$1
and ST_Distance(a.wkb_geometry, b.wkb_geometry) <= 0.05
$$LANGUAGE SQL
The above function works for a single id. For example:
select id_adj(462);
returns a single column with integer values.
I know that the array() function returns an array of values given a query result from a SELECT statement. For example:
select array(select id from existingtable where id<10);
returns an array "{6,5,8,9,7,3,4,1,2}".
But combining the two together does not seem to work. Note that although I'm using a postgis ST_Distance function above, it is not required to test a solution to my problem.
I'm also open to having the function return an array instead of a setof records, but that seemed more complicated at first.

You are missing a select statement
select
id,
array(select id_adj(id))
from existingtable

Postgres function with text array and select where in query

I need to create a function like this (scaled down to a minimum) where I send an array of strings that should be matched. But I cant make the query to work.
create or replace function bar(x text[]) returns table (c bigint) language plpgsql as $$
begin
return query select count(1) as counter from my_table where my_field in (x);
end;$$;
and call it like this
select * from bar(ARRAY ['a','b']);
I could try to let the parameter x be a single text string and then use something like
return query execute 'select ... where myfield in ('||x||')';
So how would I make it work with the parameter as an array?
would that be better or worse compared to let the parameter be a string?

Yes, an array is the cleaner form. String matching would leave corner cases where separators and patterns combined match ...
To find strings that match any of the given patterns, use the ANY construct:
CREATE OR REPLACE FUNCTION bar(x text[])
RETURNS bigint LANGUAGE sql AS
$func$
SELECT count(*) -- alias wouldn't visible outside function
FROM my_table
WHERE my_field = ANY(x);
$func$;
count(*) is slightly faster than count(1). Same result.
Note, I am using a plain SQL function (instead of plpgsql). Either has its pros and cons.

That's fixed with the help of unnest that converts an array to a set (btw, the function doesn't have to be plpgsql):
CREATE OR REPLACE FUNCTION bar(x text[]) RETURNS BIGINT LANGUAGE sql AS $$
SELECT count(1) AS counter FROM my_table
WHERE my_field IN (SELECT * FROM unnest(x));
$$;

The problem with using the array seems to be fixed by using
return query select count(1) as counter from my_table where my_field in (array_to_string(x,','));
The point of effiency still remains unsolved.