PostgreSQL C aggregate function: How to return multiple values in transition function [duplicate] - c

Is the only way to pass an extra parameter to the final function of a PostgreSQL aggregate to create a special TYPE for the state value?
e.g.:
CREATE TYPE geomvaltext AS (
geom public.geometry,
val double precision,
txt text
);
And then to use this type as the state variable so that the third parameter (text) finally reaches the final function?
Why aggregates can't pass extra parameters to the final function themselves? Any implementation reason?
So we could easily construct, for example, aggregates taking a method:
SELECT ST_MyAgg(accum_number, 'COMPUTE_METHOD') FROM blablabla
Thanks

You can define an aggregate with more than one parameter.
I don't know if that solves your problem, but you could use it like this:
CREATE OR REPLACE FUNCTION myaggsfunc(integer, integer, text) RETURNS integer
IMMUTABLE STRICT LANGUAGE sql AS
$f$
SELECT CASE $3
WHEN '+' THEN $1 + $2
WHEN '*' THEN $1 * $2
ELSE NULL
END
$f$;
CREATE AGGREGATE myagg(integer, text) (
SFUNC = myaggsfunc(integer, integer, text),
STYPE = integer
);
It could be used like this:
CREATE TABLE mytab
AS SELECT * FROM generate_series(1, 10) i;
SELECT myagg(i, '+') FROM mytab;
myagg
-------
55
(1 row)
SELECT myagg(i, '*') FROM mytab;
myagg
---------
3628800
(1 row)

I solved a similar issue by making a custom aggregate function that did all the operations at once and stored their states in an array.
CREATE AGGREGATE myagg(integer)
(
INITCOND = '{ 0, 1 }',
STYPE = integer[],
SFUNC = myaggsfunc
);
and:
CREATE OR REPLACE FUNCTION myaggsfunc(agg_state integer[], agg_next integer)
RETURNS integer[] IMMUTABLE STRICT LANGUAGE 'plpgsql' AS $$
BEGIN
agg_state[1] := agg_state[1] + agg_next;
agg_state[2] := agg_state[2] * agg_next;
RETURN agg_state;
END;
$$;
Then made another function that selected one of the results based on the second argument:
CREATE OR REPLACE FUNCTION myagg_pick(agg_state integer[], agg_fn character varying)
RETURNS integer IMMUTABLE STRICT LANGUAGE 'plpgsql' AS $$
BEGIN
CASE agg_fn
WHEN '+' THEN RETURN agg_state[1];
WHEN '*' THEN RETURN agg_state[2];
ELSE RETURN 0;
END CASE;
END;
$$;
Usage:
SELECT myagg_pick(myagg("accum_number"), 'COMPUTE_METHOD') FROM "mytable" GROUP BY ...
Obvious downside of this is the overhead of performing all the functions instead of just one. However when dealing with simple operations such as adding, multiplying etc. it should be acceptable in most cases.

You would have to rewrite the final function itself, and in that case you might as well write a set of new aggregate functions, one for each possible COMPUTE_METHOD. If the COMPUTE_METHOD is a data value or implied by a data value, then a CASE statement can be used to select the appropriate aggregate method. Alternatively, you may want to create a custom composite type with fields for accum_number and COMPUTE_METHOD, and write a single new aggregate function that uses this new data type.

Related

Recursive SQL function returning array has extra elements when self-invocation uses array function

Goal: write a function in PostgreSQL SQL that takes as input an integer array whose each element is either 0, 1, or -1 and returns an array of the same length, where each element of the output array is the sum of all adjacent nonzero values in the input array having the same or lower index.
Example, this input:
{0,1,1,1,1,0,-1,-1,0}
should produce this result:
{0,1,2,3,4,0,-1,-2,0}
Here is my attempt at such a function:
CREATE FUNCTION runs(input int[], output int[] DEFAULT '{}')
RETURNS int[] AS $$
SELECT
CASE WHEN cardinality(input) = 0 THEN output
ELSE runs(input[2:],
array_append(output, CASE
WHEN input[1] = 0 THEN 0
ELSE output[cardinality(output)] + input[1]
END)
)
END
$$ LANGUAGE SQL;
Which gives unexpected (to me) output:
# select runs('{0,1,1,1,1,0,-1,-1,-1,0}');
runs
----------------------------------------
{0,1,2,3,4,5,6,0,0,0,-1,-2,-3,-4,-5,0}
(1 row)
I'm using PostgreSQL 14.4. While I am ignorant of why there are more elements in the output array than the input, the cardinality() in the recursive call seems to be causing it, as also does using array_length() or array_upper() in the same place.
Question: how can I write a function that gives me the output I want (and why is the function I wrote failing to do that)?
Bonus extra: For context, this input array is coming from array_agg() invoked on a table column and the output will go back into a table using unnest(). I'm converting to/from an array since I see no way to do it directly on the table, in particular because WITH RECURSIVE forbids references to the recursive table in either an outer join or subquery. But if there's a way around using arrays (especially with a lack of tail-recursion optimization) that will answer the general question (But I am still very very curious why I'm seeing the extra elements in the output array).
Everything indicates that you have found a reportable Postgres bug. The function should work properly, and a slight modification unexpectedly changes its behavior. Add SELECT; right after $$ to get the function to run as expected, see Db<>fiddle.
A good alternative to a recursive solution is a simple iterative function. Handling arrays in PL/pgSQL is typically simpler and faster than recursion.
create or replace function loop_function(input int[])
returns int[] language plpgsql as $$
declare
val int;
tot int = 0;
res int[];
begin
foreach val in array input loop
if val = 0 then tot = 0;
else tot := tot + val;
end if;
res := res || tot;
end loop;
return res;
end $$;
Test it in Db<>fiddle.
The OP wrote:
this input array is coming from array_agg() invoked on a table column and the output will go back into a table using unnest().
You can calculate these cumulative sums directly in the table with the help of window functions.
select id, val, sum(val) over w
from (
select
id,
val,
case val
when 0 then 0
else sum((val = 0)::int) over w
end as series
from my_table
window w as (order by id)
) t
window w as (partition by series order by id)
order by id
Test it in Db<>fiddle.

Postgres: calling function with text[] param fails with array literal

I have a Postgres function that accepts a text[] as input. For example
create function temp1(player_ids text[])
returns void
language plpgsql
as
$$
begin
update players set player_xp = 0
where id in (player_ids);
-- the body is actually 20 lines long, updating a lot of tables
end;
$$;
and I'm trying to call it, but I keep getting
[42883] ERROR: operator does not exist: text = text[] Hint: No operator matches the given name and argument types. You might need to add explicit type casts. Where: PL/pgSQL function temp1(text[]) line 3 at SQL statement
I have tried these so far
select temp1('{F7AWLJWYQ5BMPKGXLMDNQKQ4NY,AQPBAFKQONGLBKIMCSOD747GY4}');
select temp1('{F7AWLJWYQ5BMPKGXLMDNQKQ4NY,AQPBAFKQONGLBKIMCSOD747GY4}'::text[]);
select temp1(array['F7AWLJWYQ5BMPKGXLMDNQKQ4NY,AQPBAFKQONGLBKIMCSOD747GY4']);
select temp1(array['F7AWLJWYQ5BMPKGXLMDNQKQ4NY,AQPBAFKQONGLBKIMCSOD747GY4']::text[]);
I have to be missing something obvious...how do I call this function with an array literal?
Use = any instead of in:
...
update players set player_xp = 0
where id = any(player_ids);
...
The IN operator acts on an explicit list of values.
expression IN (value [, ...])
When you want to compare a value to each element of an array, use ANY instead.
expression operator ANY (array expression)
Note that there are variants of both constructs for subqueries expression IN (subquery) and expression operator ANY (subquery). The first one was properly used in the other answer though a subquery seems excessive in this case.
You can use unnest function, this function is very easy and same time best performanced. Unnest using for converting array elements to rows. Example:
create function temp1(player_ids text[])
returns void
language plpgsql
as
$$
begin
update players set player_xp = 0
where id in (select pl.id from unnest(player_ids) as pl(id));
-- the body is actually 20 lines long, updating a lot of tables
end;
$$;
And you can easily cast array elements to another type for using unnest.
Example:
update players set player_xp = 0
where id in (select pl.id::integer from unnest(player_ids) as pl(id));

Call set-returning plpgsql function for each row returned from a query

In my Postgres 9.6 database I have the following custom domain and table definition:
create domain lowResData as
float[21];
create table myRawValues (
id text,
myData lowResData,
xAxis lowResData,
primary key(id)
);
The following functions are able to produce the result I want for a single item.
create function getData(_id 'text') returns float[] as $$
select myData
from myRawValues
where id = _id
$$ language sql;
create function getAxis(_id 'text') returns float[] as $$
select xAxis
from myRawValues
where id = _id
$$ language sql;
create function myPlotter(myarray float[], myData float[])
returns table (frequency float, amplitude float) as
$$
select *
from unnest(myarray, myData) as u;
$$ language sql;
select * from myPlotter(getAxis('123'), getData('123'));
I want to do the same for all id's produced from executing a particular query and end up with a result like this:
create or replace function allIdLowResData() returns setof float[] as
$body$
declare r text;
begin
for r in (select id from myRawValues where /*SOME CONDITION*/)
loop
return next myPlotter(getAxis(r), getData(r));
end loop;
return;
end
$body$
language plpgsql;
Use a LATERAL join to combine your set-returning function with the rest of the query. Like:
CREATE OR REPLACE FUNCTION allIdLowResData()
RETURNS TABLE (frequency float, amplitude float, id text) AS
$func$
SELECT p.*, r.id
FROM myRawValues r
LEFT JOIN LATERAL myPlotter(r.xAxis, r.myData) p ON true
WHERE /*SOME CONDITION*/
$func$ LANGUAGE sql;
See:
What is the difference between LATERAL and a subquery in PostgreSQL?
Plus, the declared return type of the function (RETURNS) must match what's actually returned.
Using a simpler SQL function here. You can do the same with PL/pgSQL, lead with RETURNS QUERY in this case.
You might be interested in these details about Postgres array definitions, quoted from the manual:
However, the current implementation ignores any supplied array size
limits, i.e., the behavior is the same as for arrays of unspecified
length.
The current implementation does not enforce the declared number of
dimensions either. Arrays of a particular element type are all
considered to be of the same type, regardless of size or number of
dimensions. So, declaring the array size or number of dimensions in
CREATE TABLE is simply documentation; it does not affect run-time behavior.
Meaning, your domain is currently noise without any effect (aside from complications). To actually enforce 1-dimensional arrays with exactly 21 elements in your table, use a CHECK constraint. Like:
CREATE DOMAIN lowResData AS float[21] -- "[21]" is just for documentation
CONSTRAINT dim1_elem21 CHECK (array_ndims(VALUE) = 1 AND array_length(VALUE, 1) = 21);
I would also ditch the functions getData() and getAxis() unless there is more to them.

How to return array of values from PostgreSQL function by INT id

I am trying to create a simple PostgreSQL function, where by using INT parameter I like to get array back. The example below will not work, but shall give idea of what I try to get back from a function. Thanks.
CREATE OR REPLACE FUNCTION contact_countries_array(INT)
RETURNS ANYARRAY AS '
SELECT ARRAY[contacts_primarycountry, contacts_othercountry] FROM contacts WHERE contacts_id = $1'
LANGUAGE SQL;
The data type of contacts_primarycountry and contacts_othercountry is integer. contacts_id is unique and integer.
Per the docs:
It is permitted to have polymorphic arguments with a fixed return
type, but the converse is not.
As such, I think your attempt to return anyarray won't work.
Your fields look like text, so I think if you altered it to something like this, it would work:
CREATE OR REPLACE FUNCTION contact_countries_array(INT)
RETURNS text[] AS $$
select array[contacts_primarycountry::text, contacts_othercountry::text]
FROM contacts WHERE contacts_id = $1
$$
LANGUAGE SQL;
This should compile, and it might work, but I'm honestly not sure:
CREATE OR REPLACE FUNCTION contact_countries_array(anyelement)
RETURNS anyarray AS $$
select array[contacts_primarycountry::text, contacts_othercountry::text]
FROM contacts WHERE contacts_id = $1
$$
LANGUAGE SQL;
I think the datatypes would have to match perfectly for this to work, unless you did casting.
Declaring Array, Looping, Adding items to Array, Returning Array with Postgres Function,
You can declare INTEGER array instead of TEXT and avoid casting (counter::TEXT) as well as return type TEXT[]. (Added those for reference.)
CREATE OR REPLACE FUNCTION "GetNumbers"(maxNo INTEGER) RETURNS TEXT[] AS $nums$
DECLARE
counter INTEGER := 0;
nums TEXT[] := ARRAY[]::TEXT[];
BEGIN
LOOP
EXIT WHEN counter = maxNo;
counter = counter + 1;
nums = array_append(nums, counter::TEXT);
END LOOP;
RETURN nums;
END ;
$nums$ LANGUAGE plpgsql;
SELECT "GetNumbers"(5); -- {1,2,3,4,5}

Return rows matching elements of input array in plpgsql function

I would like to create a PostgreSQL function that does something like the following:
CREATE FUNCTION avg_purchases( IN last_names text[] DEFAULT '{}' )
RETURNS TABLE(last_name text[], avg_purchase_size double precision)
AS
$BODY$
DECLARE
qry text;
BEGIN
qry := 'SELECT last_name, AVG(purchase_size)
FROM purchases
WHERE last_name = ANY($1)
GROUP BY last_name'
RETURN QUERY EXECUTE qry USING last_names;
END;
$BODY$
But I see two problems here:
It is not clear to me that array type is the most useful type of input.
This is currently returning zero rows when I do:
SELECT avg_purchases($${'Brown','Smith','Jones'}$$);
What am I missing?
This works:
CREATE OR REPLACE FUNCTION avg_purchases(last_names text[] = '{}')
RETURNS TABLE(last_name text, avg_purchase_size float8)
LANGUAGE sql AS
$func$
SELECT last_name, avg(purchase_size)::float8
FROM purchases
WHERE last_name = ANY($1)
GROUP BY last_name
$func$;
Call:
SELECT * FROM avg_purchases('{foo,Bar,baz,"}weird_name''$$"}');
Or (example with dollar-quoting):
SELECT * FROM avg_purchases($x${foo,Bar,baz,"}weird_name'$$"}$x$);
How to quote string literals:
Insert text with single quotes in PostgreSQL
You don't need dynamic SQL here.
While you can wrap it into a plpgsql function (which may be useful), a simple SQL function is doing the basic job just fine.
You had type mismatches:
The result of avg() may be numeric to hold a precise result. A cast to float8 (alias for double precision) makes it work. For perfect precision, use numeric instead.
The OUT parameter last_name must be text instead of text[].
VARIADIC
An array is a useful type of input. If it's easier for your client you can also use a VARIADIC input parameter that allows to pass the array as a list of elements:
CREATE OR REPLACE FUNCTION avg_purchases(VARIADIC last_names text[] = '{}')
RETURNS TABLE(last_name text, avg_purchase_size float8)
LANGUAGE sql AS
$func$
SELECT last_name, avg(purchase_size)::float8
FROM purchases
JOIN (SELECT unnest($1)) t(last_name) USING (last_name)
GROUP BY 1
$func$;
Call:
SELECT * FROM avg_purchases('foo', 'Bar', 'baz', '"}weird_name''$$"}');
Or (with dollar-quoting):
SELECT * FROM avg_purchases('foo', 'Bar', 'baz', $y$'"}weird_name'$$"}$y$);
Stock Postgres only allows a maximum of 100 elements. This is determined at compile time by the preset option:
max_function_args (integer)
Reports the maximum number of function arguments. It is determined by the value of FUNC_MAX_ARGS when building the server. The default value is 100 arguments.
You can still call it with array notation when prefixed with the keyword VARIADIC:
SELECT * FROM avg_purchases(VARIADIC '{1,2,3, ... 99,100,101}');
For bigger arrays (100+), consider unnest() in a subquery and JOIN to it, tends to scale better:
Optimizing a Postgres query with a large IN

Resources