Different results on printing the values of array - arrays

I am writing a stored procedure that accepts a string as an input and then converts that string to an array with comma as a delimiter, once I have that array I am appending a string '_1' to each element of the array as I need to further utilize that. However when I execute this stored proc to find the result the raise info command behaves differently on printing the value of array (the values will change obviously but the format in which they are displayed changes)
CREATE OR REPLACE FUNCTION Test1(inputlist text) RETURNS text AS $BODY$
DECLARE
a text;
acceptList text[];
counter integer;
length integer;
BEGIN
acceptList = string_to_array(inputList,',');
SELECT array_length(acceptList,1) into length;
RAISE INFO 'Length : %',length;
RAISE INFO 'AcceptList Print 1 : %',acceptList;
counter = 0;
FOREACH a in ARRAY acceptList LOOP
acceptList[counter] = a||'_1';
counter = counter + 1;
END LOOP;
RAISE INFO 'AcceptList Print 2 : %',acceptList;
END;
$BODY$
LANGUAGE plpgsql;
The output in messages tab would be :
INFO: Length : 4
INFO: AcceptList Print 1 : {INC000073535133,INC000073533828,INC000073535942,INC000073535857}
INFO: Acceptlist Print 2 : [0:4]={INC000073535133_1,INC000073533828_1,INC000073535942_1,INC000073535857_1,INC000073535857}
If you notice in the above output the values are appended correctly however, in the print 2 it is showing the size as well as an equal to symbol before printing the values of array
Want to understand why such behavior is seen

By default, arrays in Postgres are indexed from 1, while in the function body the first index of the modified array is 0. As a result, the original array has been extended by one element. The notation [0:4] = {...} means a five-element array with the non-standard first index of 0. Of course, this could be fixed in a simple way:
...
counter = 1;
FOREACH a in ARRAY acceptList LOOP
...
However, note the comment by horse_with_no_name that indicated how this should be done in Postgres with a single query:
select string_agg(concat(t.element, '_1'), ',' order by t.nr)
from unnest(string_to_array('one,two,three',',')) with ordinality as t(element, nr);
Read about arrays in the documentation.

Related

Recursive SQL function returning array has extra elements when self-invocation uses array function

Goal: write a function in PostgreSQL SQL that takes as input an integer array whose each element is either 0, 1, or -1 and returns an array of the same length, where each element of the output array is the sum of all adjacent nonzero values in the input array having the same or lower index.
Example, this input:
{0,1,1,1,1,0,-1,-1,0}
should produce this result:
{0,1,2,3,4,0,-1,-2,0}
Here is my attempt at such a function:
CREATE FUNCTION runs(input int[], output int[] DEFAULT '{}')
RETURNS int[] AS $$
SELECT
CASE WHEN cardinality(input) = 0 THEN output
ELSE runs(input[2:],
array_append(output, CASE
WHEN input[1] = 0 THEN 0
ELSE output[cardinality(output)] + input[1]
END)
)
END
$$ LANGUAGE SQL;
Which gives unexpected (to me) output:
# select runs('{0,1,1,1,1,0,-1,-1,-1,0}');
runs
----------------------------------------
{0,1,2,3,4,5,6,0,0,0,-1,-2,-3,-4,-5,0}
(1 row)
I'm using PostgreSQL 14.4. While I am ignorant of why there are more elements in the output array than the input, the cardinality() in the recursive call seems to be causing it, as also does using array_length() or array_upper() in the same place.
Question: how can I write a function that gives me the output I want (and why is the function I wrote failing to do that)?
Bonus extra: For context, this input array is coming from array_agg() invoked on a table column and the output will go back into a table using unnest(). I'm converting to/from an array since I see no way to do it directly on the table, in particular because WITH RECURSIVE forbids references to the recursive table in either an outer join or subquery. But if there's a way around using arrays (especially with a lack of tail-recursion optimization) that will answer the general question (But I am still very very curious why I'm seeing the extra elements in the output array).
Everything indicates that you have found a reportable Postgres bug. The function should work properly, and a slight modification unexpectedly changes its behavior. Add SELECT; right after $$ to get the function to run as expected, see Db<>fiddle.
A good alternative to a recursive solution is a simple iterative function. Handling arrays in PL/pgSQL is typically simpler and faster than recursion.
create or replace function loop_function(input int[])
returns int[] language plpgsql as $$
declare
val int;
tot int = 0;
res int[];
begin
foreach val in array input loop
if val = 0 then tot = 0;
else tot := tot + val;
end if;
res := res || tot;
end loop;
return res;
end $$;
Test it in Db<>fiddle.
The OP wrote:
this input array is coming from array_agg() invoked on a table column and the output will go back into a table using unnest().
You can calculate these cumulative sums directly in the table with the help of window functions.
select id, val, sum(val) over w
from (
select
id,
val,
case val
when 0 then 0
else sum((val = 0)::int) over w
end as series
from my_table
window w as (order by id)
) t
window w as (partition by series order by id)
order by id
Test it in Db<>fiddle.

Specify string length for substring length parameter in T-SQL

According to this msdn article and using the Substring:
SUBSTRING (value_expression ,start_expression ,length_expression )
length_expression
Is a positive integer or bigint expression that specifies how many
characters of the value_expression will be returned. If
length_expression is negative, an error is generated and the statement
is terminated. If the sum of start_expression and length_expression is
greater than the number of characters in value_expression, the whole
value expression beginning at start_expression is returned.
So if I had:
DECLARE #data varchar(max)
SELECT TOP 1 #data = Data
FROM [SomeDatabase].[dbo].[MyTable]
SELECT SUBSTRING(#data, 10, LEN(#data))
My understanding that because SUBSTRING has found that the length asked for is longer than the length of string supplied it will return you everything from the start_index till then end of the string will be taken.
for example if:
#data = "hey there"; // char length of 9
SUBSTRING(#data, 4, 20)
This should return there
Is this a particularly bad thing to do?
Are there any caveats to doing this?
Should I be explicit about the length of string to return?

counting string-indexed tables in lua

I am trying to count elements in a table that has some elements indexed with strings. When I try to use the # operator, it just ignores string indexed ones. example:
local myTab = {1,2,3}
print(#myTab)
will return 3
local myTab = {}
myTab["hello"] = 100
print(#myTab)
will return 0
mixing them, I tried
local myTab = {1,2,3,nil,5,nil,7}
print(#myTab)
myTab["test"] = try
print(#myTab)
returned 7 and then 3, that is right because I read somewhere that the # operator stops when it finds a nil value (but then why the first print printed 7?)
last, I tried
local myT = {123,456,789}
myT["test"] = 10
print(#myT)
printing 3, not 4
Why?
The rule is simple, from the length operator:
Unless a __len metamethod is given, the length of a table t is only defined if the table is a sequence, that is, the set of its positive numeric keys is equal to {1..n} for some non-negative integer n. In that case, n is its length.
In your example:
local myTab = {1,2,3,nil,5,nil,7}
#mytab is undefined because myTab isn't a sequence, with or without myTab["test"] = try.
local myT = {123,456,789}
myT is a sequence, and the length is 3, with or without myT["test"] = 10

Error for Associative Array : Missing IN OUT Parameter

I'm learning about Collections and trying out Associative Arrays in Oracle 11g. I'm using SQL Developer to write and test my code below and I am getting the error which I can't troubleshoot :
Error Report
Missing IN OUT Parameter at index ::1
Code I have written is as follows:
---SIMPLE collections EXAMPLE
DECLARE
TYPE prospect_towns IS TABLE OF VARCHAR2 (25)
INDEX BY PLS_INTEGER;
a_big_towns prospect_towns; -- associative array
i PLS_INTEGER := 1; -- index for the array
v_counter NUMBER;
v_town VARCHAR2(25);
BEGIN
a_big_towns(1):='Birmingham';
a_big_towns(2):='London':
a_big_towns(3):='Manchester';
-- v_counter := 1;
FOR i IN 1..a_big_towns.COUNT
LOOP <<big towns>>
--v_town := a_big_towns(i);
DBMS_OUTPUT.PUT_LINE('Inside Loop, town is '||a_big_towns(i));
i= a_big_towns.next:
END LOOP<<big towns>>
END;
/
Any ideas what's wrong ?
The second of these lines:
a_big_towns(1):='Birmingham';
a_big_towns(2):='London':
a_big_towns(3):='Manchester';
... has a colon at the end, instead of a semicolon. That's causing the following a_big_towns to be interpreted as a bind variable name by the parser. So it should be:
a_big_towns(2):='London';
Once you get past that, this line isn't needed, and would need := instead of = if it was, and also has a colon instead of a semicolon at the end:
i= a_big_towns.next:
... so remove that completely.
I'm not sure the labels are really adding anything here, but if you do have a label it doesn't need to be repeated at the end, and the name can't have a space in it, so make it:
<<big_towns>>
FOR i IN 1..a_big_towns.COUNT LOOP
And this needs a semicolon at the dned:
END LOOP;
This SQL Fiddle compiles.

PostgreSQL PL/pgSQL random value from array of values

How can I declare an array like variable with two or three values and get them randomly during execution?
a := [1, 2, 5] -- sample sake
select random(a) -- returns random value
Any suggestion where to start?
Try this one:
select (array['Yes', 'No', 'Maybe'])[floor(random() * 3 + 1)];
Updated 2023-01-10 to fix the broken array literal. Made it several times faster while being at it:
CREATE OR REPLACE FUNCTION random_pick()
RETURNS int
LANGUAGE sql VOLATILE PARALLEL SAFE AS
$func$
SELECT ('[0:2]={1,2,5}'::int[])[trunc(random() * 3)::int];
$func$;
random() returns a value x where 0.0 <= x < 1.0. Multiply by 3 and truncate it with trunc() (slightly faster than floor()) to get 0, 1, or 2 with exactly equal chance.
Postgres indexes are 1-based by default (as per SQL standard). This would be off-by-1. We could increment by 1 every time, but for efficiency I declare the array index to start with 0 instead. Slightly faster, yet. See:
Normalize array subscripts so they start with 1
The manual on mathematical functions.
PARALLEL SAFE for Postgres 9.6 or later. See:
PARALLEL label for a function with SELECT and INSERT
When to mark functions as PARALLEL RESTRICTED vs PARALLEL SAFE?
You can use the plain SELECT statement if you don't want to create a function:
SELECT ('[0:2]={1,2,5}'::int[])[trunc(random() * 3)::int];
Erwin Brandstetter answered the OP's question well enough. However, for others looking for understanding how to randomly pick elements from more complex arrays (like me some two months ago), I expanded his function:
CREATE OR REPLACE FUNCTION random_pick( a anyarray, OUT x anyelement )
RETURNS anyelement AS
$func$
BEGIN
IF a = '{}' THEN
x := NULL::TEXT;
ELSE
WHILE x IS NULL LOOP
x := a[floor(array_lower(a, 1) + (random()*( array_upper(a, 1) - array_lower(a, 1)+1) ) )::int];
END LOOP;
END IF;
END
$func$ LANGUAGE plpgsql VOLATILE RETURNS NULL ON NULL INPUT;
Few assumptions:
this is not only for integer arrays, but for arrays of any type
we ignore NULL data; NULL is returned only if the array is empty or if NULL is inserted (values of other non-array types produce an error)
the array don't need to be formatted as usual - the array index may start and end anywhere, may have gaps etc.
this is for one-dimensional arrays
Other notes:
without the first IF statement, empty array would lead to an endless loop
without the loop, gaps and NULLs would make the function return NULL
omit both array_lower calls if you know that your arrays start at zero
with gaps in the index, you will need array_upper instead of array_length; without gaps, it's the same (not sure which is faster, but they shouldn't be much different)
the +1 after second array_lower serves to get the last value in the array with the same probability as any other; otherwise it would need the random()'s output to be exactly 1, which never happens
this is considerably slower than Erwin's solution, and likely to be an overkill for the your needs; in practice, most people would mix an ideal cocktail from the two
Here is another way to do the same thing
WITH arr AS (
SELECT '{1, 2, 5}'::INT[] a
)
SELECT a[1 + floor((random() * array_length(a, 1)))::int] FROM arr;
You can change the array to any type you would like.
CREATE OR REPLACE FUNCTION pick_random( members anyarray )
RETURNS anyelement AS
$$
BEGIN
RETURN members[trunc(random() * array_length(members, 1) + 1)];
END
$$ LANGUAGE plpgsql VOLATILE;
or
CREATE OR REPLACE FUNCTION pick_random( members anyarray )
RETURNS anyelement AS
$$
SELECT (array_agg(m1 order by random()))[1]
FROM unnest(members) m1;
$$ LANGUAGE SQL VOLATILE;
For bigger datasets, see:
http://blog.rhodiumtoad.org.uk/2009/03/08/selecting-random-rows-from-a-table/
http://www.depesz.com/2007/09/16/my-thoughts-on-getting-random-row/
https://blog.2ndquadrant.com/tablesample-and-other-methods-for-getting-random-tuples/
https://www.postgresql.org/docs/current/static/functions-math.html
CREATE FUNCTION random_pick(p_items anyarray)
RETURNS anyelement AS
$$
SELECT unnest(p_items) ORDER BY RANDOM() LIMIT 1;
$$ LANGUAGE SQL;

Resources