Selecting arrays in nicely printed format in SQL - database

I'm trying to select for a two dimensional array of integers, and directing the output to a file. Is there any way that I can write a postgresql statement that would make the output of the select statement nicely formatted. As in each array of integers that is an element of the 2D array is on its own line.
Right now I just get this output:
SELECT array FROM table LIMIT 1;
{{0,0,0},{1,1,1},{2,2,2},{3,3,3},{0,0,0},{1,1,1},{2,2,2},{3,3,3}
,{0,0,0},{1,1,1},{2,2,2},{3,3,3},{0,0,0},{1,1,1},{2,2,2},{3,3,3}
,{0,0,0},{1,1,1},{2,2,2},{3,3,3},{0,0,0},{1,1,1},{2,2,2},{3,3,3}}
And I would like to get something more like this:
{0,0,0}
{1,1,1}
{2,2,2}
...
I can do this after the query returns with some parsing, but if its possible to do it in Postgres itself that would be ideal.

There are several ways. One way is to cast the array to text and split it up with regexp_split_to_table().
This function is present in PostgreSQL 8.3 or later.
SELECT regexp_split_to_table(trim(my_2d_intarr::text, '{}'), '},{');
Output:
0,0,0
1,1,1
2,2,2
If you want the enclosing brackets (maybe you don't?), add them back like this:
SELECT '{' || regexp_split_to_table(trim(my_2d_intarr::text, '{}'), '},{') || '}';
Ourtput:
{0,0,0}
{1,1,1}
{2,2,2}
Alternative:
This should also work with PostgreSQL 8.2 or maybe even earlier, but I did not test that.
SELECT my_2d_int_arr_var[x:x][1:3]
FROM (SELECT generate_series(1, array_upper(my_2d_intarr, 1), 1)::int4 AS x)) x
Output:
{{0,0,0}}
{{1,1,1}}
{{2,2,2}}
(You may want to strip some curly brackets ..)
Else, I would write a plpgsql function that loops through the array. Fairly easy.
There is also the related unnest() function, but it returns a row per base element (integer in this case), so it's no use here.
One (fast!) way to output the result: COPY.

Related

Position of the lowest value greater than x in ordered postgresql array (optimization)

Looking at the postgres function array_position(anyarray, anyelement [, int])
My problem is similar, but I'm looking for the position of the first value in an array that is greater than an element. I'm running this on small arrays, but really large tables.
This works:
CREATE OR REPLACE FUNCTION arr_pos_min(anyarray,real)
RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'select array_position($1,(SELECT min(i) FROM unnest($1) i where i>$2))';
the array_position takes advantage of the fact that my array is ordered, but the second part doesn't. And I feel like the second part could potentially just return the position without having to re-query.
My arrays are only 100 elements long, but I have to run this millions of times and so looking for a performance pickup.
Suggestions appreciated.
This seems to be a bit faster
CREATE OR REPLACE FUNCTION arr_pos_min(p_input anyarray, p_to_check real)
RETURNS int
AS
$$
select t.idx
from unnest(p_input) with ordinality as t(i, idx)
where t.i > p_to_check
order by t.idx
limit 1
$$
LANGUAGE sql
IMMUTABLE
PARALLEL SAFE
;
The above will use the fact that the values in the array are already sorted. Sorting by the array index is therefor quite fast. I am not sure if unnest() is guaranteed in this context to return the elements in the order they are stored in the array. If that was the case, you could remove the order by and make it even faster.
I don't think that there is a more efficient solution than yours, except if you write a dedicated C function for that.
Storing large arrays is often a good recipe for bad performance.

How to concatenate multiple ranges within a Match function

I have a list of values that I would like to match against the combination of multiple ranges.
So, for example, my ranges are A1:A100 and B1:B100.
Instead of concatenating A with B in a new column C, i.e.
CONCAT(A1,B1)...CONCAT(A100,B100)
and then matching my value against that new column - I would like to do something like this:
MATCH(value,CONCATENATE(A1:B100),0)
And copy this down a column near my list of values.
I have a feeling this can be done with some sort of array formula...
Yes as an array formula:
=MATCH(value,$A$1:$A$100 & $B$1:$B$100,0)
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
Though they may seem similar in approach they are not. CONCATENATE will return a string not an array to the MATCH with all 200 values in one long string. Where the above will return 100 values, each row concatenated, as an array which can be used to search.
One further note, If performance becomes a issue, Array formulas are inherently slower, adding the helper column and using a regular MATCH will improve the responsiveness.
This should work, basically you just need to concatenate it yourself using &
=MATCH(D1,A1:A10&B1:B10,0)
D1 is the value you're trying to look for.
This is an array, so remember to hit Ctrl+Shift+Enter when you input it.

Excel: creating an array with n times a constant

I have been looking around for a while but unable to find an answer to my question.
In Excel, what compact formula can I use to create an array made up of a single element repeated n times, where n is an input (potentially hard-coded)?
For example, something that would look like this (the formula below does not work but gives an idea of what I am looking for):
{={"Constant"}*3}
Note: I am not looking for a VBA-based solution.
EDIT Reading #AxelRichter answer, I see I should also indicate that the formulas below assume Constant is a number. If Constant is text, then this solution will not work.
Volatile:
=ROW(INDIRECT("1:" & Repts))/ROW(INDIRECT("1" & ":" & Repts)) * Constant
non-Volatile:
=ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,Repts,1))/ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,Repts,1))*Constant
If
Constant = 14
Repts = 3
then
Result = {14;14;14}
The first part of the formulas create an array of 1's repeated Repts times. Then we multiply that array by Constant to get the desired result.
And after reading #MacroMarc's comment, the following non-volatile formula shouyld also work for numbers:
=(ROW($A$1:INDEX($A:$A,Repts))>0)*Constant
One could concatenate 1:n empty cells to the "Constant" to create a string array having n items "Constant":
"Constant"&INDEX(XFD:XFD,1):INDEX(XFD:XFD,3)
There 3 is n.
Used in Formula
=INDEX("Constant"&INDEX(XFD:XFD,1):INDEX(XFD:XFD,3),0)
Evaluate Formula shows that it works:
Here column XFD is used because in most cases this column will be empty and a column which is guaranteed to be empty is needed for this solution.
If used
"Constant"&T(ROW($A$1:INDEX($A:$A,3)))
=INDEX("Constant"&T(ROW($A$1:INDEX($A:$A,3))),0)
the need of an empty column disappears. The function ROW returns numbers but the T returns an empty string if its parameter is not text. So empty strings will be concatenated for each 1:3 (n).
Thanks to #MacroMarc for the hint.
Try:
REPT("Constant", SEQUENCE(3,1,1,0))
Or, if the reference is to a dynamic array:
REPT("Constant", SEQUENCE(A1#,1,1,0))
The dynamic array spills, and has your constant repeated one time.
Using SEQUENCE with a step of 0 is a much cleaner way to make an array of constants. You can choose whether you want rows or columns (or both!) as well.
=SEQUENCE(Repts,1,Constant,0)
I will generally use a sequence (like Claire (above) said). But if you want to provide an output of text objects, I would do it this way:
=IF(SEQUENCE(A1,A2,1,0),A3)
Where:
A1 has the number of rows
A2 has the number of columns
A3 has the thing you want repeated into an array
The sequence will create a matrix of 1's, which the IF statement will default to the TRUE expression (being the contents of A3).
So, if you wanted a vertical list of 3 items that says "Constant", this would do it:
=IF(SEQUENCE(3,,1,0),"Constant")
If you would prefer it be arranged horizontally instead of vertically, just amend the SEQUENCE function:
=IF(SEQUENCE(,3,1,0),"Constant")

extract "N" sized sequences from an array in R

Suppose I have the following array:
a <- sample(letters,100,replace=TRUE)
Then suppose those letters are ordered in a sequence, I want to extract all possible 'n' sized sequences from that array. For example:
For n=2 I would do: paste0(a[1:99],"->",a[2:100])
for n=3 I would do: paste0(a[1:98],"->",a[2:99],"->",a[3:100])
you get the point. Now, my goal is to create a function that would take as input n and would give me back the corresponding set of sequences of the given length from array a
I was able to do it using loops and all that but I was hoping for a high performance one liner.
I am a bit new to R so I'm not aware of all existing functions.
You can use embed. For embed(a, 3), this gives a matrix with columns
a[3:100]
a[2:99]
a[1:98]
in that order.
To reverse the column order use matrix syntax m[rows, cols]:
res = embed(a, 3)[, 3:1]
If you want arrows printed between the columns, then
do.call(paste, c(split(res, col(res)), sep = " -> "))
is one way. This is probably better than apply(res, 1, something), performance-wise, since this is vectorized while apply would loop over rows.
As pointed out by #DavidArenburg, this can similarly be done with data.table:
library(data.table)
do.call(paste, c(shift(a, 2:0), sep = " -> "))[-(1:2)]
shift is like embed, except it ...
returns a list instead of a matrix, so we don't need to split by col to paste
pads with missing values to keep the full length, so we need to drop with -(1:2)
I was hoping to say something useful about how to find obscure functions in R, but came up mostly blank on how embed might be found. Maybe...
Go to any HTML help page
Click the "Index" hyperlink at the bottom
Read every single page
?

How to get array from non-array field for INSERT command?

I'm trying to use a SELECT statement together with a INSERT INTO command. Everything would work fine, if there wasn't a small problem: some fields of the table are defined as real[] but my input is numeric. Thus, the question:
Is there a function in PostgreSQL to create out of the single numeric input an array of type real (with just one element)?
My setting looks like this:
tempLogTable(..., logValue NUMERIC, ...)
finalLogTable(..., logValues REAL[], ...)
The idea is to insert the tuples from the tempLogTable to the finalLogTable using INSERT INTO ... SELECT .... Unfortunately, because of various reasons the data types are given and I would not like to change these for the moment (not to break anything).
I'm using PostgreSQL 9.2.
SELECT ARRAY[thenumeric::real] FROM the_table;
or
SELECT ARRAY[thenumeric]::real[] FROM the_table;
They're not really any different for a one-element array.
real has limits that numeric doesn't. In particular, comparing real values for equality doesn't work reliably; you should instead compare for two numerics being different by smaller than a small (somewhat task-specific) amount. It also can't represent values as big or small as numeric can. See the floating point guide among other info on comparing floats. This will be much harder to do right when they're wrapped in arrays.
For the purpose you describe, where it sounds like you are just collecting stats or historical data, that isn't going to be a problem. It usually only turns out to be an issue where people try to write:
WHERE some_real = some_other_real
which will result in surprising and unexpected behaviour.
You should be fine with an INSERT INTO ... SELECT as described.

Resources