Splice Array in Postgres - arrays

Is it possible to (easily) splice arrays in Postgres? For example, I want to replace all values of 4 with the values 8 and 12, so an array of {2, 4, 7} should become {2, 8, 12, 7}. Perhaps I'm going about this the wrong way, but I need to maintain the integer array column type for these columns. Thanks for any guidance you can give me.

Perhaps UNNEST?
WITH rep(ord, what, with_what) AS(
VALUES (1,4,8),
(2,4,12)
)
SELECT array_agg(COALESCE(with_what,elem) ORDER BY no, ord) AS new_array
FROM(
SELECT *
FROM UNNEST('{2, 4, 7}'::INTEGER[]) WITH ORDINALITY AS arr(elem, no)
LEFT JOIN rep ON arr.elem = rep.what
) AS q;
This way you can define a whole set of replaces easily.

Related

Most computationally efficient way to batch alter values in each array of a 2d array, based on conditions for particular values by indices

Say that I have a batch of arrays, and I would like to alter them based on conditions of particular values located by indices.
For example, say that I would like to increase and decrease particular values if the difference between those values are less than two.
For a single 1D array it can be done like this
import numpy as np
single2 = np.array([8, 8, 9, 10])
if abs(single2[1]-single2[2])<2:
single2[1] = single2[1] - 1
single2[2] = single2[2] + 1
single2
array([ 8, 7, 10, 10])
But I do not know how to do it for batch of arrays. This is my initial attempt
import numpy as np
single1 = np.array([6, 0, 3, 7])
single2 = np.array([8, 8, 9, 10])
single3 = np.array([2, 15, 15, 20])
batch = np.array([
np.copy(single1),
np.copy(single2),
np.copy(single3),
])
if abs(batch[:,1]-batch[:,2])<2:
batch[:,1] = batch[:,1] - 1
batch[:,2] = batch[:,2] + 1
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Looking at np.any and np.all, they are used to create an array of booleans values, and I am not sure how they could be used in the code snippet above.
My second attempt uses np.where, using the method described here for comparing particular values of a batch of arrays by creating new versions of the arrays with values added to the front/back of the arrays.
https://stackoverflow.com/a/71297663/3259896
In the case of the example, I am comparing values that are right next to each other, so I created copies that shift the arrays forwards and backwards by 1. I also use only the particular slice of the array that I am comparing, since the other numbers would also be used in the comparison in np.where.
batch_ap = np.concatenate(
(batch[:, 1:2+1], np.repeat(-999, 3).reshape(3,1)),
axis=1
)
batch_pr = np.concatenate(
(np.repeat(-999, 3).reshape(3,1), batch[:, 1:2+1]),
axis=1
)
Finally, I do the comparisons, and adjust the values
batch[:, 1:2+1] = np.where(
abs(batch_ap[:,1:]-batch_ap[:,:-1])<2,
batch[:, 1:2+1]-1,
batch[:, 1:2+1]
)
batch[:, 1:2+1] = np.where(
abs(batch_pr[:,1:]-batch_pr[:,:-1])<2,
batch[:, 1:2+1]+1,
batch[:, 1:2+1]
)
print(batch)
[[ 6 0 3 7]
[ 8 7 10 10]
[ 2 14 16 20]]
Though I am not sure if this is the most computationally efficient nor programmatically elegant method for this task. Seems like a lot of operations and code for the task, but I do not have a strong enough mastery of numpy to be certain about this.
This works
mask = abs(batch[:,1]-batch[:,2])<2
batch[mask,1] -= 1
batch[mask,2] += 1

How do I sort a multidimensional table in Lua?

I have a table consisting basically of the following:
myTable = {{1, 6.345}, {2, 3.678}, {3, 4.890}}
and I'd like to sort the table by the decimal values.
So I'd like to the output to be:
{{2, 3.678}, {3, 4.890}, {1, 6.345}}
If possible, I'd like to use the table.sort() function. Thankyou in advance for the help :-)
Given that your table is a sequence, you can use table.sort directly. This function accepts a comparison predicate as its second argument, which prescribes the comparison logic:
require 'table'
myTable = {{1, 6.345}, {2, 3.678}, {3, 4.890}}
table.sort(myTable, function(lhs, rhs) return lhs[2] < rhs[2] end)
Printing the table e.g. as for _, v in ipairs(myTable) do print(v[1], v[2]) end then shows the desired ordering:
2 3.678
3 4.89
1 6.345
They key here is not the dimension of the table to sort, but the fact that it is a sequence, i.e. ordered.

Array difference in postgresql

I have two arrays [1,2,3,4,7,6] and [2,3,7] in PostgreSQL which may have common elements. What I am trying to do is to exclude from the first array all the elements that are present in the second.
So far I have achieved the following:
SELECT array
(SELECT unnest(array[1, 2, 3, 4, 7, 6])
EXCEPT SELECT unnest(array[2, 3, 7]));
However, the ordering is not correct as the result is {4,6,1} instead of the desired {1,4,6}.
How can I fix this ?
I finally created a custom function with the following definition (taken from here) which resolved my issue:
create or replace function array_diff(array1 anyarray, array2 anyarray)
returns anyarray language sql immutable as $$
select coalesce(array_agg(elem), '{}')
from unnest(array1) elem
where elem <> all(array2)
$$;
I would use ORDINALITY option of UNNEST and put an ORDER BY in the array_agg function while converting it back to array. NOT EXISTS is preferred over except to make it simpler.
SELECT array_agg(e order by id)
FROM unnest( array[1, 2, 3, 4, 7, 6] ) with ordinality as s1(e,id)
WHERE not exists
(
SELECT 1 FROM unnest(array[2, 3, 7]) as s2(e)
where s2.e = s1.e
)
DEMO
More simple, NULL support, probably faster:
select array(
select v
from unnest(array[2,2,null,1,3,3,4,5,null]) with ordinality as t(v, pos)
where array_position(array[3,3,5,5], v) is null
order by pos
);
Result: {2,2,null,1,4,null}
Function array_diff() with tests.
Postgres is unfortunately lacking this functionality. In my case, what I really needed to do was to detect cases where the array difference was not empty. In that specific case you can do that with the #> operator which means "Does the first array contain the second?"
ARRAY[1,4,3] #> ARRAY[3,1,3] → t
See doc

Efficiently saving summable array values in RDBMs

I have a dataset where we track engagement per-percent (so 8 people are active at 38%, 7 people are active at 39%, etc.). This gives an array with 100 values, filled with integers.
I need to store this in a postgres table. The only/major requirement is that I need to be able to sum the values for each index to form a new array. Example:
Row 1: [5, 3, 5, ... 7]
Row 2: [2, 5, 3, ... 1]
Sum: [7, 8, 8, ... 8]
The naive way to save these would be 100 individual (BIG)INT columns, which would allow you to sum the values per-column over multiple rows. However, this makes the table very wide (and does not seem like the most efficient way to do it). I have looked into (BIG)INT[100] columns, but I cannot seem to find a good, native way to sum the values. Same thing with json(b) columns (with a native JSON array).
Have I overlooked something? Is there a good, efficient way to do this without completely bloating a table?
The solution using unnest() with ordinality:
with the_table(intarr) as (
values
(array[1, 2, 3, 4]),
(array[1, 2, 3, 4]),
(array[1, 2, 3, 4])
)
select array_agg(sum order by ordinality)
from (
select ordinality, sum(unnest)
from the_table,
lateral unnest(intarr) with ordinality
group by 1
) s;
array_agg
------------
{3,6,9,12}
(1 row)
Here is one method that seems to work:
select array_agg(sum_aval order by ind)
from (select ind, sum(aval) sum_aval
from (select id, unnest(a) as aval, generate_series(1, 3) as ind
from (values (1, array[1, 2, 3]), (2, array[3, 4, 5])) v(id, a)
) x
group by ind
) x;
That is, unnest the arrays and generate indexes for them using generate_series(). Then you can aggregate at the index level and then re-combine into an array (using two separate aggregations).

Save integers into array given by first integer

I need to know, how to save integers from stdin into array, given by first integer in line... Ehm... hope you understand. I will give you an example.
On stdin I have:
0 : [ 1, 2, 3 ]
5 : [ 10, 11, 12, 13]
6 : [ 2, 4, 9 ]
0 : [ 4, 9, 8 ]
5 : [ 9, 6, 7 ]
5 : [ 1 ]
And I need save these integers to the arrays like this:
0={1, 2, 3, 4, 9, 8}
5={10, 11, 12, 13, 9, 6, 7, 1}
6={2, 4, 9}
I absolutely don't how to do it. There is a problem, that the number of arrays(in this case - 0, 5, 6 - so 3 arrays ) can be very high and I need to work effectively with memory...So I guess i will need something like malloc and free to solve this problem, or am I wrong? The names of arrays (0, 5, 6) can be changed. Number of integers in brackets has no maximum limit.
Thank you for any help.
I go with the assumption, this is homework, and I go with the assumption, this isn't your first homework to do, so I won't present you a solution but instead some tips that would help you to solve it yourself.
Given the input line
5 : [ 10, 11, 12, 13]
I will call "5" the "array name" and 10, 11, 12 and 13 the values to add.
You should implement some system to map array names to indices. A trivial approach would be like this:
.
size_t num_arrays;
size_t * array_names;
Here, in your example input, num_arrays will end up being 3 with array_names[3] = { 0, 5, 6}. If you find a new array name, realloc and add the new array name. Also you need the actual arrays for the values:
int * * array;
you need to realloc array for each new array name (like you realloc array_names). array[0] will represent array array_names[0] here array 0, array[1] will represent array array_names[1] here array 5 and array[2] will represent array array_names[2] here array 6.
To access an array, find it's index like so:
size_t index;
for (size_t index = 0; index < num_arrays && array_names[index] != search; ++index) ;
The second step is easy. Once you figured out, you need to use array[index] to add elemens, realloc that one (array[index] = realloc(array[index], new size)) and add elements there array[index][i+old_size] = new_value[i].
Obviously, you need to keep track of the number of elements in your separate arrays as well ;)
Hint: If searching for the array names take too long, you will have to replace that trivial mapping part by some more sophisticated data structure, like a hash map or a binary search tree. The rest of the concept may stay more or less the same.
Should you have problems to parse the input lines, I suggest, you open a new question specific on this parsing part.
In algorithmic terms, you need map (associative array) from ints to arrays. This is solved long ago in most higher level languages.
If you have to implement it manually, you have a few options:
simple "master" array where you store your 0, 5, 6, 1000000 and then map them to indices 0, 1, 2, 3 by doing search in for each time you have to access it (it's too time consuming when ;
hash table: write simple hash function to map 0, 5, 6, 1000000 (they're called keys) to values less than 1000, allocate array of 1000 elements and then make "master" array structures for each hash function result;
some kind of tree (e.g. red-black tree), may be a bit complex to implement manually.
Last two structures are part of programming classic and are well described in various articles and books.

Resources