Postgresql: Filter array by series - arrays

Is there any way to filter and array column by a series?
It's simpler to explain with an example.
Imagine I have this data:
table: data
id tag
1 {a,b,c,d}
2 {a,c,d,b}
3 {c,d,a,b}
4 {d,c,b,a}
5 {d,a,b,c}
6 {d,a,c,b}
Now I want to get all rows, which have ["a", "b"] in that order and no items in between:
SELECT id from data where tags ???? ["a", "b"]
That query should return: 1,3,5
UPDATE 1:
After taking a look to array_position and array_positionS: https://www.postgresql.org/docs/10/functions-array.html
I wrote this query:
select id
from data
where 'a' = ANY(tags)
and 'b' = ANY(tags)
and (array_position(tags, 'a') + 1) = any(array_positions(tags, 'b' ))
Which works as expected
UPDATE 2:
As #klin comment, this would produce wrong result if 'a' can appear multiple times, for example {a,a,b,c,d}. So this is a more generic answer
select *
from data
where 'a' = any(tags)
and 'b' = any(tags)
and (
array_position(tags, 'a') + 1 = any(array_positions(tags, 'b' ))
or array_position(tags, 'b') - 1 = any(array_positions(tags, 'a' )))

You can use a regular expression on the text representation of the arrays.
select *
from my_table
where tag::text ~ '[\{,]a,b[,\}]'
Db<>Fiddle.

Related

count jsonb array with condition in postgres

I have a postgres database where some column data are stored as follow:
guest_composition
charging_age
[{"a": 1, "b": 1, "c": 1, "children_ages": [10, 5, 2, 0.1]}]
3
[{"a": 1, "b": 1, "c": 1, "children_ages": [2.5, 1, 4]}]
3
i want to go over the children_ages array and to return the count of children that are above the age of 3. I am having a hard time to use the array data because it is returns as jsonb and not int array.
the first row should return 2 because there are 2 children above the age of 3. The second row should return 1 because there is 1 child above the age of 3.
I have tried the following but it didn't work:
WITH reservation AS (SELECT jsonb_array_elements(reservations.guest_composition)->'children_ages' as children_ages, charging_age FROM reservations
SELECT (CASE WHEN (reservations.charging_age IS NOT NULL AND reservation.children_ages IS NOT NULL) THEN SUM( CASE WHEN (reservation.children_ages)::int[] >=(reservations.charging_age)::int THEN 1 ELSE 0 END) ELSE 0 END) as children_to_charge
You can extract an array of all child ages using a SQL JSON path function:
select jsonb_path_query_array(r.guest_composition, '$[*].children_ages[*] ? (# > 3)')
from reservations r;
The length of that array is then the count you are looking for:
select jsonb_array_length(jsonb_path_query_array(r.guest_composition, '$[*].children_ages[*] ? (# > 3)'))
from reservations r;
It's unclear to me if charging_age is a column and could change in every row. If that is the case, you can pass a parameter to the JSON path function:
select jsonb_path_query_array(
r.guest_composition, '$[*].children_ages[*] ? (# > $age)',
jsonb_build_object('age', charging_age)
)
from reservations r;

How to select each value of array

Consider following case
Table : tab1
id serial primary key
arr int[]
Now I want to select each value of arr.
SELECT * FROM (SELECT arr FROM tab1) AS tab2
I need kind of iteration in array.
e.g.
id arr
-----------------------------
1 [1,2]
2 [5,6,8]
So I could get result as
arr val
-------------------------------
[1,2] 1
[1,2] 2
[5,6,8] 5
[5,6,8] 6
[5,6,8] 8
Use unnest() for that:
WITH array_data(id,arr) AS ( VALUES
(1,ARRAY[1,2]),
(2,ARRAY[5,6,8])
)
SELECT arr,unnest(arr) AS val
FROM array_data;
I don't know if I've got well but here you have all you need
select id,
unnest(arr),
array_to_string(arr,','),
array_length(arr, 1)
from array_data;

How to pass elements to a different cell array given a specific condition

I am working with Matlab, and I have two different cell arrays with several elements.
A primary and a secondary one. There is one element that is common in both cells, although the number of rows and order is not equal. What I would like is for X extra elements from the secondary cell to ‘pass’ to the primary one every time a condition is verified. The condition would be if column Y (from primary cell) and Z (from secondary cell) match. For instance:
Primary cell array:
ABC 970508 …
FED 970524 …
BAC 970601 …
IGH 970606 …
Secondary cell array
IGH FINANCE BANK1
FED HEALTH PILLS
ABC FINANCE BANK3
What I would like to get in the ‘new’ primary cell array:
ABC 970508 FINANCE BANK3
FED 970524 HEALTH PILLS
BAC 970601 …
IGH 970606 FINANCE BANK1
Can anyone help me?
How would you like a vectorized solution with bsxfun?
Code
%%// a1 and a2 are primary and secondary cell arrays resepectively
a1 ={
'a' 'ABC is correct' '970508'
'bb' 'FED' '970524'
'dwd' 'BAC' '970601'
'hoi' 'IGH' '970606'}
a2 = {
'what' 'gap' 'IGH' 'FINANCE BANK1'
'nope' 'seal' 'FED' 'HEALTH PILLS'
'yes' 'solo' 'ABC is correct' 'FINANCE BANK3'}
X = 1;%// Number of extra columns to append from secondary cell array
Y = 2;%%// Column number from primary cell array to choose from
Z = 3;%%// Column number from secondary cell array to choose from
a1col = char(a1(:,Y))
a2col = char(a2(:,Z))
ad1 = size(a2col,2)-size(a1col,2)
a1col = [a1col repmat(' ',size(a1,1),ad1)]
a2col = [a2col repmat(' ',size(a2,1),-ad1)]
tt1 = a1col-'0'
tt2 = a2col-'0'
tt3 = permute(tt2,[3 2 1])
p1 = bsxfun(#eq,tt1,tt3)
p2 = squeeze(all(p1,2))
[v1,v2] = max(p2,[],2)
%// out is the desired output
out = cell(size(a1,1),size(a1,2)+X)
out(:,1:size(a1,2)) = a1
out(v1,:) = horzcat(a1(v1,:),a2(v2(v1),Z+1:Z+X))
Output
a1 =
'a' 'ABC is correct' '970508'
'bb' 'FED' '970524'
'dwd' 'BAC' '970601'
'hoi' 'IGH' '970606'
a2 =
'what' 'gap' 'IGH' 'FINANCE BANK1'
'nope' 'seal' 'FED' 'HEALTH PILLS'
'yes' 'solo' 'ABC is correct' 'FINANCE BANK3'
out =
'a' 'ABC is correct' '970508' 'FINANCE BANK3'
'bb' 'FED' '970524' 'HEALTH PILLS'
'dwd' 'BAC' '970601' []
'hoi' 'IGH' '970606' 'FINANCE BANK1'
I believe that what you want to do would be an inner join if you were working with databases. Joining cell arrays is discussed more fully in Join Matrices in MATLAB, but applying it to your specific case the following code should give you what you want:
d1 = {'ABC' 970508
'FED' 970524
'BAC' 970601
'IGH' 970606};
d2 = {'IGH' 'FINANCE' 'BANK1'
'FED' 'HEALTH' 'PILLS'
'ABC' 'FINANCE' 'BANK3'};
%// get all possible keys, and convert them to indices starting at 1
[keys,~,ind] = unique( [d1(:,1);d2(:,1)] );
%// inner join
ind1 = ind(1:size(d1,1));
ind2 = ind(size(d1,1)+1:end);
loc1 = ismember(ind1, ind2);
loc2 = ismember(ind2, ind1);
innerJoin = cell(sum(loc1),3);
innerJoin(:,1) = d1(loc1,1);
innerJoin(:,2) = d1(loc1,2);
innerJoin(:,3) = d2(loc2,2);
innerJoin(:,4) = d2(loc2,3);
This gives
innerJoin =
'ABC' [970508] 'FINANCE' 'BANK1'
'FED' [970524] 'HEALTH' 'PILLS'
'IGH' [970606] 'FINANCE' 'BANK3'
It also appears that recent versions of MATLAB have a table type which has a native innerjoin function, so you could use these instead of cell arrays. See the Mathworks documentation. However, this isn't in my version of MATLAB, which is 2012a.

Mapping algorithm in C

scala can make mapping (ADT) so we can mapping like this ('A', 3) = (Char, Int)
how about in C?
I want to mapping and check all the relations and comparing between two maps
'a' = 1, 'b' = 3, 'c' = 4 is mapping by abbbcccc
and 'e' = 1 , 'b' = 3, 'g' = 4 is mapping by bbbegggg
I want to find these relation ('a' , 1) is not in ('e' = 1 , 'b' = 3, 'g' = 4). then, this map
('b' = 3) is in ('e' = 1 , 'b' = 3, 'g' = 4) this map. and count++;
How I can make these like maps? can I make it by array?
Not in plain C, no.
You could implement one with an array or two, but you would have to implement either a hashing algorithm, or some kind of comparison and search algorithm. Alternatively you could use some kind of search tree to implement it.
If you don't want to write a map data type, you will have to use a library with that functionality. GLib contains one: http://developer.gnome.org/glib/2.30/glib-Hash-Tables.html

Selecting Consecutive Entries with LINQ to Entities

I have a database table with rows that each contain a sequential index. I want to select groups of rows that are consecutive based upon this index column. For example, if I had rows with the following index values:
1
3
4
5
7
9
10
11
12
15
16
and I wanted to select all groups with 3 consecutive indices (this number will vary). I would get the following groups:
3, 4, 5
9, 10, 11
10, 11, 12
Basically, I'm trying to achieve something similar to the question posed here:
selecting consecutive numbers using SQL query
However, I want to implement this with LINQ to Entities, not actual SQL. I would also prefer not to use stored procedures, and I don't want to do any sort of ToList/looping approach.
Edit: Groups with more than the requested consecutive elements don't necessarily need to be split apart. i.e. in the previous example, a result of 9, 10, 11, 12 would also be acceptable.
So I think I've come up with a pretty good solution modeled after Brian's answer in the topic I linked to.
var q = from a in query
from b in query
where a.Index < b.Index
&& b.Index < a.Index + 3
group b by new { a.Index }
into myGroup
where myGroup.Count() + 1 == 3
select myGroup.Key.Index;
Change 3 to the number of consecutive rows you want. This gives you the first index of every group of consecutive rows. Applied to the original example I provided, you would get:
3
9
10
I think this might work pretty efficiently (C# though):
int[] query = { 1, 3, 4, 5, 7, 9, 10, 11, 12, 15, 16 };
int count = 3;
List<List<int>> numbers = query
.Where(p => query.Where(q => q >= p && q < p + count).Count() == count)
.Select(p => Enumerable.Range(p, count).ToList())
.ToList();
using (var model = new AlbinTestEntities())
{
var triples = from t1 in model.Numbers
from t2 in model.Numbers
from t3 in model.Numbers
where t1.Number + 1 == t2.Number
where t2.Number + 1 == t3.Number
select new
{
t1 = t1.Number,
t2 = t2.Number,
t3 = t3.Number,
};
foreach (var res in triples)
{
Console.WriteLine(res.t1 + ", " + res.t2 + ", " + res.t3);
}
}
It generates the following SQL
SELECT
[Extent1].[Number] AS [Number],
[Extent2].[Number] AS [Number1],
[Extent3].[Number] AS [Number2]
FROM [dbo].[Numbers] AS [Extent1]
CROSS JOIN [dbo].[Numbers] AS [Extent2]
CROSS JOIN [dbo].[Numbers] AS [Extent3]
WHERE (([Extent1].[Number] + 1) = [Extent2].[Number]) AND (([Extent2].[Number] + 1) = [Extent3].[Number])
It might be even better to use an inner join like this
using (var model = new AlbinTestEntities())
{
var triples = from t1 in model.Numbers
join t2 in model.Numbers on t1.Number + 1 equals t2.Number
join t3 in model.Numbers on t2.Number + 1 equals t3.Number
select new
{
t1 = t1.Number,
t2 = t2.Number,
t3 = t3.Number,
};
foreach (var res in triples)
{
Console.WriteLine(res.t1 + ", " + res.t2 + ", " + res.t3);
}
}
but when I compare the resulting queries in management studio they generate the same execution plan and take exactly the same time to execute. I have only this limited dataset you might compare the performance on your dataset if it is larger and pick the best if they differ.
The following code will find every "root".
var query = this.commercialRepository.GetQuery();
var count = 2;
for (int i = 0; i < count; i++)
{
query = query.Join(query, outer => outer.Index + 1, inner => inner.Index, (outer, inner) => outer);
}
var dummy = query.ToList();
It will only find the first item in each group so you will either have to modify the query to remeber the other ones or you could make a query based on the fact that you have the roots and from those you know which indexes to get. I'm sorry I couldn't wrap it up before I had to go but maybe it helps a bit.
PS. if count is 2 as in this case it means if finds groups of 3.

Resources