Find pairs with distance in ruby array - arrays

I have a big array with a sequence of values.
To check if the values in place x have an influence on the values on place x+distance
I want to find all the pairs
pair = [values[x], values[x+1]]
The following code works
pairs_with_distance = []
values.each_cons(1+distance) do |sequence|
pairs_with_distance << [sequence[0], sequence[-1]]
end
but it looks complicated and I wonder if if I make it shorter and clearer

You can make the code shorter by using map directly:
pairs_with_distance = values.each_cons(1 + distance).map { |seq|
[seq.first, seq.last]
}
I prefer something like the example below, because it has short, readable lines of code, and because it separates the steps -- an approach that allows you to give a meaningful names to intermediate calculations (groups in this case). You can probably come up with better names based on the real domain of the application.
values = [11,22,33,44,55,66,77]
distance = 2
groups = values.each_cons(1 + distance)
pairs = groups.map { |seq| [seq.first, seq.last] }
p pairs

Related

how to sum only the max value for common prefix inside the array in scala

I have array contain string items in scala , each item contain from prefix + || + double value like below :
var y = Array("Zara||6.0", "Nuha||4.0","Zara||2.0","Zara||0.1")
what I want to Do :
i need sum all double value from above array (y(i).split("\|\|")(1)) But if the prefix the duplicated in the array then I only want sum the max value like below :
for item Zara we have 3 values i want to take the max (in our sample it 6.0)
for item Nuha it unique then i will take it's value (4.0)
the excepted output is (6.0+4.0)=10.0
is there are any way to do it in scala rather than using 2 instead loop ?
Prepare your array: extract prefix and values into tuple. Use foldLeft for aggregate max elem for each prefix, and sum values
val res = y.map(_.split("\\|\\|")).map(arr => (arr(0), arr(1).toDouble))
.foldLeft(Map.empty[String, Double]) { (acc, elem) =>
val value = acc.get(elem._1).map(math.max(_, elem._2)).getOrElse(elem._2)
acc + (elem._1 -> value)
}.values.sum
println(res)
You can do it pretty much in one step (it's three steps technically, but only one specifically addressing your requirement, everything else (split and sum) is kinda a given either way.
y
.iterator
.map(_.split("""\|\|"""))
.groupMapReduce(_.head)(_.last.toDouble)(_ max _)
.values
.sum
Also ... do not use vars. Even if you just putting together a quick sample. Vars are evil, just pretend they do not exist at all ... at least for a while, until you acquire enough of a command of the language to be able to tell the 1% of situations, where you might actually need them. Actually, avoid using Arrays as much as possible too.

MATLAB - repmat values into cell array where individual cell elements have unequal size

I am trying to repeat values from an array (values) to a cell array where the individual elements have unequal sizes (specified by array_height and array_length).
I hope to apply this to a larger data set (containing ~100 x ~100 values) and my current solution is to have a line of code for each value (code example below). Surely there is a better way... Please could someone offer an alternative solution?
C = cell(3,2);
values = rand(3,2);
array_height = randi(10,3,2);
array_length = randi(10,3,2);
C{1,1} = repmat((values(1,1)),[array_height(1,1),array_length(1,1)]);
C{2,1} = repmat((values(2,1)),[array_height(2,1),array_length(2,1)]);
C{3,1} = repmat((values(3,1)),[array_height(3,1),array_length(3,1)]);
C{1,2} = repmat((values(1,2)),[array_height(1,2),array_length(1,2)]);
C{2,2} = repmat((values(2,2)),[array_height(2,2),array_length(2,2)]);
C{3,2} = repmat((values(3,2)),[array_height(3,2),array_length(3,2)]);
If you did this in a for loop, it might look something like this:
for i = 1:size(C,1)
for j = 1:size(C,2)
C{i,j} = repmat(values(i,j),[array_height(i,j),array_length(i,j)]);
end
end
However, if you are trying to generate or use this with a larger dataset, this code snippet likely will take forever! I suspect whatever your overall objective is can be better served by matlab's many optimizations for matrices and vectors, but without more information I can't help more than that.

How to structure multiple python arrays for sorting

A fourier analysis I'm doing outputs 5 data fields, each of which I've collected into 1-d numpy arrays: freq bin #, amplitude, wavelength, normalized amplitude, %power.
How best to structure the data so I can sort by descending amplitude?
When testing with just one data field, I was able to use a dict as follows:
fourier_tuples = zip(range(len(fourier)), fourier)
fourier_map = dict(fourier_tuples)
import operator
fourier_sorted = sorted(fourier_map.items(), key=operator.itemgetter(1))
fourier_sorted = np.argsort(-fourier)[:3]
My intent was to add the other arrays to line 1, but this doesn't work since dicts only accept 2 terms. (That's why this post doesn't solve my issue.)
Stepping back, is this a reasonable approach, or are there better ways to combine & sort separate arrays? Ultimately, I want to take the data values from the top 3 freqs and associated other data, and write them to an output data file.
Here's a snippet of my data:
fourier = np.array([1.77635684e-14, 4.49872050e+01, 1.05094837e+01, 8.24322470e+00, 2.36715913e+01])
freqs = np.array([0. , 0.00246951, 0.00493902, 0.00740854, 0.00987805])
wavelengths = np.array([inf, 404.93827165, 202.46913583, 134.97942388, 101.23456791])
amps = np.array([4.33257766e-16, 1.09724890e+00, 2.56328871e-01, 2.01054261e-01, 5.77355886e-01])
powers% = np.array([4.8508237956526163e-32, 0.31112370227749603, 0.016979224022185751, 0.010445983875848858, 0.086141014686372669])
The last 4 arrays are other fields corresponding to 'fourier'. (Actual array lengths are 42, but pared down to 5 for simplicity.)
You appear to be using numpy, so here is the numpy way of doing this. You have the right function np.argsort in your post, but you don't seem to use it correctly:
order = np.argsort(amplitudes)
This is similar to your dictionary trick only it computes the inverse shuffling compared to your procedure. Btw. why go through a dictionary and not simply a list of tuples?
The contents of order are now indices into amplitudes the first cell of order contains the position of the smallest element of amplitudes, the second cell contains the position of the next etc. Therefore
top5 = order[:-6:-1]
Provided your data are 1d numpy arrays you can use top5 to extract the elements corresponding to the top 5 ampltiudes by using advanced indexing
freq_bin[top5]
amplitudes[top5]
wavelength[top5]
If you want you can group them together in columns and apply top5 to the resulting n-by-5 array:
np.c_[freq_bin, amplitudes, wavelength, ...][top5, :]
If I understand correctly you have 5 separate lists of the same length and you are trying to sort all of them based on one of them. To do that you can either use numpy or do it with vanilla python. Here are two examples from top of my head (sorting is based on the 2nd list).
a = [11,13,10,14,15]
b = [2,4,1,0,3]
c = [22,20,23,25,24]
#numpy solution
import numpy as np
my_array = np.array([a,b,c])
my_sorted_array = my_array[:,my_array[1,:].argsort()]
#vanilla python solution
from operator import itemgetter
my_list = zip(a,b,c)
my_sorted_list = sorted(my_list,key=itemgetter(1))
You can then flip the array with my_sorted_array = np.fliplr(my_sorted_array) if you wish or if you are working with lists you can reverse it in place with my_sorted_list.reverse()
EDIT:
To get first n values only, you have to simply slice the array similarly to what #Paul is suggesting. Slice is done in a similar manner to classic list slicing by specifying start:stop:step (you can omit the step) arguments. In your case for 5 top columns it would be [:,-5:]. So in the example above you can take top 2 columns from each row like this:
my_sliced_sorted_array = my_sorted_array[:,-2:]
result will be:
array([[15, 13],
[ 3, 4],
[24, 20]])
Hope it helps.

MATLAB solve array

I've got multiple arrays that you can't quite fit a curve/equation to, but i do need to solve them for a lot of values. Simplified it looks like this when i plot it, but the real ones have a lot more points:
So say i would like to solve for y=22,how would i do that? As you can see there'd be three solutions to this, but i only need the most left one.
Linear is okay, but i'd rather us a non-linear method.
The only way i found is to fit an equation to a set of points and solve that equation, but an equation can't approximate the array accurately enough.
This implementation uses a first-order interpolation- if you're looking for higher accuracy and it feels appropriate, you can use a similar strategy for another order estimator.
Assuming data is the name of your array containing data with x values in the first column and y values in the second, that the columns are sorted by increasing or decreasing x values, and you wanted to find all data at the value y = 22;
searchPoint = 22; %search for all solutions where y = 22
matchPoints = []; %matrix containing all values of x
for ii = 1:length(data)-1
if (data(ii,2)>searchPoint)&&(data(ii+1,2)<searchPoint)
xMatch = data(ii,1)+(searchPoint-data(ii,2))*(data(ii+1,1)-data(ii,1))/(data(ii+1,2)-data(ii,2)); %Linear interpolation to solve for xMatch
matchPoints = [matchPoints xMatch];
elseif (data(ii,2)<searchPoint)&&(data(ii+1,2)>searchPoint)
xMatch = data(ii,1)+(searchPoint-data(ii,2))*(data(ii+1,1)-data(ii,1))/(data(ii+1,2)-data(ii,2)); %Linear interpolation to solve for xMatch
matchPoints = [matchPoints xMatch];
elseif (data(ii,2)==searchPoint) %check if data(ii,2) is equal
matchPoints = [matchPoints data(ii,1)];
end
end
if(data(end,2)==searchPoint) %Since ii only goes to the rest of the data
matchPoints = [matchPoints data(end,1)];
end
This was written sans-compiler, but the logic was tested in octave (in other words, sorry if there's a slight typo in variable names, but the math should be correct)

Help with a special case of permutations algorithm (not the usual)

I have always been interested in algorithms, sort, crypto, binary trees, data compression, memory operations, etc.
I read Mark Nelson's article about permutations in C++ with the STL function next_perm(), very interesting and useful, after that I wrote one class method to get the next permutation in Delphi, since that is the tool I presently use most. This function works on lexographic order, I got the algo idea from a answer in another topic here on stackoverflow, but now I have a big problem. I'm working with permutations with repeated elements in a vector and there are lot of permutations that I don't need. For example, I have this first permutation for 7 elements in lexographic order:
6667778 (6 = 3 times consecutively, 7 = 3 times consecutively)
For my work I consider valid perm only those with at most 2 elements repeated consecutively, like this:
6676778 (6 = 2 times consecutively, 7 = 2 times consecutively)
In short, I need a function that returns only permutations that have at most N consecutive repetitions, according to the parameter received.
Does anyone know if there is some algorithm that already does this?
Sorry for any mistakes in the text, I still don't speak English very well.
Thank you so much,
Carlos
My approach is a recursive generator that doesn't follow branches that contain illegal sequences.
Here's the python 3 code:
def perm_maxlen(elements, prefix = "", maxlen = 2):
if not elements:
yield prefix + elements
return
used = set()
for i in range(len(elements)):
element = elements[i]
if element in used:
#already searched this path
continue
used.add(element)
suffix = prefix[-maxlen:] + element
if len(suffix) > maxlen and len(set(suffix)) == 1:
#would exceed maximum run length
continue
sub_elements = elements[:i] + elements[i+1:]
for perm in perm_maxlen(sub_elements, prefix + element, maxlen):
yield perm
for perm in perm_maxlen("6667778"):
print(perm)
The implentation is written for readability, not speed, but the algorithm should be much faster than naively filtering all permutations.
print(len(perm_maxlen("a"*100 + "b"*100, "", 1)))
For example, it runs this in milliseconds, where the naive filtering solution would take millenia or something.
So, in the homework-assistance kind of way, I can think of two approaches.
Work out all permutations that contain 3 or more consecutive repetitions (which you can do by treating the three-in-a-row as just one psuedo-digit and feeding it to a normal permutation generation algorithm). Make a lookup table of all of these. Now generate all permutations of your original string, and look them up in lookup table before adding them to the result.
Use a recursive permutation generating algorthm (select each possibility for the first digit in turn, recurse to generate permutations of the remaining digits), but in each recursion pass along the last two digits generated so far. Then in the recursively called function, if the two values passed in are the same, don't allow the first digit to be the same as those.
Why not just make a wrapper around the normal permutation function that skips values that have N consecutive repetitions? something like:
(pseudocode)
funciton custom_perm(int max_rep)
do
p := next_perm()
while count_max_rerps(p) < max_rep
return p
Krusty, I'm already doing that at the end of function, but not solves the problem, because is need to generate all permutations and check them each one.
consecutive := 1;
IsValid := True;
for n := 0 to len - 2 do
begin
if anyVector[n] = anyVector[n + 1] then
consecutive := consecutive + 1
else
consecutive := 1;
if consecutive > MaxConsecutiveRepeats then
begin
IsValid := False;
Break;
end;
end;
Since I do get started with the first in lexographic order, ends up being necessary by this way generate a lot of unnecessary perms.
This is easy to make, but rather hard to make efficient.
If you need to build a single piece of code that only considers valid outputs, and thus doesn't bother walking over the entire combination space, then you're going to have some thinking to do.
On the other hand, if you can live with the code internally producing all combinations, valid or not, then it should be simple.
Make a new enumerator, one which you can call that next_perm method on, and have this internally use the other enumerator, the one that produces every combination.
Then simply make the outer enumerator run in a while loop asking the inner one for more permutations until you find one that is valid, then produce that.
Pseudo-code for this:
generator1:
when called, yield the next combination
generator2:
internally keep a generator1 object
when called, keep asking generator1 for a new combination
check the combination
if valid, then yield it

Resources