Ruby: Comparing arrays and creating new ones based on specific conditions - arrays

I have 3 arrays of equal length. Some spots are nil, which complicates things, but I need to retain their order.
a = [5.2, 3.0, 1.21, 7.0, 5.0, 5.0, 6.0, 8.0, 10.0, 10.0]
b = [nil, nil, [{"price"=>1.99, "size"=>269.897475661239}], nil, nil, nil, nil, nil, nil, nil]
x = [6.0, 6.2, 2.5, 5.0, 9.0, 2.36, 15.5, 20.0, nil, nil]
(Step One, I want to iterate over b so that b = [nil, nil, 1.99, nil, nil, nil, nil, nil, nil, nil]. Just need ["price"], ignore ["size"]. Couldn't figure that out.)
Step Two, I want to create a new array (c) that averages a and b but where there is nil, just take the one that has a value. In other words, c would = [5.2, 3.0, 1.6, 7.0, 5.0, 5.0, 6.0, 8.0, 10.0, 10.0] which looks like a except the third spot the average of 1.21 and 1.99 (1.6).
So I have my original third array x = [6.0, 6.2, 2.5, 5.0, 9.0, 2.36, 15.5, 20.0, nil, nil]. Step Three, I want to compare c and x to and create a new array z that takes the SMALLER* of the two numbers, or if nil, the one that has a value. z is the result I would like.
Thus z should = [6.0, 6.2, 2.5, 7.0, 9.0, 5.0, 15.5, 20.0, 10.0, 10.0] (if my eyes are correct). (*Edit: I meant the larger of the two numbers, which is why this array doesn't match below answer, so I used that answer below but used .max instead of .min )
I know those steps are tedious, but I need to go in that order because I have lots of arrays where I need to average 2 then compare with a third and take the larger number and with random nil values laced throughout, it gets beyond my abilities. Can't figure it out, and would greatly appreciate some help! Thank you!

bb = b.map { |e| e.is_a?(Array) ? e.first["price"] : e }
#=> [nil, nil, 1.99, nil, nil, nil, nil, nil, nil, nil]
c = a.zip(bb).map { |ea, ebb| ebb.nil? ? ea : (ea+ebb)/2.0 }
#=> [5.2, 3.0, 1.6, 7.0, 5.0, 5.0, 6.0, 8.0, 10.0, 10.0]
c.zip(x).map { |cc,xx| xx.nil? ? cc : [cc,xx].min }
#=> [5.2, 3.0, 1.6, 5.0, 5.0, 2.36, 6.0, 8.0, 10.0, 10.0]
If only bb and the return value were needed, you might perform the following calculation.
[a,bb,x].transpose.map do |ae,bbe,xe|
ab_avg = bbe ? (ae+bbe)/2.0 : ae
xe ? [ab_avg, xe].min : ab_avg
end
#=> [5.2, 3.0, 1.6, 5.0, 5.0, 2.36, 6.0, 8.0, 10.0, 10.0]

Step 1.
b.map!{ |x| x.first.values.first if x }
Step 2.
c = a.map.each_with_index{ |x, i| (x && b[i]) ? ((x || 0) + (b[i] || 0))/2 : (x || b[i]) }
Step 3.
c.map.each_with_index{ |k, i| (k && x[i]) ? [k, x[i]].max : (k || x[i]) }

Related

Do math across two arrays

I need to subtract one array from another, by index:
a = [3,4,3,5]
b = [1,2,2,1]
c = [2,2,1,4]
I would use Array#zip and then Array#map:
a = [3,4,3,5]
b = [1,2,2,1]
c = a.zip(b).map { |a, b| a - b }
#=> [2, 2, 1, 4]
There are multiple ways to do it in Ruby. Some examples:
The most straightforward approach:
a = [3.0, 4.0, 3.0, 5.0]
b = [1.0, 2.0, 2.0, 1.0]
length = 4
c = Array.new(length, 0.0) # Where 0.0 is default array value.
length.times do |i|
c[i] = a[i] - b[i]
end
Using Vector class from Ruby standard library:
require 'matrix'
a = Vector[3.0, 4.0, 3.0, 5.0]
b = Vector[1.0, 2.0, 2.0, 1.0]
(a - b).to_a
Using Enumerable#inject.
a = [3, 4, 3, 5]
b = [1, 2, 2, 1]
с = a.zip(b).map { |i| i.inject(&:-) }
# => [2, 2, 1, 4]
I like :-) symbols there :-)

fast way to compute vectorized percentiles in numpy/scipy?

given an array of values:
v = np.random.randn(100)
what's the fastest way to compute the percentile of each element in the array? the following is slow:
%timeit map(lambda e: scipy.stats.percentileofscore(v, e), v)
100 loops, best of 3: 5.1 ms per loop
You could use scipy.stats.rankdata() to achieve the same result:
In [58]: v = np.random.randn(10)
In [59]: print(list(map(lambda e: scipy.stats.percentileofscore(v, e), v)))
[30.0, 40.0, 50.0, 90.0, 20.0, 60.0, 10.0, 70.0, 80.0, 100.0]
In [60]: from scipy.stats import rankdata
In [61]: rankdata(v)*100/len(v)
Out[61]: array([ 30., 40., 50., 90., 20., 60., 10., 70., 80., 100.])

Count number of Double.NaN values in scala Array

For example:
scala> val my_array = Array(4,5,Double.NaN,6,5,6, Double.NaN)
my_array: Array[Double] = Array(4.0, 5.0, NaN, 6.0, 5.0, 6.0, NaN)
scala> my_array.count(_ == Double.NaN)
res13: Int = 0
I understand that two Double.NaN are not equal to each other
scala> Double.NaN == Double.NaN
res14: Boolean = false
and therefore, I get the result that I get, but I can't find a function that would tell me the number of Double.NaNs, what am I missing?
In python the behaviour would look like this:
In [43]: import numpy as np
In [44]: a = np.array([5,np.nan,5,7,4,np.nan])
In [45]: np.isnan(a)
Out[45]: array([False, True, False, False, False, True], dtype=bool)
In [46]: np.isnan(a).sum()
Out[46]: 2
Double.isNan does the job:
scala> val array = Array(4,5,Double.NaN,6,5,6, Double.NaN)
array: Array[Double] = Array(4.0, 5.0, NaN, 6.0, 5.0, 6.0, NaN)
scala> array.count(_.isNaN)
res0: Int = 2

How to find correspondent elements of two linearly transformed arrays in Ruby?

I have two arrays of floats (x,y) with unique elements, one of them is a linear transform of the other y=a*x+b, for example:
a=0.95;
b1=3.33;
b2=5.55;
x=[1,3,4,6,9,13,20,22,31,35,37,40];
y=t1.collect.with_index{|z,i| i>6 ? z*a+b1 : z*a+b2}
=> [6.5, 8.4, 9.35, 11.25, 14.1, 17.9, 24.55, 24.23, 32.78, 36.58, 38.48, 41.33]
The linear transformation is applied with two different b values to the x array. Let's suppose I don't know the rule of the b values aplied, here the function of the index i.
My goal is that if I know the value of a and I also know the possible values of b in the form of a two element array bs=[b1,b2], then I would like to find out the correspondent b value for every element of y even if the two arrays (x,y) are scrambled. My idea (doesn't work correctly, I need help here):
def ybs(x,y,bs,a)
difference=0.0
xelem=0.0
return y.map do |z|
cb=bs.min_by do |b|
xelem=x.min_by do |q|
(q-(z-b)*1/a).abs
end
difference=(xelem-(z-b)*1/a).abs
end
difference=(xelem-(z-cb)*1/a).abs
[z,xelem,(z-cb)*1/a,cb,difference]
end
end
It would return 4 values for every elements of the y array in the form:
[<value from y>,<correspondent value from x>,<inverse transformed value of y, should be equal to xelem>,<correspondent b value of the linear transformation>,<difference, error, usually 0.0>]
My output when I call ybs(x,y,bs,a):
[[1, 6.5, -2.4526315789473685, 3.33, 8.952631578947368],
[3, 6.5, -0.34736842105263166, 3.33, 6.847368421052631],
[4, 6.5, 0.7052631578947368, 3.33, 5.794736842105263],
[6, 6.5, 2.8105263157894735, 3.33, 3.6894736842105265],
[9, 6.5, 5.968421052631579, 3.33, 0.5315789473684207],
[13, 8.4, 7.842105263157896, 5.55, 0.5578947368421048],
[20, 14.1, 17.547368421052635, 3.33, 3.4473684210526354],
[22, 17.9, 17.31578947368421, 5.55, 0.5842105263157897],
[31, 24.55, 26.789473684210527, 5.55, 2.2394736842105267],
[35, 32.78, 33.33684210526316, 3.33, 0.5568421052631578],
[37, 32.78, 33.10526315789474, 5.55, 0.3252631578947387],
[40, 36.58, 38.6, 3.33, 2.020000000000003]]
I need this method for my subtitle syncing program, where different parts of the subtitles' time codes can be shifted by different amount, for example when a scene is missing from a different version of the movie.
The problem was that you weren't keeping your ordered pairs together. For each y value, your code 'thinks' that the x associated with it is the one for which (q-(z-b)*1/a).abs is the least. However, it could be that taking the "wrong" b value for the y value being considered, together with the wrong x value would lead to a value of (q-(z-b)*1/a).abs that was slightly (or much) less than that which you get by taking the "right" b and x values.
I ran your code (rounding off the values for clarity) and got:
[6.5, 1.0, 1.0, 5.55, 0.0]
[8.4, 3.0, 3.0, 5.55, 0.0]
[9.35, 4.0, 4.0, 5.55, 0.0]
[11.25, 6.0, 6.0, 5.55, 0.0]
[14.1, 9.0, 9.0, 5.55, 0.0]
[17.9, 13.0, 13.0, 5.55, 0.0]
[24.55, 20.0, 20.0, 5.55, 0.0]
[24.23, 20.0, 22.0, 3.33, 2.0]
[32.78, 31.0, 31.0, 3.33, 0.0]
[36.58, 31.0, 35.0, 3.33, 4.0]
[38.48, 35.0, 37.0, 3.33, 2.0]
[41.33, 37.0, 40.0, 3.33, 3.0]
You can see that the x values do not follow the original sequence. Since there's no need to take a chance letting 'y's get associated with the wrong 'x's, lets just force them to stay together.
Here is how I modified your code to keep the ys and xs together.
def ybs(pairs,bs,a)
difference=0.0
xelem=0.0
return pairs.map do |pair|
x,y = pair[0], pair[1]
cb = bs.min_by do |b|
(x-(y-b)*1/a).abs
end
difference = (x-(y-cb)*1/a).abs
[y,x,(y-cb)*1/a,cb,difference]
end
end
a=0.95;
b1=3.33;
b2=5.55;
bs = [b1, b2]
x=[1,3,4,6,9,13,20,22,31,35,37,40];
y=x.collect.with_index{|z,i| i>6 ? z*a+b1 : z*a+b2}
c = x.count-1
pairs = (0..c).collect do |i|
[x[i],y[i]]
end
r = ybs(pairs,bs,a)
r.each do |q|
(0..4).each do |p|
q[p] = q[p].round(2)
end
p q
end
and here is my output:
[6.5, 1.0, 1.0, 5.55, 0.0]
[8.4, 3.0, 3.0, 5.55, 0.0]
[9.35, 4.0, 4.0, 5.55, 0.0]
[11.25, 6.0, 6.0, 5.55, 0.0]
[14.1, 9.0, 9.0, 5.55, 0.0]
[17.9, 13.0, 13.0, 5.55, 0.0]
[24.55, 20.0, 20.0, 5.55, 0.0]
[24.23, 22.0, 22.0, 3.33, 0.0]
[32.78, 31.0, 31.0, 3.33, 0.0]
[36.58, 35.0, 35.0, 3.33, 0.0]
[38.48, 37.0, 37.0, 3.33, 0.0]
[41.33, 40.0, 40.0, 3.33, 0.0]
All of the errors are small, and the bs are correct... they are 5.55 until the 7th row, where they switch to 3.33, as your rule prescribes.

How do I append one matrix to another in Scala?

If I have the following code:
var A = Array[Array[Double]]() // where A becomes an MxP matrix
var B = Array[Array[Double]]() // where B becomes an NxP matrix
What are some efficient ways to append one matrix to the other, resulting in a single matrix, as the following pseudocode would suggest?
val C = A append B // where C is a (M+N)xP matrix
Obviously, one of the dimensions (in this case P) is held constant.
EDIT: So far, both of the provided solutions are growing in the second dimension. I am trying to hold the second dimension fixed.
Functional, but not as performant as the imperative alternative would be:
scala> val a = Array.tabulate(2, 3)((_, _) => (math.random * 100).toInt)
a: Array[Array[Int]] = Array(Array(52, 61, 58), Array(35, 69, 39))
scala> val b = Array.tabulate(2, 4)((_, _) => (math.random * 100).toInt)
b: Array[Array[Int]] = Array(Array(51, 54, 87, 10), Array(52, 76, 18, 85))
scala> (a, b).zipped.map(_ ++ _)
res0: Array[Array[Int]] = Array(Array(52, 61, 58, 51, 54, 87, 10), Array(35, 69, 39, 52, 76, 18, 85))
(In reply to the comment...)
Holding the second dimension fixed:
scala> val x = Array.tabulate(3, 2)((_, _) => (math.random * 100).toInt)
x: Array[Array[Int]] = Array(Array(13, 26), Array(96, 6), Array(68, 58))
scala> val y = Array.tabulate(2, 2)((_, _) => (math.random * 100).toInt)
y: Array[Array[Int]] = Array(Array(82, 5), Array(0, 76))
scala> x ++ y
res1: Array[Array[Int]] = Array(Array(13, 26), Array(96, 6), Array(68, 58), Array(82, 5), Array(0, 76))
scala> val a = Array.fill(4,3) { 1. };
a: Array[Array[Double]] = Array(Array(1.0, 1.0, 1.0), Array(1.0, 1.0, 1.0), Array(1.0, 1.0, 1.0), Array(1.0, 1.0, 1.0))
scala> val b = Array.fill(4,6) { 2. };
b: Array[Array[Double]] = Array(Array(2.0, 2.0, 2.0, 2.0, 2.0, 2.0), Array(2.0, 2.0, 2.0, 2.0, 2.0, 2.0), Array(2.0, 2.0, 2.0, 2.0, 2.0, 2.0), Array(2.0, 2.0, 2.0, 2.0, 2.0, 2.0))
scala> for((aa,bb) <- a zip b) yield (aa ++ bb)
res0: Array[Array[Double]] = Array(Array(1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0), Array(1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0), Array(1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0), Array(1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0))

Resources