Passing an array into a block - arrays

I am a little lost on the block below.
def sort_string(string)
string.split(" ").sort{|a,b| a.length <=> b.length}.join(" ")
end
The array is sorted based on the length (least to greatest). My confusion comes from what the variable b in the block of code is.
If I split the string "example string here" into an array and then sort it, how is [example],[string],[here] passed into the block {|a,b| a.length <=> b.length}? I don't understand how the elements of the array are passed into the code and then compared.

When using sort, Ruby passes two objects into the block. They are to be compared, either using the built-in <=> method, or by some machination you devise that determines whether one is less-than (-1), equal-to (0), or greater-than (1) the other. So, a is one and b is the other.
Meditate on this:
[1, 2, 3, 4].shuffle # => [4, 1, 3, 2]
.sort { |i, j|
[i, j] # => [4, 1], [4, 3], [1, 3], [4, 2], [3, 2], [1, 2]
i <=> j # => 1, 1, -1, 1, 1, -1
}
# => [1, 2, 3, 4]
Remember what <=> does and compare the values returned for the i <=> j comparison each time through the loop.
But of course you knew this from reading the documentation for sort:
http://ruby-doc.org/core-2.3.0/Enumerable.html#method-i-sort
http://ruby-doc.org/core-2.3.0/Array.html#method-i-sort

Related

How to move an element of an array to the beginning of the array

I want to move the element at index 2 to the start of the array [1, 2, 3, 4], the resulting array should look like [3, 1, 2, 4].
My solution was to do the following
[3] + ([1, 2, 3, 4] - [3])
Is there a better way to do this?
A method that takes the first n elements from an array and rotates them by one, then adds back the remaining elements.
def rotate_first_n_right(arr, n)
arr[0...n].rotate(-1) + arr[n..-1]
end
rotate_first_n_right([1,2,3,4], 3)
# => [3, 1, 2, 4]
This does fail if we try to use it on an array that is too short, as the arr[n..-1] slice will yield nil which will cause an error when we try to add it to the first array.
We can fix this by expanding both slices into a list.
def rotate_first_n_right(arr, n)
[*arr[0...n].rotate(-1), *arr[n..-1]]
end
To see why this works, a very simple example:
[*[1, 2, 3], *nil]
# => [1, 2, 3]
A problem with you example is what happens if 3 occurs in the array more than once. E.g.
[1,2,3,3,3,4] - [3]
# => [1, 2, 4]
Not sure what you mean about "rotation" as this is not exactly a rotation but you could go with
def move_element_to_front(arr, idx)
# ruby >= 2.6 arr.dup.then {|a| a.unshift(a.delete_at(idx)) }
arr = arr.dup
arr.unshift(arr.delete_at(idx))
end
This will move the element at idx to the first position in the returned Array
def move_element_to_front(arr, idx)
[arr[idx]].concat(arr[0,idx], arr[idx+1..])
end
arr = [:dog, :cat, :pig, :hen]
move_element_to_front(arr, 2)
#=> [:pig, :dog, :cat, :hen]
move_element_to_front(arr, 0)
#=> [:dog, :cat, :pig, :hen]
move_element_to_front(arr, 3)
#=> [:hen, :dog, :cat, :pig]
The operative line of the method could alternatively be expressed
[arr[idx], *arr[0,idx], *arr[idx+1..]]

What the difference between Ruby + and concat for arrays?

I've been trying to collect arrays with digits into one array. If I try to use + it returns emty array as output. Using concat returns expected array of digits. How does it work and what the main difference between these Ruby methods?
0.step.with_object([]) do |index, output|
output + [index]
break output if index == 100
do # returns empty array
0.step.with_object([]) do |index, output|
output.concat [index]
break output if index == 100
end # returns an array contains digits from 0 to 100
Unlike Enumerable#reduce, Enumerable#each_with_object passes the same object through reducing process.
Array#+ creates a new instance, leaving the original object unrouched.
Array#concat mutates the original object.
With reduce the result will be the same:
0.step.reduce([]) do |acc, index|
break acc if index > 100
acc + [index]
end
Let's create two arrays:
a = [1, 2]
b = [3, 4]
Like all objects, these arrays have unique object ids:
a.object_id #=> 48242540181360
b.object_id #=> 48242540299680
Now let's add them together:
c = a + b #=> [1, 2, 3, 4]
This creates a new object (held by the variable c):
c.object_id #=> 48242540315060
and leaves (the objects held by) a and b (and their object ids) unchanged:
a #=> [1, 2]
b #=> [3, 4]
Now, let's write:
a += b #=> [1, 2, 3, 4]
which Ruby changes to:
a = a + b
when it compiles the code. We obtain:
a #=> [1, 2, 3, 4]
a.object_id #=> 48242541482040
The variable a now holds a new object that equals the previous value of a plus b.
Now let's concatenate b with (the new value of) a:
a.concat(b) #=> [1, 2, 3, 4, 3, 4]
This changes (mutates) a, but of course does not change a's object id:
a #=> [1, 2, 3, 4, 3, 4]
a.object_id #=> 48242541482040
Lastly, we could replace a's value with c, without affecting a's object id:
a.replace(c) #=> [1, 2, 3, 4]
a #=> [1, 2, 3, 4]
a.object_id #=> 48242541482040
See Array#+, Array#concat and Array#replace.

How to find indices of max n elements in array in stable order

I have a number and an array:
n = 4
a = [0, 1, 2, 3, 3, 4]
I want to find the indices corresponding to the maximal n elements of a in the reverse order of the element size, and in stable order when the element sizes are equal. The expected output is:
[5, 3, 4, 2]
This code:
a.each_with_index.max(n).map(&:last)
# => [5, 4, 3, 2]
gives the right indices, but changes the order.
Code
def max_with_order(arr, n)
arr.each_with_index.max_by(n) { |x,i| [x,-i] }.map(&:last)
end
Examples
a = [0,1,2,3,3,4]
max_with_order(a, 1) #=> [5]
max_with_order(a, 2) #=> [5, 3]
max_with_order(a, 3) #=> [5, 3, 4]
max_with_order(a, 4) #=> [5, 3, 4, 2]
max_with_order(a, 5) #=> [5, 3, 4, 2, 1]
max_with_order(a, 6) #=> [5, 3, 4, 2, 1, 0]
Explanation
For n = 3 the steps are as follows.
b = a.each_with_index
#=> #<Enumerator: [0, 1, 2, 3, 3, 4]:each_with_index>
We can convert b to an array to see the (six) values it will generate and pass to the block.
b.to_a
#=> [[0, 0], [1, 1], [2, 2], [3, 3], [3, 4], [4, 5]]
Continuing,
c = b.max_by(n) { |x,i| [x,-i] }
#=> [[4, 5], [3, 3], [3, 4]]
c.map(&:last)
#=> [5, 3, 4]
Note that the elements of arr need not be numeric, merely comparable.
You can supply a block to max to make the determination more specific like so
a.each_with_index.max(n) do |a,b|
if a[0] == b[0] # the numbers are the same
b[1] <=> a[1] # compare the indexes in reverse
else
a[0] <=> b[0] # compare the numbers themselves
end
end.map(&:last)
#=> [5,3,4,2]
max block expects a comparable response e.g. -1,0,1 so in this case we are just saying if the number is the same then compare the indexes in reverse order e.g. 4 <=> 3 #=> -1 the -1 indicates this values is less so that will then be placed after 3
Also to expand on #CarySwoveland's answer (which I am a bit jealous I did not think of), since you only care about returning the indices we could implement as follows without a secondary map
a.each_index.max_by(n) { |x| [a[x],-x] }
#=> [5,3,4,2]
#compsy you wrote without changing order, so it would be:
a = [0,1,2,3,3,4]
n = a.max
i = 0
a.each do |x|
break if x == n
i += 1
end
I use variable i as index, when x (which is the value beeing analized) is equals n we use break to stop the each method conserving the last value of i wich corresponds to the position of the max value at the array. Be aware that value of i is different by one of the natural position in the array, and tht is because in arrays the first element is 0 not 1.
I break the each because there is no need to keep checking all the other values of the array after we found the position of the value.

Array of tuples, sum the values when the the first element is the same

I am trying to sum the elements of an array by grouping by the first element.
ex:
[[1, 8], [3, 16], [1, 0], [1, 1], [1, 1]]
should give
[ {1 => 10}, {3 => 16} ]
It is summing the values in the original array where the first element was 1 and 3. The data structures in the end result don't matter, ex: an array of arrays, an array of hash or just a hash is fine.
Some tries:
k = [[1, 8], [3, 16], [1, 0], [1, 1], [1, 1]]
h = {}
k.inject({}) { |(a,b)| h[a] += b}
#=> undefined method `+' for nil:NilClass
data = [[1, 8], [3, 16], [1, 0], [1, 1], [1, 1]]
data.each_with_object({}) { |(k, v), res| res[k] ||= 0; res[k] += v }
gives
{1=>10, 3=>16}
there is also inject version although it's not so laconic:
data.inject({}) { |res, (k, v)| res[k] ||= 0; res[k] += v; res }
inject vs each_with_object
You're pretty close, some changes are needed on your code:
k.inject({}) do |hash, (a, b)|
if hash[a].nil?
hash[a] = b
else
hash[a] += b
end
hash
end
First of all, you don't need the h variable. #inject accepts an argument, often called the accumulator, which you can change it for each array element and then get as the return. Since you're already passing an empty hash to inject, you don't need the variable.
Next, you have to handle the case where the key doesn't yet exist on the hash, hence the if hash[a].nil?. In that case, we assign the value of b to the hash where the key is a. When the key exists in the hash, we can safely sum the value.
Another thing to notice is that you are using the wrong arguments of the block. When calling #inject, you first receive the accumulator (in this case, the hash), then the iteration element.
Documentation for #inject
k.group_by(&:first).transform_values {|v| v.map(&:last).sum }
You actually used the words "group by" in your question, but you never grouped the array in your code. Here, I first group the inner arrays by their first elements, ending up with:
{ 1 => [[1, 8], [1, 0], [1, 1], [1, 1]], 3 => [[3, 16]] }
Next, I only want the last element of all of the inner arrays, since I already know that the first is always going to be the key, so I use Hash#transform_values to map the two-element arrays to their last element. Lastly, I Enumerable#sum those numbers.

Prevent identical pairs when shuffling and slicing Ruby array

I'd like to prevent producing pairs with the same items when producing a random set of pairs in a Ruby array.
For example:
[1,1,2,2,3,4].shuffle.each_slice(2).to_a
might produce:
[[1, 1], [3, 4], [2, 2]]
I'd like to be able to ensure that it produces a result such as:
[[4, 1], [1, 2], [3, 2]]
Thanks in advance for the help!
arr = [1,1,2,2,3,4]
loop do
sliced = arr.shuffle.each_slice(2).to_a
break sliced if sliced.none? { |a| a.reduce(:==) }
end
Here are three ways to produce the desired result (not including the approach of sampling repeatedly until a valid sample is found). The following array will be used for illustration.
arr = [1,4,1,2,3,2,1]
Use Array#combination and Array#sample
If pairs sampled were permitted to have the same number twice, the sample space would be
arr.combination(2).to_a
#=> [[1, 4], [1, 1], [1, 2], [1, 3], [1, 2], [1, 1], [4, 1], [4, 2],
# [4, 3], [4, 2], [4, 1], [1, 2], [1, 3], [1, 2], [1, 1], [2, 3],
# [2, 2], [2, 1], [3, 2], [3, 1], [2, 1]]
The pairs containing the same value twice--here [1, 1] and [2, 2]--are not wanted so they are simple removed from the above array.
sample_space = arr.combination(2).reject { |x,y| x==y }
#=> [[1, 4], [1, 2], [1, 3], [1, 2], [4, 1], [4, 2], [4, 3],
# [4, 2], [4, 1], [1, 2], [1, 3], [1, 2], [2, 3], [2, 1],
# [3, 2], [3, 1], [2, 1]]
We evidently are to sample arr.size/2 elements from sample_space. Depending on whether this is to be done with or without replacement we would write
sample_space.sample(arr.size/2)
#=> [[4, 3], [1, 2], [1, 3]]
for sampling without replacement and
Array.new(arr.size/2) { sample_space.sample }
#=> [[1, 3], [4, 1], [2, 1]]
for sampling with replacement.
Sample elements of each pair sequentially, Method 1
This method, like the next, can only be used to sample with replacement.
Let's first consider sampling a single pair. We could do that by selecting the first element of the pair randomly from arr, remove all instances of that element in arr and then sample the second element from what's left of arr.
def sample_one_pair(arr)
first = arr.sample
[first, second = (arr-[first]).sample]
end
To draw a sample of arr.size/2 pairs we there execute the following.
Array.new(arr.size/2) { sample_one_pair(arr) }
#=> [[1, 2], [4, 3], [1, 2]]
Sample elements of each pair sequentially, Method 2
This method is a very fast way of sampling large numbers of pairs with replacement. Like the previous method, it cannot be used to sample without replacement.
First, compute the cdf (cumulative distribution function) for drawing an element of arr at random.
counts = arr.group_by(&:itself).transform_values { |v| v.size }
#=> {1=>3, 4=>1, 2=>2, 3=>1}
def cdf(sz, counts)
frac = 1.0/sz
counts.each_with_object([]) { |(k,v),a|
a << [k, frac * v + (a.empty? ? 0 : a.last.last)] }
end
cdf_first = cdf(arr.size, counts)
#=> [[1, 0.429], [4, 0.571], [2, 0.857], [3, 1.0]]
This means that there is a probability of 0.429 (rounded) of randomly drawing a 1, 0.571 of drawing a 1 or a 4, 0.857 of drawing a 1, 4 or 2 and 1.0 of drawing one of the four numbers. We therefore can randomly sample a number from arr by obtaining a (pseudo-) random number between zero and one (p = rand) and then determine the first element of counts_cdf, [n, q] for which p <= q:
def draw_random(cdf)
p = rand
cdf.find { |n,q| p <= q }.first
end
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 4
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 1
draw_random(counts_cdf) #=> 2
draw_random(counts_cdf) #=> 3
In simulation models, incidentally, this is the standard way of generating pseudo-random variates from discrete probability distributions.
Before drawing the second random number of the pair we need to modify cdf_first to reflect that fact that the first number cannot be drawn again. Assuming there will be many pairs to generate randomly, it is most efficient to construct a hash cdf_second whose keys are the first values drawn randomly for the pair and whose values are the corresponding cdf's.
cdf_second = counts.keys.each_with_object({}) { |n, h|
h[n] = cdf(arr.size - counts[n], counts.reject { |k,_| k==n }) }
#=> {1=>[[4, 0.25], [2, 0.75], [3, 1.0]],
# 4=>[[1, 0.5], [2, 0.833], [3, 1.0]],
# 2=>[[1, 0.6], [4, 0.8], [3, 1.0]],
# 3=>[[1, 0.5], [4, 0.667], [2, 1.0]]}
If, for example, a 2 is drawn for the first element of the pair, the probability is 0.6 of drawing a 1 for the second element, 0.8 of drawing a 1 or 4 and 1.0 of drawing a 1, 4, or 3.
We can then sample one pair as follows.
def sample_one_pair(cdf_first, cdf_second)
first = draw_random(cdf_first)
[first, draw_random(cdf_second[first])]
end
As before, to sample arr.size/2 values with replacement, we execute
Array.new(arr.size/2) { sample_one_pair }
#=> [[2, 1], [3, 2], [1, 2]]
With replacement, you may get results like:
unique_pairs([1, 1, 2, 2, 3, 4]) # => [[4, 1], [1, 2], [1, 3]]
Note that 1 gets chosen three times, even though it's only in the original array twice. This is because the 1 is "replaced" each time it's chosen. In other words, it's put back into the collection to potentially be chosen again.
Here's a version of Cary's excellent sample_one_pair solution without replacement:
def unique_pairs(arr)
dup = arr.dup
Array.new(dup.size / 2) do
dup.shuffle!
first = dup.pop
second_index = dup.rindex { |e| e != first }
raise StopIteration unless second_index
second = dup.delete_at(second_index)
[first, second]
end
rescue StopIteration
retry
end
unique_pairs([1, 1, 2, 2, 3, 4]) # => [[4, 3], [1, 2], [2, 1]]
This works by creating a copy of the original array and deleting elements out of it as they're chosen (so they can't be chosen again). The rescue/retry is in there in case it becomes impossible to produce the correct number of pairs. For example, if [1, 3] is chosen first, and [1, 4] is chosen second, it becomes impossible to make three unique pairs because [2, 2] is all that's left; the sample space is exhausted.
This should be slower than Cary's solution (with replacement) but faster (on average) than the posted solutions (without replacement) that require looping and retrying. Welp, chalk up another point for "always benchmark!" I was wrong about all most of my assumptions. Here are the results on my machine with an array of 16 numbers ([1, 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7, 8, 9, 9, 10]):
cary_with_replacement
93.737k (± 2.9%) i/s - 470.690k in 5.025734s
mwp_without_replacement
187.739k (± 3.3%) i/s - 943.415k in 5.030774s
mudasobwa_without_replacement
129.490k (± 9.4%) i/s - 653.150k in 5.096761s
EDIT: I've updated the above solution to address Stefan's numerous concerns. In hindsight, the errors are obvious and embarrassing! On the plus side, the revised solution is now faster than mudasobwa's solution, and I've confirmed that the two solutions have the same biases.
You can check if there any mathes and shuffle again:
a = [1,1,2,2,3,4]
# first time shuffle
sliced = a.shuffle.each_slice(2).to_a
# checking if there are matches and shuffle if there are
while sliced.combination(2).any? { |a, b| a.sort == b.sort } do
sliced = a.shuffle.each_slice(2).to_a
end
It is unlikely, be aware about possibility of infinity loop

Resources