Turn array into array of arrays following structure of another array - arrays

I would like to turn an array into an array of arrays following another array of arrays. I'm not sure how to do this, here are the arrays:
orig_array = [[0,1],[4],[3],[],[3,2,6],[]]
my_array = [2,0,1,3,3,4,5]
wanted_array = [[2,0],[1],[3],[],[3,4,5],[]]
I would like to keep the empty arrays.
Thanks

Get the lengths of each element in orig_array, perform cumumlative summations along the length values to give us the indices at which my_array needs to be split and finally use np.split to actually perform the splitting. Thus, the implementation would look something like this -
lens = [len(item) for item in orig_array]
out = np.split(my_array,np.cumsum(lens))[:-1]
Sample run -
In [72]: orig_array = np.array([[0,1],[4],[3],[],[3,2,6],[]])
...: my_array = np.array([2,0,1,3,3,4,5])
...:
In [73]: lens = [len(item) for item in orig_array]
...: out = np.split(my_array,np.cumsum(lens))[:-1]
...:
In [74]: out
Out[74]:
[array([2, 0]),
array([1]),
array([3]),
array([], dtype=int64),
array([3, 4, 5]),
array([], dtype=int64)]

def do(format, values):
if type(format) == list:
return [do(v, values) for v in format]
else:
return values.pop(0)
print do(orig_array, my_array)
Note: this destroys the array where the values come from.

You could do the following:
import copy
def reflect_array(orig_array, order):
wanted_array = copy.deepcopy(orig_array)
for i, part_list in enumerate(orig_array):
for j, _ in enumerate(part_list):
wanted_array[i][j] = order.pop()
return wanted_array
Test run:
orig_array = [[0,1],[4],[3],[],[3,2,6],[]]
my_array = [2,0,1,3,3,4,5]
print reflect_array(orig_array, my_array)
# [[2, 0], [1], [3], [], [3, 4, 5], []]

In [858]: my_array = [2,0,1,3,3,4,5]
In [859]: [[my_array.pop(0) for _ in range(len(x))] for x in orig_array]
Out[859]: [[2, 0], [1], [3], [], [3, 4, 5], []]
Use b=my_array[:] if you don't want to change my_array.
This operates on the same principle as #karoly's answer; just more direct because it assumes only one level of nesting.

Related

replace numpy elements with non-scalar dictionary values

import pandas as pd
import numpy as np
column = np.array([5505, 5505, 5505, 34565, 34565, 65539, 65539])
column = pd.Series(column)
myDict = column.groupby(by = column ).groups
I am creating a dictionary from a pandas df using df.group(by=..) which has the form:
>>> myDict
{5505: Int64Index([0, 1, 2], dtype='int64'), 65539: Int64Index([5, 6], dtype='int64'), 34565: Int64Index([3, 4], dtype='int64')}
I have a numpy array, e.g.
myArray = np.array([34565, 34565, 5505,65539])
and I want to replace each of the array's elements with the dictionary's values.
I have tried several solutions that I have found (e.g. here and here) but these examples have dictionaries with single dictionary values, and I am always getting the error of setting an array element with a sequence. How can I get over this problem?
My intended output is
np.array([3, 4, 3, 4, 0, 1, 2, 5, 6])
One approach based on np.searchsorted -
# Extract dict info
k = list(myDict.keys())
v = list(myDict.values())
# Use argsort of k to find search sorted indices from myArray in keys
# Index into the values of dict based on those indices for output
sidx = np.argsort(k)
idx = sidx[np.searchsorted(k,myArray,sorter=sidx)]
out_arr = np.concatenate([v[i] for i in idx])
Sample input, output -
In [369]: myDict
Out[369]:
{5505: Int64Index([0, 1, 2], dtype='int64'),
34565: Int64Index([3, 4], dtype='int64'),
65539: Int64Index([5, 6], dtype='int64')}
In [370]: myArray
Out[370]: array([34565, 34565, 5505, 65539])
In [371]: out_arr
Out[371]: array([3, 4, 3, 4, 0, 1, 2, 5, 6])

Sort an array of arrays by the number of same occurencies in Ruby

This question is different from this one.
I have an array of arrays of AR items looking something like:
[[1,2,3], [4,5,6], [7,8,9], [7,8,9], [1,2,3], [7,8,9]]
I would like to sort it by number of same occurences of the second array:
[[7,8,9], [1,2,3], [4,5,6]]
My real data are more complexes, looking something like:
raw_data = {}
raw_data[:grapers] = []
suggested_data = {}
suggested_data[:grapers] = []
varietals = []
similar_vintage.varietals.each do |varietal|
# sub_array
varietals << Graper.new(:name => varietal.grape.name, :grape_id => varietal.grape_id, :percent => varietal.percent)
end
raw_data[:grapers] << varietals
So, I want to sort raw_data[:grapers] by the max occurrencies of each varietals array comparing this value: grape_id inside them.
When I need to sort a classical array of data by max occurencies I do that:
grapers_with_frequency = raw_data[:grapers].inject(Hash.new(0)) { |h,v| h[v] += 1; h }
suggested_data[:grapers] << raw_data[:grapers].max_by { |v| grapers_with_frequency[v] }
This code doesn't work cos there are sub arrays there, including AR models that I need to analyze.
Possible solution:
array.group_by(&:itself) # grouping
.sort_by {|k, v| -v.size } # sorting
.map(&:first) # optional step, depends on your real data
#=> [[7, 8, 9], [1, 2, 3], [4, 5, 6]]
I recommend you take a look at the Ruby documentation for the sort_by method. It allows you to sort an array using anything associated with the elements, rather than the values of the elements.
my_array.sort_by { |elem| -my_array.count(elem) }.uniq
=> [[7, 8, 9], [1, 2, 3], [4, 5, 6]]
This example sorts by the count of each element in the original array. This is preceded with a minus so that the elements with the highest count are first. The uniq is to only have one instance of each element in the final result.
You can include anything you like in the sort_by block.
As Ilya has pointed out, having my_array.count(elem) in each iteration will be costlier than using group_by beforehand. This may or may not be an issue for you.
arr = [[1,2,3], [4,5,6], [7,8,9], [7,8,9], [1,2,3], [7,8,9]]
arr.each_with_object(Hash.new(0)) { |a,h| h[a] += 1 }.
sort_by(&:last).
reverse.
map(&:first)
#=> [[7.8.9]. [1,2,3], [4,5,6]]
This uses the form of Hash::new that takes an argument (here 0) that is the hash's default value.

Using self.dup, but failing rspec test to not modify original array

I'm creating a method to transpose square 2-d arrays. My method passes every test, except the "does not modify original array" one. I'm only working on the duped array, so I'm confused on why the test is failing.
Code:
class Array
def my_transpose
orig_arr = self.dup; array = []
orig_arr[0].length.times do
temp_arr = []
orig_arr.each { |arr| temp_arr << arr.shift }
array << temp_arr
end
array
end
end
RSpec:
describe Array do
describe "#my_transpose" do
let(:arr) { [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
] }
let(:small_arr) { [
[1, 2],
[3, 4]
] }
it "transposes a small matrix" do
expect(small_arr.my_transpose).to eq([
[1, 3],
[2, 4]
])
end
it "transposes a larger matrix" do
expect(arr.my_transpose).to eq([
[1, 4, 7],
[2, 5, 8],
[3, 6, 9]
])
end
it "should not modify the original array" do
small_arr.my_transpose
expect(small_arr).to eq([
[1, 2],
[3, 4]
])
end
it "should not call the built-in #transpose method" do
expect(arr).not_to receive(:transpose)
arr.my_transpose
end
end
end
Output:
7) Array#my_transpose should not modify the original array
Failure/Error: expect(small_arr).to eq([
expected: [[1, 2], [3, 4]]
got: [[], []]
(compared using ==)
# ./spec/00_array_extensions_spec.rb:123:in `block (3 levels) in <top (required)>'
When you call dup on an array, it only duplicates the array itself; the array's contents are not also duplicated. So, for example:
a = [[1,2],[3,4]]
b = a.dup
a.object_id == b.object_id # => false
a[0].object_id == b[0].object_id # => true
Thus, modifications to a itself are not reflected in b (and vice versa), but modifications in the elements of a are reflected in b, because those elements are the same objects.
That being the case, the problem crops up here:
orig_arr.each { |arr| temp_arr << arr.shift }
arr is an element of orig_arr, but it is also an element of self. If you did something like remove it from orig_arr, you would not also remove it from self, but if you change it, it's changed, no matter how you are accessing it, and as it turns out, Array#shift is a destructive operation.
Probably the smallest change you could make to your code to make it work as you expect would be to use each_with_index, and then use the index into arr, rather than calling arr.shift, so:
orig_arr.each_with_index { |arr,i| temp_arr << arr[i] }
In fact, though, once you're doing that, you're not doing any destructive operations at all and you don't need orig_arr, you can just use self.
The original array isn’t being modified, but the arrays within it are, as dup is a shallow clone.
xs = [[1,2],[3,4]]
ids = xs.map(&:object_id)
xs.my_transpose
ids == xs.map(&:object_id) #=> true
Since shift is a mutating operation (being performed on the nested array elements), you need to dup the elements within the array as well, e.g.
orig_arr = dup.map(&:dup)
With this modification, your test should pass.

Destructive reject from an array returning the values rejected

Is there a sensible way to do the following:
I want to take an array and select specific items from the array according to conditions, removing them from the array as they go.
(I basically want to split the contents of an array into categories).
array = [1,2,3,4,5,6,7,8]
less_than_three = array.reject_destructively{|v| v<3}
=> [1,2]
array
=> [3,4,5,6,7,8]
more_than_five = array.reject_destructively{|v| v>5}
=> [6,7,8]
array
=> [3,4,5]
I've tried delete_if, select!, reject! and none of them seem to be able to give you the affected items whilst leaving the array with the rest.
Unless I'm going mad, which is entirely possible.
As I understood the question, you do not want to produce two new objects. Here you go:
class Array
def carve!
dup.tap { delete_if &Proc.new } - self
end
end
array = [1,2,3,4,5,6,7,8]
p array.carve! { |v| v < 3 }
#⇒ [1, 2] # returned by Array#carve method
p array
#⇒ [3, 4, 5, 6, 7, 8] # remained in original array
Using this solution, array.__id__ remains the same. And this is the golfiest answer all around :)
You can build your own method for this...
class Array
def extract(&block)
temp = self.select(&block)
self.reject!(&block)
temp
end
end
then...
a = [1, 2, 3, 4, 5]
a.extract{|x| x < 3}
=> [1,2]
p a
=> [3, 4, 5]
EDIT: If you don't want to monkey patch (but monkey patching isn't evil in itself) you can do it with a vanilla method...
def select_from_array(array, &block)
temp = array.select(&block)
array.reject!(&block)
temp
end
array = [1,2,3,4,5,6,7,8]
less_than_three = select_from_array(array){|v| v<3}
=> [1,2]
array
=> [3,4,5,6,7,8]
more_than_five = select_from_array(array){|v| v>5}
=> [6,7,8]
array
=> [3,4,5]
In rails 6 there is a method extract!:
a = [1, 2, 3] #=> [1, 2, 3]
a.extract! { |num| num.odd? } #=> [1, 3]
a #=> [2]
irb(main):001:0> array = [1,2,3,4,5,6,7,8]
=> [1, 2, 3, 4, 5, 6, 7, 8]
irb(main):002:0> array.partition{|v| v < 3}
=> [[1, 2], [3, 4, 5, 6, 7, 8]]
is there a specific reason, why this has to be destructive ?
Will this help
class Array
def reject_destructively(&block)
arr = self.select(&block)
arr.each{ |i| self.delete(i) }
arr
end
end
array = [1,2,3,4,5,6,7,8]
p less_than_three = array.reject_destructively{|v| v<3}
#=> [1,2]
p array
#=> [3,4,5,6,7,8]
p more_than_five = array.reject_destructively{|v| v>5}
#=> [6,7,8]
p array
#=> [3,4,5]
The above code can be simplified further to look like:
class Array
def reject_destructively(&block)
self.select(&block).each{ |i| self.delete(i) }
end
end
Ok. This works, avoids monkey patching, keeps it to one line...etc, but it's damn ugly....
less_than_three = array.dup - array.reject!{|v| v<3}
=> [1,2]
array
=> [3,4,5,6,7,8]
more_than_five = array.dup - array.reject!{|v| v>5}
=> [6,7,8]
array
=> [3,4,5]
module Enumerable
def reject_destructively
array=[]
self.each do |y|
if yield(y)
array<<y
end
end
array.each do |x|
self.delete(x)
end
return array
end
end
array=[10,9,2,1,3,45,52]
print less_than_three = array.reject_destructively{|v| v < 3}
print array
You can use group_by to get all of the elements that satisfy the condition in one group, and all of the rest in the other.
For example
[1,2,3,4,5].group_by{|i| i > 3}
gives
{false=>[1, 2, 3], true=>[4, 5]}
More information is available at http://ruby-doc.org/core-2.1.1/Enumerable.html#method-i-group_by

Deleting an array from a multidimensional array in ruby

How would this get done? Assume I have the following
arr = [[test, 0, 0, 0], [apples, 0, 9, 8]]
I know I would do something like:
def delete_me(item)
arr.each do |a|
if a[0] == item
#delete the array containing test
end
end
end
delete_me('test')
As far as I can see you can only do: a.remove() but that leaves me with a empty [],m I don't want that, I want it completely gone.
You can use delete_if and match the first term to your argument:
arr = [['test', 0, 0, 0], ['apples', 0, 9, 8]]
def delete_me(array, term)
array.delete_if {|x, *_| x == term }
end
(I've included the array as an argument as well, as the execution context is not clear from your post).
Following up on #iamnotmaynard's suggestion:
arr.delete_if { |a| a[0] == 'test' }
assoc.
arr.delete(arr.assoc("test"))
I had a similar need to remove one or more columns that matched a text pattern.
col_to_delete = 'test'
arr = [['test','apples','pears'],[2,3,5],[3,6,8],[1,3,1]]
arr.transpose.collect{|a| a if (a[0] != col_to_delete)}.reject(&:nil?).transpose
=> [["apples", "pears"], [3, 5], [6, 8], [3, 1]]

Resources