Related
Suppose I have an array a, and a boolean array b, I want to extract a fixed number of elements from the valid elements in each row of a. The valid elements are the ones indicated by b.
Here is an example:
a = np.arange(24).reshape(4,6)
b = np.array([[0,0,1,1,0,0],[0,1,0,1,0,1],[0,1,1,1,1,0],[0,0,0,0,1,1]]).astype(bool)
x = []
for i in range(a.shape[0]):
c = a[i,b[i]]
d = np.random.choice(c, 2)
x.append(d)
Here I used a for loop, which will be slow in case these arrays are big and high-dimensional. Is there a more efficient way to do this? Thanks.
Generate a random uniform [0, 1] matrix of shape a.
Multiply this matrix by the mask b to set invalid elements to zero.
Select the k maximum indices from each row (simulating an unbiased random k-sample from only the valid elements in this row).
(Optional) use these indices to get the elements.
a = np.arange(24).reshape(4,6)
b = np.array([[0,0,1,1,0,0],[0,1,0,1,0,1],[0,1,1,1,1,0],[0,0,0,0,1,1]])
k = 2
r = np.random.uniform(size=a.shape)
indices = np.argpartition(-r * b, k)[:,:k]
To get the elements from the indices:
>>> indices
array([[3, 2],
[5, 1],
[3, 2],
[4, 5]])
>>> a[np.arange(a.shape[0])[:,None], indices]
array([[ 3, 2],
[11, 7],
[15, 14],
[22, 23]])
I'm working to update the SVG::Graph gem, and have made many improvements to my version, but have found a bottleneck with multiple array sorting.
There is a "sort_multiple" function built in, which keeps an array of arrays (all of equal size) sorted by the first array in the group.
The issue I have is that this sort works well on truly random data, and really badly on sorted, or almost sorted data:
def sort_multiple( arrys, lo=0, hi=arrys[0].length-1 )
if lo < hi
p = partition(arrys,lo,hi)
sort_multiple(arrys, lo, p-1)
sort_multiple(arrys, p+1, hi)
end
arrys
end
def partition( arrys, lo, hi )
p = arrys[0][lo]
l = lo
z = lo+1
while z <= hi
if arrys[0][z] < p
l += 1
arrys.each { |arry| arry[z], arry[l] = arry[l], arry[z] }
end
z += 1
end
arrys.each { |arry| arry[lo], arry[l] = arry[l], arry[lo] }
l
end
this routine appears to use a variant of the Lomuto partition scheme from wikipedia: https://en.wikipedia.org/wiki/Quicksort#Lomuto_partition_scheme
I have an array of 5000+ numbers, which is previously sorted, and this function adds about 1/2 second per chart.
I have modified the "sort_multiple" routine with the following:
def sort_multiple( arrys, lo=0, hi=arrys[0].length-1 )
first = arrys.first
return arrys if first == first.sort
if lo < hi
...
which has "fixed" the problem with sorted data, but I was wondering if there is any way to utilise the better sort functions built into ruby to get this sort to work much quicker. e.g. do you think I could utilise a Tsort to speed this up? https://ruby-doc.org/stdlib-2.6.1/libdoc/tsort/rdoc/TSort.html
looking at my benchmarking, the completely random first group appears to be very fast.
Current benchmarking:
def sort_multiple( arrys, lo=0, hi=arrys[0].length-1 )
if lo < hi
p = partition(arrys,lo,hi)
sort_multiple(arrys, lo, p-1)
sort_multiple(arrys, p+1, hi)
end
arrys
end
def partition( arrys, lo, hi )
p = arrys[0][lo]
l = lo
z = lo+1
while z <= hi
if arrys[0][z] < p
l += 1
arrys.each { |arry| arry[z], arry[l] = arry[l], arry[z] }
end
z += 1
end
arrys.each { |arry| arry[lo], arry[l] = arry[l], arry[lo] }
l
end
first = (1..5400).map { rand }
second = (1..5400).map { rand }
unsorted_arrys = [first.dup, second.dup, Array.new(5400), Array.new(5400), Array.new(5400)]
sorted_arrys = [first.sort, second.dup, Array.new(5400), Array.new(5400), Array.new(5400)]
require 'benchmark'
Benchmark.bmbm do |x|
x.report("unsorted") { sort_multiple( unsorted_arrys.map(&:dup) ) }
x.report("sorted") { sort_multiple( sorted_arrys.map(&:dup) ) }
end
results:
Rehearsal --------------------------------------------
unsorted 0.070699 0.000008 0.070707 ( 0.070710)
sorted 0.731734 0.000000 0.731734 ( 0.731742)
----------------------------------- total: 0.802441sec
user system total real
unsorted 0.051636 0.000000 0.051636 ( 0.051636)
sorted 0.715730 0.000000 0.715730 ( 0.715733)
#EDIT#
Final accepted solution:
def sort( *arrys )
new_arrys = arrys.transpose.sort_by(&:first).transpose
new_arrys.each_index { |k| arrys[k].replace(new_arrys[k]) }
end
I have an array of 5000+ numbers, which is previously sorted, and this function adds about 1/2 second per chart.
Unfortunately, algorithms implemented in Ruby can become quite slow. It's often much faster to delegate the work to the built-in methods that are implemented in C, even if it comes with an overhead.
To sort a nested array, you could transpose it, then sort_by its first element, and transpose again afterwards:
arrays.transpose.sort_by(&:first).transpose
It works like this:
arrays #=> [[3, 1, 2], [:c, :a, :b]]
.transpose #=> [[3, :c], [1, :a], [2, :b]]
.sort_by(&:first) #=> [[1, :a], [2, :b], [3, :c]]
.transpose #=> [[1, 2, 3], [:a, :b, :c]]
And although it creates several temporary arrays along the way, the result seems to be an order of magnitude faster than the "unsorted" variant:
unsorted 0.035297 0.000106 0.035403 ( 0.035458)
sorted 0.474134 0.003065 0.477199 ( 0.480667)
transpose 0.001572 0.000082 0.001654 ( 0.001655)
In the long run, you could try to implement your algorithm as a C extension.
I confess I don't fully understand the question and don't have the time to study the code at the link, but it seems that you have one sorted array that you are repeatedly mutating only slightly, and with each change you may mutate several other arrays, each a little or a lot. After each set of mutations you re-sort the first array and then rearrage each of the other arrays consistent with the changes in indices of elements in the first array.
If, for example, the first array were
arr = [2,4,6,8,10]
and the change to arr were to replace the element at index 1 (4) with 9 and the element at index 3 (8) with 3, arr would become [2,9,6,3,10], which, after re-sorting, would be [2,3,6,9,10]. We could do that as follows:
new_arr, indices = [2,9,6,3,10].each_with_index.sort.transpose
#=> [[2, 3, 6, 9, 10], [0, 3, 2, 1, 4]]
Therefore,
new_arr
#=> [2, 3, 6, 9, 10]
indices
#=> [0, 3, 2, 1, 4]
the intermediate calculation being
[2,9,6,3,10].each_with_index.sort
#=> [[2, 0], [3, 3], [6, 2], [9, 1], [10, 4]]
Considering that
new_array == [2,9,6,3,10].values_at(*indices)
#=> true
we see that each of the other arrays, after having been mutated, can be sorted to conform with the sorting of indices in the first array with the following method, which is quite fast.
def sort_like_first(a, indices)
a.values_at(*indices)
end
For example,
a = [5,4,3,1,2]
a.replace(sort_like_first a, indices)
a #=> [5, 1, 3, 4, 2]
a = %w|dog cat cow pig owl|
a.replace(sort_like_first a, indices)
a #=> ["dog", "pig", "cow", "cat", "owl"]
In fact, it's not necessary to sort each of the other arrays until they are required in the calculations.
I would now like to consider a special case, namely, when only a single element in the first array is to be changed.
Suppose (as before)
arr = [2,4,6,8,10]
and the element at index 3 (8) is to be replaced with 5, resulting in [2,4,6,5,10]. A fast sort can be done with the following method, which employs a binary search.
def new_indices(arr, replace_idx, replace_val)
new_loc = arr.bsearch_index { |n| n >= replace_val } || arr.size
indices = (0..arr.size-1).to_a
index_removed = indices.delete_at(replace_idx)
new_loc -= 1 if new_loc > replace_idx
indices.insert(new_loc, index_removed)
end
arr.bsearch_index { |n| n >= replace_val } returns nil if n >= replace_val #=> false for all n. It is for that reason I have tacked on || arr.size.
See Array#bsearch_index, Array#delete_at and Array#insert.
Let's try it. If
arr = [2,4,6,8,10]
replace_idx = 3
replace_val = 5
then
indices = new_indices(arr, replace_idx, replace_val)
#=> [0, 1, 3, 2, 4]
Only now can we replace the element of arr at index replace_idx.
arr[replace_idx] = replace_val
arr
#=> [2, 4, 6, 5, 10]
We see that the re-sorted array is as follows.
arr.values_at(*indices)
#=> [2, 4, 5, 6, 10]
The other arrays are sorted as before, using sort_like_first:
a = [5,4,3,1,2]
a.replace(sort_like_first(a, indices))
#=> [5, 4, 1, 3, 2]
a = %w|dog cat cow pig owl|
a.replace(sort_like_first(a, indices))
#=> ["dog", "cat", "pig", "cow", "owl"]
Here's a second example.
arr = [2,4,6,8,10]
replace_idx = 3
replace_val = 12
indices = new_indices(arr, replace_idx, replace_val)
#=> [0, 1, 2, 4, 3]
arr[replace_idx] = replace_val
arr
#=> [2, 4, 6, 12, 10]
The first array sorted is therefore
arr.values_at(*indices)
#=> [2, 4, 6, 10, 12]
The other arrays are sorted as follows.
a = [5,4,3,1,2]
a.replace(sort_like_first a, indices)
a #=> [5, 4, 3, 2, 1]
a = %w|dog cat cow pig owl|
a.replace(sort_like_first a, indices)
a #=> ["dog", "cat", "cow", "owl", "pig"]
I have two numpy arrays:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
and I want to concatenate them into two columns like,
1 4
2 5
3 6
is there any way to do this without transposing or reshaping the arrays?
You can try:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.concatenate((a[np.newaxis, :], b[np.newaxis, :]), axis = 0).T
And you get :
c = array([[1, 4],
[2, 5],
[3, 6]])
Best,
I had two list:
a=[0,0,0,1,1,1,1,2,2]
b=[2,5,12,2,3,8,9,4,6]
And I wanted to get:
c=[[0,2,5,12],[1,2,3,8,9],[2,4,6]]
A and b correlated to each other, a[i] related to b[i], when the value in a change like 0 to 1, 12 end in the first inner-list of c.
I tried it with if else statement but it failed
How to get c in python?
This code produces c in a good enough way (provided a and b are always adjusted in the same way as in the example):
a=[0,0,0,1,1,1,1,2,2]
b=[2,5,12,2,3,8,9,4,6]
c = []
i = 0
while i < len(a):
d = a.count(a[i])
c.append([a[i]] + b[i:i + d])
i += d
print(c) # ==> [[0, 2, 5, 12], [1, 2, 3, 8, 9], [2, 4, 6]]
We can zip the lists, group by first value from a, and make lists with the second:
from itertools import groupby
from operator import itemgetter
a=[0,0,0,1,1,1,1,2,2]
b=[2,5,12,2,3,8,9,4,6]
[list(map(itemgetter(1), group)) for _, group in groupby(zip(a, b), key=itemgetter(0))]
#[[2, 5, 12], [2, 3, 8, 9], [4, 6]]
Similar to #Thierry Lathuille's answer, but does actually prepend the keys to the sub lists as requested by OP:
import itertools as it
ib = iter(b)
[[k, *(next(ib) for _ in gr)] for k, gr in it.groupby(a)]
# [[0, 2, 5, 12], [1, 2, 3, 8, 9], [2, 4, 6]]
Here's my simple solution. Notice that you are splitting the list by by the counts of elemets in the list a. deque is used for popping elements in O(1) time from the left.
import itertools
from collections import Counter, deque
a = [0,0,0,1,1,1,1,2,2]
b = deque([2,5,12,2,3,8,9,4,6])
c = Counter(a)
new_list=[]
for x in c:
new_list.append([x]+[b.popleft() for i in range(a[x])])
This question already has answers here:
Sum of arrays of different size [closed]
(4 answers)
Closed 6 years ago.
I want to add values of two different length array.
a =[1,2,3]
b= [1,2]
c = [1,2,3,4]
and so on..
I want result to be like [3,6,6,4]. How to do this in ruby on rails.
In order to make it dynamic, I would create arrays of array with your a, b, c =>
a = [1, 2, 3]
b = [1, 2]
c = [1, 2, 3, 4]
arrays = [a, b, c]
Then I would retrieve the max size :
max_size = arrays.map(&:size).max #=> 4
Then the following line would give you your answer :
max_size.times.map{ |i| arrays.reduce(0){|s, a| s + a.fetch(i, 0)}} #=> [3, 6, 6, 4]
You can build a new array that consists of all those array, and then can write the following code to get the array that has combined entries of each array:
a = [1,2,3]
b = [1,2,3]
c = [1,2,3,4]
Before you apply the rest of the code, you need to make sure that each array has the same length. For that, you can append 0 in all the arrays if need be, to ensure that each array has the same length as the rest of the arrays have.
a = [1,2,3,0]
b = [1,2,3,0]
c = [1,2,3,4]
combined_array = [a,b,c]
result = combined_array.transpose.map { |a| a.reduce :+ }
Extending the answer from #Arslan Ali
I added a way to make all the arrays the same size, so that his method of summing can be applied:
a = [1,2,3]
b = [1,2,3]
c = [1,2,3,4]
arrays = [a, b, c]
size = [a, b, c].map{|a| a.size}.max # Compute maximum size
combined_array = [a,b,c].map{|a| a.fill(a.size...size){0}} # Fill arrays to maximum size
result = combined_array.transpose.map { |a| a.reduce :+ } # Sum everything
Here's one way:
a = [1, 2, 3]
b = [1, 2, 3]
c = [1, 2, 3, 4]
[a,b,c].inject([]) do |totals, add|
add.each_with_index do |n, i|
totals[i] = (totals[i] || 0) + n
end
totals
end