remove duplicates based on one field in a numpy array

remove duplicates based on one field in a numpy array - arrays

How do I remove duplicates when a numpy array field has duplicates.
for example, i have an array like this:
vals = numpy.array([[1,2,3],[1,5,6],[1,8,7],[0,4,5],[2,2,1],[0,0,0],[5,4,3]])
array([[1, 2, 3],
[1, 5, 6],
[1, 8, 7],
[0, 4, 5],
[2, 2, 1],
[0, 0, 0],
[5, 4, 3]])
i need to remove the duplicates for field [0], so that i got the results like:
([1,2,3],
[0, 4, 5],
[2, 2, 1],
[0, 0, 0],
[5, 4, 3]])

You can use numpy.unique:
In [11]: vals
Out[11]:
array([[1, 2, 3],
[1, 5, 6],
[1, 8, 7],
[0, 4, 5],
[2, 2, 1],
[0, 0, 0],
[5, 4, 3]])
In [12]: unique_keys, indices = np.unique(vals[:,0], return_index=True)
In [13]: vals[indices]
Out[13]:
array([[0, 4, 5],
[1, 2, 3],
[2, 2, 1],
[5, 4, 3]])
To maintain the original order:
In [17]: vals[np.sort(indices)]
Out[17]:
array([[1, 2, 3],
[0, 4, 5],
[2, 2, 1],
[5, 4, 3]])

Related

How to insert a number to each array in an array of arrays?

From:
arr1 = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])
To:
arr1 = np.array([ [0, 1, 2, 3], [0, 4, 5, 6], [0, 7, 8, 9] ])

You can try something like this with numpy.full:
x = 0
new = np.full((arr1.shape[0], arr1.shape[1] + 1), x)
new[:, 1:] = arr1
Output
new
array([[0, 1, 2, 3],
[0, 4, 5, 6],
[0, 7, 8, 9]])
Note that you can assign any value to x.

Your (3,3) 2d array:
In [100]: arr1 = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])
In [101]: arr1
Out[101]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
A new (3,4) array:
In [102]: np.concatenate((np.zeros((3,1),int),arr1), axis=1)
Out[102]:
array([[0, 1, 2, 3],
[0, 4, 5, 6],
[0, 7, 8, 9]])
Any other (3,1) array (or even a (3,n)) could be added "at the start" like this.

Slice sets of columns in numpy

Consider a numpy array as such:
>>> a = np.array([[1, 2, 3, 0, 1], [2, 3, 2, 2, 2], [0, 3, 3, 2, 2]])
>>> a
array([[1, 2, 3, 0, 1],
[2, 3, 2, 2, 2],
[0, 3, 3, 2, 2]])
And an array which contains couples of column indexes to slice (a specific column can appear in multiple couples):
b = [[0,1], [0,3], [1,4]]
How can I slice/broadcast/stride a using b to get a result as such:
array([[[1, 2],
[2, 3],
[0, 3]],
[[1, 0],
[2, 2],
[0, 2]],
[[2, 1],
[3, 2],
[3, 2]]])

Use b as column indices to subset the array and then transpose the result:
a[:, b].swapaxes(0, 1)
# array([[[1, 2],
# [2, 3],
# [0, 3]],
# [[1, 0],
# [2, 2],
# [0, 2]],
# [[2, 1],
# [3, 2],
# [3, 2]]])

Summing elements from arrays with matching elements

I have an array that looks like this:
original = [[1, 2, 3], [2, 2, 2], [1, 2, 3], [2, 2, 2], [2, 2, 3], [1, 2, 2], [5, 4, 2]]
I'd like to get a new array whose elements that match the second and third position would sum up its first position to get this:
expected_output = [[4, 2, 3], [5, 2, 2], [5, 4, 2]]
I got to grouping the elements from the array as follows:
new_array = original.group_by {|n| n[1] && n[2] }
# => {3=>[[1, 2, 3], [1, 2, 3], [2, 2, 3]], 2=>[[2, 2, 2], [2, 2, 2], [1, 2, 2], [5, 4, 2]]}
It is still far from my desired output.

Here's one way to return a new array of arrays where the first element of each array is the sum of the original array's first element where its second and third elements match:
arr = [[1, 2, 3], [2, 2, 2], [1, 2, 3], [2, 2, 2], [2, 2, 3], [1, 2, 2], [5, 4, 2]]
array_groups = arr.group_by { |sub_arr| sub_arr[1, 2] }
result = array_groups.map do |k, v|
k.unshift(v.map(&:first).inject(:+))
end
result
# => [[4, 2, 3], [5, 2, 2], [5, 4, 2]]
Hope this helps!

This will produce a similar result using an array grouping rather than combining the two latter numbers.
original = [[1, 2, 3], [2, 2, 2], [1, 2, 3], [2, 2, 2], [2, 2, 3], [1, 2, 2], [5, 4, 2]]
new = original.group_by {|n| [n[1], n[2]] }
added = new.map{|x| [new[x.first].map(&:first).inject(0, :+),x.first].flatten}
puts added.to_s

original.each_with_object(Hash.new(0)) { |(f,*rest),h| h[rest] += f }.
map { |(s,t),f| [f,s,t] }
# => [[4, 2, 3], [5, 2, 2], [5, 4, 2]]
Note that
original.each_with_object(Hash.new(0)) { |(f,*rest),h| h[rest] += f }
#=> {[2, 3]=>4, [2, 2]=>5, [4, 2]=>5}
Hash.new(0) is sometimes called a counting hash. To understand how that works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(0), then if h does not have a key k, h[k] returns the default value, here 0 (and the hash is not changed).

Finding all permutations of numbers plucked from an array which sum to 16

I would like to find all the permutations of plucking 3, 4 or 5 numbers from [2,3,4,5,6,7,8], repeats allowed, such that their sum is 16. So [8,5,3], [8,3,5] and [4,3,3,3,3] are valid permutations. Also circular permutations should be removed so [3,3,3,3,4] wouldn't also be added to the answer.
I can do this in Ruby without allowing repeats like this:
d = [2,3,4,5,6,7,8]
number_of_divisions = [3,4,5]
number_of_divisions.collect do |n|
d.permutation(n).to_a.reject do |p|
p[0..n].inject(0) { |sum,x| sum + x } != 16
end
end
How could I allow repeats so that [3,3,3,3,4] was included?

For all permutations, including duplicates, one might use Array#repeated_permutation:
d = [2,3,4,5,6,7,8]
number_of_divisions = [3,4,5]
number_of_divisions.flat_map do |n|
d.repeated_permutation(n).reject do |p| # no need `to_a`
p.inject(:+) != 16
end
end
or, even better with Array#repeated_combination:
number_of_divisions.flat_map do |n|
d.repeated_combination(n).reject do |p| # no need `to_a`
p.inject(:+) != 16
end
end

There are far fewer repeated combinations than repeated permutations, so let's find the repeated combinations that sum to the given value, then permute each of those. Moreover, by applying uniq at each of several steps of the calculation we can significantly reduce the number of repeated combinations and permutations considered.
Code
require 'set'
def rep_perms_for_all(arr, n_arr, tot)
n_arr.flat_map { |n| rep_perms_for_1(arr, n, tot) }
end
def rep_perms_for_1(arr, n, tot)
rep_combs_to_rep_perms(rep_combs_for_1(arr, n, tot)).uniq
end
def rep_combs_for_1(arr, n, tot)
arr.repeated_combination(n).uniq.select { |c| c.sum == tot }
end
def rep_combs_to_rep_perms(combs)
combs.flat_map { |c| comb_to_perms(c) }.uniq
end
def comb_to_perms(comb)
comb.permutation(comb.size).uniq.uniq do |p|
p.size.times.with_object(Set.new) { |i,s| s << p.rotate(i) }
end
end
Examples
rep_perms_for_all([2,3,4,5], [3], 12)
#=> [[2, 5, 5], [3, 4, 5], [3, 5, 4], [4, 4, 4]]
rep_perms_for_all([2,3,4,5,6,7,8], [3,4,5], 16).size
#=> 93
rep_perms_for_all([2,3,4,5,6,7,8], [3,4,5], 16)
#=> [[2, 6, 8], [2, 8, 6], [2, 7, 7], [3, 5, 8], [3, 8, 5], [3, 6, 7],
# [3, 7, 6], [4, 4, 8], [4, 5, 7], [4, 7, 5], [4, 6, 6], [5, 5, 6],
# [2, 2, 4, 8], [2, 2, 8, 4], [2, 4, 2, 8], [2, 2, 5, 7], [2, 2, 7, 5],
# [2, 5, 2, 7], [2, 2, 6, 6], [2, 6, 2, 6], [2, 3, 3, 8], [2, 3, 8, 3],
# ...
# [3, 3, 3, 7], [3, 3, 4, 6], [3, 3, 6, 4], [3, 4, 3, 6], [3, 3, 5, 5],
# [3, 5, 3, 5], [3, 4, 4, 5], [3, 4, 5, 4], [3, 5, 4, 4], [4, 4, 4, 4],
# ...
# [2, 2, 4, 5, 3], [2, 2, 5, 3, 4], [2, 2, 5, 4, 3], [2, 3, 2, 4, 5],
# [2, 3, 2, 5, 4], [2, 3, 4, 2, 5], [2, 3, 5, 2, 4], [2, 4, 2, 5, 3],
# ...
# [2, 5, 3, 3, 3], [2, 3, 3, 4, 4], [2, 3, 4, 3, 4], [2, 3, 4, 4, 3],
# [2, 4, 3, 3, 4], [2, 4, 3, 4, 3], [2, 4, 4, 3, 3], [3, 3, 3, 3, 4]]
Explanation
rep_combs_for_1 uses the method Enumerable#sum, which made its debut in Ruby v2.4. For earlier versions, use c.reduce(:0) == tot.
In comb_to_perms, the first uniq simply removes duplicates. The second uniq, with a block, removes all but one of the p.size elements (arrays) that can be obtained by rotating any of the other p-1 elements. For example,
p = [1,2,3]
p.size.times.with_object(Set.new) { |i,s| s << p.rotate(i) }
#=> #<Set: {[1, 2, 3], [2, 3, 1], [3, 1, 2]}>

Ruby: match first, second, this etc elements from a dimensional array

I have an array of arrays. I want to concatenate the first, second, third elements of arrays.
Example arrays:
a = [[4, 5, 6], [1, 2, 3], [8, 9, 10]]
a1 = [[1, 2, 3], [8, 9, 10]]
a2 = [[4, 5, 6], [1, 2, 3], [8, 9, 10], [11, 21, 31]]
Output:
out of a: [[4,1,8],[5,2,9],[6,3,10]]
out of a1: [[1,8],[2,9],[3,10]]
out of a2: [[4,1,8,11],[5,2,9,21],[6,3,10,31]]

Use transpose method
a.transpose
=> [[4, 1, 8], [5, 2, 9], [6, 3, 10]]

Array#transpose:
[a, a1, a2].map(&:transpose)
# [
# [[4, 1, 8], [5, 2, 9], [6, 3, 10]],
# [[1, 8], [2, 9], [3, 10]],
# [[4, 1, 8, 11], [5, 2, 9, 21], [6, 3, 10, 31]]
# ]

Whenever Array#transpose can be used so can Enumerable#zip.
a.first.zip *a.drop(1)
#=> [[4,1,8],[5,2,9],[6,3,10]]

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

remove duplicates based on one field in a numpy array - arrays

Related

How to insert a number to each array in an array of arrays?

Slice sets of columns in numpy

Summing elements from arrays with matching elements

Finding all permutations of numbers plucked from an array which sum to 16

Ruby: match first, second, this etc elements from a dimensional array

Categories

Resources