Find most duplicated numbers inside a array - arrays

I have the following array
[1, 2, 3, 4, 5, 1, 2, 5, 3, 4, 2, 3, 1, 3, 2, 2]`
I want to find out 2 things:
1) How many duplicates of each number is it?
For instance: 1, 3 times, 4, 2 times etc.
2) Find 3 most duplicated numbers in the array.
For instance: [2, 3, 1] since 2 is duplicated 5 times, 3 is duplicated 4 times & 1 is duplicated 3 times.
I have tried
arr = [1, 2, 3, 4, 5, 1, 2, 5, 3, 4, 2, 3, 1, 3, 2, 2]
= arr.group_by { |e| e }.map { |e| e[0] if e[1][1] }.compact
But results are not what I am looking for: [1, 2, 3, 4, 5]

▶ arr.group_by { |e| e } # arr.group_by(&:itself) for Ruby >= 2.2
.map { |k, v| [k, v.count] } #⇒ [[1, 3], [2, 5], [3, 4], [4, 2], [5, 2]]
.sort_by { |(_, cnt)| -cnt } #⇒ [[2, 5], [3, 4], [1, 3], [4, 2], [5, 2]]
.take(3) #⇒ [[2, 5], [3, 4], [1, 3]]
.map(&:first)
#⇒ [2, 3, 1]
Remove three last clauses to get the whole unsorted result.

To get a count of duplicated entries per duplicate you can go with:
arr.group_by(&:itself)
.each_with_object({}) {|(k, v), hash| hash[k] = v.size }
#=> {1=>3, 2=>5, 3=>4, 4=>2, 5=>2}
To get 3 most duplicated entries:
arr.group_by(&:itself)
.sort_by { |_k, v| -v.size }
.take(3)
.map(&:first)
#=> [2, 3, 1]

1) How many duplicates of each number is it?
counts = Hash[arr.uniq.map{|_x| [_x, arr.count(_x)]}]
=> {1=>3, 2=>5, 3=>4, 4=>2, 5=>2}
2) Find 3 most duplicated numbers in the array
counts.sort_by { |a, b| -b }.take(3).map(&:first)
=> [2, 3, 1]

arr = [1, 2, 3, 4, 5, 1, 2, 5, 3, 4, 2, 3, 1, 3, 2, 2]
I suggest using a counting hash (see the reference to "default value" at Hash::new):
h = arr.each_with_object(Hash.new(0)) { |n,h| h[n] += 1 }
# => {1=>3, 2=>5, 3=>4, 4=>2, 5=>2}
and use the method Enumerable#max_by with an argument of 3 to obtain the three keys of h having the largest values:
h.max_by(3, &:last).map(&:first)
#=> [2, 3, 1]
Note that if h is largish, using max_by with an argument is more efficient that using Enumerable#sort_by or Array#sort and then discarding all but the three largest values. The Enumerable methods max_by, min_by max and min were changed to permit an argument (which defaults to 1) in Ruby v2.2.

Related

Select value range from an array, including duplicates

I am given an array arr of integers that is sorted in ascending or descending order. If arr contains at least two distinct elements, I need to find the longest arr.last(n) that has exactly two distinct elements (i.e., with the largest n). Otherwise, it should return arr. Some examples are:
arr = [6, 4, 3, 2, 2], then [3, 2, 2] is to be returned
arr = [6, 4, 3, 3, 2], then [3, 3, 2] is to be returned
arr = [1], then arr is to be returned.
I would be grateful for suggestions on how to compute the desired result.
Here's a fairly inefficient approach that uses take_while:
def last_non_dupe(array, count = 2)
result = [ ]
array.reverse.take_while do |n|
result << n
result.uniq.length <= count
end.reverse
end
It can be improved on by using a Set which is automatically unique:
require 'set'
def last_non_dupe(array, count = 2)
result = Set.new
array.reverse.take_while do |n|
result << n
result.length <= count
end.reverse
end
Where in either case you do:
last_non_dupe([6, 4, 3, 2, 2])
# => [3, 2, 2]
The count argument can be changed as necessary for longer or shorter lists.
def last_two_different(arr, count)
arr.reverse_each.
lazy.
chunk(&:itself).
first(count).
flat_map(&:last).
reverse
end
last_two_different [6, 4, 3, 2, 2], 2 #=> [3, 2, 2]
last_two_different [3, 4, 3, 3, 2], 2 #=> [3, 3, 2]
last_two_different [3, 4, 3, 3, 2], 3 #=> [4, 3, 3, 2]
last_two_different [3, 4, 3, 3, 2], 4 #=> [3, 4, 3, 3, 2]
last_two_different [1, 2], 2 #=> [1, 2]
last_two_different [1, 1], 2 #=> [1, 1]
last_two_different [1], 2 #=> [1]
last_two_different [], 2 #=> []
The steps are as follows.
arr = [6, 4, 3, 2, 2]
count = 2
enum0 = arr.reverse_each
#=> #<Enumerator: [6, 4, 3, 2, 2]:reverse_each>
We can convert this enumerator to an array to see the values it will generate.
enum0.to_a
#=> [2, 2, 3, 4, 6]
First, suppose we wrote the following.
enum1 = enum0.chunk(&:itself)
#=> #<Enumerator: #<Enumerator::Generator:0x00005c29be132b00>:each>
enum1.to_a
#=> [[2, [2, 2]], [3, [3]], [4, [4]], [6, [6]]]
We want the first count #=> 2 elements generated by enum1, from which we could extract the desired result. That tells us that we want a lazy enumerator.
enum2 = enum0.lazy
#=> #<Enumerator::Lazy: #<Enumerator: [6, 4, 3, 2, 2]:reverse_each>>
enum3 = enum2.chunk(&:itself)
#=> #<Enumerator::Lazy: #<Enumerator:
# #<Enumerator::Generator:0x00005c29bdf48cb8>:each>>
enum3.to_a
#=> [[2, [2, 2]], [3, [3]], [4, [4]], [6, [6]]]
a = enum3.first(count)
#=> [[2, [2, 2]], [3, [3]]]
b = a.flat_map(&:last)
#=> [2, 2, 3]
b.reverse
#=> [3, 2, 2]
Not sure about the efficiency, but here is another way to do it:
arr = [6, 4, 3, 2, 2]
uniq = arr.uniq.last(2) # => [3, 2]
arr.select{|e| uniq.include?(e)} # => [3, 2, 2]

How to get all sub matrices of 2D array without numpy?

I need to get all submatrices of the 2D array and to do the manipulation for each submatrix. So I created example matrix:
M3 = [list(range(5)) for i in range(6)]
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]
I need to capture 3 rows and 3 columns and then shift this "window" till I get all submatrices. The first submatrix would be:
[[0, 1, 2],
[0, 1, 2],
[0, 1, 2]]
and the last one is:
[[2, 3, 4],
[2, 3, 4],
[2, 3, 4]]
For this matrix I need 12 submatrices. However, I become more using code with which I tried to solve the problem:
for j in range(len(M3[0])-3):
for i in range(len(M3)-3):
for row in M3[0+j:3+j]:
X_i_j = [row[0+i:3+i] for row in M3[0+j:3+j]]
print(X_i_j)
I get 18 but not 12 (with two duplicates of each submatrix):
[[0, 1, 2], [0, 1, 2], [0, 1, 2]]
[[0, 1, 2], [0, 1, 2], [0, 1, 2]]
[[0, 1, 2], [0, 1, 2], [0, 1, 2]]
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
...
[[2, 3, 4], [2, 3, 4], [2, 3, 4]]
[[2, 3, 4], [2, 3, 4], [2, 3, 4]]
And with this sample of code I get 6 submatrices with 1 duplicate for each:
for i in range(len(M3)-3):
for j in range(len(M3[0])-3):
X_i_j = [row[0+i:3+i] for row in M3[0+j:3+j]]
print(X_i_j)
I do not see what is wrong with it and why I get the duplicates. How can I get all sub matrices of 2D array without numpy for this case?
Your code is working ( with change of order of vars and constants ):
for j in range(len(M3)-2):
for i in range(len(M3[0])-2):
X_i_j = [row[0+i:3+i] for row in M3[0+j:3+j]]
print('=======')
for x in X_i_j:
print(x)
I would solve it slightly different.
a function to read y-number-of-rows
then a function to read x-number-of-columns from those rows, which then is your sub.
This would work for any (2D) array / sub-array
Sample:
def read_y_rows(array, rows, offset):
return array[offset:rows + offset]
def read_x_cols(array, cols, offset):
return list(row[offset:cols + offset] for row in array)
def get_sub_arrays(array, x_dim_cols, y_dim_rows):
"""
get 2D sub arrays by x_dim columns and y_dim rows
from 2D array (list of lists)
"""
result = []
for start_row in range(len(array) - y_dim_rows + 1):
y_rows = read_y_rows(array, y_dim_rows, start_row)
for start_col in range(len(max(array, key=len)) - x_dim_cols + 1):
x_columns = read_x_cols(y_rows, x_dim_cols, start_col)
result.append(x_columns)
return result
to use it you could do:
M3 = [list(range(5)) for i in range(6)]
sub_arrays = get_sub_arrays(M3, 3, 3) ## this would also work for 2x2 arrays
the sub_arrays is again a list of lists, containing all found subarrays, you could print them like this:
for sub_array in sub_arrays:
print()
for row in sub_array:
print(row)
I know it is a lot more code than above, just wanted to share this code.

Finding the first combination of two integers in an array whose latter element appears the earliest and sum matches a given value

I have array and sum_of_two:
array = [10, 5, 1, 9, 7, 8, 2, 4, 6, 9, 3, 2, 1, 4, 8, 7, 5]
sum_of_two = 10
I'm trying to find the combination of two integers in array whose latter element of the two appears the earliest among those of such combinations whose sum equals sum_of_two. For example, both [5, 5] and [1, 9] are candidates for such combinations, but 9 of [1, 9] (which appears later than 1 in array) appears earlier than the second 5 of [5, 5] (which is the last element in array). So I would like to return [1, 9].
I tried using combination and find:
array.combination(2).find{|x,y| x + y == sum_of_two} #=> [5, 5]
However, it returns a combination of the first integer in the array, 5 , and another integer further along the array, also 5.
If I use find_all instead of find, I get all combinations of two integers that add up to sum_of_two:
array.combination(2).find_all{|x,y| x + y == sum_of_two}
#=> [[5, 5], [1, 9], [1, 9], [9, 1], [7, 3], [8, 2], [8, 2], [2, 8], [4, 6], [6, 4], [9, 1], [3, 7], [2, 8]]
But then I'm not sure how to get the first one.
I would use Set (which would be a bit more efficient than using Array#include?) and do something like this:
array = [10, 5, 1, 9, 7, 8, 2, 4, 6, 9, 3, 2, 1, 4, 8, 7, 5]
sum_of_two = 10
require 'set'
array.each_with_object(Set.new) do |element, set|
if set.include?(sum_of_two - element)
break [sum_of_two - element, element]
else
set << element
end
end
#=> [1, 9]
x = array.find.with_index{|e, i| array.first(i).include?(sum_of_two - e)}
[sum_of_two - x, x] # => [1, 9]
Array#combination(n) does not give the elements in the order you want, so you must build the pairs yourself. It's easy if you begin from the second index. A O(n) lazy implementation, and let's call the input xs:
pairs = (1...xs.size).lazy.flat_map { |j| (0...j).lazy.map { |i| [xs[i], xs[j]] } }
first_matching_pair = pairs.detect { |i, j| i + j == 10 }
#=> [1, 9]

a repeated permutation with limitations

I am trying to generate all possible combinations of certain values in an array of 15 which add up to 50.
$a = [3, 4, 1, 2, 5]
print $a.repeated_permutation(15).to_a
In this case,
[2,2,2,2,4,4,4,4,4,4,4,4,4,3,3]
[2,2,2,4,2,4,4,4,4,4,4,4,4,3,3]
[2,2,4,2,2,4,4,4,4,4,4,4,4,3,3]
are all possible answers.
After some investigation I realize the code to do this is a bit over my head, but I will leave the question up if it might help someone else.
For some reference as to what I am working on, Project Euler, problem 114. It's pretty difficult, and so I am attempting to solve only a single case where my 50-space-long grid is filled only with 3-unit-long blocks. The blocks must be separated by at least one blank, so I am counting the blocks as 4. This (with some tweaking, which I have left out as this is confusing enough already) allows for twelve blocks plus three single blanks, or a maximum of fifteen elements.
Approach
I think recursion is the way to go here, where your recursive method looks like this:
def recurse(n,t)
where
n is the number of elements required; and
t is the required total.
If we let #arr be the array of integers you are given, recurse(n,t) returns an array of all permutations of n elements from #arr that sum to t.
Assumption
I have assumed that the elements of #arr are non-negative integers, sorted by size, but the method can be easily modified if it includes negative integers (though performance will suffer). Without loss of generality, we can assume the elements of #arr are unique, sorted by increasing magnitude.
Code
def recurse(n,t)
if n == 1
#arr.include?(t) ? [[t]] : nil
else
#arr.each_with_object([]) do |i,a|
break if i > t # as elements of #arr are non-decreasing
if (ret = recurse(n-1,t-i))
ret.each { |b| a << [i,*b] }
end
end
end
end
Examples
#arr = [3, 4, 1, 2, 5].sort
#=> [1, 2, 3, 4, 5]
recurse(1,4)
#=> [[4]]
recurse(2,6)
#=> [[1, 5], [2, 4], [3, 3], [4, 2], [5, 1]]
recurse(3,10)
#=> [[1, 4, 5], [1, 5, 4], [2, 3, 5], [2, 4, 4], [2, 5, 3],
# [3, 2, 5], [3, 3, 4], [3, 4, 3], [3, 5, 2], [4, 1, 5],
# [4, 2, 4], [4, 3, 3], [4, 4, 2], [4, 5, 1], [5, 1, 4],
# [5, 2, 3], [5, 3, 2], [5, 4, 1]]
recurse(3,50)
#=> []
Improvement
We can do better, however, by first computing all combinations, and then computing the permutations of each of those combinations.
def combo_recurse(n,t,last=0)
ndx = #arr.index { |i| i >= last }
return nil if ndx.nil?
arr_above = #arr[ndx..-1]
if n == 1
arr_above.include?(t) ? [[t]] : nil
else
arr_above.each_with_object([]) do |i,a|
break if i > t # as elements of #arr are non-decreasing
if (ret = combo_recurse(n-1,t-i,i))
ret.each { |b| a << [i,*b] }
end
end
end
end
combo_recurse(1,4)
#=> [[4]]
combo_recurse(2,6)
#=> [[1, 5], [2, 4], [3, 3]]
combo_recurse(3,10)
#=> [[1, 4, 5], [2, 3, 5], [2, 4, 4], [3, 3, 4]]
combo_recurse(3,50)
#=> []
combo_recurse(15,50).size
#=> 132
combo_recurse(15,50).first(5)
#=> [[1, 1, 1, 1, 1, 1, 4, 5, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 2, 3, 5, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 2, 4, 4, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 3, 3, 4, 5, 5, 5, 5, 5, 5, 5],
# [1, 1, 1, 1, 1, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5]]
We can then compute the permutations from the combinations:
combo_recurse(2,6).flat_map { |a| a.permutation(a.size).to_a }.uniq
#=> [[1, 5], [5, 1], [2, 4], [4, 2], [3, 3]]
combo_recurse(3,10).flat_map { |a| a.permutation(a.size).to_a }.uniq
#=> [[1, 4, 5], [1, 5, 4], [4, 1, 5], [4, 5, 1], [5, 1, 4],
# [5, 4, 1], [2, 3, 5], [2, 5, 3], [3, 2, 5], [3, 5, 2],
# [5, 2, 3], [5, 3, 2], [2, 4, 4], [4, 2, 4], [4, 4, 2],
# [3, 3, 4], [3, 4, 3], [4, 3, 3]]
We can approximate the number of permutations for (15,50) (it will be somewhat high because uniq is not applied):
def factorial(n)
(1..n).reduce :*
end
Math.log10 combo_recurse(15,50).reduce(1) { |t,a| t*factorial(a.size) }
#=> 1599.3779486682888
That is, the result has about 1,600 digits. What platform will you be running this on?

How to transpose an array in Python 3?

I've been scanning the forums and haven't found an answer yet that I can apply to my situation. I need to be able to take an n by n array and transpose it in Python-3. The example given is that I have this list input into the function:
[[4, 2, 1], ["a", "a", "a"], [-1, -2, -3]] and it needs to be transposed to read:
[[4, 'a', -1], [2, 'a', -2], [1, 'a', -3]] So basically reading vertically instead of horizontally.
I CANNOT use things like zip or numpy, I have to make my own function.
Been rattling my brain at this for two nights and it's a huge headache. If anyone could help and then provide an explanation so I can learn it, I'd be grateful.
Edit:
I should add for reference sake that the argument variable is M. The function we're supposed to write is trans(M):
A one-liner:
def trans(M):
return [[M[j][i] for j in range(len(M))] for i in range(len(M[0]))]
result:
>>> M = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> trans(M)
[[1, 4, 7], [2, 5, 8], [3, 6, 9]
# or for a non-square matrix:
>>> N = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
>>> trans(N)
[[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]
Additional Note: If you look up the tutorial on list comprehension, one of the examples is in fact transposition of a matrix array.
A variant that should work for matrices with irregular row lengths:
m=[[3, 2, 1],
[0, 1],
[2, 1, 0]]
m_T = [ [row[c] for row in m if c < len(row)] for c in range(0, max([len(row) for row in m])) ]
Here is an in place solution that works for square matrices:
def trans(M):
n = len(M)
for i in range(n - 1):
for j in range(i + 1, n):
M[i][j], M[j][i] = M[j][i], M[i][j]
Example Usage:
def print_matrix(M):
for row in M:
for ele in row:
print(ele, end='\t')
print()
M = [[4, 2, 1], ["a", "a", "a"], [-1, -2, -3]]
print('Original Matrix:')
print_matrix(M)
trans(M)
print('Transposed Matrix:')
print_matrix(M)
Output:
Original Matrix:
4 2 1
a a a
-1 -2 -3
Transposed Matrix:
4 a -1
2 a -2
1 a -3
y=([1,2], [3,4], [5,6])
transpose=[[row[i] for row in y] for i in range(len(y[0]))]
the output is
[[1, 3, 5], [2, 4, 6]]
You can also use the function in numpy to transpose - if you need the answer as a list it is straightforward to convert back using tolist:
from numpy import transpose
M = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transpose(M).tolist()
the output is
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
Haven't timed it (no time!) but I strongly suspect this will be a lot faster than iterators for large arrays, especially if you don't need to convert back to a list.

Resources