Get the maximum N elements (along with their indices) of an Array

Get the maximum N elements (along with their indices) of an Array - arrays

I've got an array that contains Integers as the one shown below:
val my_array = Array(10, 20, 6, 31, 0, 2, -2)
I need to get the maximum 3 elements of this array along with their corresponding indices (either using a single function or two separate funcs).
For example, the output might be something like:
// max values
Array(31, 20, 10)
// max indices
Array(3, 1, 0)
Although the operations look simple, I was not able to find any relevant functions around.

Here's a straightforward way - zipWithIndex followed by sorting:
val (values, indices) = my_array
.zipWithIndex // add indices
.sortBy(t => -t._1) // sort by values (descending)
.take(3) // take first 3
.unzip // "unzip" the array-of-tuples into tuple-of-arrays

Here's another way to do it:
(my_array zip Stream.from(0)).
sortWith(_._1 > _._1).
take(3)
res1: Array[(Int, Int)] = Array((31,3), (20,1), (10,0))

Related

Most computationally efficient way to batch alter values in each array of a 2d array, based on conditions for particular values by indices

Say that I have a batch of arrays, and I would like to alter them based on conditions of particular values located by indices.
For example, say that I would like to increase and decrease particular values if the difference between those values are less than two.
For a single 1D array it can be done like this
import numpy as np
single2 = np.array([8, 8, 9, 10])
if abs(single2[1]-single2[2])<2:
single2[1] = single2[1] - 1
single2[2] = single2[2] + 1
single2
array([ 8, 7, 10, 10])
But I do not know how to do it for batch of arrays. This is my initial attempt
import numpy as np
single1 = np.array([6, 0, 3, 7])
single2 = np.array([8, 8, 9, 10])
single3 = np.array([2, 15, 15, 20])
batch = np.array([
np.copy(single1),
np.copy(single2),
np.copy(single3),
])
if abs(batch[:,1]-batch[:,2])<2:
batch[:,1] = batch[:,1] - 1
batch[:,2] = batch[:,2] + 1
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Looking at np.any and np.all, they are used to create an array of booleans values, and I am not sure how they could be used in the code snippet above.
My second attempt uses np.where, using the method described here for comparing particular values of a batch of arrays by creating new versions of the arrays with values added to the front/back of the arrays.
https://stackoverflow.com/a/71297663/3259896
In the case of the example, I am comparing values that are right next to each other, so I created copies that shift the arrays forwards and backwards by 1. I also use only the particular slice of the array that I am comparing, since the other numbers would also be used in the comparison in np.where.
batch_ap = np.concatenate(
(batch[:, 1:2+1], np.repeat(-999, 3).reshape(3,1)),
axis=1
)
batch_pr = np.concatenate(
(np.repeat(-999, 3).reshape(3,1), batch[:, 1:2+1]),
axis=1
)
Finally, I do the comparisons, and adjust the values
batch[:, 1:2+1] = np.where(
abs(batch_ap[:,1:]-batch_ap[:,:-1])<2,
batch[:, 1:2+1]-1,
batch[:, 1:2+1]
)
batch[:, 1:2+1] = np.where(
abs(batch_pr[:,1:]-batch_pr[:,:-1])<2,
batch[:, 1:2+1]+1,
batch[:, 1:2+1]
)
print(batch)
[[ 6 0 3 7]
[ 8 7 10 10]
[ 2 14 16 20]]
Though I am not sure if this is the most computationally efficient nor programmatically elegant method for this task. Seems like a lot of operations and code for the task, but I do not have a strong enough mastery of numpy to be certain about this.

This works
mask = abs(batch[:,1]-batch[:,2])<2
batch[mask,1] -= 1
batch[mask,2] += 1

How to get average of values in array between two given indexes in Swift

I'm trying to get the average of the values between two indexes in an array. The solution I first came to reduces the array to the required range, before taking the sum of values divided by the number of values. A simplified version looks like this:
let array = [0, 2, 4, 6, 8, 10, 12]
// The aim is to take the average of the values between array[n] and array[.count - 1].
I attempted with the following code:
func avgOf(x: Int) throws -> String {
let avgforx = solveList.count - x
// Error handling to check if x in average of x does not overstep bounds
guard avgforx > 0 else {
throw FuncError.avgNotPossible
}
solveList.removeSubrange(ClosedRange(uncheckedBounds: (lower: 0, upper: avgforx - 1)))
let avgx = (solveList.reduce(0, +)) / Double(x)
// Rounding
let roundedAvgOfX = (avgx * 1000).rounded() / 1000
print(roundedAvgOfX)
return "\(roundedAvgOfX)"
}
where avgforx is used to represent the lower bound :
array[(.count - 1) - x])
The guard statement makes sure that if the index is out of range, the error is handled properly.
solveList.removeSubrange was my initial solution, as it removes the values outside of the needed index range (and subsequently delivers the needed result), but this has proved to be problematic as the values not taken in the average should remain.
The line in removeSubrange basically takes a needed index field (e.g. array[5] to array[10]), removes all the values from array[0] to array[4], and then takes the sum of the resulting array divided by the number of elements.
Instead, the values in array[0] to array[4] should remain.
I would appreciate any help.
(Swift 4, Xcode 10)

Apart from the fact that the original array is modified, the error in your code is that it divides the sum of the remaining elements by the count of the removed elements (x) instead of dividing by the count of remaining elements.
A better approach might be to define a function which computes the average of a collection of integers:
func average<C: Collection>(of c: C) -> Double where C.Element == Int {
precondition(!c.isEmpty, "Cannot compute average of empty collection")
return Double(c.reduce(0, +))/Double(c.count)
}
Now you can use that with slices, without modifying the original array:
let array = [0, 2, 4, 6, 8, 10, 12]
let avg1 = average(of: array[3...]) // Average from index 3 to the end
let avg2 = average(of: array[2...4]) // Average from index 2 to 4
let avg3 = average(of: array[..<5]) // Average of first 5 elements

Append new variables to IDL for loop array

If I have the following array:
x = double([1, 1, 1, 10, 1, 1, 50, 1, 1, 1 ])
I want to do the following:
Group the array into groups of 5 which will each be evaluated separately.
Identify the MAX value each of the groups of the array
Remove that MAX value and put it into another array.
Finally, I want to print the updated array x without the MAX values, and the new array containing the MAX values.
How can I do this? I am new to IDL and have had no formal training in coding.
I understand that I can write the code to group and find the max values this way:
FOR i = 1, (n_elements(x)-4) do begin
print, "MAX of array", MAX( MAX(x[i-1:1+3])
ENDFOR
However, how do I implement all of what I specified above? I know I have to create an empty array that will append the values found by the for loop, but I don't know how to do that.
Thanks

I changed your x to have unique elements to make sure I wasn't fooling myself. It this, the number of elements of x must be divisible by group_size:
x = double([1, 2, 3, 10, 4, 5, 50, 6, 7, 8])
group_size = 5
maxes = max(reform(x, group_size, n_elements(x) / group_size), ind, dimension=1)
all = bytarr(n_elements(x))
all[ind] = 1
x_without_maxes = x[where(all eq 0)]
print, maxes
print, x_without_maxes

Lists are good for this, because they allow you to pop out values at specific indices, rather than rewriting the whole array again. You might try something like the following. I've used a while loop here, rather than a for loop, because it makes it a little easier in this case.
x = List(1, 1, 1, 10, 1, 1, 50, 1, 1, 1)
maxValues = List()
pos = 4
while (pos le x.length) do begin
maxValues.add, max(x[pos-4:pos].toArray(), iMax)
x.Remove, iMax+pos-4
pos += 5-1
endwhile
print, "Max Values : ", maxValues.toArray()
print, "Remaining Values : ", x.toArray()
This allows you to do what you want I think. At the end, you have a List object (which can easily be converted to an array) with the max values for each group of 5, and another containing the remaining values.
Also, please tag this as idl-programming-language rather than idl. They are two different tags.

How to sum up every n elements of a Scala array?

what's the efficient way to sum up every n elements of an array in Scala? For example, if my array is like below:
val arr = Array(3,1,9,2,5,8,...)
and I want to sum up every 3 elements of this array and get a new array like below:
newArr = Array(13, 15, ...)
How can I do this efficiently in Spark Scala? Thank you very much.

grouped followed by map should do the trick:
scala> val arr = Array(3,1,9,2,5,8)
arr: Array[Int] = Array(3, 1, 9, 2, 5, 8)
scala> arr.grouped(3).map(_.sum).toArray
res0: Array[Int] = Array(13, 15)

Calling the toIterator method on the array before calling grouped should speed things up a bit, i.e.
arr.toIterator.grouped(3).map(_.sum).toArray
For example, using
val xs = Array.range(0, 10000)
10000 iterations of
xs.toIterator.grouped(3).map(_.sum).toArray
takes about 16.93 seconds, while 10000 iterations of
xs.grouped(3).map(_.sum).toArray
requires approximately 21.49 seconds.

Looping through slices of Theano tensor

I have two 2D Theano tensors, call them x_1 and x_2, and suppose for the sake of example, both x_1 and x_2 have shape (1, 50). Now, to compute their mean squared error, I simply run:
T.sqr(x_1 - x_2).mean(axis = -1).
However, what I wanted to do was construct a new tensor that consists of their mean squared error in chunks of 10. In other words, since I'm more familiar with NumPy, what I had in mind was to create the following tensor M in Theano:
M = [theano.tensor.sqr(x_1[:, i:i+10] - x_2[:, i:i+10]).mean(axis = -1) for i in xrange(0, 50, 10)]
Now, since Theano doesn't have for loops, but instead uses scan (which map is a special case of), I thought I would try the following:
sequence = T.arange(0, 50, 10)
M = theano.map(lambda i: theano.tensor.sqr(x_1[:, i:i+10] - x_2[:, i:i+10]).mean(axis = -1), sequence)
However, this does not seem to work, as I get the error:
only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices
Is there a way to loop through the slices using theano.scan (or map)? Thanks in advance, as I'm new to Theano!

Similar to what can be done in numpy, a solution would be to reshape your (1, 50) tensor to a (1, 10, 5) tensor (or even a (10, 5) tensor), and then to compute the mean along the second axis.
To illustrate this with numpy, suppose I want to compute means by slices of 2
x = np.array([0, 2, 0, 4, 0, 6])
x = x.reshape([3, 2])
np.mean(x, axis=1)
outputs
array([ 1., 2., 3.])