Applying a threshold to all values in an array

Applying a threshold to all values in an array - arrays

I have an array of 1000 elements, which elements fluctuate from 0 to 1.
I want to scan that array and zero all values below a certain threshold, let's say, 0.3.
I know I can do something like
let filteredArrayOnDict = myArray.filter { $0 > 0.3}
and I will get a new array with the elements above 0.3. But that is not what I want. I want to zero the elements below 0.3 and keep the resulting array with the same number of elements.
I can iterate over the array like
var newArray : [Double] = []
for item in myArray {
if item > 0.3 {
newArray.append(item)
} else {
newArray.append(0)
}
}
but I wonder if there is some more elegant method using these magical commands like filter, map, flatmap, etc.

The Accelerate framework has a dedicated function vDSP_vthresD for this purpose:
Vector threshold with zero fill; double precision.
Example:
import Accelerate
let array = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
var threshold = 0.3
var result = Array(repeating: 0.0, count: array.count)
vDSP_vthresD(array, 1, &threshold, &result, 1, vDSP_Length(array.count))
print(result) // [0.0, 0.0, 0.0, 0.3, 0.4, 0.5, 0.6]

You can try map for this
var resultArray = myArray.map({$0 > 0.3 ? $0 : 0})

Related

How to remove numpy array row which matches the string in list

I have got an array which looks like
array = array([['Mango', 0.75, 0.25],
['Honey', 0.75, 0.25],
['Grape', 0.625, 0.375],
['Pineapple', 0.5, 0.5]], dtype=object)
and a list item = {'Honey','Grape'}
now, have to remove the rows from the array which matches the items in the list.
Expected Output:
array = array([['Mango', 0.75, 0.25],
['Pineapple', 0.5, 0.5]], dtype=object)
Have tried the below code but somehow it doesn't work.
array[:] = [x for x in array[:,0] if item not in x]
Help me with this. Thanks in advance!

You can use:
out = array[ ~np.isin(array[:, 0], item) ]
out:
array([['Mango', 0.75, 0.25],
['Pineapple', 0.5, 0.5]], dtype=object)
but you may want to have a look at a np.recarray or a pandas DataFrame, which is more suited to this kind of data.

Another possible solution, using numpy broadcasting:
a[np.all(a[:,0][:, None] != itens, axis=1), :]
Output:
array([['Mango', 0.75, 0.25],
['Pineapple', 0.5, 0.5]], dtype=object)

Remove elements of 2d array whose 1st column values are duplicates (Swift)

This query is a small twist on a commonly asked question.
The goal is to filter a 2d array to remove duplicate element pairs whose first-column values are duplicates. For example:
[[1, 1, 1, 2, 2, 3, 3, 3, 3], [0.1, 0.15, 0.2, 0.05, 0.1, 0.2, 0.25, 0.3, 0.35]]
-> [[1, 2, 3],[0.2, 0.1, 0.35]]
Since the second-column values vary, there is obviously some discretion that needs to be applied when filtering: here, the last value of the set of duplicates is chosen.
One of the myriad answers to this related question -- a functional programming solution by Tim MB -- can be adapted to the task:
// Use FP-style filtering to eliminate repeated elements
let rawArray: [[Float]] = [...]
let filteredArray = rawArray
.transpose
.enumerated()
.filter{ (rawArray[0]).lastIndex(of: $0.1[0]) == $0.0 }
.map{ $0.1 }
.transpose
However, this solution is rather slow, which is unfortunate because it's elegant.
A faster solution that keeps the FP spirit is to use dictionary hashing:
// Use array -> dict -> array trick to remove repeated elements
let rawArray: [[Float]] = [...]
let filteredArray = Array( Array(
rawArray
.transpose
.reduce(into: [:], { dict, elements in
dict[elements[0], default:(0,0)] = elements[1]
} )
.map{ ($0.key, $0.value) } )
.sorted{ $0.0 < $1.0 }
.map{ [$0.0, $0.1] }
.transpose) as! Array2D
My questions are:
Is this dictionary trick a good idea? Given that it uses floats as keys?
Why is the FP solution slow? Can it be speeded up?
Are there better alternatives?

Note on terminology: I'll use a to refer to your array, length to refer to its count (a.count), and width to refer to its elements widths (a[0].count).
There are a few things here that are each pretty brutal on your performance.
Transposing
Firstly, each array transposition is O(width * height). Depending on the implementation, it could also be particularly rough on your cache. And you do it twice. Thus, it's an important goal to avoid transposition is possible.
In your case, since you have vectors with only two elements, you can use zip to iterate your two column vectors in tandem. The result a sequence that does so lazily, so no copying happens, and no extra memory or time is used.
Deduplication
The implementation of deduplication that you stumbled on (.filter{ (rawArray[0]).lastIndex(of: $0.1[0]) == $0.0 }) is hot garbage. It's also O(width * height). It's actually worse than approaches that use Array.contains to maintain an array of "already seen" elements. When contains is looking for an element, it can bail-early when it finds a match. lastIndex(of:) always has to go through the entire array, never early-returning because there could always be a later instance of the sought-after element.
Where possible, use an implementation that takes advantage of the Hashability of your elements. Using a Set to track the "already seen" elements allows you to do O(1) contains checks, over array's O(count). I strongly recommend Cœur's implementation.
There's only one catch: that implementation only keeps the first elements, not the last. Luckily enough, that's really easy to fix: just reverse the elements, unique them (keeping the firsts of the reversed elemtents is like keeping the lasts of the original elements), and reverse them back.
My solution:
extension Sequence {
/// Returns an array containing, in order, the first instances of
/// elements of the sequence that compare equally for the keyPath.
func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> [Element] {
var unique = Set<T>()
return filter { unique.insert($0[keyPath: keyPath]).inserted }
}
}
let points = zip(array[0], array[1])
let pointsUniquedByXs = points.reversed() // O(1) for collections
.unqiue() // O(count)
.reversed() // O(1) until you need to materalize as a reversed collection

You can accomplish what you want by first filtering the indices of the first array which the element is the first occurrence in reverse order. Then you just need to map the subsequences using them:
let rawArray: [[Float]] = [[1, 1, 1, 2, 2, 3, 3, 3, 3], [0.1, 0.15, 0.2, 0.05, 0.1, 0.2, 0.25, 0.3, 0.3]]
var set: Set<Float> = []
let indices = rawArray
.first?
.indices
.reversed()
.filter { set.insert(rawArray.first![$0]).inserted }
.reversed() ?? []
let result = rawArray.map { elements in indices.map { elements[$0] } }
print(result) // [[1, 2, 3], [0.2, 0.1, 0.3]]
Another option is to create two empty subsequences, iterate the first rawArray subsequence indices reversed and try to insert the float value into a set, if inserted append the corresponding elements to the subsequences, then you just need to recreate the resulting array with those two new sequences reversed:
let rawArray: [[Float]] = [[1, 1, 1, 2, 2, 3, 3, 3, 3], [0.1, 0.15, 0.2, 0.05, 0.1, 0.2, 0.25, 0.3, 0.3]]
var set: Set<Float> = []
var sub1: [Float] = []
var sub2: [Float] = []
rawArray[0].indices.reversed().forEach {
let value = rawArray[0][$0]
if set.insert(value).inserted {
sub1.append(value)
sub2.append(rawArray[1][$0])
}
}
let result: [[Float]] = [sub1.reversed(), sub2.reversed()] // [[1, 2, 3], [0.2, 0.1, 0.3]]
You can make it even faster if the result array is declared as a reversed collection of floating points. It would be O(1) for [ReversedCollection<[Float]>] instead of O(n) for [[Float]] for each subsequence.

Thanks to Alexander, here is solution adapted from Coeur's method in the long related thread.
let rawArray: [[Float]] = [[1, 1, 1, 2, 2, 3, 3, 3, 3],
[0.1, 0.15, 0.2, 0.05, 0.1, 0.2, 0.25, 0.3, 0.35]]
let filteredArray = rawArray
.transpose
.reversed()
.map{ ($0[0],$0[1]) }
.unique(for: \.0)
.map{ [$0.0,$0.1] }
.reversed()
.transpose
All that cruft arises because the data is a two-column float array rather than a 1d array of tuples, and because the last rather than the first duplicate value is required to be selected.
For this to work, Array needs to have the following extensions, the first courtesy of Alexander and Coeur, the second (revision) thanks to Leo Dabus:
extension RangeReplaceableCollection {
/// Returns a collection containing, in order, the first instances of
/// elements of the sequence that compare equally for the keyPath.
func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> Self {
var unique = Set<T>()
return filter { unique.insert($0[keyPath: keyPath]).inserted }
}
}
extension RandomAccessCollection where Element: RandomAccessCollection {
/// Peform a transpose operation
var transpose: [[Element.Element]] {
guard !isEmpty,
var index = first?.startIndex,
let endIndex = first?.endIndex
else { return [] }
var result: [[Element.Element]] = []
while index < endIndex {
result.append(map{$0[index]})
first?.formIndex(after: &index) }
return result
}
}

Scala: Append to Array within a Map

I am learning immutable types in Scala and am struggling to get this elementary task done. I simply need to append to an array of doubles that is within a map. I do not want to use ArrayBuffer.
My ultimate goal is to make an adjacency matrix. When I append a new item to the map (an (Int, double) tuple) I want to increase the size of each array within the map--essentially increasing the dimension of the matrix.
var map = Map[Int, Array[Double]]()
map += (0 -> new Array[Double](5))
// HOW TO DO THIS
map(0) = map(0) :+ 0.01
for ((i, a) <- map) {
print(i + ": ")
for (d <- a) print(d + ", ")
}
What I have written above does not compile. However map(0) :+ 0.01 alone will but it does not achieve my goal of appending to an immutable array within a map.

Because it is immutable Map, you can't change the value in place, as you've tried to do using map(0) = map(0) :+ 0.01.
The one of the possible solutions is using updated method, which returns updated map (all methods like add, remove, modify in immutable data structures return new data struture):
map = map.updated(0, map(0) :+ 0.01)
Some examples to prove:
var map = Map[Int, Array[Double]]()
map += (0 -> new Array[Double](5))
map = map.updated(0, map(0) :+ 0.01)
map(0) // res1: Array[Double] = Array(0.0, 0.0, 0.0, 0.0, 0.0, 0.01)
map = map.updated(0, map(0) :+ 0.02)
map(0) // res2: Array[Double] = Array(0.0, 0.0, 0.0, 0.0, 0.0, 0.01, 0.02)

Scala: merge two arrays in one single structure

I have two arrays:
val diceProbDist = new Array[Double](2 * DICE + 1)
and
val diceExpDist = new Array[Double](2 * DICE + 1)
and I want to merge in one single structure (some sort of tuple, maybe):
(0, 0.0, 0,0)(1, 0.0, 0.0)(2, 0.02778, 0.02878)...
where the first entry is the array index, the second entry is the first array value and the third entry is the second array value.
Is there some scala function to accomplish that (zip with map or something like that)?
thanks,
ML

val diceProbDist = Array(0.1, 0.2, 0.3)
val diceExpDist = Array(0.11, 0.22, 0.33)
diceProbDist
.zip(diceExpDist)
.zipWithIndex
.map { case ((v1, v2), i) => (i, v1, v2) }
// result: Array((0,0.1,0.11), (1,0.2,0.22), (2,0.3,0.33))

Simple for comprehension should also do the trick if you do not mind:
for {
index <- 0 until math.min(diceExpDist.length, diceProbDist.length)
} yield (index, diceProbDist(index), diceExpDist(index))

Another solution, similar to tkachuko's one without the for comprehension
val diceProbDist = List(0.1, 0.2, 0.3)
val diceExpDist = List(0.11, 0.22, 0.33)
val range = 0 until math.min(diceExpDist.length, diceProbDist.length)
range.map { idx => (idx, c(idx), d(idx)) }
// result : res0: List[(Int, Int, Int)] = List((0,0,11), (1,1,12), (2,2,13), (3,3,14), (4,4,15), (5,5,16), (6,6,17), (7,7,18), (8,8,19), (9,9,20))

Create array with non-integer increments

I am trying to create time stamp arrays in Swift.
So, say I want to go from 0 to 4 seconds, I can use Array(0...4), which gives [0, 1, 2, 3, 4]
But how can I get [0.0, 0.5 1.0, 2.0, 2.5, 3.0, 3.5, 4.0]?
Essentially I want a flexible delta, such as 0.5, 0.05, etc.

You can use stride(from:through:by:):
let a = Array(stride(from: 0.0, through: 4.0, by: 0.5))

An alternative for non-constant increments (even more viable in Swift 3.1)
The stride(from:through:by:) functions as covered in #Alexander's answer is the fit for purpose solution where, but for the case where readers of this Q&A wants to construct a sequence (/collection) of non-constant increments (in which case the linear-sequence constructing stride(...) falls short), I'll also include another alternative.
For such scenarios, the sequence(first:next:) is a good method of choice; used to construct a lazy sequence that can be repeatedly queried for the next element.
E.g., constructing the first 5 ticks for a log10 scale (Double array)
let log10Seq = sequence(first: 1.0, next: { 10*$0 })
let arr = Array(log10Seq.prefix(5)) // [1.0, 10.0, 100.0, 1000.0, 10000.0]
Swift 3.1 is intended to be released in the spring of 2017, and with this (among lots of other things) comes the implementation of the following accepted Swift evolution proposal:
SE-0045: Add prefix(while:) and drop(while:) to the stdlib
prefix(while:) in combination with sequence(first:next) provides a neat tool for generating sequences with everything for simple next methods (such as imitating the simple behaviour of stride(...)) to more advanced ones. The stride(...) example of this question is a good minimal (very simple) example of such usage:
/* this we can do already in Swift 3.0 */
let delta = 0.05
let seq = sequence(first: 0.0, next: { $0 + delta})
/* 'prefix(while:)' soon available in Swift 3.1 */
let arr = Array(seq.prefix(while: { $0 <= 4.0 }))
// [0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0]
// ...
for elem in sequence(first: 0.0, next: { $0 + delta})
.prefix(while: { $0 <= 4.0 }) {
// ...
}
Again, not in contest with stride(...) in the simple case of this Q, but very viable as soon as the useful but simple applications of stride(...) falls short, e.g. for a constructing non-linear sequences.