Swift Array extension for standard deviation - arrays

I am frequently needing to calculate mean and standard deviation for numeric arrays. So I've written a small protocol and extensions for numeric types that seems to work. I just would like feedback if there is anything wrong with how I have done this. Specifically, I am wondering if there is a better way to check if the type can be cast as a Double to avoid the need for the asDouble variable and init(_:Double) constructor.
I know there are issues with protocols that allow for arithmetic, but this seems to work ok and saves me from putting the standard deviation function into classes that need it.
protocol Numeric {
var asDouble: Double { get }
init(_: Double)
}
extension Int: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Float: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Double: Numeric {var asDouble: Double { get {return Double(self)}}}
extension CGFloat: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Array where Element: Numeric {
var mean : Element { get { return Element(self.reduce(0, combine: {$0.asDouble + $1.asDouble}) / Double(self.count))}}
var sd : Element { get {
let mu = self.reduce(0, combine: {$0.asDouble + $1.asDouble}) / Double(self.count)
let variances = self.map{pow(($0.asDouble - mu), 2)}
return Element(sqrt(variances.mean))
}}
}
edit: I know it's kind of pointless to get [Int].mean and sd, but I might use numeric elsewhere so it's for consistency..
edit: as #Severin Pappadeux pointed out, variance can be expressed in a manner that avoids the triple pass on the array - mean then map then mean. Here is the final standard deviation extension
extension Array where Element: Numeric {
var sd : Element { get {
let sss = self.reduce((0.0, 0.0)){ return ($0.0 + $1.asDouble, $0.1 + ($1.asDouble * $1.asDouble))}
let n = Double(self.count)
return Element(sqrt(sss.1/n - (sss.0/n * sss.0/n)))
}}
}

Swift 4 Array extension with FloatingPoint elements:
extension Array where Element: FloatingPoint {
func sum() -> Element {
return self.reduce(0, +)
}
func avg() -> Element {
return self.sum() / Element(self.count)
}
func std() -> Element {
let mean = self.avg()
let v = self.reduce(0, { $0 + ($1-mean)*($1-mean) })
return sqrt(v / (Element(self.count) - 1))
}
}

There's actually a class that provides this functionality already - called NSExpression. You could reduce your code size and complexity by using this instead. There's quite a bit of stuff to this class, but a simple implementation of what you want is as follows.
let expression = NSExpression(forFunction: "stddev:", arguments: [NSExpression(forConstantValue: [1,2,3,4,5])])
let standardDeviation = expression.expressionValueWithObject(nil, context: nil)
You can calculate mean too, and much more. Info here: http://nshipster.com/nsexpression/

In Swift 3 you might (or might not) be able to save yourself some duplication with the FloatingPoint protocol, but otherwise what you're doing is exactly right.

To follow up on Matt's observation, I'd do the main algorithm on FloatingPoint, taking care of Double, Float, CGFloat, etc. But then I then do another permutation of this on BinaryInteger, to take care of all of the integer types.
E.g. on FloatingPoint:
extension Array where Element: FloatingPoint {
/// The mean average of the items in the collection.
var mean: Element { return reduce(Element(0), +) / Element(count) }
/// The unbiased sample standard deviation. Is `nil` if there are insufficient number of items in the collection.
var stdev: Element? {
guard count > 1 else { return nil }
return sqrt(sumSquaredDeviations() / Element(count - 1))
}
/// The population standard deviation. Is `nil` if there are insufficient number of items in the collection.
var stdevp: Element? {
guard count > 0 else { return nil }
return sqrt(sumSquaredDeviations() / Element(count))
}
/// Calculate the sum of the squares of the differences of the values from the mean
///
/// A calculation common for both sample and population standard deviations.
///
/// - calculate mean
/// - calculate deviation of each value from that mean
/// - square that
/// - sum all of those squares
private func sumSquaredDeviations() -> Element {
let average = mean
return map {
let difference = $0 - average
return difference * difference
}.reduce(Element(0), +)
}
}
But then on BinaryInteger:
extension Array where Element: BinaryInteger {
var mean: Double { return map { Double(exactly: $0)! }.mean }
var stdev: Double? { return map { Double(exactly: $0)! }.stdev }
var stdevp: Double? { return map { Double(exactly: $0)! }.stdevp }
}
Note, in my scenario, even when dealing with integer input data, I generally want floating point mean and standard deviations, so I arbitrarily chose Double. And you might want to do safer unwrapping of Double(exactly:). You can handle this scenario any way you want. But it illustrates the idea.

Not that I know Swift, but from numerics POV you're doing it a bit inefficiently
Basically, you're doing two passes (actually, three) over the array to compute two values, where one pass should be enough. Vairance might be expressed as E(X2) - E(X)2, so in some pseudo-code:
tuple<float,float> get_mean_sd(data) {
float s = 0.0f;
float s2 = 0.0f;
for(float v: data) {
s += v;
s2 += v*v;
}
s /= count;
s2 /= count;
s2 -= s*s;
return tuple(s, sqrt(s2 > 0.0 ? s2 : 0.0));
}

Just a heads-up, but when I tested the code outlined by Severin Pappadeux the result was a "population standard deviation" rather than a "sample standard deviation". You would use the first in an instance where 100% of the relevant data is available to you, such as when you are computing the variance around an average grade for all 20 students in a class. You would use the second if you did not have universal access to all the relevant data, and had to estimate the variance from a much smaller sample, such as estimating the height of all males within a large country.
The population standard deviation is often denoted as StDevP. The Swift 5.0 code I used is shown below. Note that this is not suitable for very large arrays due to loss of the "small value" bits as the summations get large. Especially when the variance is close to zero you might run into run-times errors. For such serious work you might have to introduce an algorithm called compensated summation
import Foundation
extension Array where Element: FloatingPoint
{
var sum: Element {
return self.reduce( 0, + )
}
var average: Element {
return self.sum / Element( count )
}
/**
(for a floating point array) returns a tuple containing the average and the "standard deviation for populations"
*/
var averageAndStandardDeviationP: ( average: Element, stDevP: Element ) {
let sumsTuple = sumAndSumSquared
let populationSize = Element( count )
let average = sumsTuple.sum / populationSize
let expectedXSquared = sumsTuple.sumSquared / populationSize
let variance = expectedXSquared - (average * average )
return ( average, sqrt( variance ) )
}
/**
(for a floating point array) returns a tuple containing the sum of all the values and the sum of all the values-squared
*/
private var sumAndSumSquared: ( sum: Element, sumSquared: Element ) {
return self.reduce( (Element(0), Element(0) ) )
{
( arg0, x) in
let (sumOfX, sumOfSquaredX) = arg0
return ( sumOfX + x, sumOfSquaredX + ( x * x ) )
}
}
}

Related

iOS How to Find Minimum Difference Between the values of Array of Floating/Integer Values

I have an array of floating values:
let array:[Double] = [2270.87, 2285.15, 2273.49, 2312.89, 2323.07, 2336.14, 2355.09, 2633.0, 2671.34]
I need and single line logic using swift higher-order functions or array extension to find the minimum difference value from all differences between the values.
I tried but I'm unable to move further:
let array = [2270.87, 2285.15, 2273.49, 2312.89, 2323.07, 2336.14, 2355.09, 2633.0, 2671.34]
let minDiff = array.map( { *All differences between array of values* } ).reduce(0, min)
Actually I am showing these values in a graph. So I want the minimum absolute fluctuation between the values. in the above example like 2323.07 and 2336.14 have minimum fluctuation 10.18.
You can zip two arrays where second one doesn't have first element. Then you can get abs value of subtraction value with index x + 1 from that with index x, and using map() you can get the minimum value from all of those absolute values.
zip(array, array.dropFirst()).map { abs($1 - $0) }.min() // 10.180000000000291
If you want to find the minimum absolute difference between two consecutive elements of your array, you can use below extension, which maps over the indices of the array, to access 2 consecutive elements at a time.
dropLast is important to ensure that we stop iteration before the last element (since we calculate the diff between the penultimate and last element before reaching the last index).
extension Array where Element: Comparable, Element: SignedNumeric {
func minConsecutiveDiff() -> Element? {
indices.dropLast().map { abs(self[$0] - self[$0+1])}.min()
}
}
let array:[Double] = [2270.87, 2285.15, 2273.49, 2312.89, 2323.07, 2336.14, 2355.09, 2633.0, 2671.34]
array.minConsecutiveDiff() // 10.18
If you were interested in the diff between any two elements of the array, not just consecutive ones, you could get that by first sorting the Array and then calculating the diff between the consecutive elements of the sorted array as #MartinR pointed out in comments.
extension Array where Element: Comparable, Element: SignedNumeric {
func minDiff() -> Element? {
sorted().minConsecutiveDiff()
}
}
Here is one way using forEach
var min: Double = array.max()!
var previous: Double?
array.forEach {
if let prev = previous, min > abs($0 - prev) {
min = abs($0 - prev)
}
previous = $0
}
another option is to use reduce(into:) with a tuple
let min = array.reduce(into: (Double, Double)(array.first!, 0)) {
if abs($1 - $0.1) < $0.0 {
$0.0 = abs($1 - $0.1)
}
$0.1 = $1
}.0
which both gives 10.18 as the smallest difference.

Type Int does not conform to protocol sequence

I have the following code in Swift 3:
var numbers = [1,2,1]
for number in numbers.count - 1 { // error
if numbers[number] < numbers[number + 1] {
print(number)
}
}
I am checking if the value on the index [number] is always higher than the value on the index [number + 1]. I am getting an error:
Type Int does not conform to protocol sequence
Any idea?
It may be swift.
You can use this iteration.
for number in 0..<(numbers.count-1)
The error is because Int is not a Sequence. You can create a range as already suggested, which does conform to a sequence and will allow iteration using for in.
One way to make Int conform to a sequence is:
extension Int: Sequence {
public func makeIterator() -> CountableRange<Int>.Iterator {
return (0..<self).makeIterator()
}
}
Which would then allow using it as a sequence with for in.
for i in 5 {
print(i)
}
but I wouldn't recommend doing this. It's only to demonstrate the power of protocols but would probably be confusing in an actual codebase.
From you example, it looks like you are trying to compare consecutive elements of the collection. A custom iterator can do just that while keeping the code fairly readable:
public struct ConsecutiveSequence<T: IteratorProtocol>: IteratorProtocol, Sequence {
private var base: T
private var index: Int
private var previous: T.Element?
init(_ base: T) {
self.base = base
self.index = 0
}
public typealias Element = (T.Element, T.Element)
public mutating func next() -> Element? {
guard let first = previous ?? base.next(), let second = base.next() else {
return nil
}
previous = second
return (first, second)
}
}
extension Sequence {
public func makeConsecutiveIterator() -> ConsecutiveSequence<Self.Iterator> {
return ConsecutiveSequence(self.makeIterator())
}
}
which can be used as:
for (x, y) in [1,2,3,4].makeConsecutiveIterator() {
if (x < y) {
print(x)
}
}
In the above example, the iterator will go over the following pairs:
(1, 2)
(2, 3)
(3, 4)
This maybe a little late but you could have done:
for number in numbers { }
instead of:
for number in numbers.count - 1 { }
For a for loop to work a sequence (range) is needed. A sequence consists of a stating a value, an ending value and everything in between. This means that a for loop can be told to loop through a range with ether
for number in 0...numbers.count-1 { } `or` for number in numbers { }
Both example give the nesasery sequences. Where as:
for number in numbers.count - 1 { }
Only gives one value that could either be the starting or the ending value, making it impossible to work out how many time the for loop will have to run.
For more information see Apple's swift control flow documnetation
This error can also come about if you try to enumerate an array instead of the enumerated array. For example:
for (index, element) in [0, 3, 4] {
}
Should be:
for (index, element) in [0, 3, 4].enumerated() {
}
So first you need to understand what is sequence..
A type that provides sequential, iterated access to its elements.
A sequence is a list of values that you can step through one at a time. The most common way to iterate over the elements of a sequence is to use a for-in loop:
let oneTwoThree = 1...3. // Sequence
for loop actually means
For number in Sequences {}
So you need to use
for number in 0..<(numbers.count-1) {}
The error is because number is not an index, but the element of the array on each iteration. You can modify your code like this:
var numbers = [1,2,1,0,3]
for number in 0..<numbers.count - 1 {
if numbers[number] < numbers[number + 1] {
print(numbers[number])
}
}
Or there is a trick using the sort method, but that's kind of a hack (and yes, the subindexes are right, but look like inverted; you can try this directly on a Playground):
var numbers = [1,2,1,0,3]
numbers.sort {
if $0.1 < $0.0 {
print ($0.1)
}
return false
}
For me, this error occurred when I tried writing a for loop, not for an array but a single element of the array.
For example:
let array = [1,2,3,4]
let item = array[0]
for its in item
{
print(its)
}
This gives an error like: Type Int does not conform to protocol 'sequence'
So, if you get this error in for loop, please check whether you are looping an array or not.

Swift - speed and efficiency of higher order functions (reduce)

Quick question please about the efficiency of higher order swift functions with large input data. During a recent test I had a question about finding 'equlibirum indexes' in arrays- i.e. the index of an array where the sum of all elements below the index equals the sum of all elements above the index
An equilibrium index of this array is any integer P such that 0 ≤ P <
N and the sum of elements of lower indices is equal to the sum of
elements of higher indices, i.e.
A[0] + A[1] + ... + A[P−1] = A[P+1] + ... + A[N−2] + A[N−1].
The challenge was to write a short function which computed the first (or any) index which was considered 'equilibirum'.
I put together a simple snippet which scored highly but failed some of the 'performance' tests which used large input data (array sizes around 100,000).
Here's the code
public func solution(inout A : [Int]) -> Int {
var index = 0;
for _ in A {
let sumBefore = A[0...index].reduce(0) { $0 + $1 }
let sumAfter = A[index...A.count-1].reduce(0) { $0 + $1 }
if (sumBefore == sumAfter) { return index; }
index += 1;
}
return -1;
}
Would anyone please be able to explain why the code performs so poorly with large sets of data, or any recommended alternatives?
Here, for example is a description of a failing perfomance test:
Large performance test, O(n^2) solutions should fail.
✘ TIMEOUT ERROR
running time: >6.00 sec., time limit: 0.22 sec.
It looks like the challenge is failing because your solution is O(n^2).
Your for loop, along with 2 sequential reduces inside, make your solution ~ O(2*n^2) since reduce goes through all the elements again.
A simpler solution is to first compute the whole sum, and then iterate through the elements once, subtracting each value from the whole sum, one by one, thus having access to the left and right sums, for comparison.
Using Swift 3.0, Xcode 8:
func findEquilibriumIndex(in array: [Int]) -> Int? {
var leftSum = 0
var rightSum = array.reduce(0, combine: +)
for (index, value) in array.enumerated() {
rightSum -= value
if leftSum == rightSum {
return index
}
leftSum += value
}
return nil
}
let sampleArray = [-7, 1, 5, 2, -4, 3, 0]
findEquilibriumIndex(in: sampleArray)
The problem is not that "the built-in functions perform so poorly."
Your solution is slow because in each iteration, N elements are
added (N being the length of the array). It would be more efficient
to compute the total sum once and update the "before sum"
and "after sum" while traversing through the array. This reduces
the complexity from O(N^2) to O(N):
public func solution(A : [Int]) -> Int {
var sumBefore = 0
var sumAfter = A.reduce(0, combine: +)
for (idx, elem) in A.enumerate() {
sumAfter -= elem
if sumBefore == sumAfter {
return idx
}
sumBefore += elem
}
return -1
}
(Swift 2.2, Xcode 7.3.1)
Remarks:
There is no reason to pass the array as inout parameter.
An operator (in this case +) can be passed as a argument to the reduce() function.
enumerate() returns a sequence of array indices together with
the corresponding element, this saves another array access.
Note also that a more "Swifty" design would be to make the return type
an optional Int? which is nil if no solution was found.
The incrementalSums extension
If you define this extension
extension Array where Element : SignedInteger {
var incrementalSums: [Element] {
return Array(reduce([0]) { $0.0 + [$0.0.last! + $0.1] }.dropLast())
}
}
given an array of Int(s) you can build an array where the Int at the n-th position represents the sum of the values from 0 to (n-1) in the original array.
Example
[1, 2, 3, 10, 2].incrementalSums // [0, 1, 3, 6, 16]
The equilibriumIndex function
Now you can build a function like this
func equilibriumIndex(nums: [Int]) -> Int? {
let leftSums = nums.incrementalSums
let rightSums = nums.reversed().incrementalSums.reversed()
return Array(zip(leftSums, rightSums)).index { $0 == $1 }
}
Here is a functional version of the solution in Swift 3
let total = sampleArray.reduce(0,+)
var sum = 0
let index = sampleArray.index{ v in defer {sum += v}; return sum * 2 == total - v }
If I understand correctly the element at the resulting index is excluded from the sum on each side (which I'm not certain the other solutions achieve)

How do I pick the nearest element in an array in Swift?

I was surprised I could not find a thread on this, but I need to check a series of arrays for a specific value, and if not present, check if the value falls between the max and min value, and then choose the closest, most negative value to assign to a variable.
I attempted to accomplish this with the function below, but it yields a compiler error: Cannot call value of non-function type "Float!"
Is there any way to overcome the compiler error, or should I try a different approach?
func nearestElement(powerD : Float, array : [Float]) -> Float {
var n = 0
var nearestElement : Float!
while array[n] <= powerD {
n++;
}
nearestElement = array[n] // error: Cannot call value of non-function type "Float!"
return nearestElement;
}
I'd like to then call nearestElement() when I check each array, within arrayContains():
func arrayContains(array: [Float], powerD : Float) {
var nearestElement : Float!
if array.minElement() < powerD && powerD < array.maxElement() {
if array.contains(powerD) {
contactLensSpherePower = vertexedSpherePower
} else {
contactLensSpherePower = nearestElement(powerD, array)
}
}
}
Is there any way to overcome the compiler error, or should I try a different approach?
First, it's worth noting the behavior is largely dependent upon the version of Swift you're using.
In general though, your issue is with naming a variable the same as a method:
func nearestElement(powerD : Float, array : [Float]) -> Float {
var n = 0
var nearestElement : Float! //<-- this has the same name as the function
while array[n] <= powerD {
n++;
}
nearestElement = array[n] // error: Cannot call value of non-function type "Float!"
return nearestElement;
}
Also, in arrayContains, you'll also want to rename var nearestElement : Float! so there's no ambiguity there as well.
Optimised solution using higher order functions:
func closestMatch(values: [Int64], inputValue: Int64) -> Int64? {
return (values.reduce(values[0]) { abs($0-inputValue) < abs($1-inputValue) ? $0 : $1 })
}
Swift is advancing with every version to optimise the performance and efficiency. With higher order functions finding the closest match in an array of values is much easier with this implementation. Change the type of value as per your need.

Converting Array Of String to Double and then calculating the sum in Swift

I have an array of Strings that i would like to convert to Double. Then i would like to add each item in the array together and get the sum.
this is my code so far, After enumerating the array I'm having issues adding all of them together.
update: Xcode 10.1 • Swift 4.2.1 or later
let strings = ["1.9","2.7","3.1","4.5","5.0"]
let doubles = strings.compactMap(Double.init)
let sum = doubles.reduce(0, +)
print(sum) // 17.2
If you dont need the intermediary collection
let sum = strings.reduce(0) { $0 + (Double($1) ?? .zero) }
Just map (iterate and convert each value in array) all values in array to Double and then reduce all Double values with start value 0 and closure (in your case it's just an operator) +.
reduce a collection of elements down to a single value by recursively applying the provided closure.
let stringDoubles = ["2.9","3.1","1.7","9.5","5.6"]
let sum = stringDoubles.map { Double($0)! }.reduce(0, combine: +)
print(sum) // "22.8". If start value was, for example, 10, print(sum) => "32.8"
Loop through the array,
for each string, convert the string to double using:
Double(string:String)
Then add each to the tally
var strings:[String] = ["1.3", "1", "8", "5", "bad number"]
var tally = 0.0
for eachString in strings{
// Convert each string to Double
if let num = Double(eachString) { //Double(String) returns an optional.
tally += num
} else {
print("Error converting to Double")
}
// Another way to convert if you don't need error handling
// NSString.doubleValue will just return 0.0 on a bad string.
// let num=(eachString as NSString).doubleValue
// tally += num
}
print(tally)
Here is an answer to your first part of your question in Swift 2.0:
I have an array of Strings that i would like to convert to Double. Then i would like to add each item in the array together and get the sum.
let myStrings = ["2", "3","5.6", "4", "6"]
let doubles = myStrings.map { (s : String) -> Double in
if let d = Double(s){
return d
}
return 0.0
}
let sum = doubles.reduce(0.0, combine: {(sum: Double, item:Double) ->Double in
return sum + item
})

Resources