Create array with non-integer increments - arrays

I am trying to create time stamp arrays in Swift.
So, say I want to go from 0 to 4 seconds, I can use Array(0...4), which gives [0, 1, 2, 3, 4]
But how can I get [0.0, 0.5 1.0, 2.0, 2.5, 3.0, 3.5, 4.0]?
Essentially I want a flexible delta, such as 0.5, 0.05, etc.

You can use stride(from:through:by:):
let a = Array(stride(from: 0.0, through: 4.0, by: 0.5))

An alternative for non-constant increments (even more viable in Swift 3.1)
The stride(from:through:by:) functions as covered in #Alexander's answer is the fit for purpose solution where, but for the case where readers of this Q&A wants to construct a sequence (/collection) of non-constant increments (in which case the linear-sequence constructing stride(...) falls short), I'll also include another alternative.
For such scenarios, the sequence(first:next:) is a good method of choice; used to construct a lazy sequence that can be repeatedly queried for the next element.
E.g., constructing the first 5 ticks for a log10 scale (Double array)
let log10Seq = sequence(first: 1.0, next: { 10*$0 })
let arr = Array(log10Seq.prefix(5)) // [1.0, 10.0, 100.0, 1000.0, 10000.0]
Swift 3.1 is intended to be released in the spring of 2017, and with this (among lots of other things) comes the implementation of the following accepted Swift evolution proposal:
SE-0045: Add prefix(while:) and drop(while:) to the stdlib
prefix(while:) in combination with sequence(first:next) provides a neat tool for generating sequences with everything for simple next methods (such as imitating the simple behaviour of stride(...)) to more advanced ones. The stride(...) example of this question is a good minimal (very simple) example of such usage:
/* this we can do already in Swift 3.0 */
let delta = 0.05
let seq = sequence(first: 0.0, next: { $0 + delta})
/* 'prefix(while:)' soon available in Swift 3.1 */
let arr = Array(seq.prefix(while: { $0 <= 4.0 }))
// [0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0]
// ...
for elem in sequence(first: 0.0, next: { $0 + delta})
.prefix(while: { $0 <= 4.0 }) {
// ...
}
Again, not in contest with stride(...) in the simple case of this Q, but very viable as soon as the useful but simple applications of stride(...) falls short, e.g. for a constructing non-linear sequences.

Related

How can I directly modify the weight values in the Julia library Flux?

In the Julia library Flux, we have the ability to take a neural network, let's call it network m and extract the weights of network m with the following code:
params(m)
This returns a Zygote.Params type of object, of the form:
Params([Float32[0.20391908 -0.101616435 0.09610984 -0.1013181 -0.13325627 -0.034813307 -0.13811183 0.27022845 ...]...)
If I wanted to alter each of the weights slightly, how would I be able to access them?
Edit:
As requested, here is the structure for m:
Chain(LSTM(8,10),Dense(10,1))
You can iterate on a Params object to access each set of parameters as an array, which you can manipulate in place.
Supposing you want to change every parameter by 1‰, you could do something like the following:
julia> using Flux
julia> m = Dense(10, 5, σ)
Dense(10, 5, σ)
julia> params(m)
Params([Float32[-0.026854342 -0.57200056 … 0.36827534 -0.39761665; -0.47952518 0.594778 … 0.32624483 0.29363066; … ; -0.22681071 -0.0059174187 … -0.59344876 -0.02679312;
-0.4910349 0.60780525 … 0.114975974 0.036513895], Float32[0.0, 0.0, 0.0, 0.0, 0.0]])
julia> for p in params(m)
p .*= 1.001
end
julia> params(m)
Params([Float32[-0.026881196 -0.5725726 … 0.3686436 -0.39801428; -0.4800047 0.5953728 … 0.32657108 0.2939243; … ; -0.22703752 -0.0059233364 … -0.5940422 -0.026819913; -0.
49152592 0.60841304 … 0.11509095 0.03655041], Float32[0.0, 0.0, 0.0, 0.0, 0.0]])

Efficiently sorting and filtering a JaggedArray by another one

I have a JaggedArray (awkward.array.jagged.JaggedArray) that contains indices that point to positions in another JaggedArray. Both arrays have the same length, but each of the numpy.ndarrays that the JaggedArrays contain can be of different length. I would like to sort the second array using the indices of the first array, at the same time dropping the elements from the second array that are not indexed from the first array. The first array can additionally contain values of -1 (could also be replaced by None if needed, but this is currently not that case) that mean that there is no match in the second array. In such a case, the corresponding position in the first array should be set to a default value (e.g. 0).
Here's a practical example and how I solve this at the moment:
import uproot
import numpy as np
import awkward
def good_index(my_indices, my_values):
my_list = []
for index in my_indices:
if index > -1:
my_list.append(my_values[index])
else:
my_list.append(0)
return my_list
indices = awkward.fromiter([[0, -1], [3,1,-1], [-1,0,-1]])
values = awkward.fromiter([[1.1, 1.2, 1.3], [2.1,2.2,2.3,2.4], [3.1]])
new_map = awkward.fromiter(map(good_index, indices, values))
The resulting new_map is: [[1.1 0.0] [2.4 2.2 0.0] [0.0 3.1 0.0]].
Is there a more efficient/faster way achieving this? I was thinking that one could use numpy functionality such as numpy.where, but due to the different lengths of the ndarrays this fails at least for the ways that I tried.
If all of the subarrays in values are guaranteed to be non-empty (so that indexing with -1 returns the last subelement, not an error), then you can do this:
>>> almost = values[indices] # almost what you want; uses -1 as a real index
>>> almost.content = awkward.MaskedArray(indices.content < 0, almost.content)
>>> almost.fillna(0.0)
<JaggedArray [[1.1 0.0] [2.4 2.2 0.0] [0.0 3.1 0.0]] at 0x7fe54c713c88>
The last step is optional because without it, the missing elements are None, rather than 0.0.
If some of the subarrays in values are empty, you can pad them to ensure they have at least one subelement. All of the original subelements are indexed the same way they were before, since pad only increases the length, if need be.
>>> values = awkward.fromiter([[1.1, 1.2, 1.3], [], [2.1, 2.2, 2.3, 2.4], [], [3.1]])
>>> values.pad(1)
<JaggedArray [[1.1 1.2 1.3] [None] [2.1 2.2 2.3 2.4] [None] [3.1]] at 0x7fe54c713978>

Powershell Comparing Array Elements

I'm likely missing something simple here, so I apologize in advance. I am also aware that there is likely a better approach to this, so I'm open to that as well.
I'm trying to run a PowerShell script that will look at an array of values, comparing them to see the value of the difference between two elements of an array.
Below is a sample data set I'm using to test with that is imported into powershell from CSV:
1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.7, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.7, 2.9, 3.0
What I'm trying to accomplish is running through this list and comparing the second entry with the first, the third with the second, fourth with the third, etc, adding the element to $export ONLY if it has a value that is at least 0.2 greater than the previous element.
Here's what I've tried:
$import = get-content C:/pathtoCSVfile
$count = $import.Length-1;
$current=0;
Do
{
$current=$current+1;
$previous=$current-1
if (($import[$current]-$import[$previous]) -ge 0.2)
{
$export=$export+$import[$current]+"`r`n";
}
}
until ($current -eq $count)
Now I've run this with Trace on and it assigns values to $current and $previous and runs the subtraction of the two as described in the if condition on each loop through, but ONLY for the value of 2.7 ($import[14]-$import[13]) is it registering that the if condition has been met, thus leaving only a single value of 2.7 in $export. I expected other values (1.7, 1.9, and 2.9) to also be added to the $export variable.
Again, this is probably something stupid/obvious I'm overlooking, but I can't seem to figure it out. Thanks in advance for any insight you can offer.
The problem is that decimal fractions have no exact representation in the implicitly used [double] data type, resulting in rounding errors that cause your -ge 0.2 comparison to yield unexpected results.
A simple example with [double] values, which are what PowerShell implicitly uses with number literals that have a decimal point:
PS> 2.7 - 2.5 -ge 0.2
True # OK, but only accidentally so, due to the specific input numbers.
PS> 1.7 - 1.5 -ge 0.2
False # !! Due to the inexact internally binary [double] representation.
If you force your calculations to use the [decimal] type instead, the problem goes away.
Applied to the above example (appending d to a number literal in PowerShell makes it a [decimal]):
PS> 1.7d - 1.5d -ge 0.2d
True # OK - Comparison is now exact, due to [decimal] values.
Applied in the context of a more PowerShell-idiomatic reformulation of your code:
# Sample input; note that floating-point number literals such as 1.0 default to [double]
# Similarly, performing arithmetic on *strings* that look like floating-point numbers
# defaults to [double], and Import-Csv always creates string properties.
$numbers = 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.7, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.7, 2.9, 3.0
# Collect those array elements that are >= 0.2 than their preceding element
# in output *array* $exports.
$exports = foreach ($ndx in 1..($numbers.Count - 1)) {
if ([decimal] $numbers[$ndx] - [decimal] $numbers[$ndx-1] -ge 0.2d) {
$numbers[$ndx]
}
}
# Output the result array.
# To create a multi-line string representation, use $exports -join "`r`n"
$exports
The above yields:
1.7
1.9
2.7
2.9

Swift Accelerate for Mean & Standard Deviation

I am looking at Accelerate to compute mean and standard deviation of arrays in Swift.
I can do the mean. How do I do the standard deviation?
let rr: [Double] = [ 18.0, 21.0, 41.0, 42.0, 48.0, 50.0, 55.0, 90.0 ]
var mn: Double = 0.0
vDSP_meanvD(rr, 1, &mn, vDSP_Length(rr.count))
print(mn) // prints correct mean as 45.6250
// Standard Deviation should be 22.3155
You can compute the standard deviation from the mean value and
the mean square value (compare https://en.wikipedia.org/wiki/Standard_deviation#Identities_and_mathematical_properties and https://en.wikipedia.org/wiki/Algebraic_formula_for_the_variance):
import Accelerate
let rr: [Double] = [ 18.0, 21.0, 41.0, 42.0, 48.0, 50.0, 55.0, 90.0 ]
var mn: Double = 0.0 // mean value
vDSP_meanvD(rr, 1, &mn, vDSP_Length(rr.count))
var ms: Double = 0.0 // mean square value
vDSP_measqvD(rr, 1, &ms, vDSP_Length(rr.count))
let sddev = sqrt(ms - mn * mn) * sqrt(Double(rr.count)/Double(rr.count - 1))
print(mn, sddev)
// 45.625 22.315513501982
Alternatively (for iOS 9.0 and later or macOS 10.11 and later), use vDSP_normalizeD:
var mn = 0.0
var sddev = 0.0
vDSP_normalizeD(rr, 1, nil, 1, &mn, &sddev, vDSP_Length(rr.count))
sddev *= sqrt(Double(rr.count)/Double(rr.count - 1))
print(mn, sddev)
// 45.625 22.315513501982
an add-on for #Martin R's answer: There is also a vDSP_normalize function for Float/single precision.
func vDSP_normalize(UnsafePointer<Float>, vDSP_Stride, UnsafeMutablePointer<Float>?, vDSP_Stride, UnsafeMutablePointer<Float>, UnsafeMutablePointer<Float>, vDSP_Length)
//Compute mean and standard deviation and then calculate new elements to have a zero mean and a unit standard deviation. Single precision.
func vDSP_normalizeD(UnsafePointer<Double>, vDSP_Stride, UnsafeMutablePointer<Double>?, vDSP_Stride, UnsafeMutablePointer<Double>, UnsafeMutablePointer<Double>, vDSP_Length)
//Compute mean and standard deviation and then calculate new elements to have a zero mean and a unit standard deviation. Double precision.

Easiest way to represent Euclidean Distance in scala

I am writing a data mining algorithm in Scala and I want to write the Euclidean Distance function for a given test and several train instances. I have an Array[Array[Double]] with test and train instances. I have a method which loops through each test instance against all training instances and calculates distances between the two (picking one test and train instance per iteration) and returns a Double.
Say, for example, I have the following data points:
testInstance = Array(Array(3.2, 2.1, 4.3, 2.8))
trainPoints = Array(Array(3.9, 4.1, 6.2, 7.3), Array(4.5, 6.1, 8.3, 3.8), Array(5.2, 4.6, 7.4, 9.8), Array(5.1, 7.1, 4.4, 6.9))
I have a method stub (highlighting the distance function) which returns neighbours around a given test instance:
def predictClass(testPoints: Array[Array[Double]], trainPoints: Array[Array[Double]], k: Int): Array[Double] = {
for(testInstance <- testPoints)
{
for(trainInstance <- trainPoints)
{
for(i <- 0 to k)
{
distance = euclideanDistanceBetween(testInstance, trainInstance) //need help in defining this function
}
}
}
return distance
}
I know how to write a generic Euclidean Distance formula as:
math.sqrt(math.pow((x1 - y1), 2) + math.pow((x2 - y2), 2))
I have some pseudo steps as to what I want the method to do with a basic definition of the function:
def distanceBetween(testInstance: Array[Double], trainInstance: Array[Double]): Double = {
// subtract each element of trainInstance with testInstance
// for example,
// iteration 1 will do [Array(3.9, 4.1, 6.2, 7.3) - Array(3.2, 2.1, 4.3, 2.8)]
// i.e. sqrt(3.9-3.2)^2+(4.1-2.1)^2+(6.2-4.3)^2+(7.3-2.8)^2
// return result
// iteration 2 will do [Array(4.5, 6.1, 8.3, 3.8) - Array(3.2, 2.1, 4.3, 2.8)]
// i.e. sqrt(4.5-3.2)^2+(6.1-2.1)^2+(8.3-4.3)^2+(3.8-2.8)^2
// return result, and so on......
}
How can I write this in code?
So the formula you put in only works for two-dimensional vectors. You have four dimensions, but you should probably write your function to be flexible on this. So check out this formula.
So what you really want to say is:
for each position i:
subtract the ith element of Y from the ith element of X
square it
add all of those up
square root the whole thing
To make this more functional-programming style it will be more like:
square root the:
sum of:
zip X and Y into pairs
for each pair, square the difference
So that would look like:
import math._
def distance(xs: Array[Double], ys: Array[Double]) = {
sqrt((xs zip ys).map { case (x,y) => pow(y - x, 2) }.sum)
}
val testInstances = Array(Array(5.0, 4.8, 7.5, 10.0), Array(3.2, 2.1, 4.3, 2.8))
val trainPoints = Array(Array(3.9, 4.1, 6.2, 7.3), Array(4.5, 6.1, 8.3, 3.8), Array(5.2, 4.6, 7.4, 9.8), Array(5.1, 7.1, 4.4, 6.9))
distance(testInstances.head, trainPoints.head)
// 3.2680269276736382
As for predicting the class, you can make that more functional too, but it's unclear what the Double is that you are intending to return. It seems like you would want to predict the class for each test instance? Maybe choosing the class c corresponding to the nearest training point?
def findNearestClasses(testPoints: Array[Array[Double]], trainPoints: Array[Array[Double]]): Array[Int] = {
testPoints.map { testInstance =>
trainPoints.zipWithIndex.map { case (trainInstance, c) =>
c -> distance(testInstance, trainInstance)
}.minBy(_._2)._1
}
}
findNearestClasses(testInstances, trainPoints)
// Array(2, 0)
Or maybe you want the k-nearest neighbors:
def findKNearestClasses(testPoints: Array[Array[Double]], trainPoints: Array[Array[Double]], k: Int): Array[Int] = {
testPoints.map { testInstance =>
val distances =
trainPoints.zipWithIndex.map { case (trainInstance, c) =>
c -> distance(testInstance, trainInstance)
}
val classes = distances.sortBy(_._2).take(k).map(_._1)
val classCounts = classes.groupBy(identity).mapValues(_.size)
classCounts.maxBy(_._2)._1
}
}
findKNearestClasses(testInstances, trainPoints)
// Array(2, 1)
The generic formula for the euclidean distance is as follows:
math.sqrt(math.pow((x1 - x2), 2) + math.pow((y1 - y2), 2))
You can only compare the x coordinate with the x, and y with the y.

Resources