F# negative indices in array - arrays

In my application there is a need to precompute and keep trigonometric function values for some particular angle parameters, the range varies from -90 to 180 degree.
I can create arrays(one for each sine, cos etc) which will store value for -90 angle on 0th index and while retrieving I can subtract 90 from the index.
but is there any other way in F# to specify range of index, if we want to use [-90 .. 180]
so that I can have more meaningful implementation.
considering alternate solution, will usage of dictionary be as fast as usage of simple 2D arrays.

If I understand well your problem you would need to retrieve precomputed values by the key/index which is a given angle going from -90 to 180. Something like this ?
let value = precomputed.[-90]
You could use Map for that. F# maps are implemented as immutable AVL trees, an efficient data structure which forms a self-balancing binary tree. This can be very efficient if you have a precomputed data and you need to look up by key fre­quently. Its immutabil­ity in this case ensures that the sta­tic data can­not be mod­i­fied by mis­take and has lit­tle impact to per­for­mance as you never need to mutate it once initialized. However if you need to modify it frequently I would advice you to use a regular .NET Dictionary because they are based on hashtable which has a better performance than AVL trees.
You could turn the list into the map where the key would be the angle and the value would be the precomputed one :
let precomputedValus f =
[for i in -90..180 ->
i, f(i)]
|> Map.ofList
Where f is the function doing the precomputation. So you obtain your precomputed map for every angle something like that.
let sinValues = precomputedValus (fun e -> sin (float e))
And you can access the procomputed sin value like that
> sinValues.[-90];;
val it : float = -0.8939966636

A little index arithmetic will be of use:
let inline idx i = (i + 270) % 270
since it's inline the overhead is going to be very, very small. And you can just use myArray.[idx -90]. (you might have to write different modulo values, but you get the picture)

The easiest way is to simply make some function which given some i returns a pre-computed value of sin i:
let array_table =
let a = Array.init 271 (fun i -> sin <| float (i-90))
fun i -> a.[i+90]
To lookup the sine of, say, 42, you simply do table 42.
anushri and Tomasz both mention using Maps instead of Arrays, but in my experience, these are not good candidates for storing precomputed values, as they are much slower than Arrays. Let's try:
let map_table =
let m = Seq.init 271 (fun i -> i-90, sin <| float i) |> Map.ofSeq
fun i -> Map.find i m
let no_table =
fun i -> sin (float i)
// Benchmarking code omitted (100000 lookups of each value in -90..270)
When I run this, array_table is roughly 8 times faster than no_table and 22 times faster than map_table:
> fsharpi --optimize+ memo.fsx
map_table
Real: 00:00:02.922, CPU: 00:00:02.924, GC gen0: 4, gen1: 0
no_table
Real: 00:00:01.091, CPU: 00:00:01.091, GC gen0: 3, gen1: 0
array_table
Real: 00:00:00.130, CPU: 00:00:00.130, GC gen0: 3, gen1: 0

Related

Creating a logarithmic spaced array in IDL

I was looking for a way to generate a logarithmic spaced array in IDL.
From the L3 Harris Geospatial website I came across "arrgen" and was trying to use it for this purpose. However,
arrgen(1,215,/log)
returns the error: Variable is undefined: ARRGEN.
What would be the correct way to do it?
Thanks in advance for your help
Start by defining your lower and upper bounds in which ever log-base you prefer. I will use base $e$ for brevity sake.
lowe = ALOG(low[0])
uppe = ALOG(upp[0])
where low and upp are scalar, numerical values you, the user, define (e.g., 1 and 215 in your example). Then construct an evenly spaced array of n elements, such as:
dinde = DINDGEN(n[0])*(uppe[0] - lowe[0])/(n[0] - 1L) + lowe[0]
where n is a scalar integer. Now convert back to linear space to get:
dind = EXP(dinde)
This will be a logarithmically spaced array. If you want to use base-10 log, then substitute ALOG for ALOG10. If you need another base, then you can use the logarithmic change of base rule given by:
logb x = logc x / logc b

Ranking Function in F#

I wrote an algorithm for ranking an array.
let rankFun array =
let arrayNew = List.toArray array
let arrayLength = Array.length arrayNew
let rankedArray = Array.create arrayLength 1.0
for i in 0 .. arrayLength - 2 do
for j in (i+1) .. arrayLength - 1 do
if arrayNew.[i] > arrayNew.[j] then
rankedArray.[i] <- rankedArray.[i] + 1.0
elif arrayNew.[i] < arrayNew.[j] then
rankedArray.[j] <- rankedArray.[j] + 1.0
else
rankedArray.[i] <- rankedArray.[i] + 0.0
rankedArray
I wanted to ask you what do you think about performance? I used for loops and I was wondering if you think there's another way better than this one. Before getting to this I was sorting my array, keeping original indexes, ranking and only afterwards resorting each rank to its original position, what was reeeeeally bad in terms of performance. Now I got to this improved version and was looking for some feedback. Any ideas?
Edit: Duplicated elements should have same rank. ;)
Thank you very much in advance. :)
I'm assuming that ranks can be taken from sorting the inputs, since you commented that the question's behavior on duplicates is a bug. It's surprising that the solution with sorting you described ran slower than the code shown in the question. It should be a lot faster.
A simple way to solve this via sorting is to build an ordered set from all values. Sets in F# are always ordered and contain no duplicates, so they can be used to create the ranking.
From the set, create a map from each value to its index in the set (plus one, to keep the ranking that starts with 1). With this, you can look up the rank of each value to fill the output array.
let getRankings array =
let rankTable =
Set.ofArray array
|> Seq.mapi (fun i n -> n, i + 1)
|> Map.ofSeq
array |> Array.map (fun n -> rankTable.[n])
This takes an array, rather than a list, because the input parameter in the question was called array. It also uses integers for the ranks, as this is the normal data type for this purpose.
This is much faster than the original algorithm, since all operations are at most O(n*log(n)), while the nested for-loops in the question are O(n^2). (See also: Wikipedia on Big O notation.) For only 10000 elements, the sorting-based solution already runs over 100 times faster on my computer.
(BTW, the statement else rankedArray.[i] <- rankedArray.[i] + 0.0 appears to do nothing. Unless you're doing some sort of black magic with the optimizer, you can just remove it.)

Build Dictionary of Arrays Efficiently in julia

I want to save the (x,y) coordinates in a grid network that are visited by different individuals. Let say I have 1000 individuals and the network size is x = 1:100 and y=1:100. I am using Dict() and here is a sample code about what I want to do:
individuals = 1:1000
x = 1:100
y = 1:100
function Visited_nodes()
nodes_of_inds =Dict{Int64, Array{Tuple{Int64, Int64}}}()
for ind in individuals
dum_array = Array{Tuple{Int64, Int64}}(0)
for i in x
for j in y
if rand()<0.2 # some conditions
push!(dum_array, (i,j))
end
end
end
nodes_of_inds[ind]=unique(dum_array)
end
return nodes_of_inds
end
#time nodes_of_inds = Visited_nodes()
# result: 1.742297 seconds (12.31 M allocations: 607.035 MB, 6.72% gc time)
But this is not efficient. I appreciate any advice how to make it more efficient.
Please see the performance tips. Very first piece of advice there: avoid global variables. individuals, x, and y are all non-constant global variables. Make them arguments to your function instead. That change alone speeds up your function by an order of magnitude.
By construction, you're not going to have any duplicate tuples in your dum_array, so you don't need to call unique. That shaves off another factor of two.
Finally, Array{T} isn't a concrete type. Julia's arrays also encode the dimensionality as a type parameter, which must be included for the dictionary of arrays to be efficient. Use Array{T, 1} or Vector{T} instead. This isn't a major consideration within the time of this function, though.
The major thing that's left is just the O(length(individuals)*length(x)*length(y)) computational complexity. Doing anything ten million times will add up quickly, no matter how efficient it is.
#Matt B., thanks for your response. About the global variables, I tried a simplified version of my code and it did not help the performance.
Let say I read my input data from a couple of csv files and I have three functions with different arguments:
function Read_input_data()
# read input data
individuals = readcsv("file1")
x = readcsv("file2")
y = readcsv("file3")
A = readcsv("file4")
B = readcsv("file5") # and a few other files
# call different functions
result_1 = Function1(individuals , x, y)
result_2 = Function2(result_1 ,y, A, B)
result_3 = Function3(result_2 , individuals, A, B)
return result_1, result_2, result_3
end
result_1, result_2, result_3 = Read_input_data()
I do not know why the performance is not better compared to when I define everything global! I appreciate any if you can comment about this!

Non-monolithic arrays in Haskell

I have accepted an answer to the question below, but It seemed I misunderstood how Arrays in haskell worked. I thought they were just beefed up lists. Keep that in mind when reading the question below.
I've found that monolithic arrays in haskell are quite inefficient when using them for larger arrays.
I haven't been able to find a non-monolithic implementation of arrays in haskell. What I need is O(1) time look up on a multidimensional array.
Is there an implementation of of arrays that supports this?
EDIT: I seem to have misunderstood the term monolithic. The problem is that it seems like the arrays in haskell treats an array like a list. I might be wrong though.
EDIT2: Short example of inefficient code:
fibArray n = a where
bnds = (0,n)
a = array bnds [ (i, f i) | i <- range bnds ]
f 0 = 0
f 1 = 1
f i = a!(i-1) + a!(i-2)
this is an array of length n+1 where the i'th field holds the i'th fibonacci number. But since arrays in haskell has O(n) time lookup, it takes O(n²) time to compute.
You're confusing linked lists in Haskell with arrays.
Linked lists are the data types that use the following syntax:
[1,2,3,5]
defined as:
data [a] = [] | a : [a]
These are classical recursive data types, supporting O(n) indexing and O(1) prepend.
If you're looking for multidimensional data with O(1) lookup, instead you should use a true array or matrix data structure. Good candidates are:
Repa - fast, parallel, multidimensional arrays -- (Tutorial)
Vector - An efficient implementation of Int-indexed arrays (both mutable and immutable), with a powerful loop optimisation framework . (Tutorial)
HMatrix - Purely functional interface to basic linear algebra and other numerical computations, internally implemented using GSL, BLAS and LAPACK.
Arrays have O(1) indexing. The problem is that each element is calculated lazily. So this is what happens when you run this in ghci:
*Main> :set +s
*Main> let t = 100000
(0.00 secs, 556576 bytes)
*Main> let a = fibArray t
Loading package array-0.4.0.0 ... linking ... done.
(0.01 secs, 1033640 bytes)
*Main> a!t -- result omitted
(1.51 secs, 570473504 bytes)
*Main> a!t -- result omitted
(0.17 secs, 17954296 bytes)
*Main>
Note that lookup is very fast, after it's already been looked up once. The array function creates an array of pointers to thunks that will eventually be calculated to produce a value. The first time you evaluate a value, you pay this cost. Here are a first few expansions of the thunk for evaluating a!t:
a!t -> a!(t-1)+a!(t-2)-> a!(t-2)+a!(t-3)+a!(t-2) -> a!(t-3)+a!(t-4)+a!(t-3)+a!(t-2)
It's not the cost of the calculations per se that's expensive, rather it's the need to create and traverse this very large thunk.
I tried strictifying the values in the list passed to array, but that seemed to result in an endless loop.
One common way around this is to use a mutable array, such as an STArray. The elements can be updated as they're available during the array creation, and the end result is frozen and returned. In the vector package, the create and constructN functions provide easy ways to do this.
-- constructN :: Unbox a => Int -> (Vector a -> a) -> Vector a
import qualified Data.Vector.Unboxed as V
import Data.Int
fibVec :: Int -> V.Vector Int64
fibVec n = V.constructN (n+1) c
where
c v | V.length v == 0 = 0
c v | V.length v == 1 = 1
c v | V.length v == 2 = 1
c v = let len = V.length v
in v V.! (len-1) + v V.! (len-2)
BUT, the fibVec function only works with unboxed vectors. Regular vectors (and arrays) aren't strict enough, leading back to the same problem you've already found. And unfortunately there isn't an Unboxed instance for Integer, so if you need unbounded integer types (this fibVec has already overflowed in this test) you're stuck with creating a mutable array in IO or ST to enable the necessary strictness.
Referring specifically to your fibArray example, try this and see if it speeds things up a bit:
-- gradually calculate m-th item in steps of k
-- to prevent STACK OVERFLOW , etc
gradualth m k arr
| m <= v = pre `seq` arr!m
where
pre = foldl1 (\a b-> a `seq` arr!b) [u,u+k..m]
(u,v) = bounds arr
For me, for let a=fibArray 50000, gradualth 50000 10 aran at 0.65 run time of just calling a!50000 right away.

Growing arrays in Haskell

I have the following (imperative) algorithm that I want to implement in Haskell:
Given a sequence of pairs [(e0,s0), (e1,s1), (e2,s2),...,(en,sn)], where both "e" and "s" parts are natural numbers not necessarily different, at each time step one element of this sequence is randomly selected, let's say (ei,si), and based in the values of (ei,si), a new element is built and added to the sequence.
How can I implement this efficiently in Haskell? The need for random access would make it bad for lists, while the need for appending one element at a time would make it bad for arrays, as far as I know.
Thanks in advance.
I suggest using either Data.Set or Data.Sequence, depending on what you're needing it for. The latter in particular provides you with logarithmic index lookup (as opposed to linear for lists) and O(1) appending on either end.
"while the need for appending one element at a time would make it bad for arrays" Algorithmically, it seems like you want a dynamic array (aka vector, array list, etc.), which has amortized O(1) time to append an element. I don't know of a Haskell implementation of it off-hand, and it is not a very "functional" data structure, but it is definitely possible to implement it in Haskell in some kind of state monad.
If you know approx how much total elements you will need then you can create an array of such size which is "sparse" at first and then as need you can put elements in it.
Something like below can be used to represent this new array:
data MyArray = MyArray (Array Int Int) Int
(where the last Int represent how many elements are used in the array)
If you really need stop-and-start resizing, you could think about using the simple-rope package along with a StringLike instance for something like Vector. In particular, this might accommodate scenarios where you start out with a large array and are interested in relatively small additions.
That said, adding individual elements into the chunks of the rope may still induce a lot of copying. You will need to try out your specific case, but you should be prepared to use a mutable vector as you may not need pure intermediate results.
If you can build your array in one shot and just need the indexing behavior you describe, something like the following may suffice,
import Data.Array.IArray
test :: Array Int (Int,Int)
test = accumArray (flip const) (0,0) (0,20) [(i, f i) | i <- [0..19]]
where f 0 = (1,0)
f i = let (e,s) = test ! (i `div` 2) in (e*2,s+1)
Taking a note from ivanm, I think Sets are the way to go for this.
import Data.Set as Set
import System.Random (RandomGen, getStdGen)
startSet :: Set (Int, Int)
startSet = Set.fromList [(1,2), (3,4)] -- etc. Whatever the initial set is
-- grow the set by randomly producing "n" elements.
growSet :: (RandomGen g) => g -> Set (Int, Int) -> Int -> (Set (Int, Int), g)
growSet g s n | n <= 0 = (s, g)
| otherwise = growSet g'' s' (n-1)
where s' = Set.insert (x,y) s
((x,_), g') = randElem s g
((_,y), g'') = randElem s g'
randElem :: (RandomGen g) => Set a -> g -> (a, g)
randElem = undefined
main = do
g <- getStdGen
let (grownSet,_) = growSet g startSet 2
print $ grownSet -- or whatever you want to do with it
This assumes that randElem is an efficient, definable method for selecting a random element from a Set. (I asked this SO question regarding efficient implementations of such a method). One thing I realized upon writing up this implementation is that it may not suit your needs, since Sets cannot contain duplicate elements, and my algorithm has no way to give extra weight to pairings that appear multiple times in the list.

Resources