Why is my recursion faster than Array.exists? - arrays

I am pretty new to F#. I'm trying to understand how I can get a fast code in F#. For this, I tried to write two methods (IsPrime1 and IsPrime2) for benchmarking. My code is:
// Learn more about F# at http://fsharp.net
open System
open System.Diagnostics
#light
let isDivisible n d = n % d = 0
let IsPrime1 n =
Array.init (n-2) ((+) 2) |> Array.exists (isDivisible n) |> not
let rec hasDivisor n d =
match d with
| x when x < n -> (n % x = 0) || (hasDivisor n (d+1))
| _ -> false
let IsPrime2 n =
hasDivisor n 2 |> not
let SumOfPrimes max =
[|2..max|] |> Array.filter IsPrime1 |> Array.sum
let maxVal = 20000
let s = new Stopwatch()
s.Start()
let valOfSum = SumOfPrimes maxVal
s.Stop()
Console.WriteLine valOfSum
Console.WriteLine("IsPrime1: {0}", s.ElapsedMilliseconds)
//////////////////////////////////
s.Reset()
s.Start()
let SumOfPrimes2 max =
[|2..max|] |> Array.filter IsPrime2 |> Array.sum
let valOfSum2 = SumOfPrimes2 maxVal
s.Stop()
Console.WriteLine valOfSum2
Console.WriteLine("IsPrime2: {0}", s.ElapsedMilliseconds)
Console.ReadKey()
IsPrime1 takes 760 ms while IsPrime2 takes 260ms for the same result.
What's going on here and how I can make my code even faster?

In IsPrime2, you don't construct a huge array so you could avoid allocating, explicitly traversing and garbage collecting this array. Remember that you call IsPrime1/IsPrime2 function max-1 times in SumOfPrimes so there are many instances of such array. Avoiding creating explicit data structures could be used as an optimization technique.
Here are some small optimizations which could be done on your code.
1) To check for divisors in hasDivisors, you only have to check up to sqrt(n) and skip all even numbers. If no divisor found, the checked number is prime.
let rec hasDivisor2 n d =
match d with
| x when x <= int(sqrt(float n)) -> (n % x = 0) || (hasDivisor2 n (d+2))
| _ -> false
let IsPrime3 n =
n = 2 || (n%2 <> 0 && not (hasDivisor2 n 3))
2) For SumOfPrimes, you could eliminate the intermediate array and also skip all even numbers (they couldn't be prime anyway).
let sumOfPrimes isPrime max =
[|2..max|] |> Array.filter isPrime|> Array.sum
let sumOfPrimes2 isPrime max =
let mutable sum = 2L
for i in 3..2..max do
if isPrime i then
sum <- sum + int64 i
sum
3) I did a small change so that isPrime is passed as an argument. In this way, you can measure your code more easily:
let time fn =
let sw = new System.Diagnostics.Stopwatch()
sw.Start()
let f = fn()
sw.Stop()
printfn "Time taken: %.2f s" <| (float sw.ElapsedMilliseconds)/1000.0
f
let maxVal = 200000
let p2 = time (fun () -> sumOfPrimes IsPrime2 maxVal)
let p3 = time (fun () -> sumOfPrimes2 IsPrime3 maxVal)
The new sumOfPrimes2 function with IsPrime3 is blazingly fast. It took 0.05 seconds on my machine for maxVal = 200000 while the original version took 7.45 seconds.

The reason for the speed difference is that the slow code does something like this:
if n%a.[1] = 0 || n%a.[2]=0 ...
wheras the fast code does:
if n%2=0 || n%(2+1)=0 ...
In the fast case we don't need to go to memory to get the next factor. We also avoid having to build the array in the fast case
Here is my generic very fast F# code to build up a table of primes (from this answer: https://stackoverflow.com/a/12014908/124259):
#time "on"
let limit = 1000000
//returns an array of all the primes up to limit
let table =
let table = Array.create limit true //use bools in the table to save on memory
let tlimit = int (sqrt (float limit)) //max test no for table, ints should be fine
let mutable curfactor = 1;
while curfactor < tlimit-2 do
curfactor <- curfactor+2
if table.[curfactor] then //simple optimisation
let mutable v = curfactor*2
while v < limit do
table.[v] <- false
v <- v + curfactor
let out = Array.create (100000) 0 //this needs to be greater than pi(limit)
let mutable idx = 1
out.[0]<-2
let mutable curx=1
while curx < limit-2 do
curx <- curx + 2
if table.[curx] then
out.[idx]<-curx
idx <- idx+1
out

Related

Random Number between 1 and 0 in matrix ocaml

I am trying to build a matrix in OCaml consisting only of 1 and 0. My current code is
let myArray = Array.make_matrix num num2 (Random.int 2) in
print myArray
However this fills the entire array up with only 0 or only 1, and not a combination, is there a way for me to do what i want?
Yes, Array.make_matrix fills the matrix with a single value that you supply.
You can write this:
let myArray =
Array.init num
(fun _ -> Array.init num2 (fun _ -> Random.int 2))
Here's how it looks:
# let num = 2;;
val num : int = 2
# let num2 = 3;;
val num2 : int = 3
# let myArray =
Array.init num
(fun _ -> Array.init num2 (fun _ -> Random.int 2));;
val myArray : int array array = [|[|0; 1; 0|]; [|1; 1; 0|]|]
#
Update
If you want different numbers each time, you need to initialize the random number generator with a different seed each time. An easy way to do this is:
Random.self_init ()
You can call this once at program startup, and you'll get different random numbers in each run of the program.

F# Finding the Missing element between 2 arrays/lists

`So, I am very new to F#. I hope the issue is simple. I have been doing research and looking around. I have an "Incomplete structured construct at or before this point in expression" error. I feel like it might be something simple or I am way off.
The objective is:
There is an array of non negative integers. A second array is
formed by shuffling the elements of the first
array and deleting a random element. Given these two arrays, find which element is missing in the second
array. Linear searching is not allowed.
let FindMiss list =
match list with
| [] ->
[]
|firstElem::otherElements ->
let rand = new Random
let shuffle (arr : 'a array) =
let array = Array.copy arr
let n = array.Length
for x in 1..n do
let i = n-x
let j = rand.Next(i+1)
let tmp = array.[i]
array.[i] <- array.[j]
array.[j] <- tmp
array
return array
array.[rand].delete
|array::list ->
let d=collections.defaultdict(int)
for num in list do
d[num] +=1
for num in array1 do
if d[num]==0 then return num
else d[num]-=1
printfn "The missing Number is: %A" (FindMiss[4;2;1;7;5;6;3;2])
The task is to reimplement List.except?
If not, just use that 'except' then:
[1;2;3] |> List.except [1;2]
Or was the task "random removal of element in list"? Then this is "the answer": https://stackoverflow.com/a/2889972/5514938
For a start, for more readability, you can take the shuffling and deleting a random element in a separate function. For arrays, they may look like this:
let Shuffle arr =
let rand = System.Random()
arr |> Array.sortBy(fun _ -> rand.Next())
let RemoveRandom arr =
let rand = System.Random()
let lng = arr |> Array.length
let index = rand.Next lng
[|0..lng - 1 |]
|> Array.choose(fun x -> if x = index then None else Some(arr.[x]))
|> Shuffle
Further define the search function:
let FindMiss arr1 arr2 =
let sum1 = arr1 |> Array.sum
let sum2 = arr2 |> Array.sum
sum1 - sum2
Example:
let first = [| 4;2;1;7;5;6;3;2 |]
first |> printfn "%A"
let second = first |> RemoveRandom
second |> printfn "%A"
FindMiss first second |> printfn "Missing value is %i"
Print:
[|4; 2; 1; 7; 5; 6; 3; 2|]
[|2; 2; 3; 7; 1; 5; 6|]
Missing value is 4
Link:
https://dotnetfiddle.net/g6wKUX

How to Generate A Specific Number of Random Indices for Array Element Removal F#

So sCount is the number of elements in the source array, iCount is the number of elements I want to remove.
let indices = Array.init iCount (fun _ -> rng.Next sCount) |> Seq.distinct |> Seq.toArray |> Array.sort
The problem with the method above is that I need to specifically remove iCount indices, and this doesn't guarantee that.
I've tried stuff like
while indices.Count < iCount do
let x = rng.Next sCount
if not (indices.Contains x) then
indices <- indices.Add x
And a few other similar things...
Every way I've tried has been extremely slow though - I'm dealing with source arrays of sizes up to 20 million elements.
What you're doing should be fine if you need a set of indices of negligible size compared to the array. Otherwise, consider doing a variation of a Knuth-Fisher-Yates shuffle to get the first i elements in a random permutation of 1 .. n:
let rndSubset i n =
let arr = Array.zeroCreate i
arr.[0] <- 0
for j in 1 .. n-1 do
let ind = rnd.Next(j+1)
if j < i then arr.[j] <- arr.[ind]
if ind < i then arr.[ind] <- j
arr
I won't give you F# code for this (because I don't know F#...), but I'll describe the approach/algorithm that you should use.
Basically, what you want to do is pick n random elements of a given list list. This can be done in pseudocode:
chosen = []
n times:
index = rng.upto(list.length)
elem = list.at(index)
list.remove-at(index)
chosen.add(elem)
Your list variable should be populated with all possible indices in the source list, and then when you pick n random values from that list of indices, you have random, distinct indices that you can do whatever you want with, including printing values, removing values, knocking yourself out with values, etc...
is iCount closer to the size of the array or closer to 0? That will change the algorithm which you will use.
If closer to 0, then keep track of the previously generated numbers and check if additional numbers have already been generated.
If closer to the size of the array then use the method as described by #feralin
let getRandomNumbers =
let rand = Random()
fun max count ->
Seq.initInfinite (fun _ -> rand.Next(max))
|> Seq.distinct
|> Seq.take count
let indices = Array.init 100 id
let numToRemove = 10
let indicesToRemove = getRandomNumbers (indices.Length - 1) numToRemove |> Seq.toList
> val indicesToRemove : int list = [32; 38; 26; 51; 91; 43; 92; 94; 18; 35]

What is the immutable version to de/reference array?

How to de/reference the 3 array variables in this code instead of using mutable values?
The code below computes the Longest common subsequence (LCS) by diagonal traversing the m*n array.
The arguments are 2 char arrays like so:
So LCS method should result to length 4 as the longest common sub-sequence chars are "acbb" & "bcbb".
let private s1 = "ABCDBB".ToCharArray()
let private s2 = "CBACBAABA".ToCharArray()
let public lcs_seq_1d_diags (x:char[]) (y:char[]) =
let m = x.Length
let n = y.Length
let mutable dk2 = Array.create (1+m) 0
//printfn "\r\n0: %A" dk2
let mutable dk1 = Array.create (1+m) 0
//printfn "1: %A" dk1
let mutable dk = Array.create (1+m) 0
for k = 2 to m+n do
let low = max 1 (k-m)
let high = min (k-1) n
for j = low to high do
let i = k - j
if x.[i-1] = y.[j-1] then
dk.[i] <- dk2.[i-1] + 1
else
dk.[i] <- max dk1.[i] dk1.[i-1]
let mutable temp = dk2
dk2 <- dk1
dk1 <- dk
dk <- temp
dk1.[m]
let private res_seq_1d_rows = duration (fun () -> lcs_seq_1d_rows s1 s2)
//res_seq_1d_rows = 4
Take a look at the reference cells http://msdn.microsoft.com/en-us/library/dd233186.aspx
The syntax looks like this:
let a = ref 1 // declaring a reference
a := 2 // changing the reference value
printfn "%i" !a // dereferencing
This might also be interesting: F#: let mutable vs. ref
Arrays are mutable by default. Try using a list instead if you want immutability.
Try starting with this instead:
let s1 = List.ofSeq "ABCDBB"
let s2 = List.ofSeq "CBACBAABA"
The rest I leave as an exercise for the reader :-)

How to translate this list-based code into using mutable arrays?

EDIT3: I'm writing a code to process very long input list of Ints with only few hundred non-duplicates. I use two auxiliary lists to maintain cumulative partial sums to calculate some accumulator value, the how's and why's are non-important. I want to ditch all lists here and turn it into nice destructive loop, and I don't know how. I don't need the whole code, just a skeleton code would be great, were read/write is done to two auxiliary arrays and some end result is returned. What I have right now would run 0.5 hour for the input. I've coded this now in C++, and it runs in 90 seconds for the same input.
I can't understand how to do this, at all. This is the list-based code that I have right now:(but the Map-based code below is clearer)
ins :: (Num b, Ord a) => a -> b -> [(a, b)] -> ([(a, b)], b)
ins n x [] = ( [(n,x)], 0)
ins n x l#((v, s):t) =
case compare n v of
LT -> ( (n,s+x) : l , s )
EQ -> ( (n,s+x) : t , if null t then 0 else snd (head t))
GT -> let (u,z) = ins n x t
in ((v,s+x):u,z)
This is used in a loop, to process a list of numbers of known length, (changed it to foldl now)
scanl g (0,([],[])) ns -- ns :: [Int]
g ::
(Num t, Ord t, Ord a) =>
(t, ([(a, t)], [(a, t)])) -> a -> (t, ([(a, t)], [(a, t)]))
g (c,( a, b)) n =
let
(a2,x) = ins n 1 a
(b2,y) = if x>0 then ins n x b else (b,0)
c2 = c + y
in
(c2,( a2, b2))
This works, but I need to speed it up. In C, I would keep the lists (a,b) as arrays; use binary search to find the element with the key just above or equal to n (instead of the sequential search used here); and use in-place update to change all the preceding entries.
I'm only really interested in final value. How is this done in Haskell, with mutable arrays?
I tried something, but I really don't know what I'm doing here, and am getting strange and very long error messages (like "can not deduce ... from context ..."):
goarr top = runSTArray $ do
let sz = 10000
a <- newArray (1,sz) (0,0) :: ST s (STArray s Int (Integer,Integer))
b <- newArray (1,sz) (0,0) :: ST s (STArray s Int (Integer,Integer))
let p1 = somefunc 2 -- somefunc :: Integer -> [(Integer, Int)]
go1 p1 2 0 top a b
go1 p1 i c top a b =
if i >= top
then
do
return c
else
go2 p1 i c top a b
go2 p1 i c top a b =
do
let p2 = somefunc (i+1) -- p2 :: [(Integer, Int)]
let n = combine p1 p2 -- n :: Int
-- update arrays and calc new c
-- like the "g" function is doing:
-- (a2,x) = ins n 1 a
-- (b2,y) = if x>0 then ins n x b else (b,0)
-- c2 = c + y
go1 p2 (i+1) c2 top a b -- a2 b2??
This doesn't work at all. I don't even know how to encode loops in do notation. Please help.
UPD: the Map based code that runs 3 times slower:
ins3 :: (Ord k, Num a) => k -> a -> Map.Map k a -> (Map.Map k a, a)
ins3 n x a | Map.null a = (Map.insert n x a , 0)
ins3 n x a = let (p,q,r) = Map.splitLookup n a in
case q of
Nothing -> (Map.union (Map.map (+x) p)
(Map.insert n (x+leftmost r) r) , leftmost r)
Just s -> (Map.union (Map.map (+x) p)
(Map.insert n (x+s) r) , leftmost r)
leftmost r | Map.null r = 0
| otherwise = snd . head $ Map.toList r
UPD2: The error message is " Could not deduce (Num (STArray s1 i e)) from the context () arising from the literal `0' at filename.hs:417:11"
that's where it says return c in go1 function. Perhaps c is expected to be an array, but I want to return the accumulator value that is built while using the two auxiliary arrays.
EDIT3: I've replaced scanl and (!!) with foldl and take as per Chris's advice, and now it runs in constant space with sane empirical complexity and is actually projected to finish in under 0.5 hour - a.o.t. ... 3 days ! I knew about it of course but was so sure GHC optimizes the stuff away for me, surely it wouldn't make that much of a difference, I thought! And so felt only mutable arrays could help... Bummer.
Still, C++ does same in 90 sec, and I would very much appreciate help in learning how to code this with mutable arrays, in Haskell.
Are the input values ever EQ? If they are not EQ then the way scanl g (0,([],[])) ns is used means that the first [(,)] array, call it a always has map snd a == reverse [1..length a] at each stage of g. For example, in a length 10 list the value of snd (a !! 4) is going to be 10-4. Keeping these reversed index values by mutating the second value of each preceding entry in a is quite wasteful. If you need speed then this is one place to make a better algorithm.
None of this applies to the second [(,)] whose purpose is still mysterious to me. It records all insertions that were not done at the end of a, so perhaps it allows one to reconstruct the initial sequence of values.
You said "I'm only really interested in final value." Do you mean you only care about the last value in list output by the scanl .. line? If so then you need a foldl instead of scanl.
Edit: I am adding a non-mutable solution using a custom Finger Tree. It passes my ad hoc testing (at bottom of code):
{-# LANGUAGE MultiParamTypeClasses #-}
import Data.Monoid
import Data.FingerTree
data Entry a v = E !a !v deriving Show
data ME a v = NoF | F !(Entry a v) deriving Show
instance Num v => Monoid (ME a v) where
mempty = NoF
NoF `mappend` k = k
k `mappend` NoF = k
(F (E _a1 v1)) `mappend` (F (E a2 v2)) = F (E a2 (v1 + v2))
instance Num v => Measured (ME a v) (Entry a v) where
measure = F
type M a v = FingerTree (ME a v) (Entry a v)
getV NoF = 0
getV (F (E _a v)) = v
expand :: Num v => M a v -> [(a, v)]
expand m = case viewl m of
EmptyL -> []
(E a _v) :< m' -> (a, getV (measure m)) : expand m'
ins :: (Ord a, Num v) => a -> v -> M a v -> (M a v, v)
ins n x m =
let comp (F (E a _)) = n <= a
comp NoF = False
(lo, hi) = split comp m
in case viewl hi of
EmptyL -> (lo |> E n x, 0)
(E v s) :< higher | n < v ->
(lo >< (E n x <| hi), getV (measure hi))
| otherwise ->
(lo >< (E n (s+x) <| higher), getV (measure higher))
g :: (Num t, Ord t, Ord a) =>
(t, (M a t, M a t)) -> a -> (t, (M a t, M a t))
g (c, (a, b)) n =
let (a2, x) = ins n 1 a
(b2, y) = if x>0 then ins n x b else (b, 0)
in (c+y, (a2, b2))
go :: (Ord a, Num v, Ord v) => [a] -> (v, ([(a, v)], [(a, v)]))
go ns = let (t, (a, b)) = foldl g (0, (mempty, mempty)) ns
in (t, (expand a, expand b))
up = [1..6]
down = [5,4..1]
see'tests = map go [ up, down, up ++ down, down ++ up ]
main = putStrLn . unlines . map show $ see'test
Slightly unorthodox, I am adding a second answer using a mutable technique. Since user1308992 mentioned Fenwick trees, I have used them to implement the algorithm. Two STUArray are allocated and mutated during the run. The basic Fenwick tree keeps totals for all smaller indices and the algorithm here needs totals for all larger indices. This change is handled by the (sz-x) subtraction.
import Control.Monad.ST(runST,ST)
import Data.Array.ST(STUArray,newArray)
import Data.Array.Base(unsafeRead, unsafeWrite)
import Data.Bits((.&.))
import Debug.Trace(trace)
import Data.List(group,sort)
{-# INLINE lsb #-}
lsb :: Int -> Int
lsb i = (negate i) .&. i
go :: [Int] -> Int
go xs = compute (maximum xs) xs
-- Require "top == maximum xs" and "all (>=0) xs"
compute :: Int -> [Int] -> Int
compute top xs = runST mutating where
-- Have (sz - (top+1)) > 0 to keep algorithm simple
sz = top + 2
-- Reversed Fenwick tree (no bounds checking)
insert :: STUArray s Int Int -> Int -> Int -> ST s ()
insert arr x v = loop (sz-x) where
loop i | i > sz = return ()
| i <= 0 = error "wtf"
| otherwise = do
oldVal <- unsafeRead arr i
unsafeWrite arr i (oldVal + v)
loop (i + lsb i)
getSum :: STUArray s Int Int -> Int -> ST s Int
getSum arr x = loop (sz - x) 0 where
loop i acc | i <= 0 = return acc
| otherwise = do
val <- unsafeRead arr i
loop (i - lsb i) $! acc + val
ins n x arr = do
insert arr n x
getSum arr (succ n)
mutating :: ST s Int
mutating = do
-- Start index from 0 to make unsafeRead, unsafeWrite easy
a <- newArray (0,sz) 0 :: ST s (STUArray s Int Int)
b <- newArray (0,sz) 0 :: ST s (STUArray s Int Int)
let loop [] c = return c
loop (n:ns) c = do
x <- ins n 1 a
y <- if x > 0
then
ins n x b
else
return 0
loop ns $! c + y
-- Without debugging use the next line
-- loop xs 0
-- With debugging use the next five lines
c <- loop xs 0
a' <- see a
b' <- see b
trace (show (c,(a',b'))) $ do
return c
-- see is only used in debugging
see arr = do
let zs = map head . group . sort $ xs
vs <- sequence [ getSum arr z | z <- zs ]
let ans = filter (\(a,v) -> v>0) (zip zs vs)
return ans
up = [1..6]
down = [5,4..1]
see'tests = map go [ up, down, up ++ down, down ++ up ]
main = putStrLn . unlines . map show $ see'tests

Resources