I have a name of a file (as a string), and that file contains certain amount (1000000, for example) double-precision floating-point values (stored as binary, 8 bytes for each, obviously).
What would be the best way to read those doubles into a vector?
Here's how I did it in the end:
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Unboxed.Mutable as VM
import qualified Data.ByteString.Lazy as BS
import Data.Binary
import Data.Binary.Get
import System.IO.Unsafe (unsafePerformIO)
import Unsafe.Coerce
readDoubles :: Int -> FilePath -> IO (V.Vector Double)
readDoubles n f = BS.readFile f >>= return . runGet (getVector n)
getVector :: Int -> Get (V.Vector Double)
{-# INLINE getVector #-}
getVector n = do
mv <- liftGet $ VM.new n
let fill i
| i < n = do
x <- fmap unsafeCoerce getWord64be
(unsafePerformIO $ VM.unsafeWrite mv i x) `seq` return ()
fill (i+1)
| otherwise = return ()
fill 0
liftGet $ V.unsafeFreeze mv
liftGet :: IO b -> Get b
liftGet = return . unsafePerformIO
Related
I wrote a set of utility functions around the bindings-fluidsynth library:
module FSUtilities where
import Control.Monad
import System.Directory
import Foreign.Ptr
import Foreign.ForeignPtr
import Foreign.C.Types
import Foreign.C.String
import Bindings.FluidSynth
newtype Settings = Settings (ForeignPtr C'fluid_settings_t)
newtype Synth = Synth (ForeignPtr C'fluid_synth_t)
type Channel = Int
type Key = Int
type Velocity = Int
initSynth :: IO Synth
initSynth = createSettings >>=
changeSettingStr "audio.driver" "alsa" >>=
changeSettingInt "synth.polyphony" 64 >>=
(\s -> createSynth s >>= createDriver s) >>=
loadSF "GS.sf2"
createSettings :: IO Settings
createSettings =
c'new_fluid_settings >>=
newForeignPtr p'delete_fluid_settings >>= (pure $!) . Settings
changeSettingStr :: String -> String -> Settings -> IO Settings
changeSettingStr k v (Settings s) =
withForeignPtr s $ \ptr ->
withCAString k $ \cstr ->
withCAString v $ \cstr' ->
c'fluid_settings_setstr ptr cstr cstr' >>
(pure $! Settings s)
changeSettingInt :: String -> Int -> Settings -> IO Settings
changeSettingInt k v (Settings s) =
withForeignPtr s $ \ptr ->
withCAString k $ \cstr ->
c'fluid_settings_setint ptr cstr (fromIntegral v) >>
(pure $! Settings s)
createSynth :: Settings -> IO Synth
createSynth (Settings s) =
withForeignPtr s c'new_fluid_synth >>=
newForeignPtr p'delete_fluid_synth >>= (pure $!) . Synth
createDriver :: Settings -> Synth -> IO Synth
createDriver (Settings set) (Synth syn) =
withForeignPtr set $ \ptr ->
withForeignPtr syn $ \ptr' ->
c'new_fluid_audio_driver ptr ptr' >>=
newForeignPtr p'delete_fluid_audio_driver >>
(pure $! Synth syn)
loadSF :: String -> Synth -> IO Synth
loadSF path (Synth syn) =
withForeignPtr syn $ \s ->
makeAbsolute path >>= \p ->
withCAString p $ \p' ->
c'fluid_synth_sfload s p' 1 >>=
\c -> if c == (-1) then error "loadSF: Could not load SoundFont"
else putStrLn "loadSF: SoundFont loaded" >>
(pure $! Synth syn)
noteOn :: Channel -> Key -> Velocity -> Synth -> IO ()
noteOn c k v (Synth ptr) =
withForeignPtr ptr $ \syn ->
c'fluid_synth_noteon syn c' k' v' >> pure ()
where c' = fromIntegral c
k' = fromIntegral k
v' = fromIntegral v
justPlay :: Channel -> Key -> IO ()
justPlay c k = initSynth >>= noteOn c k 127
justPlay' :: Channel -> Key -> IO Synth
justPlay' c k = initSynth >>= \s -> noteOn c k 127 s >> pure s
The justPlay and justPlay' functions serve to illustrate the issue. When I call justPlay from ghci, I get random segfaults (not consistently, around 30% of the time), while justPlay' never does that (but swiftly fills up my system's memory after a bunch of calls, due to dangling Synths. I think this is because I'm not cleaning up after myself when the Synth is no longer referenced, but I thought the call to newForeignPtr with a finalizer function at the creation of Synth was supposed to take care of that automatically.
I'm new to Haskell and I don't know C, so I'm trying to feel my way through this. What's the proper way to handle such a situation?
It is hard to say what exactly couses the crash, but there is at least one obviuosly wrong thing. Occurding to the documentation:
Other users of a synthesizer instance, such as audio and MIDI drivers, should be deleted prior to freeing the FluidSynth instance.
In your case the order of finalizers is not defined, so synthesizer could be deleted before driver. Probably other objects also has restrictions on their life circle.
To explicitly finalize foreign pointer use finalizeForeignPtr.
Haskell one is implemented using optimized Data.IntSet with complexity O(lg n). However, there is a 15x (previously 30x) speed difference for n = 2000000 despite Haskell code is already optimized for even number cases. I would like to know whether/why my implementation in Haskell is imperfect.
Original Haskell
primesUpTo :: Int -> [Int]
primesUpTo n = 2 : put S.empty [3,5..n]
where put :: S.IntSet -> [Int] -> [Int]
put _ [] = []
put comps (x:xs) =
if S.member x comps
then put comps xs
else x : put (S.union comps multiples) xs
where multiples = S.fromList [x*2, x*3 .. n]
Update
fromDistinctAscList gives a 4x speed increase. 2-3-5-7-Wheel speeds up by another 50%.
primesUpTo :: Int -> [Int]
primesUpTo n = 2 : 3 : 5 : 7 : put S.empty (takeWhile (<=n) (spin wheel 11))
where put :: S.IntSet -> [Int] -> [Int]
put _ [] = []
put comps (x:xs) =
if S.member x comps
then put comps xs
else x : put (S.union comps multiples) xs
where multiples = S.fromDistinctAscList [x*x, x*(x+2) .. n]
spin (x:xs) n = n : spin xs (n + x)
wheel = 2:4:2:4:6:2:6:4:2:4:6:6:2:6:4:2:6:4:6:8:4:2:4:2:4:8:6:4:6:2:4:6:2:6:6:4:2:4:6:2:6:4:2:4:2:10:2:10:wheel
Benchmarking
All time are measured by *nix time command, real space
Haskell original : 2e6: N/A; 2e7: >30s
Haskell optimized: 2e6: 0.396s; 2e7: 6.273s
C++ Set (ordered): 2e6: 4.694s; 2e7: >30s
C++ Bool Array : 2e6: 0.039s; 2e7: 0.421s
Haskell optimized is slower than C++ Bool by 10~15x, and faster than C++ Set by 10x.
Source code
C Compiler options: g++ 5.3.1, g++ -std=c++11
Haskell options: ghc 7.8.4, ghc
C code (Bool array) http://pastebin.com/W0s7cSWi
prime[0] = prime[1] = false;
for (int i=2; i<=limit; i++) { //edited
if (!prime[i]) continue;
for (int j=2*i; j<=n; j+=i)
prime[j] = false;
}
C code (Set) http://pastebin.com/sNpghrU4
nonprime.insert(1);
for (int i=2; i<=limit; i++) { //edited
if (nonprime.count(i) > 0) continue;
for (int j=2*i; j<=n; j+=i)
nonprime.insert(j);
}
Haskell code http://pastebin.com/HuMqwvRW
Code as written above.
I would like to know whether/why my implementation in Haskell is imperfect.
Instead of fromList you better use fromDistinctAscList which performs linearly. You may also add only odd multiples starting with x*x not x*2, because all the smaller odd multiples have already been added. Style-wise, a right fold may fit better than recursion.
Doing so, I get more than 3 times performance improvement for n equal to 2,000,000:
import Data.IntSet (member, union, empty, fromDistinctAscList)
sieve :: Int -> [Int]
sieve n = 2: foldr go (const []) [3,5..n] empty
where
go i run obs
| member i obs = run obs
| otherwise = i: run (union obs inc)
where inc = fromDistinctAscList [i*i, i*(i + 2)..n]
Nevertheless, an array has both O(1) access and cache friendly memory allocation. Using mutable arrays, I see more than 15 times performance improvement over your Haskell code (again n equal to 2,000,000):
{-# LANGUAGE FlexibleContexts #-}
import Data.Array.ST (STUArray)
import Control.Monad (forM_, foldM)
import Control.Monad.ST (ST, runST)
import Data.Array.Base (newArray, unsafeWrite, unsafeRead)
sieve :: Int -> [Int]
sieve n = reverse $ runST $ do
arr <- newArray (0, n) False :: ST s (STUArray s Int Bool)
foldM (go arr) [2] [3,5..n]
where
go arr acc i = do
b <- unsafeRead arr i
if b then return acc else do
forM_ [i*i, i*(i + 2).. n] $ \k -> unsafeWrite arr k True
return $ i: acc
I am trying to perform a series of transforms on graphical files using Haskell and Repa/DevIL. The starting example used was provided by the Haskell wiki page https://wiki.haskell.org/Numeric_Haskell:_A_Repa_Tutorial. I am an imperative programmer of 30 years experience with some erlang for good measure, trying to learn Haskell outside a classroom environment.
The problem is manipulating the data after the file load was first transformed into a Repa array:
import Data.Array.Repa.IO.DevIL (runIL,readImage,writeImage,Image(RGB),IL)
import qualified Data.Array.Repa as R
import Data.Vector.Unboxed as DVU
import Control.Monad
main :: IO ()
main = do
[f] <- getArgs
(RGB a) <- runIL $ Data.Array.Repa.IO.DevIL.readImage f
let
c = (computeP (R.traverse a id rgbTransform)) :: IL (Array U DIM3 Float)
which is successfully cast to type "Array F DIM3 Float" as output from the rgbTransform. From that point on it has been a nightmare to use the data. Flicking the array storage type between F(oreign) and U(nboxed) changes all following call's usability, plus the Repa-added monad layer IL forces use of liftM for nearly every equation following the 1st transform:
let -- continued
sh = liftM R.extent c -- IL DIM3
v = liftM R.toUnboxed c -- IL (Vector Float)
lv = liftM DVU.length v -- IL Int
f = liftM indexed v -- vector of tuples: (Int,a) where Int is idx
k = (Z :. 2) :. 2 :. 0 :: DIM3
These are the routines I can call without error. The IO monad's print command produces no output if placed in or after this 'let' list, due to the IL monad layer.
The game plan for the curious:
read the graphic file (done, via Repa)
resize image (not done, no resize in Repa, must be hand-coded)
transform and convert image from Word8 to Float (done)
get a Stablepointer to the transformed Float data (not done)
transform in-place the Float data as an array of C structs
of {Float a,b,c;}, by an external C routine via FFI (not completely
done). This is done hopefully without marshalling a new graphic
array by passing a pointer to the data
perform more passes over the transformed data to extract more info (partly done).
I am looking for help with issues 4 and 5.
4 -> The type system has been difficult to deal with while attempting to get C-usable memory pointers. Going thru the mountains of haskell library calls has not helped.
5 -> The external C routine is of type:
foreign import ccall unsafe "transform.h xform"
c_xform :: Ptr (CFloat,CFloat,CFloat) ->
CInt ->
IO ()
The Ptr is expected to point to an unboxed flat C array of rgb_t structs:
typedef struct
{
float r;
float g;
float b;
} rgb_t;
Available web-based FFI descriptions of how to deal with array pointers in FFI are non-existent if not downright obscure. The fairly straightforward idea of unfreezing and passing in a C array of floating-point RGB structs, modifying them in-place and then freezing the result is what I had in mind. The external transform is pure in the sense that the same input will produce predictable output, does not use threads, does not use global vars nor depend upon obscure libraries.
Foreign.Marshal.Array seems to provide a way to convert haskell data to C data and other way around.
I tested interfacing C code and haskell using the following files (Haskell + FFI for the first time for me)
hsc2hs rgb_ffi.hsc
ghc main.hs rgb_ffi.hs rgb.c
rgb.h
#ifndef RGB_H
#define RGB_H
#include <stdlib.h>
typedef struct {
float r;
float g;
float b;
} rgb_t;
void rgb_test(rgb_t * rgbs, ssize_t n);
#endif
rgb.h
#include <stdlib.h>
#include <stdio.h>
#include "rgb.h"
void rgb_test(rgb_t * rgbs, ssize_t n)
{
int i;
for(i=0; i<n; i++) {
printf("%.3f %.3f %.3f\n", rgbs[i].r, rgbs[i].g, rgbs[i].b);
rgbs[i].r *= 2.0;
rgbs[i].g *= 2.0;
rgbs[i].b *= 2.0;
}
}
rgb_ffi.hsc
{-# LANGUAGE ForeignFunctionInterface #-}
{-# LANGUAGE CPP #-}
module RGB where
import Foreign
import Foreign.C
import Control.Monad (ap)
#include "rgb.h"
data RGB = RGB {
r :: CFloat, g :: CFloat, b :: CFloat
} deriving Show
instance Storable RGB where
sizeOf _ = #{size rgb_t}
alignment _ = alignment (undefined :: CInt)
poke p rgb_t = do
#{poke rgb_t, r} p $ r rgb_t
#{poke rgb_t, g} p $ g rgb_t
#{poke rgb_t, b} p $ b rgb_t
peek p = return RGB
`ap` (#{peek rgb_t, r} p)
`ap` (#{peek rgb_t, g} p)
`ap` (#{peek rgb_t, b} p)
foreign import ccall "rgb.h rgb_test" crgbTest :: Ptr RGB -> CSize -> IO ();
rgbTest :: [RGB] -> IO [RGB]
rgbTest rgbs = withArray rgbs $ \ptr ->
do
crgbTest ptr (fromIntegral (length rgbs))
peekArray (length rgbs) ptr
rgbAlloc :: [RGB] -> IO (Ptr RGB)
rgbAlloc rgbs = newArray rgbs
rgbPeek :: Ptr RGB -> Int -> IO [RGB]
rgbPeek rgbs l = peekArray l rgbs
rgbTest2 :: Ptr RGB -> Int -> IO ()
rgbTest2 ptr l =
do
crgbTest ptr (fromIntegral l)
return ()
main.hs
module Main (main) where
import RGB
main =
do
let a = [RGB {r = 1.0, g = 1.0, b = 1.0},
RGB {r = 2.0, g = 2.0, b = 2.0},
RGB {r = 3.0, g = 3.0, b = 3.0}]
let l = length a
print a
-- b <- rgbTest a
-- print b
c <- rgbAlloc a
rgbTest2 c l
rgbTest2 c l
d <- rgbPeek c l
print d
return ()
I am trying to pass IO (Data.Vector.Storable.Vector Double) from Haskell to C, but there seem to be a problem as Haskell does not allow that. In this case, should I pass a pointer to the vector back to the C program (then how to wrap this object in Ptr) or is there another way?
Also, Data.Vector.Storable seem to have toList function, but when I try to use it, the program does not compile complaining that the Vector is wrapped in IO Monad.
I have tried to find a way to unwrap Vector from IO Monad, but failed to find anything specific, seems like to do so, the IO wrapped object has to be passed to another function, which would have to return back the IO wrapped object again, which, in my opinion, would lead to the same problem again as IO wrapped objects cannot be passed back to C program from the interface.
{-# LANGUAGE ForeignFunctionInterface #-}
module Safe where
import Foreign
import Foreign.C.Types
import Data.Vector.Storable
import AI.HNN.Recurrent.Network
foreign export ccall process :: Ptr CInt -> Ptr Double -> IO (Data.Vector.Storable.Vector Double) -> IO ()
feed :: [Double] -> IO (Data.Vector.Storable.Vector Double)
feed adj = do
let numNeurons = 3
numInputs = 1
thresholds = Prelude.replicate numNeurons 0.5
inputs = [[0.38], [0.75]]
n <- createNetwork numNeurons numInputs adj thresholds :: IO (Network Double)
output <- evalNet n inputs sigmoid
return output
peekInt :: Ptr CInt -> IO Int
peekInt = fmap fromIntegral . peek
process :: Ptr CInt -> Ptr Double -> IO (Data.Vector.Storable.Vector Double) -> IO ()
process n xs result = do
n <- peekInt n
es <- peekArray n xs
poke result $ (feed es)
Calling and collecting from C
double res = [output_size];
int asize = sizeof(weights)/sizeof(double);
process(&asize, &weights, &res);
I'm having a little Haskell Situation over here. I'm trying to write two functions with monads.
First one is supposed to iterate through a function as long as the condition is true for the input / output of the function. Second one is supposed to use the first one to take a number as input and write it as output until you enter a space.
I'm stuck with this, any help?
module Test where
while :: (a -> Bool) -> (a -> IO a) -> a -> IO a
while praed funktion x = do
f <- praed (funktion x)
if f == True then do
y <- funktion x
while praed funktion y
else return x
power2 :: IO ()
power2 = do putStr (Please enter a number.")
i <- getChar
while praed funktion
where praed x = if x /= ' ' then False else True
funktion = i
import Control.Monad
while :: (a -> Bool) -> (a -> IO a) -> a -> IO a
while praed funktion x
| praed x = do
y <- funktion x
while praed funktion y
| otherwise = return x
power2 :: IO ()
power2 = do
putStr "Please enter a number."
i <- getChar
let praed x = x /= ' '
let f x = do
putChar x
getChar
while praed f '?'
return ()
Some notes:
Using if x then True else False is redundant, it's equivalent to just x.
Similarly if x == True ... is redundant and equivalent to if x ....
You need to distinguish between IO actions and their results. For example, if yo do
do
i <- getChar
...
then in ... i represents the result of the action, a character, so i :: Char. But getChar :: IO Char is the action itself. You can view it as a recipe that returns Char when performed. You can pass the recipe around to functions etc., and it is only performed when executed somewhere.
Your while called funktion twice, which probably isn't what you intend - it would read a character twice, check the first one and return the second one. Remember, your funktion is an action, so each time you "invoke" the action (for example by using <- funktion ... in the do notation), the action is run again. So it should rather be something like
do
y <- funktion x
f <- praed y
-- ...
(My code is somewhat different, it checks the argument that is passed to it.)
For a pure version:
{-# LANGUAGE BangPatterns #-}
while :: (a -> Bool) -> (a -> a) -> a -> a
while p f = go where go !x = if p x then go (f x) else x
test1 :: Int
test1 = while (< 1000) (* 2) 2
-- test1 => 1024
for monadic:
import Control.Monad
whileM :: (Monad m, MonadPlus f) => (a -> m Bool) -> m a -> m (f a)
whileM p f = go where
go = do
x <- f
r <- p x
if r then (return x `mplus`) `liftM` go else return mzero
test2 :: IO [String]
test2 = whileM (return . (/= "quit")) getLine
-- *Main> test2
-- quit
-- []
-- *Main> test2
-- 1
-- 2
-- 3
-- quit
-- ["1","2","3"]
power2 :: IO (Maybe Char)
power2 = whileM (return . (/= 'q')) getChar
-- *Main> power2
-- q
-- Nothing
-- *Main> power2
-- 1
-- 2
-- 3
-- q
-- Just '\n'
see also:
http://hackage.haskell.org/package/monad-loops, http://hackage.haskell.org/package/loop-while, http://hackage.haskell.org/package/control-monad-loop.
http://www.haskellforall.com/2012/01/haskell-for-c-programmers-for-loops.html