process subarrays asynchronously and reduce the results to a single array - arrays

Input If input is in the form of array of arrays.
let items = [|
[|"item1"; "item2"|]
[|"item3"; "item4"|]
[|"item5"; "item6"|]
[|"item7"; "item8"|]
[|"item9"; "item10"|]
[|"item11"; "item12"|]
|]
Asynchronous action that returns asynchronous result or error
let action (item: string) : Async<Result<string, string>> =
async {
return Ok (item + ":processed")
}
Attempt process one subarray at a time in parallel
let result = items
|> Seq.map (Seq.map action >> Async.Parallel)
|> Async.Parallel // wrong? process root items sequentially
|> Async.RunSynchronously
Expectations:
a) Process one subarray at a time in parallel, then process the second subarray in parallel and so on. (In other words sequential processing for the root items and parallel processing for subitems)
b) Then collect all the results and merge them into a singly dimensioned results array while maintaining the order.
c) Preferably using built-in methods provided by Array, Seq, List, Async etc. instead of any custom operators (that'd be last resort)
d) Optional - If it's not possible to have something within the chain, then as a last resort perhaps convert the result subarrays into single array at the end and return to the caller, if that leads to a cleaner and minimalistic approach which I prefer.
Attempt 2
let result2 = items
|> Seq.map (Seq.map action >> Async.Parallel)
|> Async.Parallel // wrong? is it processing root items sequentially
|> Async.RunSynchronously
|> Array.collect id
Array.iter (fun (item: Result<string, string>) ->
match item with
| Ok r -> Console.WriteLine(r)
| Error e -> Console.WriteLine(e)
) result2
Edit
let action (item: string) : Async<Result<string, string>> =
async {
return Ok (item + ":processed")
}
let items = [| "item1"; "item2"; "item3"; "item4"; "item5"; "item6"; "item7"; "item8"; "item9"; "item10"|]
let result = items
|> Seq.chunkBySize 2
|> Seq.map (Seq.map action >> Async.Parallel)
|> Seq.map Async.RunSynchronously
|> Seq.toArray
|> Array.collect id

let result = items |> Array.map ( Array.map action >> Async.Parallel)
|> Array.map Async.RunSynchronously
|> Array.collect id
Edit: Note that majority of operations defined on Seq can be found in array and vice versa. If you initially have an array you can use array operation all the way down.
let items = [| "item1"; "item2"; "item3"; "item4"; "item5"; "item6"; "item7"; "item8"; "item9"; "item10"|]
let result = items
|> Array.chunkBySize 2
|> Array.map (Array.map action >> Async.Parallel >> Async.RunSynchronously)
|> Array.concat

Related

F#: How to iterate through a List of Arrays of Strings (string [] list) the functional way

I'm a newbie at F#,
I've got a List that contains arrays, each arrays contains 7 Strings.
I want to loop through the Arrays and do some kind of Array.map later on,
However my problem is that I can't send individual arrays to some other function.
I don't want to use for-loops but focus on the functional way using pipelines and mapping only.
let stockArray =
[[|"2012-03-30"; "32.40"; "32.41"; "32.04"; "32.26"; "31749400"; "32.26"|];
[|"2012-03-29"; "32.06"; "32.19"; "31.81"; "32.12"; "37038500"; "32.12"|];
[|"2012-03-28"; "32.52"; "32.70"; "32.04"; "32.19"; "41344800"; "32.19"|];
[|"2012-03-27"; "32.65"; "32.70"; "32.40"; "32.52"; "36274900"; "32.52"|];]
let tryout =
stockArray
|> List.iter;;
Output complains about List.iter:
error FS0001: Type mismatch. Expecting a
'string [] list -> 'a' but given a
'('b -> unit) -> 'b list -> unit'
The type 'string [] list' does not match the type ''a -> unit'
When trying Array.iter, same difference:
error FS0001: Type mismatch. Expecting a
'string [] list -> 'a' but given a
'('b -> unit) -> 'b [] -> unit'
The type 'string [] list' does not match the type ''a -> unit'
In C# I would simply go about it with a foreach to start treating my arrays one at a time, but with F# I feel real stuck.
Thank you for your help
The question is not clear, even with the extra comments. Anyway, I think you will finally be able to figure out your needs from this answer.
I have implemented parseDate and parseFloat in such a way that I expect it to work on any machine, whatever locale, with the given data. You may want something else for your production application. Also, how theInt is calculated is perhaps not what you want.
List.iter, as you already discovered, converts data to unit, effectively throwing away data. So what's the point in that? It is usually placed last when used in a pipe sequence, often doing some work that involves side effects (e.g. printing out data) or mutable data operations (e.g. filling a mutable list with items). I suggest you study functions in the List, Array, Seq and Option modules, to see how they're used to transform data.
open System
open System.Globalization
let stockArray =
[
[| "2012-03-30"; "32.40"; "32.41"; "32.04"; "32.26"; "31749400"; "32.26" |]
[| "2012-03-29"; "32.06"; "32.19"; "31.81"; "32.12"; "37038500"; "32.12" |]
[| "2012-03-28"; "32.52"; "32.70"; "32.04"; "32.19"; "41344800"; "32.19" |]
[| "2012-03-27"; "32.65"; "32.70"; "32.40"; "32.52"; "36274900"; "32.52" |]
]
type OutData = { TheDate: DateTime; TheInt: int }
let parseDate s = DateTime.ParseExact (s, "yyyy-MM-dd", CultureInfo.InvariantCulture)
let parseFloat s = Double.Parse (s, CultureInfo.InvariantCulture)
let myFirstMap (inArray: string[]) : OutData =
if inArray.Length <> 7 then
failwith "Expected array with seven strings."
else
let theDate = parseDate inArray.[0]
let f2 = parseFloat inArray.[2]
let f3 = parseFloat inArray.[3]
let f = f2 - f3
let theInt = int f
{ TheDate = theDate; TheInt = theInt }
let tryout =
stockArray
|> List.map myFirstMap
The following is an alternative implementation of myFirstMap. I guess some would say it's more idiomatic, but I would just say that what you prefer to use depends on what you might expect from a possible future development.
let myFirstMap inArray =
match inArray with
| [| sDate; _; s2; s3; _; _; _ |] ->
let theDate = parseDate sDate
let f2 = parseFloat s2
let f3 = parseFloat s3
let f = f2 - f3
let theInt = int f
{ TheDate = theDate; TheInt = theInt }
| _ -> failwith "Expected array with seven strings."
The pipe operator |> is used to write an f x as x |> f.
The signature of List.iter is:
action: ('a -> unit) -> list: ('a list) -> unit
You give it an action, then a list, and it gives you a void.
You can read it thus: when you give List.iter an action, its type will be
list: ('a list) -> unit
a function to which you can pass a list.
So when you write stockArray |> List.iter, what you're actually trying to give it in place of an action is your list - that's the error. So pass in an action:
let tryout = List.iter (fun arr -> printfn "%A" arr) stockArray
which can be rewritten as:
let tryout = stockArray |> List.iter (fun arr -> printfn "%A" arr)
However my problem is that I can't send individual arrays to some other function
List.map and similar functions allow you to do precisely this - you don't need to iterate the list yourself.
For example, this will return just the first element of each array in your list:
stockArray
|> List.map (fun x -> x.[0])
You can replace the function passed to List.map with any function that operates on one array and returns some value.

F# CSV - for each row create an array from columns data

I have a CSV file, where the fst column is a title and next 700+ columns are some int data.
Title D1 D2 D3 D4 .. D700
Name1 0 1 7 5 48
I try to use CsvProvider to read the file and then convert data to my custom type
type DigitRecord = { Title:string; Digits:int[] }
The problem is I don't know how to put all column data (except the first one with a title) into a int[] array.
let dataRecords =
CSV.Rows
|> Seq.map (fun record -> {Title = record.Title; Digits = ???})
I want to get a record with Title=Name1 and Digits=[|0,1,7,5...48|]
I'm newbie in F#, I'd be grateful for any help!
I think the easiest way is to use CsvParser like this:
let readData (path : string) seps =
CsvFile.Load(path, seps).Rows
|> Seq.map
(fun row -> row.Columns.[0], row.Columns |> Array.skip 1 |> Array.map int)
|> Seq.map
(fun (title, digits) -> {Title = title; Digits = digits})

F# remove duplicates from a string [] list

I have a program that results in an [] list, and I'm trying to remove near duplicated arrays from the list. An example of the list is...
[
[|
"Jackson";
"Stentzke";
"22";
"001"
|];
[|
"Jackson";
"Stentzke";
"22";
"002"
|];
[|
"Alec";
"Stentzke";
"18";
"003"
|]
]
Basically I'm trying to write a function that would read over the list and remove all examples of near identical data. So the final returned [] list should look like...
[
[|
"Alec";
"Stentzke";
"18";
"003"
|]
]
I've tried a number of functions to try and get this result or something close to it that can work with. My current attempt is this...
let removeDuplicates (arrayList: string[]list) =
let list = arrayList|> List.map(fun aL ->
let a = arrayList|> List.map(fun aL2 ->
try
match (aL.GetValue(0).Equals(aL2.GetValue(0))) && (aL.GetValue(2).Equals(aL2.GetValue(2))) && (aL.GetValue(3).Equals(aL2.GetValue(3))) with
| false -> aL2
| _ -> [|""|]
with
| ex -> [|""|]
)
a
)
list |> List.concat |> List.distinct
But all this returns is the a reversed version on the input []list.
Does anyone know how to remove near duplicated arrays from a list?
I believe your code and comments don't match up very well. Considering your comments "the first, second and third values are the same", I believe this can get you in the right track:
let removeDuplicates (arrayList: string[]list) =
arrayList |> Seq.distinctBy (fun elem -> (elem.[0] , elem.[1] , elem.[2]))
The result of this against your input data is a two element list containing:
[
[|
"Jackson";
"Stentzke";
"22";
"001"
|];
[|
"Alec";
"Stentzke";
"18";
"003"
|]
]
You should create a dictionary/map based on the fields you consider identical then just remove any duplicate occurance. Here's a simply and mechanical way, assuming xs is the List you specified above:
type DataRec = { key:string
fname:string
lname:string
id1:string
id2:string}
let dataRecs = xs |> List.map (fun x -> {key=x.[0]+x.[1]+x.[2];fname=x.[0];lname=x.[1];id1=x.[2];id2=x.[3]})
dataRecs |> Seq.groupBy (fun x -> x.key)
|> Seq.filter (fun x -> Seq.length (snd x) = 1)
|> Seq.collect snd
|> Seq.map (fun x -> [|x.fname;x.lname;x.id1;x.id2|])
|> Seq.toList
Output:
val it : string [] list = [[|"Alec"; "Stentzke"; "18"; "003"|]]
It basically creates a key from the first three items, groups by it, filters out anything over 2 counst, and then maps back to an array.
Using some Linq:
let comparer (atMost) =
{ new System.Collections.Generic.IEqualityComparer<string[]> with
member __.Equals(a, b) =
Seq.zip a b
|> Seq.sumBy (fun (a',b') -> System.StringComparer.InvariantCulture.Compare(a', b') |> abs |> min 1)
|> ((>=) atMost)
member __.GetHashCode(a) = 1
}
System.Linq.Enumerable.GroupBy(data, id, comparer 1)
|> Seq.choose (fun g -> match Seq.length g with | 1 -> Some g.Key | _ -> None)
The comparer allows for atMost : int number of differences between two arrays.

f# Finding the difference between 2 obj[]lists

I have 2 obj[]lists list1 and list2. List1 has a length of 8 and list2 has a length of 10. There are arrays in list1 that only exist in list1. That also goes the same for list2. But there are array that exist in both. I'm wondering how to get the arrays that exist in list1. At the moment when I run my code I get a list of the arrays that exist in both lists, but it's missing the data unique to list1. I'm wondering how to get that unique list1 data. Any suggestions?
let getProdOnly (index:int)(list1:obj[]list)(list2:obj[]list) =
let mutable list3 = list.Empty
for i = 0 to list1.Length-1 do
for j = 0 to list2.Length-1 do
if list1.Item(i).GetValue(index).Equals(list2.Item(j).GetValue(index)) then
System.Diagnostics.Debug.WriteLine("Exists in List 1 and 2")
else
list3 <- list1.Item(i)
Something like this:
let ar1 = [|1;2;3|]
let ar2 = [|2;3;4|]
let s1 = ar1 |> Set.ofArray
let s2 = ar2 |> Set.ofArray
Set.difference s1 s2
//val it : Set<int> = set [1]
There are also a bunch of Array related functions, like compareWith, distinct, exists if you want to work with Arrays directly.
But as was pointed out in previous answers, this type of imperative code is not very idiomatic. Try to avoid mutable variables, try to avoid loops. It could probably rewritten with Array.map for example.
If you want the elements unique to one list, this is the easiest way to do it in F# 4.0:
list1
|> List.except list2
which will remove all the elements of list2 from list1. Note that except also calls a distinct, so you might need to watch out for that.
First I took your code with minor changes and added some printf debuging to see what is does.
let getProdOnly2 (index:int)(list1:obj[] list)(list2:obj[] list) =
let mutable list3 : obj[] list= list.Empty
for i = 0 to list1.Length-1 do
for j = 0 to list2.Length-1 do
if list1.[i].[index] = list2.[j].[index] then
printfn "equal"
System.Diagnostics.Debug.WriteLine("Exists in List 1 and 2")
list3
else
printfn "add %A %A" (list1.Item(i)) (list2.Item(j))
list3 <- list1.Item(i) :: list3
list3
list3
And it does adding an element each time it finds an element not equal the current element.
So my attempt is to take the list1 and just ceep or better filter the elements that are not part of list2.
let getProdOnly3 (index:int)(list1:obj[] list)(list2:obj[] list) =
list1
|> List.filter (fun el1 ->
list2
|> List.fold (fun acc el2 -> acc && (el2<>el1)) true )
I tested the code with the following lists
let list1 = [ [| 1;2;3;4|]
[| 1;2;3;4|]
[| 2;3;4|]
[| 3;4;5|] ] |> List.map (fun a -> a |> Array.map (fun e -> box e))
let list2 = [ [| 2;3;4|]
[| 3;4;5|] ] |> List.map (fun a -> a |> Array.map (fun e -> box e))
In difference to s952163 my result will have double entries if list1 has double entries, do not know if that is wanted or unwanted beahyuvier.

Is there a simple way to print each element of an array?

let x=[|15..20|]
let y=Array.map f x
printf "%O" y
Well, I got a type information.
Is there any way to print each element of "y" with delimiter of ",", while not having to use a for loop?
Either use String.Join in the System namespace or F# 'native':
let x = [| 15 .. 20 |]
printfn "%s" (System.String.Join(",", x))
x |> Seq.map string |> String.concat "," |> printfn "%s"
Using String.concat to concatenate the string with a separator is probably the best option in this case (because you do not want to have the separator at the end).
However, if you just wanted to print all elements, you can also use Array.iter:
let nums= [|15..20|]
Array.iter (fun x -> printfn "%O" x) nums // Using function call
nums |> Array.iter (fun x -> printfn "%O" x) // Using the pipe
Adding the separators in this case is harder, but possible using iteri:
nums |> Array.iteri (fun i x ->
if i <> 0 then printf ", "
printf "%O" x)
This won't print the entire array if it is large; I think it prints only the first 100 elements. Still, I suspect this is what you're after:
printfn "%A" y
If the array of items is large and you do not want to generate a large string, another option is to generate a interleaved sequence and skip the first item. The following code works assuming the array has at least one element.
One advantage of this approach is that it cleanly separates the act of interleaving the items and that of printing. It also eliminates having to do a check for the first item on every iteration.
let items = [| 15 .. 20|]
let strInterleaved delimiter items =
items
|> Seq.collect (fun item -> seq { yield delimiter; yield item})
|> Seq.skip(1)
items
|> Seq.map string
|> strInterleaved ","
|> Seq.iter (printf "%s")

Resources