I'm curious about the performance characteristics of joined() and .flatMap(_:) in flattening a multidimensional array:
let array = [[1,2,3],[4,5,6],[7,8,9]]
let j = Array(array.joined())
let f = array.flatMap{$0}
They both flatten the nested array into [1, 2, 3, 4, 5, 6, 7, 8, 9]. Should I prefer one over the other for performance? Also, is there a more readable way to write the calls?
TL; DR
When it comes just to flattening 2D arrays (without any transformations or separators applied, see #dfri's answer for more info about that aspect), array.flatMap{$0} and Array(array.joined()) are both conceptually the same and have similar performance.
The main difference between flatMap(_:) and joined() (note that this isn't a new method, it has just been renamed from flatten()) is that joined() is always lazily applied (for arrays, it returns a special FlattenBidirectionalCollection<Base>).
Therefore in terms of performance, it makes sense to use joined() over flatMap(_:) in situations where you only want to iterate over part of a flattened sequence (without applying any transformations). For example:
let array2D = [[2, 3], [8, 10], [9, 5], [4, 8]]
if array2D.joined().contains(8) {
print("contains 8")
} else {
print("doesn't contain 8")
}
Because joined() is lazily applied & contains(_:) will stop iterating upon finding a match, only the first two inner arrays will have to be 'flattened' to find the element 8 from the 2D array. Although, as #dfri correctly notes below, you are also able to lazily apply flatMap(_:) through the use of a LazySequence/LazyCollection – which can be created through the lazy property. This would be ideal for lazily applying both a transformation & flattening a given 2D sequence.
In cases where joined() is iterated fully through, it is conceptually no different from using flatMap{$0}. Therefore, these are all valid (and conceptually identical) ways of flattening a 2D array:
array2D.joined().map{$0}
Array(array2D.joined())
array2D.flatMap{$0}
In terms of performance, flatMap(_:) is documented as having a time-complexity of:
O(m + n), where m is the length of this sequence and n is the length of the result
This is because its implementation is simply:
public func flatMap<SegmentOfResult : Sequence>(
_ transform: (${GElement}) throws -> SegmentOfResult
) rethrows -> [SegmentOfResult.${GElement}] {
var result: [SegmentOfResult.${GElement}] = []
for element in self {
result.append(contentsOf: try transform(element))
}
return result
}
}
As append(contentsOf:) has a time-complexity of O(n), where n is the length of sequence to append, we get an overall time-complexity of O(m + n), where m will be total length of all sequences appended, and n is the length of the 2D sequence.
When it comes to joined(), there is no documented time-complexity, as it is lazily applied. However, the main bit of source code to consider is the implementation of FlattenIterator, which is used to iterate over the flattened contents of a 2D sequence (which will occur upon using map(_:) or the Array(_:) initialiser with joined()).
public mutating func next() -> Base.Element.Iterator.Element? {
repeat {
if _fastPath(_inner != nil) {
let ret = _inner!.next()
if _fastPath(ret != nil) {
return ret
}
}
let s = _base.next()
if _slowPath(s == nil) {
return nil
}
_inner = s!.makeIterator()
}
while true
}
Here _base is the base 2D sequence, _inner is the current iterator from one of the inner sequences, and _fastPath & _slowPath are hints to the compiler to aid with branch prediction.
Assuming I'm interpreting this code correctly & the full sequence is iterated through, this also has a time complexity of O(m + n), where m is the length of the sequence, and n is the length of the result. This is because it goes through each outer iterator and each inner iterator to get the flattened elements.
So, performance wise, Array(array.joined()) and array.flatMap{$0} both have the same time complexity.
If we run a quick benchmark in a debug build (Swift 3.1):
import QuartzCore
func benchmark(repeatCount:Int = 1, name:String? = nil, closure:() -> ()) {
let d = CACurrentMediaTime()
for _ in 0..<repeatCount {
closure()
}
let d1 = CACurrentMediaTime()-d
print("Benchmark of \(name ?? "closure") took \(d1) seconds")
}
let arr = [[Int]](repeating: [Int](repeating: 0, count: 1000), count: 1000)
benchmark {
_ = arr.flatMap{$0} // 0.00744s
}
benchmark {
_ = Array(arr.joined()) // 0.525s
}
benchmark {
_ = arr.joined().map{$0} // 1.421s
}
flatMap(_:) appears to be the fastest. I suspect that joined() being slower could be due to the branching that occurs within the FlattenIterator (although the hints to the compiler minimise this cost) – although just why map(_:) is so slow, I'm not too sure. Would certainly be interested to know if anyone else knows more about this.
However, in an optimised build, the compiler is able to optimise away this big performance difference; giving all three options comparable speed, although flatMap(_:) is still fastest by a fraction of a second:
let arr = [[Int]](repeating: [Int](repeating: 0, count: 10000), count: 1000)
benchmark {
let result = arr.flatMap{$0} // 0.0910s
print(result.count)
}
benchmark {
let result = Array(arr.joined()) // 0.118s
print(result.count)
}
benchmark {
let result = arr.joined().map{$0} // 0.149s
print(result.count)
}
(Note that the order in which the tests are performed can affect the results – both of above results are an average from performing the tests in the various different orders)
From the Swiftdoc.org documentation of Array (Swift 3.0/dev) we read [emphasis mine]:
func flatMap<SegmentOfResult : Sequence>(_: #noescape (Element) throws -> SegmentOfResult)
Returns an array containing the concatenated results of calling the
given transformation with each element of this sequence.
...
In fact, s.flatMap(transform) is equivalent to Array(s.map(transform).flatten()).
We may also take a look at the actual implementations of the two in the Swift source code (from which Swiftdoc is generated ...)
swift/stdlib/public/core/Join.swift
swift/stdlib/public/core/FlatMap.swift
Most noteably the latter source file, where the flatMap implementations where the used closure (transform) does not yield and optional value (as is the case here) are all described as
/// Returns the concatenated results of mapping `transform` over
/// `self`. Equivalent to
///
/// self.map(transform).joined()
From the above (assuming the compiler can be clever w.r.t. a simple over self { $0 } transform), it would seem as if performance-wise, the two alternatives should be equivalent, but joined does, imo, better show the intent of the operation.
In addition to intent in semantics, there is one apparent use case where joined is preferable over (and not entirely comparable to) flatMap: using joined with it's init(separator:) initializer to join sequences with a separator:
let array = [[1,2,3],[4,5,6],[7,8,9]]
let j = Array(array.joined(separator: [42]))
print(j) // [1, 2, 3, 42, 4, 5, 6, 42, 7, 8, 9]
The corresponding result using flatMap is not really as neat, as we explicitly need to remove the final additional separator after the flatMap operation (two different use cases, with or without trailing separator)
let f = Array(array.flatMap{ $0 + [42] }.dropLast())
print(f) // [1, 2, 3, 42, 4, 5, 6, 42, 7, 8, 9]
See also a somewhat outdated post of Erica Sadun dicussing flatMap vs. flatten() (note: joined() was named flatten() in Swift < 3).
Erica Sadun- Beta 6: flatten #swiftlang
Related
I have two arrays which I would like to reduce to one array in which at each index you have the sum of the two elements in the original arrays. For example:
val arr1: Array[Int] = Array(1, 1, 3, 3, 5)
val arr1: Array[Int] = Array(2, 1, 2, 2, 1)
val arr3: Array[Int] = sum(arr1, arr2)
// This should result in:
// arr3 = Array(3, 2, 5, 5, 6)
I've seen this post: Element-wise sum of arrays in Scala, and I currently use this approach (zip/map). However, using this for a big data application I am concerned about its performance. Using this approach one has to traverse the array(s) at least twice. Is there a better approach in terms of efficiency?
The most efficient way might well be to do it lazily.
As with anything collection-oriented, Scala 2.12 and 2.13 are going to be different (this code is Scala 2.13, but 2.12 will be similar... might extend IndexedSeqLike, but I don't know for sure)
import scala.collection.IndexedSeq
import scala.math.Numeric
case class SumIndexedSeq[+T: Numeric](seq1: IndexedSeq[T], seq2: IndexedSeq[T]) extends IndexedSeq[T] {
override val length: Int = seq1.length.min(seq2.length)
override def apply(i: Int) =
if (i >= length) throw new IndexOutOfBoundsException
else seq1(i) + seq2(i)
}
Arrays are implicitly convertible to a subtype of collection.IndexedSeq. This will compute the sum of the corresponding elements on every access (which may be generally desirable as it's possible to use a mutable IndexedSeq).
If you need an Array, you can get one with only a single traversal via
val arr3: Array[Int] = SumIndexedSeq(arr1, arr2).toArray
but SumIndexedSeq can be used anywhere a Seq can be used without a traversal.
As a further optimization, especially if you're sure that the underlying collections/arrays won't mutate, you can add a cache so you don't add the same elements together twice. It can also be generalized, if you so care, to any binary operations on T (in which case the Numeric constraint can be removed).
As Luis noted, for a performance question: experiment and benchmark. It's worth keeping in mind that a cache implementation may well entail boxing every element to put in the cache, so you might need to be accessing the same elements many times in order for the cache to be a win (and a sufficiently large cache may have implications for the stability of a distributed system).
Well, first of all, as with all things related to performance the only answer is to benchmark.
Second, are you sure you need plain mutable, invariant, weird Arrays? Can't you use something like Vector or ArraySeq?
Third, you can just do something like this or using a while loop, which would be the same.
val result = ArraySeq.tabulate(math.min(arr1.length, arr2.length)) { i =>
arr1(i) + arr2(i)
}
I want to tile a small array multiple times in a large array. I'm looking for an "official" way of doing this. A naive solution follows:
val arr = Array[Int](1, 2, 3)
val array = {
val arrBuf = ArrayBuffer[Int]()
for (_ <- 1 until 10) {
arrBuf ++= arr
}
arrBuf.toArray
}
If you do not know why Arrays are good for performance (meaning you do not really need raw performance in this case) I would recommend you do not use them, and rather stick with List or Vector instead.
Arrays are not proper Scala collections, they are just plain JVM arrays. Meaning, they are mutable, very efficient (especially for unboxed primitives), fixed in memory size, and very restricted. They behave like normal scala collections because of implicit conversions and extension methods. But, due to their mutability and invariance, you really should avoid them unless you have good reasons for using them.
The proposed solution by Andronicus is not ideal for arrays (but it would be a very good solution for any real collection) because given arrays have fixed memory size, this fattening will end in constant reallocations and memory copying under the hood.
Anyways, here is a slight variation to such solution, using lists instead; which is a little bit more efficient.
implicit class ListOps[A](private val list: List[A]) extends AnyVal {
def times[B >: A](n: Int): List[B] =
Iterator.fill(n)(list).flatten.toList
}
List(1, 2, 3).times(3)
// res: List[Int] = List(1, 2, 3, 1, 2, 3, 1, 2, 3)
And here is also an efficient version using the new ArraySeq introduced in 2.13; which is an immutable Array.
(Note, you can do this using plain Arrays too)
implicit class ArraySeqOps[A](private val arr: ArraySeq[A]) extends AnyVal {
def times[B >: A](n: Int): ArraySeq[B] =
ArraySeq.tabulate(n * arr.lenght) { i => arr(i % arr.length) }
}
ArraySeq(1, 2, 3).times(3)
// res: ArraySeq[Int] = ArraySeq(1, 2, 3, 1, 2, 3, 1, 2, 3)
You can use Array.fill:
Array.fill(10)(Array(1, 2, 3)).flatten
[1, 2, 3, -1, -2].filter({ $0 > 0 }).count // => 3
[1, 2, 3, -1, -2].lazy.filter({ $0 > 0 }).count // => 3
What is the advantage of adding lazy to the second statement. As per my understanding, when lazy variable is used, memory is initialized to that variable at the time when it used. How does it make sense in this context?
Trying to understand the the use of LazySequence in little more detail. I had used the map, reduce and filter functions on sequences, but never on lazy sequence. Need to understand why to use this?
lazy changes the way the array is processed. When lazy is not used, filter processes the entire array and stores the results into a new array. When lazy is used, the values in the sequence or collection are produced on demand from the downstream functions. The values are not stored in an array; they are just produced when needed.
Consider this modified example in which I've used reduce instead of count so that we can print out what is happening:
Not using lazy:
In this case, all items will be filtered first before anything is counted.
[1, 2, 3, -1, -2].filter({ print("filtered one"); return $0 > 0 })
.reduce(0) { (total, elem) -> Int in print("counted one"); return total + 1 }
filtered one
filtered one
filtered one
filtered one
filtered one
counted one
counted one
counted one
Using lazy:
In this case, reduce is asking for an item to count, and filter will work until it finds one, then reduce will ask for another and filter will work until it finds another.
[1, 2, 3, -1, -2].lazy.filter({ print("filtered one"); return $0 > 0 })
.reduce(0) { (total, elem) -> Int in print("counted one"); return total + 1 }
filtered one
counted one
filtered one
counted one
filtered one
counted one
filtered one
filtered one
When to use lazy:
option-clicking on lazy gives this explanation:
From the Discussion for lazy:
Use the lazy property when chaining operations:
to prevent intermediate operations from allocating storage
or
when you only need a part of the final collection to avoid unnecessary computation
I would add a third:
when you want the downstream processes to get started sooner and not have to wait for the upstream processes to do all of their work first
So, for example, you'd want to use lazy before filter if you were searching for the first positive Int, because the search would stop as soon as you found one and it would save filter from having to filter the whole array and it would save having to allocate space for the filtered array.
For the 3rd point, imagine you have a program that is displaying prime numbers in the range 1...10_000_000 using filter on that range. You would rather show the primes as you found them than having to wait to compute them all before showing anything.
I hadn't seen this before so I did some searching and found it.
The syntax you post creates a lazy collection. A lazy collection avoids creating a whole series of intermediate arrays for each step of your code. It isn't that relevant when you only have a filter statement it would have much more effect if you did something like filter.map.map.filter.map, since without the lazy collection a new array is created at each step.
See this article for more information:
https://medium.com/developermind/lightning-read-1-lazy-collections-in-swift-fa997564c1a3
EDIT:
I did some benchmarking, and a series of higher-order functions like maps and filters is actually a little slower on a lazy collection than on a "regular" collection.
It looks like lazy collections give you a smaller memory footprint at the cost of slightly slower performance.
Edit #2:
#discardableResult func timeTest() -> Double {
let start = Date()
let array = 1...1000000
let random = array
.map { (value) -> UInt32 in
let random = arc4random_uniform(100)
//print("Mapping", value, "to random val \(random)")
return random
}
let result = random.lazy //Remove the .lazy here to compare
.filter {
let result = $0 % 100 == 0
//print(" Testing \($0) < 50", result)
return result
}
.map { (val: UInt32) -> NSNumber in
//print(" Mapping", val, "to NSNumber")
return NSNumber(value: val)
}
.compactMap { (number) -> String? in
//print(" Mapping", number, "to String")
return formatter.string(from: number)
}
.sorted { (lhv, rhv) -> Bool in
//print(" Sorting strings")
return (lhv.compare(rhv, options: .numeric) == .orderedAscending)
}
let elapsed = Date().timeIntervalSince(start)
print("Completed in", String(format: "%0.3f", elapsed), "seconds. count = \(result.count)")
return elapsed
}
In the code above, if you change the line
let result = random.lazy //Remove the .lazy here to compare
to
let result = random //Removes the .lazy here
Then it runs faster. With lazy, my benchmark has it take about 1.5 times longer with the .lazy collection compared to a straight array.
I want to randomly sample from a Scala list or array (not an RDD), the sample size can be much longer than the length of the list or array, how can I do this efficiently? Because the sample size can be very big and the sampling (on different lists/arrays) needs to be done a large number of times.
I know for a Spark RDD we can use takeSample() to do it, is there an equivalent for Scala list/array?
Thank you very much.
An easy-to-understand version would look like this:
import scala.util.Random
Random.shuffle(list).take(n)
Random.shuffle(array.toList).take(n)
// Seeded version
val r = new Random(seed)
r.shuffle(...)
For arrays:
import scala.util.Random
import scala.reflect.ClassTag
def takeSample[T:ClassTag](a:Array[T],n:Int,seed:Long) = {
val rnd = new Random(seed)
Array.fill(n)(a(rnd.nextInt(a.size)))
}
Make a random number generator (rnd) based on your seed. Then, fill an array with random numbers from 0 until the size of your array.
The last step is applying each random value to the indexing operator of your input array. Using it in the REPL could look as follows:
scala> val myArray = Array(1,3,5,7,8,9,10)
myArray: Array[Int] = Array(1, 3, 5, 7, 8, 9, 10)
scala> takeSample(myArray,20,System.currentTimeMillis)
res0: scala.collection.mutable.ArraySeq[Int] = ArraySeq(7, 8, 7, 3, 8, 3, 9, 1, 7, 10, 7, 10,
1, 1, 3, 1, 7, 1, 3, 7)
For lists, I would simply convert the list to Array and use the same function. I doubt you can get much more efficient for lists anyway.
It is important to note, that the same function using lists would take O(n^2) time, whereas converting the list to arrays first will take O(n) time
If you want to sample without replacement -- zip with randoms, sort O(n*log(n), discard randoms, take
import scala.util.Random
val l = Seq("a", "b", "c", "d", "e")
val ran = l.map(x => (Random.nextFloat(), x))
.sortBy(_._1)
.map(_._2)
.take(3)
Using a for comprehension, for a given array xs as follows,
for (i <- 1 to sampleSize; r = (Math.random * xs.size).toInt) yield a(r)
Note the random generator here produces values within the unit interval, which are scaled to range over the size of the array, and converted to Int for indexing over the array.
Note For pure functional random generator consider for instance the State Monad approach from Functional Programming in Scala, discussed here.
Note Consider also NICTA, another pure functional random value generator, it's use illustrated for instance here.
Using classical recursion.
import scala.util.Random
def takeSample[T](a: List[T], n: Int): List[T] = {
n match {
case n: Int if n <= 0 => List.empty[T]
case n: Int => a(Random.nextInt(a.size)) :: takeSample(a, n - 1)
}
}
package your.pkg
import your.pkg.SeqHelpers.SampleOps
import scala.collection.generic.CanBuildFrom
import scala.collection.mutable
import scala.language.{higherKinds, implicitConversions}
import scala.util.Random
trait SeqHelpers {
implicit def withSampleOps[E, CC[_] <: Seq[_]](cc: CC[E]): SampleOps[E, CC] = SampleOps(cc)
}
object SeqHelpers extends SeqHelpers {
case class SampleOps[E, CC[_] <: Seq[_]](cc: CC[_]) {
private def recurse(n: Int, builder: mutable.Builder[E, CC[E]]): CC[E] = n match {
case 0 => builder.result
case _ =>
val element = cc(Random.nextInt(cc.size)).asInstanceOf[E]
recurse(n - 1, builder += element)
}
def sample(n: Int)(implicit cbf: CanBuildFrom[CC[_], E, CC[E]]): CC[E] = {
require(n >= 0, "Cannot take less than 0 samples")
recurse(n, cbf.apply)
}
}
}
Either:
Mixin SeqHelpers, for example, with a Scalatest spec
Include import your.pkg.SeqHelpers._
Then the following should work:
Seq(1 to 100: _*) sample 10 foreach { println }
Edits to remove the cast are welcome.
Also if there is a way to create an empty instance of the collection for the accumulator, without knowing the concrete type ahead of time, please comment. That said, the builder is probably more efficient.
Did not test for performance, but the following code is a simple and elegant way to do the sampling and I believe can help many that come here just to get a sampling code. Just change the "range" according to the size of your end sample. If pseude-randomness is not enough for your need, you can use take(1) in the inner list and increase the range.
Random.shuffle((1 to 100).toList.flatMap(x => (Random.shuffle(yourList))))
I have a big array of objects and would like to split it into two arrays containing the objects in alternate order.
Example:
[0, 1, 2, 3, 4, 5, 6]
Becomes these two arrays (they should alternate)
[0, 2, 4, 6] and [1, 3, 5]
There are a ton of ways to split an array. But, what is the most efficient (least costly) if the array is huge.
There are various fancy ways to do it with filter but most would probably require two passes rather than one, so you may as well just use a for-loop.
Reserving space up-front could make a big difference in this case since if the source is large it’ll avoid unnecessary re-allocation as the new arrays grow, and the calculation of space needed is in constant time on arrays.
// could make this take a more generic random-access collection source
// if needed, or just make it an array extension instead
func splitAlternating<T>(source: [T]) -> ([T],[T]) {
var evens: [T] = [], odds: [T] = []
evens.reserveCapacity(source.count / 2 + 1)
odds.reserveCapacity(source.count / 2)
for idx in indices(source) {
if idx % 2 == 0 {
evens.append(source[idx])
}
else {
odds.append(source[idx])
}
}
return (evens,odds)
}
let a = [0,1,2,3,4,5,6]
splitAlternating(a) // ([0, 2, 4, 6], [1, 3, 5])
If performance is truly critical, you could use source.withUnsafeBufferPointer to access the source elements, to avoid the index bounds checking.
If the arrays are really huge, and you aren’t going to use the resulting data except to sample a small number of elements, you could consider using a lazy view instead (though the std lib lazy filter isn’t much use here as it returns sequence not a collection – you’d possibly need to write your own).
You can use the for in stride loop to fill two resulting arrays as follow:
extension Array {
var groupOfTwo:(firstArray:[T],secondArray:[T]) {
var firstArray:[T] = []
var secondArray:[T] = []
for index in stride(from: 0, to: count, by: 2) {
firstArray.append(self[index])
if index + 1 < count {
secondArray.append(self[index+1])
}
}
return (firstArray,secondArray)
}
}
[0, 1, 2, 3, 4, 5, 6].groupOfTwo.firstArray // [0, 2, 4, 6]
[0, 1, 2, 3, 4, 5, 6].groupOfTwo.secondArray // [1, 3, 5]
update: Xcode 7.1.1 • Swift 2.1
extension Array {
var groupOfTwo:(firstArray:[Element],secondArray:[Element]) {
var firstArray:[Element] = []
var secondArray:[Element] = []
for index in 0.stride(to: count, by: 2) {
firstArray.append(self[index])
if index + 1 < count {
secondArray.append(self[index+1])
}
}
return (firstArray,secondArray)
}
}
A more concise, functional approach would be to use reduce
let a = [0,1,2,3,4,5,6]
let (evens, odds) = a.enumerate().reduce(([Int](),[Int]())) { (cur, next) in
let even = next.index % 2 == 0
return (cur.0 + (even ? [next.element] : []),
cur.1 + (even ? [] : [next.element]))
}
evens // [0,2,4,6]
odds // [1,3,5]
Big/huge array always pose problems when being partially processed, like in this case, as creating two extra (even if half-sized) arrays can be both time and memory consuming. What if, for example, you just want to compute the mean and standard deviation of oddly and evenly positioned numbers, but this will require calling a dedicated function which requires a sequence as input?
Thus why not creating two sub-collections that instead of duplicating the array contents, they point to the original array, in a transparent manner to allow querying them for elements:
extension Collection where Index: Strideable{
func stride(from: Index, to: Index, by: Index.Stride) -> StridedToCollection<Self> {
return StridedToCollection(self, from: from, to: to, by: by)
}
}
struct StridedToCollection<C>: Collection where C: Collection, C.Index: Strideable {
private let _subscript : (C.Index) -> C.Element
private let step: C.Index.Stride
fileprivate init(_ collection: C, from: C.Index, to: C.Index, by: C.Index.Stride) {
startIndex = from
endIndex = Swift.max(to, startIndex)
step = by
_subscript = { collection[$0] }
}
let startIndex: C.Index
let endIndex: C.Index
func index(after i: C.Index) -> C.Index {
let next = i.advanced(by: step)
return next >= endIndex ? endIndex : next
}
subscript(_ index: C.Index) -> C.Element {
return _subscript(index)
}
}
The Collection extension and the associated struct would create a pseudo-array that you can use to access only the elements you are interested into.
Usage is simple:
let numbers: [Int] = [1, 2, 3, 4]
let stride1 = numbers.stride(from: 0, to: numbers.count, by: 2)
let stride2 = numbers.stride(from: 1, to: numbers.count, by: 2)
print(Array(stride1), Array(stride2))
With the above you can iterate the two strides without worrying you'll double the amount of memory. And if you actually need two sub-arrays, you just Array(stride)-ify them.
Use for loops. If the index value is even then send that to one array and if the index value is odd, then send that to odd array.
Here's, in my opinion, the easiest way
old_list = [0, 1, 2, 3, 4, 5, 6]
new_list1 =[]
new_list2 = []
while len(old_list)>0:
new_list1.append(old_list.pop(-1))
if len(old_list) != 0:
new_list2.append(old_list.pop(-1))
new_list1.reverse()
new_list2.reverse()
I just had to do this where I split an array into two in one place, and three into another. So I built this:
extension Array {
/// Splits the receiving array into multiple arrays
///
/// - Parameter subCollectionCount: The number of output arrays the receiver should be divided into
/// - Returns: An array containing `subCollectionCount` arrays. These arrays will be filled round robin style from the receiving array.
/// So if the receiver was `[0, 1, 2, 3, 4, 5, 6]` the output would be `[[0, 3, 6], [1, 4], [2, 5]]`. If the reviever is empty the output
/// Will still be `subCollectionCount` arrays, they just all will be empty. This way it's always safe to subscript into the output.
func split(subCollectionCount: Int) -> [[Element]] {
precondition(subCollectionCount > 1, "Can't split the array unless you ask for > 1")
var output: [[Element]] = []
(0..<subCollectionCount).forEach { (outputIndex) in
let indexesToKeep = stride(from: outputIndex, to: count, by: subCollectionCount)
let subCollection = enumerated().filter({ indexesToKeep.contains($0.offset)}).map({ $0.element })
output.append(subCollection)
}
precondition(output.count == subCollectionCount)
return output
}
}
It works on Swift 4.2 and 5.0 (as of 5.0 with Xcode 10.2 beta 2)