Unflatten a flat array in Scala - arrays

I have a flat array like this and another flat array that describes the dimensions:
val elems = Array(0,1,2,3)
val dimensions = Array(2,2)
So now I must be able to unflatten that and return a 2*2 array like this:
val unflattened = {{0,1},{2,3}}
The dimensions could be of any order. The only condition is that the length of the flat array will equal to the product of the dimensions. So for example., if the dimensions is
Array(3,3)
then I expect that the elems flat array will have to have 9 elements in it! The preconditions will be checked elsewhere so I do not have to worry about it here! All I need to do is to return an unflattened array.
Since this has to work on any dimension size, I think I probably have to define a recursive structure to put my results! Something like this?
case class Elem(elem: Array[Elem])
Could this work?
Any clue on how to go about implementing this function?

Although you should be able to do this with a simple recursive structure, I went along with a structure more suited to the problem.
case class Row(elems: List[Int])
trait Matrix
case class SimpleMatrix(rows: List[Row]) extends Matrix
case class HigherMatrix(matrices: List[Matrix]) extends Matrix
// since your flat arrays are always of proper sizes... we are not handling error cases
// so we are dealing with higher N-dimension matrices with size List(s1, s2, ...,sN)
// I have chosen List for the example (as its easy to print), you should choose Array
def arrayToMatrix(flat: List[Int], dimension: Int, sizes: List[Int]): Matrix = dimension match {
case 1 | 2 =>
// since your flat arrays are always of proper sizes... there should not be any problems here
SimpleMatrix(
flat
.grouped(sizes.head)
.map(Row)
.toList
)
case _ =>
HigherMatrix(
flat
.grouped(sizes.tail.reduce(_ * _))
.map(g => arrayToMatrix(g, dimension - 1, sizes.tail))
.toList
)
}
def arrayToSquareMatrix(flat: List[Int], dimension: Int, size: Int): Matrix =
arrayToMatrix(flat, dimension, Range.inclusive(1, dimension).map(_ => size).toList)
Here are the examples
val sm_2__2_2 = arrayToSquareMatrix(Range.inclusive(1, 4).toList, 2, 2)
// sm_2__2_2: Matrix = SimpleMatrix(List(Row(List(1, 2)), Row(List(3, 4))))
val m_2__3_2 = arrayToMatrix(Range.inclusive(1, 6).toList, 2, List(3, 2))
// m_2__3_2: Matrix = SimpleMatrix(List(Row(List(1, 2, 3)), Row(List(4, 5, 6))))
val sm_3__2_2_2 = arrayToSquareMatrix(Range.inclusive(1, 8).toList, 3, 2)
// sm_3__2_2_2: Matrix = HigherMatrix(List(SimpleMatrix(List(Row(List(1, 2)), Row(List(3, 4)))), SimpleMatrix(List(Row(List(5, 6)), Row(List(7, 8))))))
val m_3__3_2_2 = arrayToMatrix(Range.inclusive(1, 12).toList, 3, List(3, 2, 2))
// m_3__3_2_2: Matrix = HigherMatrix(List(SimpleMatrix(List(Row(List(1, 2)), Row(List(3, 4)))), SimpleMatrix(List(Row(List(5, 6)), Row(List(7, 8)))), SimpleMatrix(List(Row(List(9, 10)), Row(List(11, 12))))))

Here is a solution:
def unflatten(flat: Vector[Any], dims: Vector[Int]): Vector[Any] =
if (dims.length <= 1) {
flat
} else {
val (Vector(dim), rest) = dims.splitAt(1)
flat.grouped(flat.length/dim).map(a => unflatten(a, rest)).toVector
}
I have used Vector because Array isn't really a Scala type and doesn't allow conversion from Array[Int] to Array[Any].
Note that this implements only one of the possible partitions with the given dimensions, so it may or may not be what is required.
This is a version using types based on the Matrix trait in another answer:
trait Matrix
case class SimpleMatrix(rows: Vector[Int]) extends Matrix
case class HigherMatrix(matrices: Vector[Matrix]) extends Matrix
def unflatten(flat: Vector[Int], dims: Vector[Int]): Matrix =
if (dims.length <= 1) {
SimpleMatrix(flat)
} else {
val (Vector(dim), rest) = dims.splitAt(1)
val subs = flat.grouped(flat.length/dim).map(a => unflatten(a, rest)).toVector
HigherMatrix(subs)
}

There is a function grouped on Arrays which does what you want.
# Array(0,1,2,3).grouped(2).toArray
res2: Array[Array[Int]] = Array(Array(0, 1), Array(2, 3))

Related

Convert an array with n elements to a tuple with n elements in Scala

I have an array of elements (numbers in my case)
var myArray= Array(1, 2, 3, 4, 5, 6)
//myArray: Array[Int] = Array(1, 2, 3, 4, 5, 6)
and I want to obtain a tuple than contain all of them:
var whatIwant= (1,2,3,4,5,6)
//whatIwant: (Int, Int, Int, Int, Int, Int) = (1,2,3,4,5,6)
I tried the following code but it doesn't work:
var tuple = myArray(0)
for (e <- myArray)
{
var tuple = tuple :+ e
}
//error: recursive variable tuple needs type
The trivial answer is this:
val whatIwant =
myArray match {
case Array(a, b, c, d, e, f) => (a, b, c, d, e, f)
case _ => (0, 0, 0, 0, 0, 0)
}
If you want to support different numbers of element in myArray then you are in for a world of pain because you will lose all the type information associated with a tuple.
If you are using Spark you should probably use its mechanisms to generate the data you want directly rather than converting to Array first.
The number of elements of a tuple is not infinitely superimposed. In earlier versions, there were only 22 at most. Scala treats tuples with different numbers of elements as different classes. So you can't add elements like a list.
Apart from utilizing metaprogramming techniques such as reflection, tuple objects may only be generated explicitly.

Element-wise sum of arrays in Scala

How do I compute element-wise sum of the Arrays?
val a = new Array[Int](5)
val b = new Array[Int](5)
// assign values
// desired output: Array -> [a(0)+b(0), a(1)+b(1), a(2)+b(2), a(3)+b(3), a(4)+b(4)]
a.zip(b).flatMap(_._1+_._2)
missing parameter type for expanded function
Try:
a.zip(b).map { case (x, y) => x + y }
When you use an underscore as a placeholder in a function definition, it can only appear once (for each function argument position, that is, but in this case flatMap takes a Function1, so there's only one). If you need to refer to an argument more than once, you can't use the placeholder syntax—you'll need to give the argument a name.
As the other answers point out, you can use .map { case (x, y) => x + y } or the tuple accessor version, but it's also worth noting that if you want to avoid a bunch of tuple allocations in an intermediate collection, you can write the following:
scala> (a, b).zipped.map(_ + _)
res5: Array[Int] = Array(0, 0, 0, 0, 0)
Here zipped is a method that's available on pairs of collections that has a special map that takes a Function2, which means the only tuple that gets created is the (a, b) pair. The extra efficiency probably doesn't matter much in most cases, but the fact that you can pass a Function2 instead of a function from pairs means the syntax is often a little nicer as well.
// one D Array
val x = Array(1, 2, 3, 40, 55)
val x1 = Array(1, 2, 3, 40, 55)
x.indices.map(i=>x(i)+ x(i) )
// TWo D Array
val x1= Array((3,5), (5,7))
val x = Array((1,2), (3,4))
x.indices.map(i=>( x(i)._1 + x1(i)._1, x(i)._2 + x1(i)._2))

How to randomly sample from a Scala list or array?

I want to randomly sample from a Scala list or array (not an RDD), the sample size can be much longer than the length of the list or array, how can I do this efficiently? Because the sample size can be very big and the sampling (on different lists/arrays) needs to be done a large number of times.
I know for a Spark RDD we can use takeSample() to do it, is there an equivalent for Scala list/array?
Thank you very much.
An easy-to-understand version would look like this:
import scala.util.Random
Random.shuffle(list).take(n)
Random.shuffle(array.toList).take(n)
// Seeded version
val r = new Random(seed)
r.shuffle(...)
For arrays:
import scala.util.Random
import scala.reflect.ClassTag
def takeSample[T:ClassTag](a:Array[T],n:Int,seed:Long) = {
val rnd = new Random(seed)
Array.fill(n)(a(rnd.nextInt(a.size)))
}
Make a random number generator (rnd) based on your seed. Then, fill an array with random numbers from 0 until the size of your array.
The last step is applying each random value to the indexing operator of your input array. Using it in the REPL could look as follows:
scala> val myArray = Array(1,3,5,7,8,9,10)
myArray: Array[Int] = Array(1, 3, 5, 7, 8, 9, 10)
scala> takeSample(myArray,20,System.currentTimeMillis)
res0: scala.collection.mutable.ArraySeq[Int] = ArraySeq(7, 8, 7, 3, 8, 3, 9, 1, 7, 10, 7, 10,
1, 1, 3, 1, 7, 1, 3, 7)
For lists, I would simply convert the list to Array and use the same function. I doubt you can get much more efficient for lists anyway.
It is important to note, that the same function using lists would take O(n^2) time, whereas converting the list to arrays first will take O(n) time
If you want to sample without replacement -- zip with randoms, sort O(n*log(n), discard randoms, take
import scala.util.Random
val l = Seq("a", "b", "c", "d", "e")
val ran = l.map(x => (Random.nextFloat(), x))
.sortBy(_._1)
.map(_._2)
.take(3)
Using a for comprehension, for a given array xs as follows,
for (i <- 1 to sampleSize; r = (Math.random * xs.size).toInt) yield a(r)
Note the random generator here produces values within the unit interval, which are scaled to range over the size of the array, and converted to Int for indexing over the array.
Note For pure functional random generator consider for instance the State Monad approach from Functional Programming in Scala, discussed here.
Note Consider also NICTA, another pure functional random value generator, it's use illustrated for instance here.
Using classical recursion.
import scala.util.Random
def takeSample[T](a: List[T], n: Int): List[T] = {
n match {
case n: Int if n <= 0 => List.empty[T]
case n: Int => a(Random.nextInt(a.size)) :: takeSample(a, n - 1)
}
}
package your.pkg
import your.pkg.SeqHelpers.SampleOps
import scala.collection.generic.CanBuildFrom
import scala.collection.mutable
import scala.language.{higherKinds, implicitConversions}
import scala.util.Random
trait SeqHelpers {
implicit def withSampleOps[E, CC[_] <: Seq[_]](cc: CC[E]): SampleOps[E, CC] = SampleOps(cc)
}
object SeqHelpers extends SeqHelpers {
case class SampleOps[E, CC[_] <: Seq[_]](cc: CC[_]) {
private def recurse(n: Int, builder: mutable.Builder[E, CC[E]]): CC[E] = n match {
case 0 => builder.result
case _ =>
val element = cc(Random.nextInt(cc.size)).asInstanceOf[E]
recurse(n - 1, builder += element)
}
def sample(n: Int)(implicit cbf: CanBuildFrom[CC[_], E, CC[E]]): CC[E] = {
require(n >= 0, "Cannot take less than 0 samples")
recurse(n, cbf.apply)
}
}
}
Either:
Mixin SeqHelpers, for example, with a Scalatest spec
Include import your.pkg.SeqHelpers._
Then the following should work:
Seq(1 to 100: _*) sample 10 foreach { println }
Edits to remove the cast are welcome.
Also if there is a way to create an empty instance of the collection for the accumulator, without knowing the concrete type ahead of time, please comment. That said, the builder is probably more efficient.
Did not test for performance, but the following code is a simple and elegant way to do the sampling and I believe can help many that come here just to get a sampling code. Just change the "range" according to the size of your end sample. If pseude-randomness is not enough for your need, you can use take(1) in the inner list and increase the range.
Random.shuffle((1 to 100).toList.flatMap(x => (Random.shuffle(yourList))))

Array Set in scala

I have code in scala :
val graph = new Array [Set[Int]] (n)
def addedge(i:Int,j:Int)
{
graph(i)+=j
}
What does graph(i)+=j mean?
Can anybody translate it in any other languages like c, c++ or java?
graph is an Array, just like in C or Java. graph(i) means "access the ith element of graph". Each element in graph is a Set of Ints. The += method on Set adds an item to the Set. So graph(i) += j adds the number j into the Set stored at index i in graph.
Trying things out in the REPL shows the behavior:
scala> val graph = Array(Set(1,2), Set(2,3), Set(1))
graph: Array[scala.collection.immutable.Set[Int]] = Array(Set(1, 2), Set(2, 3), Set(1))
scala> graph(1) += 4
scala> graph
res0: Array[scala.collection.immutable.Set[Int]] = Array(Set(1, 2), Set(2, 3, 4), Set(1))

2d scala array iteration

I have a 2d array of type boolean (not important)
It is easy to iterate over the array in non-functional style.
How to do it FP style?
var matrix = Array.ofDim[Boolean](5, 5)
for ex, I would like to iterate through all the rows for a given column and return a list of int that would match a specific function.
Example: for column 3, iterate through rows 1 to 5 to return 4, 5 if the cell at (4, 3), (5, 3) match a specif function. Thx v much
def getChildren(nodeId: Int) : List[Int] = {
info("getChildren("+nodeId+")")
var list = List[Int]()
val nodeIndex = id2indexMap(nodeId)
for (rowIndex <- 0 until matrix.size) {
val elem = matrix(rowIndex)(nodeIndex)
if (elem) {
println("Row Index = " + rowIndex)
list = rowIndex :: list
}
}
list
}
What about
(1 to 5) filter {i => predicate(matrix(i)(3))}
where predicate is your function?
Note that initialized with (5,5) indexes goes from 0 to 4.
Update: based on your example
def getChildren(nodeId: Int) : List[Int] = {
info("getChildren("+nodeId+")")
val nodeIndex = id2indexMap(nodeId)
val result = (0 until matrix.size).filter(matrix(_)(nodeIndex)).toList
result.forEach(println)
result
}
You may move the print in the fiter if you want too, and reverse the list if you want it exactly as in your example
If you're not comfortable with filters and zips, you can stick with the for-comprehension but use it in a more functional way:
for {
rowIndex <- matrix.indices
if matrix(rowIndex)(nodeIndex)
} yield {
println("Row Index = " + rowIndex)
rowIndex
}
yield builds a new collection from the results of the for-comprehension, so this expression evaluates to the collection you want to return. seq.indices is a method equivalent to 0 until seq.size. The curly braces allow you to span multiple lines without semicolons, but you can make it in-line if you want:
for (rowIndex <- matrix.indices; if matrix(rowIndex)(nodeIndex)) yield rowIndex
Should probably also mention that normally if you're iterating through an Array you won't need to refer to the indices at all. You'd do something like
for {
row <- matrix
elem <- row
} yield f(elem)
but your use-case is a bit unusual in that it requires the indices of the elements, which you shouldn't normally be concerned with (using array indices is essentially a quick and dirty hack to pair a data element with a number). If you want to capture and use the notion of position you might be better off using a Map[Int, Boolean] or a case class with such a field.
def findIndices[A](aa: Array[Array[A]], pred: A => Boolean): Array[Array[Int]] =
aa.map(row =>
row.zipWithIndex.collect{
case (v,i) if pred(v) => i
}
)
You can refactor it to be a bit more nicer by extracting the function that finds the indices in a single row only:
def findIndices2[A](xs: Array[A], pred: A => Boolean): Array[Int] =
xs.zipWithIndex.collect{
case (v,i) if pred(v) => i
}
And then write
matrix.map(row => findIndices2(row, pred))

Resources