Convert IndexedSeq to Array - arrays

I have a 2D Array, say a 3x3 Matrix, and want to multiply all elements with a given value. This is what I have so far:
val m = Array(
Array(1,2,3),
Array(4,5,6),
Array(7,8,9)
)
val value = 3
for ( i <- m.indices) yield m(i).map(x => x*value)
//Result:
//IndexedSeq[Array[Int]] = Vector(Array(3, 6, 9), Array(12, 15, 18), Array(21, 24, 27))
Problem is I have an IndexedSeq[Array[Int]] now but I need this to be an Array[Array[Int]] just like val m.
I know that for example for (i <- Array(1, 2, 3)) yield i results in an Array[Int] but I can't figure out how to put this all together.
Just appending .toArray doesn't work also

If you want to create new Array instances (copy):
val m = Array(
Array(1,2,3),
Array(4,5,6),
Array(7,8,9)
)
val value = 3
val newM = m.map{ array => array.map{x => x * value}}
Or, if you want to modify the original arrays "in-place":
val m = Array(
Array(1,2,3),
Array(4,5,6),
Array(7,8,9)
)
val value = 3
for (arr <- m; j <- arr.indices) {
arr(j) *= value
}

Related

How do I take slice from an array position to end of the array?

How do I get an array of array with elements like this? Is there an inbuilt scala api that can provide this value (without using combinations)?
e.g
val inp = Array(1,2,3,4)
Output
Vector(
Vector((1,2), (1,3), (1,4)),
Vector((2,3), (2,4)),
Vector((3,4))
)
My answer is below. I feel that there should be an elegant answer than this in scala.
val inp = Array(1,2,3,4)
val mp = (0 until inp.length - 1).map( x => {
(x + 1 until inp.length).map( y => {
(inp(x),inp(y))
})
})
print(mp)
+Edit
Added combination constraint.
Using combinations(2) and groupBy() on the first element (0) of each combination will give you the values and structure you want. Getting the result as a Vector[Vector]] will require some conversion using toVector
scala> inp.combinations(2).toList.groupBy(a => a(0)).values
res11: Iterable[List[Array[Int]]] = MapLike.DefaultValuesIterable
(
List(Array(2, 3), Array(2, 4)),
List(Array(1, 2), Array(1, 3), Array(1, 4)),
List(Array(3, 4))
)
ORIGINAL ANSWER
Note This answer is OK only if the elements in the Seq are unique and sorted (according to <). See edit for the more general case.
With
val v = a.toVector
and by foregoing combinations, I can choose tuples instead and not have to cast at the end
for (i <- v.init) yield { for (j <- v if i < j) yield (i, j) }
or
v.init.map(i => v.filter(i < _).map((i, _)))
Not sure if there's a performance hit for using init on vector
EDIT
For non-unique elements, we can use the indices
val v = a.toVector.zipWithIndex
for ((i, idx) <- v.init) yield { for ((j, jdx) <- v if idx < jdx) yield (i, j) }

how to insert element to rdd array in spark

Hi I've tried to insert element to rdd array[String] using scala in spark.
Here is example.
val data = RDD[Array[String]] = Array(Array(1,2,3), Array(1,2,3,4), Array(1,2)).
I want to make length 4 of all arrays in this data.
If the length of array is less than 4, I want to fill the NULL value in the array.
here is my code that I tried to solve.
val newData = data.map(x =>
if(x.length < 4){
for(i <- x.length until 4){
x.union("NULL")
}
}
else{
x
}
)
But The result is Array[Any] = Array((), Array(1, 2, 3, 4), ()).
So I tried another ways. I used yield on for loop.
val newData = data.map(x =>
if(x.length < 4){
for(i <- x.length until 4)yield{
x.union("NULL")
}
}
else{
x
}
)
The result is Array[Object] = Array(Vector(Array(1, 2, 3, N, U, L, L)), Array(1, 2, 3, 4), Vector(Array(1, 2, N, U, L, L), Array(1, 2, N, U, L, L)))
these are not what I want. I want to return like this
RDD[Array[String]] = Array(Array(1,2,3,NULL), Array(1,2,3,4), Array(1,2,NULL,NULL)).
What should I do?
Is there a method to solve it?
union is a functional operation, it doesn't change the array x. You don't need to do this with a loop, though, and any loop implementations will probably be slower -- it's much better to create one new collection with all the NULL values instead of mutating something every time you add a null. Here's a lambda function that should work for you:
def fillNull(x: Array[Int], desiredLength: Int): Array[String] = {
x.map(_.toString) ++ Array.fill(desiredLength - x.length)("NULL")
}
val newData = data.map(fillNull(_, 4))
I solved your use case with the following code:
val initialRDD = sparkContext.parallelize(Array(Array[AnyVal](1, 2, 3), Array[AnyVal](1, 2, 3, 4), Array[AnyVal](1, 2, 3)))
val transformedRDD = initialRDD.map(array =>
if (array.length < 4) {
val transformedArray = Array.fill[AnyVal](4)("NULL")
Array.copy(array, 0, transformedArray, 0, array.length)
transformedArray
} else {
array
}
)
val result = transformedRDD.collect()

Scala: iteration 2d array to do operation

A newbie here.
val arr_one = Array(Array(1, 2), Array(3, 4), Array(5, 6),Array(x, y)..and so on)
val arr_two = Array(Array(2,3), Array(4, 5), Array(6, 7))
var tempArr = ArrayBuffer[Double]()
I want to multiply arr_one and arr_two. for example
Iteration1 :Array(1*2+2*3, 1*4 +2*5, 1*6+2*7 ) assign to tempArr
Iteration2 :Array(3*2+4*3, 3*4 +4*5, 3*6+4*7) assign to tempArr
Iteration3 :Array(5*2+6*3, 5*4 +6*5, 5*6+6*7) assign to tempArr
I knew that if
val x = Array(1, 2) ; val y = Array(Array(2,3), Array(4, 5), Array(6, 7))
I can use y map {x zip _ map{case(a, b) => a * b} sum}
But If x like arr_one form, I don't know how to use for loop or something else to do that.
I really have on idea.
How can I do this in scala?
Really thanks.
I believe this does what you need, without any mutable state and "iterations" - it uses the "for-comprehension" syntax which is kind of a non-imperative for-loop - in other words, instead of changing state in each iteration, it returns a value which is the sequence of results per "iteration":
val result: Array[Array[Int]] = for (arr1 <- arr_one) yield {
for (arr2 <- arr_two) yield multArrays(arr1, arr2)
}
Assuming that multArrays has the following signature:
def multArrays(arr1: Array[Int], arr2: Array[Int]): Int
That calculates the value for each cell. A naive implementation (assuming arrays have size 2) would be:
def multArrays(arr1: Array[Int], arr2: Array[Int]): Int = {
arr1(0) * arr2(0) + arr1(1) * arr2(1)
}
But of course this can be generalized to any size arrays.
May be this is what you need:
val tmp = arr_one map ((arr1) => {arr_two map (arr2 => (arr1 zip arr2) map {case(a, b) => a * b} reduce (_ + _))} )
And to get ArrayBuffer simply use :
tmpArr = tmp.toBuffer

Multidimensional Array zip array in scala

I have two array like:
val one = Array(1, 2, 3, 4)
val two = Array(4, 5, 6, 7)
var three = one zip two map{case(a, b) => a * b}
It's ok.
But I have a multidimensional Array and a one-dimensional array now, like this:
val mulOne = Array(Array(1, 2, 3, 4),Array(5, 6, 7, 8),Array(9, 10, 11, 12))
val one_arr = Array(1, 2, 3, 4)
I would like to multiplication them, how can I do this in scala?
You could use:
val tmpArr = mulOne.map(_ zip one_arr).map(_.map{case(a,b) => a*b})
This would give you Array(Array(1*1, 2*2, 3*3, 4*4), Array(5*1, 6*2, 7*3, 8*4), Array(9*1, 10*2, 11*3, 12*4)).
Here mulOne.map(_ zip one_arr) maps each internal array of mulOne with corresponding element of one_arr to create pairs like: Array(Array((1,1), (2,2), (3,3), (4,4)), ..) (Note: I have used placeholder syntax). In the next step .map(_.map{case(a,b) => a*b}) multiplies each elements of pair to give
output like: Array(Array(1, 4, 9, 16),..)
Then you could use:
tmpArr.map(_.reduce(_ + _))
to get sum of all internal Arrays to get Array(30, 70, 110)
Try this
mulOne.map{x => (x, one_arr)}.map { case(one, two) => one zip two map{case(a, b) => a * b} }
mulOne.map{x => (x, one_arr)} => for every array inside mulOne, create a tuple with content of one_arr.
.map { case(one, two) => one zip two map{case(a, b) => a * b} } is basically the operation that you performed in your first example on every tuple that were created in the first step.
Using a for comprehension like this,
val prods =
for { xs <- mulOne
zx = one_arr zip xs
(a,b) <- zx
} yield a*b
and so
prods.sum
delivers the final result.

Scala logical indexing with for comprehension

I'm trying to translate the following Matlab logical-indexing pattern into Scala code:
% x is an [Nx1] array of Int32
% y is an [Nx1] array of Int32
% myExpensiveFunction() processes batches of unique x.
ux = unique(x);
z = nan(size(x));
for i = 1:length(ux)
idx = x == ux(i);
z(idx) = myExpensiveFuntion(x(idx), y(idx));
end
Assume I'm working with val x: Array[Int] in Scala. What is the best way to do this?
Edit: To clarify, I'm looking to process batches of (x,y) at a time, grouped by unique x, and return a result (z) with an order corresponding to the initial input. I'm open to sorting x, but eventually need to get back to the original unsorted order. My primary requirement is to handle all the indexing/mapping/sorting in a clear and reasonably efficient way.
Most of this is pretty straightforward in Scala; the only thing that's a bit out of the ordinary is the unique x indices. In Scala you'd do that with a `groupBy'. Since this is a really index-heavy method, I'm just going to give in and go with indices all the way:
val z = Array.fill(x.length)(Double.NaN)
x.indices.groupBy(i => x(i)).foreach{ case (xi, is) =>
is.foreach(i => z(i) = myExpensiveFunction(xi, y(i)))
}
z
assuming you can live with a lack of vectors going to myExpensiveFunction. If not,
val z = Array.fill(x.length)(Double.NaN)
x.indices.groupBy(i => x(i)).foreach{ case (xi, is) =>
val xs = Array.fill(is.length)(xi)
val ys = is.map(i => y(i)).toArray
val zs = myExpensiveFunction(xs, ys)
is.foreach(i => z(i) = zs(i))
}
z
This isn't the most natural way to do the computation in Scala, or the most efficient, but you don't care about efficiency if your expensive function is expensive, and it's the closest I can come to a literal translation.
(Translating your matlab-algorithms into almost everything else involves a certain amount of pain or rethinking, since the "natural" computations in matlab are not like those in most other languages.)
The important point is to get Matlab's unique right. A simple solution would be to use a Set to determine the unique values:
val occurringValues = x.toSet
occurringValues.foreach{ value =>
val indices = x.indices.filter(i => x(i) == value)
for (i <- indices) {
z(i) = myExpensiveFunction(x(i), y(i))
}
}
Note: I assume that it is possible to change myExpensiveFunction to element-wise operation...
scala> def process(xs: Array[Int], ys: Array[Int], f: (Seq[Int], Seq[Int]) => Double): Array[Double] = {
| val ux = xs.distinct
| val zs = Array.fill(xs.size)(Double.NaN)
| for(x <- ux) {
| val idx = xs.indices.filter{ i => xs(i) == x }
| val res = f(idx.map(xs), idx.map(ys))
| idx foreach { i => zs(i) = res }
| }
| zs
| }
process: (xs: Array[Int], ys: Array[Int], f: (Seq[Int], Seq[Int]) => Double)Array[Double]
scala> val xs = Array(1,2,1,2,3)
xs: Array[Int] = Array(1, 2, 1, 2, 3)
scala> val ys = Array(1,2,3,4,5)
ys: Array[Int] = Array(1, 2, 3, 4, 5)
scala> val f = (a: Seq[Int], b: Seq[Int]) => a.sum/b.sum.toDouble
f: (Seq[Int], Seq[Int]) => Double = <function2>
scala> process(xs, ys, f)
res0: Array[Double] = Array(0.5, 0.6666666666666666, 0.5, 0.6666666666666666, 0.6)

Resources