Megre two sorted Array, List vs Array - arrays

This function merges two sorted lists. It takes two lists as a parameter and returns one.
def merge(xs : List[Int], ys : List[Int]) : List[Int] = {
(xs, ys) match {
case (Nil, Nil) => Nil
case (Nil, ys) => ys
case (xs, Nil) => xs
case (x :: xs1, y :: ys1) =>
if(x < y) x :: merge(xs1, ys)
else y :: merge(xs, ys1)
}
}
I wanted to rewrite this function by changing the parameter type from List to Array, and it did not work. However, going from List to Seq it worked. Could you tell me what does not work with the Arrays?
def mergeDontWork(xs : Array[Int], ys : Array[Int]) : Array[Int] = {
(xs, ys) match {
case (Array.empty,Array.empty) => Array.empty
case (Array.empty, ys) => ys
case (xs, Array.empty) => xs
case (x +: xs1, y +: ys1) => if(x < y) x +: merge2(xs1, ys)
else y +: merge2(xs, ys1)
}
}
The error comes from that part of code : if(x < y) x +: merge2(xs1, ys) : Array[Any] does not conform with the expected type Array[Int]
EDIT
I finally understood how to go from List to Array thanks to the solutions proposed by pedromss and Harald. I modified the function by making it tail recursive.
def mergeTailRecursion(xs : Array[Int], ys : Array[Int]) : Array[Int] ={
def recurse( acc:Array[Int],xs:Array[Int],ys:Array[Int]):Array[Int]={
(xs, ys) match {
case (Array(),Array()) => acc
case (Array(), ys) => acc++ys
case (xs, Array()) => acc++xs
case (a#Array(x, _*), b#Array(y, _*)) =>
if (x < y) recurse(acc:+x, a.tail, b)
else recurse( acc:+y, a, b.tail)
}
}
recurse(Array(),xs,ys)
}

You can't pattern match on Array.empty because it is a method. Use Array() instead.
(x +: xs1, y +: ys1) doesn't appear to be a valid match expression. Change to (x +: xs1, y +: ys1)
Compiling version of your code:
object Arrays extends App {
def merge(xs: Array[Int], ys: Array[Int]): Array[Int] = {
(xs, ys) match {
case (Array(), Array()) => Array.empty
case (Array(), ys2) => ys2
case (xs2, Array()) => xs2
case (xs1#Array(x, _*), ys1#Array(y, _*)) =>
if (x < y) x +: merge(xs1.tail, ys)
else y +: merge(xs, ys1.tail)
}
}
merge(Array(1, 2, 3), Array(4, 5, 6)).foreach(println)
}
Refer to [here|Why can't I pattern match on Stream.empty in Scala? for the explanation about pattern matching on methods.
And [here|How do I pattern match arrays in Scala? for the explanation about the _*. Basically it will match any number of arguments.
Lastlty about the xs1#, from the [documentation|https://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html]:
Pattern Binders
Pattern2 ::= varid `#' Pattern3
A pattern binder xx#pp consists of a pattern variable xx and a pattern pp. The type of the variable xx is the static type TT of the pattern pp. This pattern matches any value vv matched by the pattern pp, provided the run-time type of vv is also an instance of TT, and it binds the variable name to that value.
You could also do it with
case (Array(x, _*), Array(y, _*)) =>
if (x < y) x +: merge(xs.tail, ys)
else y +: merge(xs, ys.tail)

The unapply methods of +: in the last pattern matching case doesn't seem to resolve to the right types Int and Array[Int].
You could try something like this:
case (Array(x, xs1 #_*), Array(y, ys1 #_*)) =>
if (x<y) x +: mergeDontWork(xs.tail, ys)
else y +: mergeDontWork(xs, ys.tail)
Unfortunately the construct xs1 #_* results in xs1 being of type Seq[Int] so it's also not possible to pass it into the recursive call. I used xs.tail as a workaround.

Related

Spark sequences [int] comparison with [String] sequence ouput

I was trying to compare integer wrapped arrays in two different columns and give the ratings as string:
import org.apache.spark.sql.Row
import org.apache.spark.sql.functions._
import scala.collection.mutable.WrappedArray
The DataFrame data has column A and B with wrapped array I would like to compare:
val data = Seq(
(Seq(1,2,3),Seq(4,5,6),Seq(7,8,9)),
(Seq(1,1,3),Seq(6,5,7),Seq(11,9,8))
).toDF("A","B","C")
And here is how it looks like:
data: org.apache.spark.sql.DataFrame = [A: array<int>, B: array<int> ... 1 more field]
+---------+---------+----------+
| A| B| C|
+---------+---------+----------+
|[1, 2, 3]|[4, 5, 6]| [7, 8, 9]|
|[1, 1, 3]|[6, 5, 7]|[11, 9, 8]|
+---------+---------+----------+
Then here is the the user define function which I would like to compare each elements in paired arrays in column A and B per row and give the ratings with simple logics. For example if A(1) > B(1) then D(1) is "Top". So as first row with column D, I hope to have ["Top", "Top", "Top"]
def myToChar(num1: Seq[Int], num2: Seq[Int]): Seq[String] = {
val twozipped = num1.zip(num2)
for ((x,y) <- num1.zip(num2)) {
if (x > y) "Top"
if (x < y) "Well"
if (x == y) "Good"
}}
val udfToChar = udf(myToChar(_: Seq[Int], _: Seq[Int]))
val ouput = data.withColumn("D",udfToChar($"A",$"B"))
However, I kept getting the <console>:45: error: type mismatch; error information. Not sure if my udf() type definition is wrong and appreciate any guidance to correct my mistake.
Your myToChar definition is declared to return a Seq[String] - but its implementation doesn't - it returns Unit, because a for expression (without a yield clause) has Unit type.
You can fix this by fixing the implementation of the function:
Replace the for with a map operation
Replace the last if with an else, otherwise the mapping function also returns Unit for inputs that adhere to none of the if conditions (unlike with pattern matching, the compiler can't conclude that your if conditions are exhaustive - it must assume there's also a possibility none of them would hold true)
So - a correct implementation would be:
def myToChar(num1: Seq[Int], num2: Seq[Int]): Seq[String] = {
num1.zip(num2).map { case (x, y) =>
if (x > y) "Top"
if (x < y) "Well"
else "Good"
}
}
Or alternatively using pattern matching with guards:
def myToChar(num1: Seq[Int], num2: Seq[Int]): Seq[String] = {
num1.zip(num2).map {
case (x, y) if x > y => "Top"
case (x, y) if x < y => "Well"
case _ => "Good"
}
}

For comprehension over Option array

I am getting compilation error:
Error:(64, 9) type mismatch;
found : Array[(String, String)]
required: Option[?]
y <- x
^
in a fragment:
val z = Some(Array("a"->"b", "c" -> "d"))
val l = for(
x <- z;
y <- x
) yield y
Why generator over Array does not produce items of the array? And where from requirement to have Option is coming from?
To be more ridiculous, if I replace "yield" with println(y) then it does compile.
Scala version: 2.10.6
This is because of the way for expressions are translated into map, flatmap and foreach expressions. Let's first simplify your example:
val someArray: Some[Array[Int]] = Some(Array(1, 2, 3))
val l = for {
array: Array[Int] <- someArray
number: Int <- array
} yield number
In accordance with the relevant part of the Scala language specification, this first gets translated into
someArray.flatMap {case array => for (number <- array) yield number}
which in turn gets translated into
someArray.flatMap {case array => array.map{case number => number}}
The problem is that someArray.flatMap expects a function from Array[Int] to Option[Array[Int]], whereas we've provided a function from Array[Int] to Array[Int].
The reason the compilation error goes away if yield number is replaced by println(number) is that for loops are translated differently from for comprehensions: it will now be translated as someArray.foreach{case array => array.foreach {case item => println(item)}}, which doesn't have the same typing issues.
A possible solution is to begin by converting the Option to the kind of collection you want to end up with, so that its flatMap method will have the right signature:
val l = for {
array: Array[Int] <- someArray.toArray
number: Int <- array
} yield number
It's the usual "option must be converted to mix monads" thing.
scala> for (x <- Option.option2Iterable(Some(List(1,2,3))); y <- x) yield y
res0: Iterable[Int] = List(1, 2, 3)
Compare
scala> for (x <- Some(List(1,2,3)); y <- x) yield y
<console>:12: error: type mismatch;
found : List[Int]
required: Option[?]
for (x <- Some(List(1,2,3)); y <- x) yield y
^
to
scala> Some(List(1,2,3)) flatMap (is => is map (i => i))
<console>:12: error: type mismatch;
found : List[Int]
required: Option[?]
Some(List(1,2,3)) flatMap (is => is map (i => i))
^
or
scala> for (x <- Some(List(1,2,3)).toSeq; y <- x) yield y
res3: Seq[Int] = List(1, 2, 3)

Scala logical indexing with for comprehension

I'm trying to translate the following Matlab logical-indexing pattern into Scala code:
% x is an [Nx1] array of Int32
% y is an [Nx1] array of Int32
% myExpensiveFunction() processes batches of unique x.
ux = unique(x);
z = nan(size(x));
for i = 1:length(ux)
idx = x == ux(i);
z(idx) = myExpensiveFuntion(x(idx), y(idx));
end
Assume I'm working with val x: Array[Int] in Scala. What is the best way to do this?
Edit: To clarify, I'm looking to process batches of (x,y) at a time, grouped by unique x, and return a result (z) with an order corresponding to the initial input. I'm open to sorting x, but eventually need to get back to the original unsorted order. My primary requirement is to handle all the indexing/mapping/sorting in a clear and reasonably efficient way.
Most of this is pretty straightforward in Scala; the only thing that's a bit out of the ordinary is the unique x indices. In Scala you'd do that with a `groupBy'. Since this is a really index-heavy method, I'm just going to give in and go with indices all the way:
val z = Array.fill(x.length)(Double.NaN)
x.indices.groupBy(i => x(i)).foreach{ case (xi, is) =>
is.foreach(i => z(i) = myExpensiveFunction(xi, y(i)))
}
z
assuming you can live with a lack of vectors going to myExpensiveFunction. If not,
val z = Array.fill(x.length)(Double.NaN)
x.indices.groupBy(i => x(i)).foreach{ case (xi, is) =>
val xs = Array.fill(is.length)(xi)
val ys = is.map(i => y(i)).toArray
val zs = myExpensiveFunction(xs, ys)
is.foreach(i => z(i) = zs(i))
}
z
This isn't the most natural way to do the computation in Scala, or the most efficient, but you don't care about efficiency if your expensive function is expensive, and it's the closest I can come to a literal translation.
(Translating your matlab-algorithms into almost everything else involves a certain amount of pain or rethinking, since the "natural" computations in matlab are not like those in most other languages.)
The important point is to get Matlab's unique right. A simple solution would be to use a Set to determine the unique values:
val occurringValues = x.toSet
occurringValues.foreach{ value =>
val indices = x.indices.filter(i => x(i) == value)
for (i <- indices) {
z(i) = myExpensiveFunction(x(i), y(i))
}
}
Note: I assume that it is possible to change myExpensiveFunction to element-wise operation...
scala> def process(xs: Array[Int], ys: Array[Int], f: (Seq[Int], Seq[Int]) => Double): Array[Double] = {
| val ux = xs.distinct
| val zs = Array.fill(xs.size)(Double.NaN)
| for(x <- ux) {
| val idx = xs.indices.filter{ i => xs(i) == x }
| val res = f(idx.map(xs), idx.map(ys))
| idx foreach { i => zs(i) = res }
| }
| zs
| }
process: (xs: Array[Int], ys: Array[Int], f: (Seq[Int], Seq[Int]) => Double)Array[Double]
scala> val xs = Array(1,2,1,2,3)
xs: Array[Int] = Array(1, 2, 1, 2, 3)
scala> val ys = Array(1,2,3,4,5)
ys: Array[Int] = Array(1, 2, 3, 4, 5)
scala> val f = (a: Seq[Int], b: Seq[Int]) => a.sum/b.sum.toDouble
f: (Seq[Int], Seq[Int]) => Double = <function2>
scala> process(xs, ys, f)
res0: Array[Double] = Array(0.5, 0.6666666666666666, 0.5, 0.6666666666666666, 0.6)

Swapping values in an array using Pattern Matching in Scala

I am trying to solve the following problem from Scala for the impatient. The question is as follows:
Using pattern matching, write a function swap that swaps the first two elements of an array provided its length is at least two.
My solution is:
def swap(sArr:Array[Int]) = sArr.splitAt(2) match {
case (Array(x,y),Array(z)) => Array(y,x,z)
case (Array(x,y),Array()) => Array(y,x)
case _ => sArr
}
My problem is with the first case statement. I think it would pattern-match something like (Array(1,2),Array(3)) whereas I intend it to pattern-match (Array(1,2),Array(3,4,5.....))
Can somebody point out how that would be possible.
Thanks
The problem with your code is that Array(z) means "match a one-element array". What you want is for z to be the whole array, no matter how many elements:
def swap(sArr: Array[Int]) =
sArr.splitAt(2) match {
case (Array(x, y), z) => Array(y, x) ++ z
case _ => sArr
}
However, I would write it with the sequence-matching syntax _* so that you don't have to manually split the array:
def f(a: Array[Int]) =
a match {
case Array(x, y, z # _*) => Array(y, x) ++ z
case _ => a
}

Create and populate two-dimensional array in Scala

What's the recommended way of creating a pre-populated two-dimensional array in Scala? I've got the following code:
val map = for {
x <- (1 to size).toList
} yield for {
y <- (1 to size).toList
} yield (x, y)
How do I make an array instead of list? Replacing .toList with .toArray doesn't compile. And is there a more concise or readable way of doing this than the nested for expressions?
On Scala 2.7, use Array.range:
for {
x <- Array.range(1, 3)
} yield for {
y <- Array.range(1, 3)
} yield (x, y)
On Scala 2.8, use Array.tabulate:
Array.tabulate(3,3)((x, y) => (x, y))
Among other ways, you can use use Array.range and map:
scala> Array.range(0,3).map(i => Array.range(0,3).map(j => (i,j)))
res0: Array[Array[(Int, Int)]] = Array(Array((0,0), (0,1), (0,2)), Array((1,0), (1,1), (1,2)), Array((2,0), (2,1), (2,2)))
Or you can use Array.fromFunction:
scala> Array.fromFunction((i,j) => (i,j))(3,3)
res1: Array[Array[(Int, Int)]] = Array(Array((0,0), (0,1), (0,2)), Array((1,0), (1,1), (1,2)), Array((2,0), (2,1), (2,2)))
Scala 2.8 gives you even more options--look through the Array object. (Actually, that's good advice for 2.7, also....)

Resources