Replace random elements of Array - arrays

I have two arrays of same length
import scala.util.Random
val length = 10
val x = 60 // Selection percentage
val rnd = new Random
var arrayOne = Array.fill(length)(rnd .nextInt(100))
arrayOne: Array[Int] = Array(8, 77, 11, 19, 17, 73, 5, 18, 45, 69)
val arrayTwo = Array.fill(length)(rnd .nextInt(100))
arrayTwo: Array[Int] = Array(96, 21, 85, 70, 28, 5, 31, 56, 27, 76)
I can select first x percent element from arrayTwo and those selected elements can replace first x percent elements of arrayOne in the following way.
arrayOne = arrayTwo.take((length * x / 100).toInt) ++ arrayOne.drop((length * x / 100).toInt)
arrayOne: Array[Int] = Array(96, 21, 85, 70, 28, 5, 5, 18, 45, 69)
Now I want to select random x percent elements from arrayTwo and that selected elements will replace random x percent elements of arrayOne. How can I do this?

You can exchange every item with a probability x:
val x = 60D
val exchanged = arrayOne.indices
.map(x => if(math.random > x / 100) arrayOne(x) else arrayTwo(x))
But that way you have no guarantee that (length * x / 100).toInt elements will be from arrayTwo. To achieve that I would go for iterative / recursive algorithm, where I'd pick random index until I have as much as I want.

You can do it via Random.shuffle:
scala> val randSet = Random.shuffle(arrayOne.indices.toBuffer).take((arrayOne.length * x / 100).toInt).toSet
randSet: scala.collection.immutable.Set[Int] = HashSet(0, 6, 9, 3, 8, 4)
scala> val randMerged = arrayOne.indices.map(i => if(randSet(i)) arrayTwo(i) else arrayOne(i))
randMerged: IndexedSeq[Int] = Vector(96, 77, 11, 70, 28, 73, 31, 18, 27, 76)
The randSet will take x percent indices randomly.
If you do not care the number's position, there is a simple one:
scala> val size = (arrayOne.length * x / 100).toInt
size: Int = 6
scala> Random.shuffle(arrayTwo).take(size) ++ Random.shuffle(arrayOne).drop(size)
res11: scala.collection.mutable.ArraySeq[Int] = ArraySeq(76, 85, 28, 56, 21, 27, 69, 45, 17, 77)

Related

In which array can we find elements in Julia?

Imagine we have the following array of 3 arrays, covering the range 1 to 150:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ... 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60 ... 92, 93, 94, 95, 96, 97, 98, 99, 100, 107]
[71, 73, 84, 101, 102, 103, 104, 105, 106, 108 ... 141, 142, 143, 144, 145, 146, 147, 148, 149, 150]
I want to build an array that stores in which array we find the values 1 to 150. The result must be then:
[1 1 1 ... 1 2 2 2 ... 2 3 2 3 2 ... 3 3 3 ... 3],
where each element corresponds to 1, 2, 3, ... ,150. The obtained array gives then the array-membership of the elements 1 to 150. The code must be applied for any number of arrays (so not only 3 arrays).
You can use an array comprehension. Here is an example with three vectors containing the range 1:10:
A = [1, 3, 4, 5, 7]
B = [2, 8, 9]
C = [6, 10]
Now we can write a comprehension using in with a fallback error to guard :
julia> [x in A ? 1 : x in B ? 2 : 3 for x in 1:10]
10-element Array{Int64,1}:
1
⋮
3
Perhaps also include a fallback error, in case the input is wrong
julia> [x in A ? 1 : x in B ? 2 : x in C ? 3 : error("not found") for x in 1:10]
10-element Array{Int64,1}:
1
⋮
3
Trade memory for search in this case:
Make an array to record which array each value is in.
# example arrays
N=100; A=rand(1:N,30);
B = rand(1:N,40);
C = rand(1:N,35);
# record array containing each value:
A=1,B=2,C=3;
not found=0;
arrayin = zeros(Int32, max(maximum(A),maximum(B),maximum(C)));
arrayin[A] .= 1;
arrayin[B] .= 2;
arrayin[C] .=3;

Outer product with R arrays

I have an array of dimension 3x1000. In truth each column is what's interesting. I want to use this to compute an array of dimension 3x3x1000, where a slab i is the outer product of the column i of the original array (in other words, v %*% t(v)). Is there a clean way to do this?
Below is a sample input matrix and output array, in the case of a 2x4 matrix.
mat_in <- cbind(c(1, 2), c(3, 4), c(5, 6), c(7, 8))
arr_out <- array(c(1, 2, 2, 4, 9, 12, 12, 16, 25, 30, 30, 36, 49, 56, 56, 64),
dim = c(2, 2, 4))
This gives you the desired result:
mat_in <- cbind(c(1, 2), c(3, 4), c(5, 6), c(7, 8))
array(apply(mat_in, 2, tcrossprod), dim=c(2,2,4))
### test:
arr_out <- array(c(1, 2, 2, 4, 9, 12, 12, 16, 25, 30, 30, 36, 49, 56, 56, 64),
dim = c(2, 2, 4))
arr_out - array(apply(mat_in, 2, tcrossprod), dim=c(2,2,4))

Gzip sequence of tuples and then unzip again into sequence of tuples - issue when unzipping the sequence

I have a sequence of Tuples that I need to gzip for storage. Afterwards I want to be able to extract the compressed content, decompress it and then get the Sequence of tuples back.
I use the following code for de/compressing:
def unzip(x: Array[Byte]) : String = {
val inputStream = new GZIPInputStream(new ByteArrayInputStream(x))
val output = scala.io.Source.fromInputStream(inputStream).mkString
return output
}
def gzip(input: Array[Byte]): Array[Byte] = {
val bos = new ByteArrayOutputStream(input.length)
val gzip = new GZIPOutputStream(bos)
gzip.write(input)
gzip.close()
val compressed = bos.toByteArray
bos.close()
compressed
}
As taken from this source https://gist.github.com/owainlewis/1e7d1e68a6818ee4d50e .
Then my routine more or less is the following:
val arr = Seq(("a",1),("b",2))
val arr_bytes = arr.toString.getBytes
val comp = compress(arr_bytes)
val x = unzip(comp)
The output is the following:
arr: Seq[(String, Int)] = List((a,1), (b,2))
arr_bytes: Array[Byte] = Array(76, 105, 115, 116, 40, 40, 97, 44, 49, 41, 44, 32, 40, 98, 44, 50, 41, 41)
comp: Array[Byte] = Array(31, -117, 8, 0, 0, 0, 0, 0, 0, 0, -13, -55, 44, 46, -47, -48, 72, -44, 49, -44, -44, 81, -48, 72, -46, 49, -46, -44, 4, 0, 35, 120, 118, -118, 18, 0, 0, 0)
x: String = List((a,1), (b,2))
The problem is x is now a String that has the format from above (with the word List contained as well).
For example:
x.toList
res174: List[Char] = List(L, i, s, t, (, (, a, ,, 1, ), ,, , (, b, ,, 2, ), ))
My question is, how do I decompress my exact sequence back, or how do I make x into my previous sequence again?
Solved it using the play api json library for storing the content in json objects:
val arr = Json.toJson(Array(Json.obj("name"-> "Bran", "age" -> 13),Json.obj("name"-> "Jon", "age" -> 18)))
val arr_bytes = arr.toString().getBytes
val comp = compress(arr_bytes)
val x= unzip(comp)
val json = Json.parse(x)

Create a list with millions of elements

I need to create and work with lists with 2**30 elements, but It's to slow. Is there any form to increase the speed?
My code:
sup = []
for i in range(2**30):
sup.append([i,pow(y,i,N)])
pow(y,i,n) == y**i*mod(N), modular exponentiation
I tried to use list comprehensions but isn't enough.
Different approach: why do you want to store those numbers in a list?
You have your formula right there; whenever some piece of code needs sup[i]; you just compute pow(y,i,N).
In other words: instead of storing values within a list; just compute them when you need them.
Edit: as it seems that you have good reasons to store that data in an array; I would then say: use the appropriate tool then.
Meaning: instead of doing computing intense things directly with python, you rather look into the numpy framework. That framework is designed for exactly such purposes. Beyond that, I would also look in the way you are storing/preparing your data. Example: you mention to later look for identical entries in that array. I am wondering if that would meant you should use a dictionary instead of a list; or did you really intend do check 2**30 entries each time you look for equal pow values?
Going by your comment and complementing the answer of GhostCat, go directly for the data you are looking for, for example like this
>>> from collections import defaultdict
>>> y = 2
>>> N = 10
>>> data = defaultdict(list)
>>> for i in range(100):
data[pow(y,i,N)].append(i)
>>> for x in data.items():
x
(8, [3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63, 67, 71, 75, 79, 83, 87, 91, 95, 99])
(1, [0])
(2, [1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65, 69, 73, 77, 81, 85, 89, 93, 97])
(4, [2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62, 66, 70, 74, 78, 82, 86, 90, 94, 98])
(6, [4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96])
>>>
or more specifically, as you need a random sample go for it from the start and don't waste time producing a gazillion stuff you would not need, for example
>>> import random
>>> random_data = defaultdict(list)
>>> for i in random.sample(range(2**30), 20):
random_data[pow(2,i,10)].append(i)
>>> for x in random_data.items():
x
(8, [633728687, 357300263, 208747091, 456291987, 1028949643, 23961003, 750842555])
(2, [602395153, 215460881, 144481457, 829193705])
(4, [752840814, 26689262])
(6, [423520476, 969809132, 326786996, 736424520, 929123176, 865279408, 338237708])
>>>
and depending of what you do with those i later on, you can instead try a more mathematical approach to uncover the underplaying patter that produce an i for which yi mod N is the same and that way you can produce as many i as you need for that particular modular class.
Which for this example is easy, it is
2i = 8 (mod 10) for all i=3 (mod 4) -> range(3,2**30,4)
2i = 2 (mod 10) for all i=1 (mod 4) -> range(1,2**30,4)
2i = 4 (mod 10) for all i=2 (mod 4) -> range(2,2**30,4)
2i = 6 (mod 10) for all i=0 (mod 4) -> range(4,2**30,4)
2i = 1 (mod 10) for i=0

Printing array in Scala

I am having problem with most basic Scala operation and it is making me crazy.
val a = Array(1,2,3)
println(a) and result is [I#1e76345
println(a.toString()) and result is [I#1e76345
println(a.toString) and result is [I#1e76345
Can anyone tell me how to print array without writing my own function for doing that because that is silly. Thanks!
mkString will convert collections (including Array) element-by-element to string representations.
println(a.mkString(" "))
is probably what you want.
You can do the normal thing (see either Rex's or Jiri's answer), or you can:
scala> Array("bob","sue")
res0: Array[String] = Array(bob, sue)
Hey, no fair! The REPL printed it out real nice.
scala> res0.toString
res1: String = [Ljava.lang.String;#63c58252
No joy, until:
scala> runtime.ScalaRunTime.stringOf(res0)
res2: String = Array(bob, sue)
scala> runtime.ScalaRunTime.replStringOf(res0, res0.length)
res3: String =
"Array(bob, sue)
"
scala> runtime.ScalaRunTime.replStringOf(res0, 1)
res4: String =
"Array(bob)
"
I wonder if there's a width setting in the REPL. Update: there isn't. It's fixed at
val maxStringElements = 1000 // no need to mkString billions of elements
But I won't try billions:
scala> Array.tabulate(100)(identity)
res5: Array[Int] = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99)
scala> import runtime.ScalaRunTime.replStringOf
import runtime.ScalaRunTime.replStringOf
scala> replStringOf(res5, 10)
res6: String =
"Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
"
scala> res5.take(10).mkString(", ")
res7: String = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Wait, let's make that:
scala> res5.take(10).mkString("Array(", ", ", ")")
res8: String = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
This might be obvious:
scala> var vs = List("1")
vs: List[String] = List(1)
scala> vs = null
vs: List[String] = null
scala> vs.mkString
java.lang.NullPointerException
So instead:
scala> import runtime.ScalaRunTime.stringOf
import runtime.ScalaRunTime.stringOf
scala> stringOf(vs)
res16: String = null
Also, an array doesn't need to be deep to benefit from its stringPrefix:
scala> println(res0.deep.toString)
Array(bob, sue)
Whichever method you prefer, you can wrap it up:
implicit class MkLines(val t: TraversableOnce[_]) extends AnyVal {
def mkLines: String = t.mkString("", EOL, EOL)
def mkLines(header: String, indented: Boolean = false, embraced: Boolean = false): String = {
val space = "\u0020"
val sep = if (indented) EOL + space * 2 else EOL
val (lbrace, rbrace) = if (embraced) (space + "{", EOL + "}") else ("", "")
t.mkString(header + lbrace + sep, sep, rbrace + EOL)
}
}
But arrays will need a special conversion because you don't get the ArrayOps:
implicit class MkArrayLines(val a: Array[_]) extends AnyVal {
def asTO: TraversableOnce[_] = a
def mkLines: String = asTO.mkLines
def mkLines(header: String = "Array", indented: Boolean = false, embraced: Boolean = false): String =
asTO.mkLines(header, indented, embraced)
}
scala> Console println Array("bob","sue","zeke").mkLines(indented = true)
Array
bob
sue
zeke
Here are two methods.
One is to use foreach:
val a = Array(1,2,3)
a.foreach(println)
The other is to use mkString:
val a = Array(1,2,3)
println(a.mkString(""))
If you use list instead, toString() method prints the actual elenents (not the hashCode)
var a = List(1,2,3)
println(a)
or
var a = Array(1,2,3)
println(a.toList)
For a simple Array of Ints like this, we can convert to a Scala List (scala.collection.immutable.List) and then use List.toString():
var xs = Array(3,5,9,10,2,1)
println(xs.toList.toString)
// => List(3, 5, 9, 10, 2, 1)
println(xs.toList)
// => List(3, 5, 9, 10, 2, 1)
If you can convert to a List earlier and do all your operations with Lists, then you'll probably end up writing more idiomatic Scala, written in a functional style.
Note that using List.fromArray is deprecated (and has been removed in 2.12.2) .
The method deep in ArrayLike recursively converts multidimensional arrays to WrappedArray, and overwrites a long prefix "WrappedArray" with "Array".
def deep: scala.collection.IndexedSeq[Any] = new scala.collection.AbstractSeq[Any] with scala.collection.IndexedSeq[Any] {
def length = self.length
def apply(idx: Int): Any = self.apply(idx) match {
case x: AnyRef if x.getClass.isArray => WrappedArray.make(x).deep
case x => x
}
override def stringPrefix = "Array"
}
Usage:
scala> val arr = Array(Array(1,2,3),Array(4,5,6))
arr: Array[Array[Int]] = Array(Array(1, 2, 3), Array(4, 5, 6))
scala> println(arr.deep)
Array(Array(1, 2, 3), Array(4, 5, 6))
Rather than manually specifying all the parameters for mkString yourself (which is a bit more verbose if you want to add start and end markers in addition to the delimiter) you can take advantage of the WrappedArray class, which uses mkString internally. Unlike converting the array to a List or some other data structure, the WrappedArray class just wraps an array reference, it's created in effectively constant time.
scala> val a = Array.range(1, 10)
a: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> println(a)
[I#64a2e69d
scala> println(x: Seq[_]) // implicit
WrappedArray(a, b, c, d)
scala> println(a.toSeq) // explicit
WrappedArray(1, 2, 3, 4, 5, 6, 7, 8, 9)

Resources