How do I set multiple array values by index in Scala? - arrays

Suppose I have a sequence of integers and a number n < 30. How can I produce an array (of length n) that is 0 in all places except at the indices specified by the sequence (where it should be 1)?
For instance
Input:
Seq(1, 2, 5)
7
Output:
Array(0, 1, 1, 0, 0, 1, 0)

scala> val a = Array.fill(7)(0)
a: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0)
scala> Seq(1,2,5).foreach(a(_) = 1)
scala> a
res1: Array[Int] = Array(0, 1, 1, 0, 0, 1, 0)

Alternatively,
scala> val is = Set(1, 2, 5)
is: scala.collection.immutable.Set[Int] = Set(1, 2, 5)
scala> Array.tabulate(10)(i => if (is contains i) 1 else 0)
res0: Array[Int] = Array(0, 1, 1, 0, 0, 1, 0, 0, 0, 0)

def makeArray(indices: Seq[Int], size: Int): Array[Int] = Iterable.tabulate(size) {
case idx if indices contains idx => 1
case _ => 0
}.toArray
makeArray(Seq(1, 2, 5), size = 7)

Related

Use the WHERE method to replace all numbers in a Numpy array with a - 1

I'm trying to use the where method to replace all odd numbers from the below array with a -1
np.array([0, 1, 0, 3, 0, 5, 0, 7, 0, 9])
I've tried using the below, but it's not working.
np.where(Q9 % 2 == 1) = - 1
Thanks for any assistance!
where method only returns indices
arr = np.array([0, 1, 0, 3, 0, 5, 0, 7, 0, 9])
arr[np.where(arr%2!=0)] = -1
print(arr)
output:
[ 0 -1 0 -1 0 -1 0 -1 0 -1]
If you want to replace in the original array, where is not needed, use simple indexing:
a = np.array([0, 1, 0, 3, 0, 5, 0, 7, 0, 9])
a[a%2==1] = -1
a
For a new array:
b = np.where(a%2==1, -1, a)
output: array([ 0, -1, 0, -1, 0, -1, 0, -1, 0, -1])

Populate Defined Named Range with multi-element array of multi-element arrays

I have defined 5 arrays.
One with undefined dimensions to store the other 4:
Dim outputArr() As Variant
and the rest as follows:
Dim Arr1(5, 0), Arr2(12, 0), Arr3(5, 0), Arr4(12, 0) As Variant
I assign the elements of the latter as follows:
Arr1(0, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(1, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(2, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(3, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(4, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(5, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
The above is applied to each array.
When I use
ReDim outputArray(3, 0)
outputArr = [{Arr1, Arr2, Arr3, Arr4}]
I get a 'Type Mismatch' error.
When I do not use Evaluate and assign without ReDim
outputArr = Array(Arr1, Arr2, Arr3, Arr4)
I can see the elements and their values in the Watch window, but when I try to populate Defined Named Ranges with the elements of outputArr I get an empty output
Range("nRange1name").Value = outputArr(0)
Range("nRange2name").Value = outputArr(1)
Range("nRange3name").Value = outputArr(2)
Range("nRange4name").Value = outputArr(3)
How can I work around this?
The use of variants in the OP code introduces unecessary dimensions. I don't understand why two transpose functions are needed but the following code pastes 2d arrays satisfactorily.
Option Explicit
Sub TestArrays()
Dim outputArr As Variant
Dim Arr1 As Variant
Dim Arr2 As Variant
Dim Arr3 As Variant
Dim Arr4 As Variant
Arr1 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr2 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr3 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr4 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
outputArr = Array(Arr1, Arr2, Arr3, Arr4)
' For Horizontal ranges
Range("A1:H2") = Application.WorksheetFunction.Transpose(Application.WorksheetFunction.Transpose(outputArr(2)))
'For Vertical ranges
Range("A4:B11") = Application.WorksheetFunction.Transpose(outputArr(3))
End Sub
You need to construct an actual 2D array to do something like that.
Dim arr(1 to 6, 1 to 12)
dim r as long, c as long
for r = lbound(arr, 1) to ubound(arr, 1)
for c = lbound(arr, 2) to ubound(arr, 2)
arr(r, c) = 0
next c
next r
Range("A1").Resize(ubound(arr, 1), ubound(arr, 2)).value = arr

Numpy Number Patterns

Is there a function in Numpy that allows you to take 4 records at a time and see where they match with a second dataset? Once there is a match move to the next 4 records of the first data set. It wont always be every 4 records, but i am using this as an example.
So if dataset one had - 1,5,7,8,10,12,6,1,3,6,8,9
And the second dataset had - 1,5,7,8,11,15,6,1,3,6,10,6
My result will be: 1,5,7,8, 6,1,3,6
POST EDIT:
My second example datasets:
import numpy as np
a =np.array([15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,1,1,3,5,6,0,9,47,1,1,7,7,0,0,45,12,17,45])
b = np.array ([6,0,0,15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,3,4,6,0,9,47,1,1,7,7,0,0,45,12,16,1,9,3,30])
Here's another snapshot of an example:
Thank you in advance for looking at my question!!
Update: for the more difficult and more interesting alignment problem it is probably best not to reinvent the wheel but to rely on python's difflib:
from difflib import SequenceMatcher
import numpy as np
k=4
a = np.array([15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,1,1,3,5,6,0,9,47,1,1,7,7,0,0,45,12,17,45])
b = np.array ([6,0,0,15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,3,4,6,0,9,47,1,1,7,7,0,0,45,12,16,1,9,3,30])
sm = SequenceMatcher(a=a, b=b)
matches = sm.get_matching_blocks()
matches = [m for m in matches if m.size >= k]
# [Match(a=0, b=3, size=17), Match(a=21, b=22, size=12)]
consensus = [a[m.a:m.a+m.size] for m in matches]
# [array([15, 15, 0, 0, 10, 10, 0, 0, 2, 1, 8, 8, 42, 2, 4, 4, 3]), array([ 6, 0, 9, 47, 1, 1, 7, 7, 0, 0, 45, 12])]
consfour = [a[m.a:m.a + m.size // k * k] for m in matches]
# [array([15, 15, 0, 0, 10, 10, 0, 0, 2, 1, 8, 8, 42, 2, 4, 4]), array([ 6, 0, 9, 47, 1, 1, 7, 7, 0, 0, 45, 12])]
summary = [np.c_[np.add.outer(np.arange(m.size // k * k), (m.a, m.b)), c]
for m, c in zip(matches, consfour)]
merge = np.concatenate(summary, axis=0)
Below is my original solution assuming already aligned and same-length arrays:
Here is a hybrid solution using numpy to find consecutive matches and cutting them out and then list comp to apply length constraints:
import numpy as np
d1 = np.array([7,1,5,7,8,0,6,9,0,10,12,6,1,3,6,8,9])
d2 = np.array([8,1,5,7,8,0,6,9,0,11,15,6,1,3,6,10,6])
k = 4
# find matches
m = d1 == d2
# find switches between match, no match
sw = np.where(m[:-1] != m[1:])[0] + 1
# split
mnm = np.split(d1, sw)
# select matches
ones_ = mnm[1-m[0]::2]
# apply length constraint
res = [blck[i:i+k] for blck in ones_ for i in range(len(blck)-k+1)]
# [array([1, 5, 7, 8]), array([5, 7, 8, 0]), array([7, 8, 0, 6]), array([8, 0, 6, 9]), array([0, 6, 9, 0]), array([6, 1, 3, 6])]
res_no_ovlp = [blck[k*i:k*i+k] for blck in ones_ for i in range(len(blck)//k)]
# [array([1, 5, 7, 8]), array([0, 6, 9, 0]), array([6, 1, 3, 6])]
You can use matrix masking like,
import numpy as np
from scipy.sparse import dia_matrix
a = np.array([1,5,7,8,10,12,6,1,3,6,8,9])
b = np.array([1,5,7,8,11,15,6,1,3,6,10,6])
mask = dia_matrix((np.ones((1, a.size)).repeat(4, axis=0), np.arange(4)),
shape=(a.size, b.size), dtype=np.int)
print(mask.toarray())
matches = a[mask.T.dot(mask.dot(a == b) == 4).astype(np.bool)]
print(matches)
This will output,
array([[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
[1 5 7 8 6 1 3 6]
You can think about how the matrix multiplication works to get this result.
Scaling
For scaling, I tested with 1e3, 1e5, and 1e7 elements and got,
1e3 - 0.019184964010491967
1e5 - 0.4330314120161347
1e7 - 144.54082221200224
See the gist. Not sure why such a hard jump at 1e7 elements.
This is an exercise is list comprehension. We have the data
data = [1,5,7,8,10,12,6,1,3,6,8,9]
search_data = [1,5,7,8,11,15,6,1,3,6,10,6]
First we can chunk the original data into blocks of length n
n = 4
chunks = [data[i:i + n] for i in range(len(data) - n + 1)]
search_chunks = [search_data[i:i + n] for i in range(len(search_data) - n + 1)]
Now we must select chunks from the first list that appear in the second list
hits = [c for c in chunks if c in search_chunks]
print hits
# [[1, 5, 7, 8], [6, 1, 3, 6]]
This may not be the optimal solution for long lists. It may improve performance to consider sets, if there are likely to repeated chunks
chunks = set(tuple(data[i:i + n]) for i in range(len(data) - n + 1))
search_chunks = set(tuple(search_data[i:i + n]) for i in range(len(search_data) - n + 1))
This can be quite competitive with above numpy solution, e.g.
import numpy as np
import time
# Generate data
len_ = 10000
max_ = 10
data = map(int, np.random.rand(len_) * max_)
search_data = map(int, np.random.rand(len_) * max_)
# Time list comprehension
start = time.time()
n = 4
chunks = set(tuple(data[i:i + n]) for i in range(len(data) - n + 1))
search_chunks = set(tuple(search_data[i:i + n]) for i in range(len(search_data) - n + 1))
hits = [c for c in chunks if c in search_chunks]
print time.time() - start
# Time numpy
a = np.array(data)
b = np.array(search_data)
mask = 1 * (np.abs(np.arange(a.size).reshape((-1, 1)) - np.arange(a.size) - 0.5) < 2)
start = time.time()
matches = a[mask.T.dot(mask.dot(a == b) == 4).astype(np.bool)]
print time.time() - start
It's typically faster here, but it depends on number of repeated chunks etc.

Array Index Out of Bound Exception - Scala

I want to print a two dimensional matrix in Scala and I keep getting Array Index Out of Bound Exception.
I have used breakable code and still I am encountering the issue.
package com.edureka.scala
import scala.util.control.Breaks._
class Pascal
{
val r,c=0
val matrix=Array.ofDim[Int](r,c) //declare a two-dimensional array
def fun
{
breakable
{
for(r <- 0 until 4 ;c <- 0 until 4)
{
println(matrix(r)(c)=r+c)
if(r>3)break
}
}
}
}
object pas1 extends App
{
val pasobj=new Pascal()
pasobj.fun
}
You are creating an empty array:
val matrix = Array.ofDim[Int](0, 0)
matrix: Array[Array[Int]] = Array()
Since there are no entries, retrieving one fails:
scala> matrix(0)(0)
java.lang.ArrayIndexOutOfBoundsException: 0
And assigning to one fails, as well:
scala> matrix(0)(0) = 0
java.lang.ArrayIndexOutOfBoundsException: 0
You need to declare an array of 4x4 dimension:
val matrix = Array.ofDim[Int](4, 4)
matrix: Array[Array[Int]] = Array(Array(0, 0, 0, 0), Array(0, 0, 0, 0), ...)
Then you can assign successfully:
scala> matrix(3)(3) = 3
And retrieve as well:
scala> matrix(3)(3)
res1: Int = 3
You define an empty array of array of ints, since you declare r,c=0
# val m = Array.ofDim[Int](0, 0)
m: Array[Array[Int]] = Array()
And then in your loop you try to access the elements in that array (which do not exist)
# m(0)(0)
java.lang.ArrayIndexOutOfBoundsException: 0
$sess.cmd5$.<init>(cmd5.sc:1)
$sess.cmd5$.<clinit>(cmd5.sc:-1)
Simply creating an array of arrays does not fill it with values, especially when you set its dimensions as 0. You can set the dimensions higher and you will have a populated array:
# val m2 = Array.ofDim[Int](5, 5)
m2: Array[Array[Int]] = Array(
Array(0, 0, 0, 0, 0),
Array(0, 0, 0, 0, 0),
Array(0, 0, 0, 0, 0),
Array(0, 0, 0, 0, 0),
Array(0, 0, 0, 0, 0)
)
# m2(1)(4)
res7: Int = 0

A simplest way to convert array to 2d array in scala

I have a 10 × 10 Array[Int]
val matrix = for {
r <- 0 until 10
c <- 0 until 10
} yield r + c
and want to convert the "matrix" to an Array[Array[Int]] with 10 rows and 10 columns.
What is the simplest way to do it?
val matrix = (for {
r <- 0 until 3
c <- 0 until 3
} yield r + c).toArray
// Array(0, 1, 2, 1, 2, 3, 2, 3, 4)
scala> matrix.grouped(3).toArray
// Array(Array(0, 1, 2), Array(1, 2, 3), Array(2, 3, 4))
If I understand correctly, you can do :
Array.tabulate(10,10)(_+_)
//> res0: Array[Array[Int]] = Array(Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), ....)
If you just need a 10 x 10 Array[Int] without any values you can do,
Array.ofDim[Int](10,10)
/> res1: Array[Array[Int]] = Array(Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0
//| , 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, ....
The code you showed gives you a Vector of Int, not an Array. If Vector and it is okay to generate a new you just need to yield twice
val matrix = for (r <- 1 to 10)
yield for(c <- 1 to 10)
yield r+c
If you need to convert the existing Vector to Array[Array[Int]] as you said, use grouped as chris-martin suggested
matrix.grouped(10).toArray.map(_.toArray)
for (x <- (0 until 10).toArray) yield (x until x + 10).toArray

Resources