A simplest way to convert array to 2d array in scala - arrays

I have a 10 × 10 Array[Int]
val matrix = for {
r <- 0 until 10
c <- 0 until 10
} yield r + c
and want to convert the "matrix" to an Array[Array[Int]] with 10 rows and 10 columns.
What is the simplest way to do it?

val matrix = (for {
r <- 0 until 3
c <- 0 until 3
} yield r + c).toArray
// Array(0, 1, 2, 1, 2, 3, 2, 3, 4)
scala> matrix.grouped(3).toArray
// Array(Array(0, 1, 2), Array(1, 2, 3), Array(2, 3, 4))

If I understand correctly, you can do :
Array.tabulate(10,10)(_+_)
//> res0: Array[Array[Int]] = Array(Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), ....)
If you just need a 10 x 10 Array[Int] without any values you can do,
Array.ofDim[Int](10,10)
/> res1: Array[Array[Int]] = Array(Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0
//| , 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Array(0, ....

The code you showed gives you a Vector of Int, not an Array. If Vector and it is okay to generate a new you just need to yield twice
val matrix = for (r <- 1 to 10)
yield for(c <- 1 to 10)
yield r+c
If you need to convert the existing Vector to Array[Array[Int]] as you said, use grouped as chris-martin suggested
matrix.grouped(10).toArray.map(_.toArray)

for (x <- (0 until 10).toArray) yield (x until x + 10).toArray

Related

Use the WHERE method to replace all numbers in a Numpy array with a - 1

I'm trying to use the where method to replace all odd numbers from the below array with a -1
np.array([0, 1, 0, 3, 0, 5, 0, 7, 0, 9])
I've tried using the below, but it's not working.
np.where(Q9 % 2 == 1) = - 1
Thanks for any assistance!
where method only returns indices
arr = np.array([0, 1, 0, 3, 0, 5, 0, 7, 0, 9])
arr[np.where(arr%2!=0)] = -1
print(arr)
output:
[ 0 -1 0 -1 0 -1 0 -1 0 -1]
If you want to replace in the original array, where is not needed, use simple indexing:
a = np.array([0, 1, 0, 3, 0, 5, 0, 7, 0, 9])
a[a%2==1] = -1
a
For a new array:
b = np.where(a%2==1, -1, a)
output: array([ 0, -1, 0, -1, 0, -1, 0, -1, 0, -1])

Populate Defined Named Range with multi-element array of multi-element arrays

I have defined 5 arrays.
One with undefined dimensions to store the other 4:
Dim outputArr() As Variant
and the rest as follows:
Dim Arr1(5, 0), Arr2(12, 0), Arr3(5, 0), Arr4(12, 0) As Variant
I assign the elements of the latter as follows:
Arr1(0, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(1, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(2, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(3, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(4, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(5, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
The above is applied to each array.
When I use
ReDim outputArray(3, 0)
outputArr = [{Arr1, Arr2, Arr3, Arr4}]
I get a 'Type Mismatch' error.
When I do not use Evaluate and assign without ReDim
outputArr = Array(Arr1, Arr2, Arr3, Arr4)
I can see the elements and their values in the Watch window, but when I try to populate Defined Named Ranges with the elements of outputArr I get an empty output
Range("nRange1name").Value = outputArr(0)
Range("nRange2name").Value = outputArr(1)
Range("nRange3name").Value = outputArr(2)
Range("nRange4name").Value = outputArr(3)
How can I work around this?
The use of variants in the OP code introduces unecessary dimensions. I don't understand why two transpose functions are needed but the following code pastes 2d arrays satisfactorily.
Option Explicit
Sub TestArrays()
Dim outputArr As Variant
Dim Arr1 As Variant
Dim Arr2 As Variant
Dim Arr3 As Variant
Dim Arr4 As Variant
Arr1 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr2 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr3 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr4 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
outputArr = Array(Arr1, Arr2, Arr3, Arr4)
' For Horizontal ranges
Range("A1:H2") = Application.WorksheetFunction.Transpose(Application.WorksheetFunction.Transpose(outputArr(2)))
'For Vertical ranges
Range("A4:B11") = Application.WorksheetFunction.Transpose(outputArr(3))
End Sub
You need to construct an actual 2D array to do something like that.
Dim arr(1 to 6, 1 to 12)
dim r as long, c as long
for r = lbound(arr, 1) to ubound(arr, 1)
for c = lbound(arr, 2) to ubound(arr, 2)
arr(r, c) = 0
next c
next r
Range("A1").Resize(ubound(arr, 1), ubound(arr, 2)).value = arr

Multiply a list of vectors to different matrices conditioned on the vectors' names

I have a list of 20-length vectors that I would like to multiply each of those with one of three matrices depending on the length vectors' names. Here is my unsuccessful attempt. Please suggest how I improve my code. Any help is much appreciated!
for (i in 1:length(List)){
.$Value=ifelse(names(List) %in% c("a","b","c"),matrixA%*%.$Value,ifelse(names(List) %in% c("d","e"),matrixB%*%.$Value, matrixC%*%.$Value))
}
Part of my list and the matrices are included below.
list(a = structure(c(3, 0, 0, 5, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 1, 1), .Dim = c(20L, 1L)), b = structure(c(2,
0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 6, 0), .Dim = c(20L,
1L)))
matrixA <- diag(2,20)
matrixB <- diag(1,20)
matrixC <- diag(4,20)
So... Not sure I understand. But it seems like if the list has name a, b or c you want to multiply it to matrixA, if it's d or e you want to multiply it to matrixB and if neither, the values should be multiplied to matrixC.
Let's use your example.
zz <- list(a = structure(c(3, 0, 0, 5, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 1, 1), .Dim = c(20L, 1L)), b = structure(c(2,
0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 6, 0), .Dim = c(20L,
1L)))
matrixA <- diag(2,20)
matrixB <- diag(1,20)
matrixC <- diag(4,20)
This is probably not the best solution but it is a simple one. I just made a few tweaks to your ideia so it would work, a matrix needs to be protected by list() (because a list is a vector, and ifelse works with vectors) inside an ifelse() otherwise you only get the first element. This will return you a list of the results.
results <- lapply(seq_along(zz), function(i){
ifelse(names(zz[i]) %in% c("a","b","c"),list(matrixA%*%zz[[i]]),
ifelse(names(zz[i]) %in% c("d","e"), list(matrixB%*%zz[[i]]), list(matrixC%*%zz[[i]])))
})
I used lapply to apply the sequence to (1 to length of zz) to the defined function. For each i the function looks at the name of i element zz (zz[i] returns the element of the list with its name) and if it satisfies the condition we multiply the content of the i element of zz (zz[[i]] just returns the content of the i element of the list without its name) by a predefined matrix.
This also works and you don't need to protect the matrix using list() which is kinda of a bother.
results <- lapply(seq_along(zz), function(i){
if(names(zz[i]) %in% c("a","b","c")) matrixA%*%zz[[i]] else
if(names(zz[i]) %in% c("d","e")) matrixB%*%zz[[i]]
else matrixC%*%zz[[i]]
})
Edit: #akrun answer is way more beautiful and short.
May be this helps
nm1 <- paste0("matrix", toupper(names(lst1)))
Map(crossprod, lst1, mget(nm1))

Numpy Number Patterns

Is there a function in Numpy that allows you to take 4 records at a time and see where they match with a second dataset? Once there is a match move to the next 4 records of the first data set. It wont always be every 4 records, but i am using this as an example.
So if dataset one had - 1,5,7,8,10,12,6,1,3,6,8,9
And the second dataset had - 1,5,7,8,11,15,6,1,3,6,10,6
My result will be: 1,5,7,8, 6,1,3,6
POST EDIT:
My second example datasets:
import numpy as np
a =np.array([15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,1,1,3,5,6,0,9,47,1,1,7,7,0,0,45,12,17,45])
b = np.array ([6,0,0,15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,3,4,6,0,9,47,1,1,7,7,0,0,45,12,16,1,9,3,30])
Here's another snapshot of an example:
Thank you in advance for looking at my question!!
Update: for the more difficult and more interesting alignment problem it is probably best not to reinvent the wheel but to rely on python's difflib:
from difflib import SequenceMatcher
import numpy as np
k=4
a = np.array([15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,1,1,3,5,6,0,9,47,1,1,7,7,0,0,45,12,17,45])
b = np.array ([6,0,0,15,15,0,0,10,10,0,0,2,1,8,8,42,2,4,4,3,3,4,6,0,9,47,1,1,7,7,0,0,45,12,16,1,9,3,30])
sm = SequenceMatcher(a=a, b=b)
matches = sm.get_matching_blocks()
matches = [m for m in matches if m.size >= k]
# [Match(a=0, b=3, size=17), Match(a=21, b=22, size=12)]
consensus = [a[m.a:m.a+m.size] for m in matches]
# [array([15, 15, 0, 0, 10, 10, 0, 0, 2, 1, 8, 8, 42, 2, 4, 4, 3]), array([ 6, 0, 9, 47, 1, 1, 7, 7, 0, 0, 45, 12])]
consfour = [a[m.a:m.a + m.size // k * k] for m in matches]
# [array([15, 15, 0, 0, 10, 10, 0, 0, 2, 1, 8, 8, 42, 2, 4, 4]), array([ 6, 0, 9, 47, 1, 1, 7, 7, 0, 0, 45, 12])]
summary = [np.c_[np.add.outer(np.arange(m.size // k * k), (m.a, m.b)), c]
for m, c in zip(matches, consfour)]
merge = np.concatenate(summary, axis=0)
Below is my original solution assuming already aligned and same-length arrays:
Here is a hybrid solution using numpy to find consecutive matches and cutting them out and then list comp to apply length constraints:
import numpy as np
d1 = np.array([7,1,5,7,8,0,6,9,0,10,12,6,1,3,6,8,9])
d2 = np.array([8,1,5,7,8,0,6,9,0,11,15,6,1,3,6,10,6])
k = 4
# find matches
m = d1 == d2
# find switches between match, no match
sw = np.where(m[:-1] != m[1:])[0] + 1
# split
mnm = np.split(d1, sw)
# select matches
ones_ = mnm[1-m[0]::2]
# apply length constraint
res = [blck[i:i+k] for blck in ones_ for i in range(len(blck)-k+1)]
# [array([1, 5, 7, 8]), array([5, 7, 8, 0]), array([7, 8, 0, 6]), array([8, 0, 6, 9]), array([0, 6, 9, 0]), array([6, 1, 3, 6])]
res_no_ovlp = [blck[k*i:k*i+k] for blck in ones_ for i in range(len(blck)//k)]
# [array([1, 5, 7, 8]), array([0, 6, 9, 0]), array([6, 1, 3, 6])]
You can use matrix masking like,
import numpy as np
from scipy.sparse import dia_matrix
a = np.array([1,5,7,8,10,12,6,1,3,6,8,9])
b = np.array([1,5,7,8,11,15,6,1,3,6,10,6])
mask = dia_matrix((np.ones((1, a.size)).repeat(4, axis=0), np.arange(4)),
shape=(a.size, b.size), dtype=np.int)
print(mask.toarray())
matches = a[mask.T.dot(mask.dot(a == b) == 4).astype(np.bool)]
print(matches)
This will output,
array([[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
[1 5 7 8 6 1 3 6]
You can think about how the matrix multiplication works to get this result.
Scaling
For scaling, I tested with 1e3, 1e5, and 1e7 elements and got,
1e3 - 0.019184964010491967
1e5 - 0.4330314120161347
1e7 - 144.54082221200224
See the gist. Not sure why such a hard jump at 1e7 elements.
This is an exercise is list comprehension. We have the data
data = [1,5,7,8,10,12,6,1,3,6,8,9]
search_data = [1,5,7,8,11,15,6,1,3,6,10,6]
First we can chunk the original data into blocks of length n
n = 4
chunks = [data[i:i + n] for i in range(len(data) - n + 1)]
search_chunks = [search_data[i:i + n] for i in range(len(search_data) - n + 1)]
Now we must select chunks from the first list that appear in the second list
hits = [c for c in chunks if c in search_chunks]
print hits
# [[1, 5, 7, 8], [6, 1, 3, 6]]
This may not be the optimal solution for long lists. It may improve performance to consider sets, if there are likely to repeated chunks
chunks = set(tuple(data[i:i + n]) for i in range(len(data) - n + 1))
search_chunks = set(tuple(search_data[i:i + n]) for i in range(len(search_data) - n + 1))
This can be quite competitive with above numpy solution, e.g.
import numpy as np
import time
# Generate data
len_ = 10000
max_ = 10
data = map(int, np.random.rand(len_) * max_)
search_data = map(int, np.random.rand(len_) * max_)
# Time list comprehension
start = time.time()
n = 4
chunks = set(tuple(data[i:i + n]) for i in range(len(data) - n + 1))
search_chunks = set(tuple(search_data[i:i + n]) for i in range(len(search_data) - n + 1))
hits = [c for c in chunks if c in search_chunks]
print time.time() - start
# Time numpy
a = np.array(data)
b = np.array(search_data)
mask = 1 * (np.abs(np.arange(a.size).reshape((-1, 1)) - np.arange(a.size) - 0.5) < 2)
start = time.time()
matches = a[mask.T.dot(mask.dot(a == b) == 4).astype(np.bool)]
print time.time() - start
It's typically faster here, but it depends on number of repeated chunks etc.

How do I set multiple array values by index in Scala?

Suppose I have a sequence of integers and a number n < 30. How can I produce an array (of length n) that is 0 in all places except at the indices specified by the sequence (where it should be 1)?
For instance
Input:
Seq(1, 2, 5)
7
Output:
Array(0, 1, 1, 0, 0, 1, 0)
scala> val a = Array.fill(7)(0)
a: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0)
scala> Seq(1,2,5).foreach(a(_) = 1)
scala> a
res1: Array[Int] = Array(0, 1, 1, 0, 0, 1, 0)
Alternatively,
scala> val is = Set(1, 2, 5)
is: scala.collection.immutable.Set[Int] = Set(1, 2, 5)
scala> Array.tabulate(10)(i => if (is contains i) 1 else 0)
res0: Array[Int] = Array(0, 1, 1, 0, 0, 1, 0, 0, 0, 0)
def makeArray(indices: Seq[Int], size: Int): Array[Int] = Iterable.tabulate(size) {
case idx if indices contains idx => 1
case _ => 0
}.toArray
makeArray(Seq(1, 2, 5), size = 7)

Resources