Related
I have 3 arrays down below a and b combine to make a_and_b. a is multiplied by a_multiplier and b gets multiplied by b_multiplier. How would I be able to modify a_and_b after the multiplier has been implemented in it.
Code:
import numpy as np
a_multiplier = 3
b_multiplier = 5
a = np.array([5,32,1,4])
b = np.array([1,5,11,3])
a_and_b = np.array([5,1,32,5,1,11,4,3])
Expected Output:
[15, 5, 96, 25, 3, 55, 12, 15]
first learn how to use the multiply:
In [187]: a = np.array([5,32,1,4])
In [188]: a*3
Out[188]: array([15, 96, 3, 12])
In [189]: b = np.array([1,5,11,3])
In [190]: b*5
Out[190]: array([ 5, 25, 55, 15])
One way to combine the 2 arrays:
In [191]: np.stack((a*3, b*5),axis=1)
Out[191]:
array([[15, 5],
[96, 25],
[ 3, 55],
[12, 15]])
which can be easily turned into the desired 1d array:
In [192]: np.stack((a*3, b*5),axis=1).ravel()
Out[192]: array([15, 5, 96, 25, 3, 55, 12, 15])
I have a Spark (Python) dataframe with two columns: a user ID and then an array of arrays, which is represented in Spark as a wrapped array like so:
[WrappedArray(9, 10, 11, 12), WrappedArray(20, 21, 22, 23, 24, 25, 26)]
In its usual representation this would look like this:
[[9, 10, 11, 12], [20, 21, 22, 23, 24, 25, 26]]
I want to perform operations on each of the subarrays, for example take a third list and check whether any of its values is in the first sub-array, but I can't seem to find solutions for pyspark 2.0 (only Scala-specific older solutions like this and this).
How does one access (and in general work with) wrapped arrays? What is an efficient way to do what I described above?
You can treat each wrapped array as individual list . in your example, if you want to which elements from 2nd wrapped array is present in first array, you could do something like -
# Prepare data
data = [[10001,[9, 10, 11, 12],[20, 10, 9, 23, 24, 25, 26]],
[10002,[8, 1, 2, 3],[49, 3, 6, 5, 6]],
]
rdd = sc.parallelize(data)
df = rdd.map(
lambda row : row+[
[x for x in row[2] if x in row[1]]
]
).toDF(["userID","array1","array2","commonElements"])
df.show()
output :
+------+---------------+--------------------+--------------+
|userID| array1| array2|commonElements|
+------+---------------+--------------------+--------------+
| 10001|[9, 10, 11, 12]|[20, 10, 9, 23, 2...| [10, 9]|
| 10002| [8, 1, 2, 3]| [49, 3, 6, 5, 6]| [3]|
+------+---------------+--------------------+--------------+
USING IDLE/Python 3.5.1
May I first of all begin by saying I am a reasonably experienced programmer in VBA but am on day 2 of Python. I assure you I have conducted many searches on this question but the 30 or so documents I have read do not seem to explain my problem.
May I also please request that any answers given are properly formatted code for Python 3.5.1 rather than helpful pointers to other documentation or links?
The Problem
I am running a report and outputting results as I go. I need to store the results (presumably in an array) during this so that I can refer to them afterwards. The report (and the populating of the array) can be rerun multiple times so please bear that in mind if using concepts like 'append' when building the array. The array has dimensions [25,4] - a maximum of 25 records with four items in each.
Day X Y Z Total
1 2 3 4 9
2 3 4 5 12 ...
(Purists: The total needs to be recorded rather than calculated because of rounding.)
I could solve the problem myself if someone could translate this bit of code into Python (from VBA for illustration purposes). I do not want to import the arrays module unless it's the only way. Note: Variable l is a loop that makes the array get built twice to demonstrate that the array needs to be capable of rebuilding from scratch rather than being created just the once.
Sub sArray()
Dim a(25, 4)
For l = 1 To 2
For i = 1 To 25
For j = 1 To 4
a(i, j) = Int(100 * Rnd(1)) + 1
Debug.Print a(i, j);
Next j
Next i
Next l
End Sub
Thanks,
Tom
I am not sure I got your question correctly...
If you want to make an array (list i a better term in this case) of size [25,4] this is one way to go:
import random
a = [[int(100*random.random())+1 for j in range(4)] for i in range(25)]
>>> print a
[[74, 17, 36, 75],
[1, 79, 33, 90],
[58, 66, 47, 95],
[35, 40, 87, 38],
[43, 46, 34, 66],
[69, 34, 26, 49],
[56, 83, 44, 14],
[2, 44, 54, 97],
[50, 21, 39, 60],
[13, 94, 12, 48],
[36, 13, 2, 71],
[77, 44, 31, 11],
[56, 26, 30, 39],
[17, 13, 83, 84],
[54, 37, 34, 18],
[5, 54, 88, 100],
[22, 77, 70, 21],
[51, 88, 26, 97],
[69, 33, 86, 48],
[42, 66, 38, 78],
[71, 43, 96, 23],
[6, 46, 100, 29],
[32, 86, 15, 48],
[96, 84, 8, 56],
[29, 64, 69, 79]]
if you want to show that "the array needs to be capable of rebuilding from scratch rather than being created just the once" (why would you need this??)
for l in range(2):
a = [[int(100*random.random())+1 for j in range(4)] for i in range(25)]
Also, the way of generating random numbers is odd (I have translated you method). To get the same result in python, just use random.randint(1,100) to generate random integers from 1 (i think you do not want to have 0 there) to whatever number you like.
If I have correctly understood from your comments, this is what you want:
def report(g=25):
array = []
for _ in range(g):
x = random.randint(1,10)
y = random.randint(1,10)
z = random.randint(1,10)
total = x+y+x
row = [x,y,z,total]
print(row)
array.append(row)
return array
result = report()
#prints all the rows while computing
>>> result #stores the "array"
[8, 4, 3, 20]
[10, 7, 4, 27]
[2, 4, 5, 8]
[8, 5, 8, 21]
[9, 7, 2, 25]
[2, 2, 3, 6]
[5, 8, 6, 18]
[8, 6, 1, 22]
[7, 6, 4, 20]
[7, 2, 10, 16]
[6, 5, 9, 17]
[3, 8, 8, 14]
[9, 1, 9, 19]
[1, 7, 7, 9]
[6, 6, 2, 18]
[9, 10, 1, 28]
[4, 6, 2, 14]
[6, 1, 6, 13]
[4, 1, 3, 9]
[5, 3, 5, 13]
[7, 5, 2, 19]
[9, 5, 7, 23]
[2, 5, 8, 9]
[3, 10, 4, 16]
[5, 6, 5, 16]
I have 100 3x3x3 matrices that I would like to multiply with another large matrix of size 3x5x5 (similar to convolving one image with multiple filters, but not quite).
For the sake of explanation, this is what my large matrix looks like:
>>> x = np.arange(75).reshape(3, 5, 5)
>>> x
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]],
[[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39],
[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49]],
[[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59],
[60, 61, 62, 63, 64],
[65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]])
In memory, I assume all sub matrices in the large matrix are stored in contiguous locations (please correct me if I'm wrong). What I want to do is, from this 3x5x5 matrix, I want to extract 3 5x3 columns from each sub-matrix of the large matrix and then join them horizontally to get a 5x9 matrix (I apologise if this part is not clear, I can explain in more detail if need be). If I were using numpy, I'd do:
>>> k = np.hstack(np.vstack(x)[:, 0:3].reshape(3, 5, 3))
>>> k
array([[ 0, 1, 2, 25, 26, 27, 50, 51, 52],
[ 5, 6, 7, 30, 31, 32, 55, 56, 57],
[10, 11, 12, 35, 36, 37, 60, 61, 62],
[15, 16, 17, 40, 41, 42, 65, 66, 67],
[20, 21, 22, 45, 46, 47, 70, 71, 72]])
However, I'm not using python so I do not have any access to the numpy functions that I need in order to reshape the data blocks into a form I want to carry out multiplication... I can only directly call the cblas_sgemm function (from the BLAS library) in C, where k corresponds to input B.
Here's my call to cblas_sgemm:
cblas_sgemm( CblasRowMajor, CblasNoTrans, CblasTrans,
100, 5, 9,
1.0,
A, 9,
B, 9, // this is actually wrong, since I don't know how to specify the right parameter
0.0,
result, 5);
Basically, the ldb attribute is the offender here, because my data is not blocked the way I need it to be. I have tried different things, but I am not able to get cblas_sgemm to understand how I want it to read and understand my data.
In short, I don't know how to tell cblas_sgemm to read x like k.Is there a way I can smartly reshape my data in python before sending it to C, so that cblas_sgemm can work the way I want it to?
I will transpose k by setting CblasTrans, so during multiplication, B is 9x5. My matrix A is of shape 100x9. Hope that helps.
Any help would be appreciated. Thanks!
In short, I don't know how to tell cblas_sgemm to read x like k.
You can't. You'll have to make a copy.
Consider k:
In [20]: k
Out[20]:
array([[ 0, 1, 2, 25, 26, 27, 50, 51, 52],
[ 5, 6, 7, 30, 31, 32, 55, 56, 57],
[10, 11, 12, 35, 36, 37, 60, 61, 62],
[15, 16, 17, 40, 41, 42, 65, 66, 67],
[20, 21, 22, 45, 46, 47, 70, 71, 72]])
In a two-dimensional array, the spacing of the elements in memory must be the same in each axis. You know from how x was created that the consecutive elements in memory are 0, 1, 2, 3, 4, ..., but your first row of k contains 0, 1, 2, 25, 26, ..... The is no spacing between 1 and 2 (i.e. the memory address increases by the size of one element of the array), but there is a large jump in memory between 2 and 25. So you'll have to make a copy to create k.
Having said that, there is an alternative method to efficiently achieve your desired final result using a bit of reshaping (without copying) and numpy's einsum function.
Here's an example. First define x and A:
In [52]: x = np.arange(75).reshape(3, 5, 5)
In [53]: A = np.arange(90).reshape(10, 9)
Here's my understanding of what you want to achieve; A.dot(k.T) is the desired result:
In [54]: k = np.hstack(np.vstack(x)[:, 0:3].reshape(3, 5, 3))
In [55]: A.dot(k.T)
Out[55]:
array([[ 1392, 1572, 1752, 1932, 2112],
[ 3498, 4083, 4668, 5253, 5838],
[ 5604, 6594, 7584, 8574, 9564],
[ 7710, 9105, 10500, 11895, 13290],
[ 9816, 11616, 13416, 15216, 17016],
[11922, 14127, 16332, 18537, 20742],
[14028, 16638, 19248, 21858, 24468],
[16134, 19149, 22164, 25179, 28194],
[18240, 21660, 25080, 28500, 31920],
[20346, 24171, 27996, 31821, 35646]])
Here's how you can get the same result by slicing x and reshaping A:
In [56]: x2 = x[:,:,:3]
In [57]: A2 = A.reshape(-1, 3, 3)
In [58]: einsum('ijk,jlk', A2, x2)
Out[58]:
array([[ 1392, 1572, 1752, 1932, 2112],
[ 3498, 4083, 4668, 5253, 5838],
[ 5604, 6594, 7584, 8574, 9564],
[ 7710, 9105, 10500, 11895, 13290],
[ 9816, 11616, 13416, 15216, 17016],
[11922, 14127, 16332, 18537, 20742],
[14028, 16638, 19248, 21858, 24468],
[16134, 19149, 22164, 25179, 28194],
[18240, 21660, 25080, 28500, 31920],
[20346, 24171, 27996, 31821, 35646]])
This question already has answers here:
How to split (chunk) a Ruby array into parts of X elements? [duplicate]
(2 answers)
Closed 7 years ago.
I have an array:
array = [12, 13, 14, 18, 17, 19, 30, 23]
I need to split this array into arrays of maximum three elements each:
[12, 13, 14] [18, 17, 19] [30, 23]
How can I do this?
Try this...
Using Enumerable#each_slice to slice array x value
array = [12, 13, 14, 18, 17, 19, 30, 23]
array.each_slice(3)
array.each_slice(3).to_a
Take a look at Enumerable#each_slice:
foo.each_slice(3).to_a
#=> [["1", "2", "3"], ["4", "5", "6"], ["7", "8", "9"], ["10"]]
If you're using rails you can also use in_groups_of:
foo.in_groups_of(3)
By this time, I hope you got your answer. If you are using Rails, you can go with in_groups, you won't have to call to_a explicitly then :
array.in_groups(3)
# => [[12, 13, 14], [18, 17, 19], [30, 23, nil]]
array.in_groups(3, false)
# => [[12, 13, 14], [18, 17, 19], [30, 23]]
One more advantage of using in_groups is, you can preserve the array size (strictly). It will fill_with = nil to maintain the array size.