NumPy array loses dimension upon assignment/copy, why? - arrays

I have the following code:
print(type(a1), a1.shape)
a2 = a1 #.reshape(-1,1,2) this solves my problem
print(type(a2), a2.shape)
The output is:
<class 'numpy.ndarray'> (8, 1, 2)
<class 'numpy.ndarray'> (8, 2)
I know the (commented out) reshape solves my problem, however, I'd like to understand why a simple assignment results in losing the central dimension of the array.
Does anybody know what is going on? Why referring to the array with another name changes its dimensions?

Looking at the openCV script mentioned in the comments, the reshape to three dimensions is necessary because a dimension is being lost via Boolean indexing, and not by the assignment alone.
The names of the arrays in that script which motivated the question are p0 and good_new.
Here is a breakdown of the operations in that script:
p0 is a 3D array with shape (17, 1, 2).
The line:
p1, st, err = cv.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)
creates new arrays, with array p1 having shape (17, 1, 2) and array st having shape (17, 1).
The assignment good_new = p1[st==1] creates a new array object by a Boolean indexing operation on p1. This is a 2D array has shape (17, 2). A dimension has been lost through the indexing operation.
The name p0 needs to be assigned back to the array data contained in good_new, but p0 also needs to be 3D. To achieve this, the script uses p0 = good_new.reshape(-1, 1, 2).
For completeness, it is worth summarising why the Boolean indexing operation in step (3) results in a dimension disappearing.
The Boolean array st == 1 has shape (17, 1) which matches the initial dimensions of p1, (17, 1, 2).
This means that the selection occurs in the second dimension of p1: the indexer array st == 1 is determining which arrays of shape (2,) should be in the resulting array. The final array will be of shape (n, 2), where n is the number of True values in the Boolean array.
This behaviour is detailed in the NumPy documentation here.

I am not sure why your are getting this.but it should not return like this.Can you please share how your a1 has been created.
I tried like below but not able to re create it
a1=np.ones((8,1,2),dtype=np.uint8)
print(type(a1), a1.shape)
<class 'numpy.ndarray'> (8, 1, 2)
a2=a1
print(type(a2), a2.shape)
<class 'numpy.ndarray'> (8, 1, 2)`

Related

How to do a cartesian product of a variable number of lists in Julia?

For each value j in the set {1, 2, ..., n} where the value of n can vary (it is some variable in my program that can be different depending on the inputs from the user), I have an array A_j. I would like to obtain the cartesian product of all the arrays A_j, so that I can then iterate through that cartesian product (taking one element from each A_1, A_2, ... A_n to get a tuple (a_1, a_2, ..., a_n) in A_1 x A_2 x ... x A_n). How would I accomplish this in Julia?
Use Iterators.product:
help?> Iterators.product
product(iters...)
Return an iterator over the product of several iterators. Each generated
element is a tuple whose ith element comes from the ith argument iterator.
The first iterator changes the fastest.
Examples
≡≡≡≡≡≡≡≡≡≡
julia> collect(Iterators.product(1:2, 3:5))
2×3 Matrix{Tuple{Int64, Int64}}:
(1, 3) (1, 4) (1, 5)
(2, 3) (2, 4) (2, 5)

Finding the set of all winning tic tac toe board states

Here's my problem. I want to create an algorithm which generates an array of arrays of every possible winning board state for an n-dimensional tic-tac-toe board. Say you have an n = 2 board, meaning 2x2, then the function should return the following array:
wins = [
[1,2],
[1,3],
[1,4],
[2,4]
]
I know this isn't specifically a MATLAB problem, however I'm trying to expand my understanding of how MATLAB works. My general idea is an algorithm that does the following:
generate an n-dimensional board of zeros
1. Go to the first cell, record that index ([1,])
2. Go to the end of the row, and that's your first board state ([1,2])
3. Go to the end of the column, that's your second board state ([1,3])
4. Go to the end of the diagonal, that's your third board state ([2,3])
5. Advance to the next cell, repeat, checking if you have already created that board state first ([2,4] should be the only one it hasn't done)
I think I'm overthinking the problem, but I'm not sure how to approach it. Can someone give me some guidance how to do this in a MATLAB-y way? My guess is that traversing the matrix and just picking whole rows/colums/diagonals is easy, it's the 'checking if it exists' part that I'm not getting. How would you call this algorithm, in general? Thanks for any help!
Better idea: you don't do this square by square, you do this by dimension. For each dimension on the board, you have these possibilities for the coordinate to vary or not through winning combinations:
iterate through all the possible values, low to high
iterate through all the possible values, high to low
hold constant as the other dimensions iterate, but do so for each value in range, repeating for the other coordinates.
For instance, for a 4^3 board, let's look at the last coordinate (call them x1, x2, x3), x3. Assume that you've already determined x1 will iterate low to high, x2 is constant at 2. You will now treat x3 with:
iterate through all the possible values, low to high
(1, 2, 1), (2, 2, 2), (3, 2, 3)
iterate through all the possible values, high to low
(1, 2, 3), (2, 2, 2), (3, 2, 1)
hold constant as the other dimensions iterate, but do so for each value in range, repeating for the other coordinates.
(1, 2, 1), (2, 2, 1), (3, 2, 1)
(1, 2, 2), (2, 2, 2), (3, 2, 2)
(1, 2, 3), (2, 2, 3), (3, 2, 3)
Does that get you moving?

Binning then sorting arrays in each bin but keeping their indices together

I have two arrays and the indices of these arrays are related. So x[0] is related to y[0], so they need to stay organized. I have binned the x array into two bins as shown in the code below.
x = [1,4,7,0,5]
y = [.1,.7,.6,.8,.3]
binx = [0,4,9]
index = np.digitize(x,binx)
Giving me the following:
In [1]: index
Out[1]: array([1, 2, 2, 1, 2])
So far so good. (I think)
The y array is a parameter telling me how well measured the x data point is, so .9 is better than .2, so I'm using the next code to sort out the best of the y array:
y.sort()
ysorted = y[int(len(y) * .5):]
which gives me:
In [2]: ysorted
Out[2]: [0.6, 0.7, 0.8]
giving me the last 50% of the array. Again, this is what I want.
My question is how do I combine these two operations? From each bin, I need to get the best 50% and put these new values into a new x and new y array. Again, keeping the indices of each array organized. Or is there an easier way to do this? I hope this makes sense.
Many numpy functions have arg... variants that don't operate "by value" but rather "by index". In your case argsort does what you want:
order = np.argsort(y)
# order is an array of indices such that
# y[order] is sorted
top50 = order[len(order) // 2 :]
top50x = x[top50]
# now top50x are the x corresponding 1-to-1 to the 50% best y
You should make a list of pairs from your x and y lists
It can be achieved with the zip function:
x = [1,4,7,0,5]
y = [.1,.7,.6,.8,.3]
values = zip(x, y)
values
[(1, 0.1), (4, 0.7), (7, 0.6), (0, 0.8), (5, 0.3)]
To sort such a list of pairs by a specific element of each pair you may use the sort's key parameter:
values.sort(key=lambda pair: pair[1])
[(1, 0.1), (5, 0.3), (7, 0.6), (4, 0.7), (0, 0.8)]
Then you may do whatever you want with this sorted list of pairs.

Looping through slices of Theano tensor

I have two 2D Theano tensors, call them x_1 and x_2, and suppose for the sake of example, both x_1 and x_2 have shape (1, 50). Now, to compute their mean squared error, I simply run:
T.sqr(x_1 - x_2).mean(axis = -1).
However, what I wanted to do was construct a new tensor that consists of their mean squared error in chunks of 10. In other words, since I'm more familiar with NumPy, what I had in mind was to create the following tensor M in Theano:
M = [theano.tensor.sqr(x_1[:, i:i+10] - x_2[:, i:i+10]).mean(axis = -1) for i in xrange(0, 50, 10)]
Now, since Theano doesn't have for loops, but instead uses scan (which map is a special case of), I thought I would try the following:
sequence = T.arange(0, 50, 10)
M = theano.map(lambda i: theano.tensor.sqr(x_1[:, i:i+10] - x_2[:, i:i+10]).mean(axis = -1), sequence)
However, this does not seem to work, as I get the error:
only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices
Is there a way to loop through the slices using theano.scan (or map)? Thanks in advance, as I'm new to Theano!
Similar to what can be done in numpy, a solution would be to reshape your (1, 50) tensor to a (1, 10, 5) tensor (or even a (10, 5) tensor), and then to compute the mean along the second axis.
To illustrate this with numpy, suppose I want to compute means by slices of 2
x = np.array([0, 2, 0, 4, 0, 6])
x = x.reshape([3, 2])
np.mean(x, axis=1)
outputs
array([ 1., 2., 3.])

How do I algorithmically instantiate and manipulate a multidimensional array in Scala

I am trying to wrote a program to manage a Database through a Scala Gui, and have been running into alot of trouble formatting my data in such a way as to input it into a Table and have the Column Headers populate. To do this, I have been told I would need to use an Array[Array[Any]] instead of an ArrayBuffer[ArrayBuffer[String]] as I have been using.
My problem is that the way I am trying to fill these arrays is modular: I am trying to use the same function to draw from different tables in a MySQL database, each of which has a different number of columns and entries.
I have been able to (I think) define a 2-D array with
val Data = new Array[Array[String]](numColumns)(numRows)
but I haven't found any ways of editing individual cells in this new array.
Data(i)(j)=Value //or
Data(i,j)=Value
do not work, and give me errors about "Update" functionality
I am sure this can't possibly be as complicated as I have been making it, so what is the easy way of managing these things in this language?
You don't need to read your data into an Array of Arrays - you just need to convert it to that format when you feed it to the Table constuctor - which is easy, as demonstrated my answer to your other question: How do I configure the Column names in a Scala Table?
If you're creating a 2D array, the idiom you want is
val data = Array.ofDim[String](numColumms, numRows)
(There is also new Array[String](numColumns, numRows), but that's deprecated.)
You access element (i, j) of an Array data with data(i)(j) (remember they start from 0).
But in general you should avoid mutable collections (like Array, ArrayBuffer) unless there's a good reason. Try Vector instead.
Without knowing the format in which you're retrieving data from the database it's not possible to say how to put it into a collection.
Update:
You can alternatively put the type information on the left hand side, so the following are equivalent (decide for yourself which you prefer):
val a: Array[Array[String]] = Array.ofDim(2,2)
val a = Array.ofDim[String](2,2)
To explain the syntax for accessing / updating elements: as in Java, a multi-dimensional array is just an array of arrays. So here, a(i) is element i of a, which an Array[String], and so a(i)(j) is element j of that array, which is a String.
Luigi's answer is great, but I'd like to shed some light on why your code isn't working.
val Data = new Array[Array[String]](numColumns)(numRows)
does not do what you expect it to do. The new Array[Array[String]](numColumns) part does create an array of array of strings with numColumns entries, with all entries (arrys of strings) being null, and returns it. The following (numRows) then just calls the apply function on that returned object, which returns the numRowsth entry in that list, which is null.
You can try that out in the scala REPL: When you input
new Array[Array[String]](10)(9)
you get this as output:
res0: Array[String] = null
Luigi's solution, instead
Array.ofDim[String](2,2)
does the right thing:
res1: Array[Array[String]] = Array(Array(null, null), Array(null, null))
It's rather ugly, but you can update a multidimensional array with update
> val data = Array.ofDim[String](2,2)
data: Array[Array[String]] = Array(Array(null, null), Array(null, null))
> data(0).update(0, "foo")
> data
data: Array[Array[String]] = Array(Array(foo, null), Array(null, null))
Not sure about the efficiency of this technique.
Luigi's answer is great, but I just wanted to point out another way of initialising an Array that is more idiomatic/functional – using tabulate. This takes a function that takes the array cell coordinates as input and produces the cell value:
scala> Array.tabulate[String](4, 4) _
res0: (Int, Int) => String => Array[Array[String]] = <function1>
scala> val data = Array.tabulate(4, 4) {case (x, y) => x * y }
data: Array[Array[Int]] = Array(Array(0, 0, 0, 0), Array(0, 1, 2, 3), Array(0, 2, 4, 6), Array(0, 3, 6, 9))

Resources