Doubling Numpy array into bigger array - arrays

Is there a way without creating a 2nd variable storing array [1,2,3] and then concatenating
to get [1,2,3,1,2,3] from array1 = [1,2,3].
Could I use numpy.repeat for this?
Input:
[2,3,4]
Output:
[2,3,4,2,3,4]

You can use numpy.tile:
>>> np.tile([1,2,3], 2)
array([1, 2, 3, 1, 2, 3])

Related

Replacing only zeros in an array with another array's values in Julia

I am trying to replace an array with another array, but only in places where the original array is zero. These arrays are equal in dimensions.
Array1 = [0, 2, 4, 2, 5, 0, 0, 2, 5]
Array2 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
I am currently trying to use the code below to superimpose array2 into array 1 where array1's values are equal to zero.
Array1[Array1 .== 0] .= Array2
From this I was hoping to get the following...
Array1 = [1, 2, 4, 2, 5, 6, 7, 2, 5]
but instead I get an error...
ERROR: DimensionMismatch("array could not be broadcast to match destination")
Is there a nice way to do this without looping through each list/array one element at a time? I am working with really large arrays and don't know if that would be too slow. Any help is appreciated.
This can be done easily almost the way you have it:
Array1[Array1 .== 0] .= Array2[Array1 .== 0]
The part Array1 .== 0 is creating an array of zeros and ones that has the same length as the original one. Zeros in the places where the condition is not met and ones in the places where the condition is met.
When you do something like:
Array1 .= Array2
You are telling julia to copy Array2 elementwise into Array1, this works because Array1 and Array2 have the same length. But when you try to do
Array1[Array1 .== 0] .= Array2
You are telling to put every element of Array2 into a subset of elements in Array1 (the ones that satisfy your condition). This fail because the arrays are not of the same length. If you instead do
Array1[Array1 .== 0] .= Array2[Array1 .== 0]
You are now telling "copy all the elements of Array2 in the indices where Array1 satisfy the condition, into the same place in Array1". These two now are of the same length.
Note, however, that in julia for loops are fast, so you will be fine if you use them.
Instead of indexing, you could also broadcast a function over the values. Here I've used 100*Array2 for the replacement values just so you can see which were changed, and #. is equivalent to writing out Array1 .= ifelse.(Array1 .== 0, 100 .* etc. but ensures you won't miss a dot.
julia> #. Array1 = ifelse(Array1==0, 100*Array2, Array1)
9-element Vector{Int64}:
100
2
4
2
5
600
700
2
5
This is likely to be more efficient, as each Array1 .== 0 makes a boolean array, and Array2[Array1 .== 0] makes a temporary copy of this data.

Loop to perform operation on i+1 in numpy array

I have a numpy array, I'd like to take the 3 numbers in each row, minus them from the next row and store those values in another array.
something like
for i in array:
a = i - i+1
I know this is very wrong, but at least this gives the idea of what I want.
Obviously i+1 will just result in the value + 1 and then all I have is a = 1,1,1
When I say i+1 I mean the next in line.
So for example:
input = np.array([[4,4,5], [2,3,1],[1,2,0]])
output = np.array([2,1,4],[1,1,1]) etc....
What would be the best way to do this iteratively on thousands of rows?
IIUC, instead of looping, you can just shift your arrays 1 up using np.roll, subtract that from your original input, and take all the resulting arrays except the last (because there will be nothing to subtract from the last array):
>>> inp = np.array([[4,4,5], [2,3,1],[1,2,0]])
>>> inp
array([[4, 4, 5],
[2, 3, 1],
[1, 2, 0]])
>>> (inp - np.roll(inp,-1,axis=0))[:-1]
array([[2, 1, 4],
[1, 1, 1]])
Or, a more straightforward way would just be to use numpy indexing:
>>> inp[:-1] - inp[1:]
array([[2, 1, 4],
[1, 1, 1]])

how to append a array to a np.array?

coordinates = np.empty([0,5])
np.vstack( (coordinates, np.array([1, 2, 3, 4, 5]) ))
print coordinates # []
np.append(coordinates, np.array([1, 2, 3, 4, 5]), axis=0)
print coordinates
In the code shown above, I tried to append the array, but both approaches failed. In the first approach, the output is still empty, in the second approach, the output is an error saying
ValueError: all the input arrays must have same number of dimensions
What is wrong with my method?
You need to capture the results of numpy.vstack()
From the (Docs)
numpy.vstack(arrays, axis=0)
Returns:
stacked : ndarray
Test Code:
coordinates = np.empty([0, 5])
x = np.vstack((coordinates, np.array([1, 2, 3, 4, 5])))
print x
Results:
[[ 1. 2. 3. 4. 5.]]

Sort array based on frequency

How can I sort an array by most repetitive values.?
suppose I have an array [3, 3, 3, 3, 4, 4]
Expected the result as [3, 4] since 3 is most repeated and 4 is least repeated.
Is there any way too do it?
Thanks in advance....!
Here is one way of doing it:
distictList: Get all distinct values from the array and store in this
countArray: For each ith index in distinctList countArray[i] holds the occurrence of the distinctList[i]
Now sort countArray and apply same swaps on the distinctList simultaneously.
Ex: [3, 3, 4, 4, 4]
distinctList [3,4]
countArray [2,3]
Descending sort countArray [3,2] sorting distinctList at the same time [4,3]
Output: [4, 3]`
Simple in Python:
data = [3, 2, 3, 4, 2, 1, 3]
frequencies = {x:0 for x in data}
for x in data:
frequencies[x] = frequencies[x] + 1
sorted_with_repetitions = sorted(data, key=lambda x:frequencies[x],reverse=True)
sorted_without_repetitions = sorted(frequencies.keys(), key=lambda x:frequencies[x],reverse=True)
print(data)
print(sorted_with_repetitions)
print(sorted_without_repetitions)
print(frequencies)
The same approach (an associative container to collect distinct values and count occurrences, used in a custom comparison to sort an array with the original data or only distinct items) is suitable for Java.

How to extract lines in an array, which contain a certain value? (numpy, scipy)

I have an numpy 2D array and I want it to return coloumn c where (r, c-1) (row r, coloumn c) equals a certain value (int n).
I don't want to iterate over the rows writing something like
for r in len(rows):
if array[r, c-1] == 1:
store array[r,c]
, because there are 4000 of them and this 2D array is just one of 20 i have to look trough.
I found "filter" but don't know how to use it (Found no doc).
Is there an function, that provides such a search?
I hope I understood your question correctly. Let's say you have an array a
a = array(range(7)*3).reshape(7, 3)
print a
array([[0, 1, 2],
[3, 4, 5],
[6, 0, 1],
[2, 3, 4],
[5, 6, 0],
[1, 2, 3],
[4, 5, 6]])
and you want to extract all lines where the first entry is 2. This can be done like this:
print a[a[:,0] == 2]
array([[2, 3, 4]])
a[:,0] denotes the first column of the array, == 2 returns a Boolean array marking the entries that match, and then we use advanced indexing to extract the respective rows.
Of course, NumPy needs to iterate over all entries, but this will be much faster than doing it in Python.
Numpy arrays are not indexed. If you need to perform this specific operation more effeciently than linear in the array size, then you need to use something other than numpy.

Resources