Loop to perform operation on i+1 in numpy array - arrays

I have a numpy array, I'd like to take the 3 numbers in each row, minus them from the next row and store those values in another array.
something like
for i in array:
a = i - i+1
I know this is very wrong, but at least this gives the idea of what I want.
Obviously i+1 will just result in the value + 1 and then all I have is a = 1,1,1
When I say i+1 I mean the next in line.
So for example:
input = np.array([[4,4,5], [2,3,1],[1,2,0]])
output = np.array([2,1,4],[1,1,1]) etc....
What would be the best way to do this iteratively on thousands of rows?

IIUC, instead of looping, you can just shift your arrays 1 up using np.roll, subtract that from your original input, and take all the resulting arrays except the last (because there will be nothing to subtract from the last array):
>>> inp = np.array([[4,4,5], [2,3,1],[1,2,0]])
>>> inp
array([[4, 4, 5],
[2, 3, 1],
[1, 2, 0]])
>>> (inp - np.roll(inp,-1,axis=0))[:-1]
array([[2, 1, 4],
[1, 1, 1]])
Or, a more straightforward way would just be to use numpy indexing:
>>> inp[:-1] - inp[1:]
array([[2, 1, 4],
[1, 1, 1]])

Related

How would you split a numpy array where the elements give the partition size?

For an numpy 1d array such as:
In [1]: A = np.array([2,5,1,3,9,0,7,4,1,2,0,11])
In [2]: A
Out[2]: array([2,5,1,3,9,0,7,4,1,2,0,11])
I need to split the array by using the values as a sub-array length.
For the example array:
The first index has a value of 2, so I need the first split to occur at index 0 + 2, so it would result in ([2,5,1]).
Skip to index 3 (since indices 0-2 were gobbled up in step 1).
The value at index 3 = 3, so the second split would occur at index 3 + 3, and result in ([3,9,0,7]).
Skip to index 7
The value at index 7 = 4, so the third and final split would occur at index 7 + 4, and result in ([4,1,2,0,11])
I'm using this simple array as an example, because I think it will help in my actual use case, which is reading data from binary files (either as bytes or unsigned shorts). I'm guessing that numpy will be the fastest way to do it, but I could also use struct/bytearray/lists or whatever would be best.
I hope this makes sense. I had a hard time trying to figure out how best to word the question.
Here is an approach using standard python lists and a while loop:
def custom_partition(arr):
partitions = []
i = 0
while i < len(arr):
pariton_size = arr[i]
next_i = i + pariton_size + 1
partitions.append(arr[i:next_i])
i = next_i
return partitions
a = [2, 5, 1, 3, 9, 0, 7, 4, 1, 2, 0, 11]
b = custom_partition(a)
print(b)
Output:
[[2, 5, 1], [3, 9, 0, 7], [4, 1, 2, 0, 11]]

Assign 1 and 0 values to numpy array depending on whether values are in list

I am looking for a way to filter numpy arrays based on a list
input_array = [[0,4,6],[2,1,1],[6,6,9]]
list=[9,4]
...
output_array = [[0,1,0],[0,0,0],[0,0,1]]
I am currently flattening the array, and turning it to a list and back. Looks very unpythonic:
list=[9,4]
shape = input_array.shape
input_array = input_array.flatten()
output_array = np.array([int(i in list) for i in input_array])
output_array = output_array.reshape(shape)
We could use np.in1d to get the mask of matches. Now, np.in1d flattens the input to 1D before processing. So, the output from it is to be reshaped back to 2D and then converted to int for an output with 0s and 1s.
Thus, the implementation would be -
np.in1d(input_array, list).reshape(input_array.shape).astype(int)
Sample run -
In [40]: input_array
Out[40]:
array([[0, 4, 6],
[2, 1, 1],
[6, 6, 9]])
In [41]: list=[9,4]
In [42]: np.in1d(input_array, list).reshape(input_array.shape).astype(int)
Out[42]:
array([[0, 1, 0],
[0, 0, 0],
[0, 0, 1]])

Sort array based on frequency

How can I sort an array by most repetitive values.?
suppose I have an array [3, 3, 3, 3, 4, 4]
Expected the result as [3, 4] since 3 is most repeated and 4 is least repeated.
Is there any way too do it?
Thanks in advance....!
Here is one way of doing it:
distictList: Get all distinct values from the array and store in this
countArray: For each ith index in distinctList countArray[i] holds the occurrence of the distinctList[i]
Now sort countArray and apply same swaps on the distinctList simultaneously.
Ex: [3, 3, 4, 4, 4]
distinctList [3,4]
countArray [2,3]
Descending sort countArray [3,2] sorting distinctList at the same time [4,3]
Output: [4, 3]`
Simple in Python:
data = [3, 2, 3, 4, 2, 1, 3]
frequencies = {x:0 for x in data}
for x in data:
frequencies[x] = frequencies[x] + 1
sorted_with_repetitions = sorted(data, key=lambda x:frequencies[x],reverse=True)
sorted_without_repetitions = sorted(frequencies.keys(), key=lambda x:frequencies[x],reverse=True)
print(data)
print(sorted_with_repetitions)
print(sorted_without_repetitions)
print(frequencies)
The same approach (an associative container to collect distinct values and count occurrences, used in a custom comparison to sort an array with the original data or only distinct items) is suitable for Java.

Calculating repetitive permutations of an array

Let's say I have an array with 5 elements. How can I calculate all possible repetitive permutations of this array in C.
Edit: What I mean is creating all possible arrays by using that 5 number. So the positon matters.
Example:
array = [1,2,3,4,5]
[1,1,1,1,1]
[1,1,1,1,2]
[1,1,1,2,3]
.
.
A common way to generate combinations or permutations is to use recursion: enumerate each of the possibilities for the first element, and prepend those to each of the combinations or permutations for the same set reduced by one element. So, if we say that you're looking for the number of permutations of n things taken k at a time and we use the notation perms(n, k), you get:
perms(5,5) = {
[1, perms(5,4)]
[2, perms(5,4)]
[3, perms(5,4)]
[4, perms(5,4)]
[5, perms(5,4)]
}
Likewise, for perms(5,4) you get:
perms(5,4) = {
[1, perms(5,3)]
[2, perms(5,3)]
[3, perms(5,3)]
[4, perms(5,3)]
[5, perms(5,3)]
}
So part of perms(5,5) looks like:
[1, 1, perms(5,3)]
[1, 2, perms(5,3)]
[1, 3, perms(5,3)]
[1, 4, perms(5,3)]
[1, 5, perms(5,3)]
[2, 1, perms(5,3)]
[2, 2, perms(5,3)]
...
Defining perms(n, k) is easy. As for any recursive definition, you need two things: a base case and a recursion step. The base case is where k = 0: perms(n, 0) is an empty array, []. For the recursive step, you generate elements by prepending each of the possible values in your set to all of the elements of perms(n, k-1).
If I get your question correctly, you need to generate all 5 digit numbers with digits 1,2,3,4 and 5. So there is a simple solution - generate all numbers base five up to 44444 and then map the 0 to 1, 1 to 2 and so on. Add leading zeros where needed - so 10 becomes 00010 or [1,1,1,2,1].
NOTE: you don't actually have to generate the numbers themselves, you may just iterate the numbers up to 5**5(excluding) and for each of them find the corresponing sequence by getting it's digits base 5.
int increment(size_t *dst, size_t len, size_t base) {
if (len == 0) return 0;
if (dst[len-1] != base-1) {
++dst[len-1];
return 1;
} else {
dst[len-1] = 0;
return increment(dst, len-1, base);
}
}
Armed with this function you can iterate over all repetitive permutations of (0 ... 4) starting from {0, 0, 0, 0, 0}. The function will return 0 when it runs out of repetitive permutations.
Then for each repetitive permutation in turn, use the contents as indexes into your array so as to get a repetitive permutation of the array rather than of (0 ... 4).
In your given example, each position could be occupied by either 1, 2, 3, 4, 5. As there are 5 positions, the total number of possibilities = 5 * 5 * 5 * 5 * 5 = 5 ^ 5 = 3125. In general, it would be N ^ N. (where ^ is the exponentiation operator).
To generate these possibilities, in each of the positions, put the numbers 1, 2, 3, 4, 5, one by one, and increment starting from the last position, similar to a 5 digit counter.
Hence, start with 11111. Increment the last position to get 11112 ... until 11115.
Then wrap back to 1, and increment the next digit 11121 continue with 11122 ... 11125, etc. Repeat this till you reach the first position, and you would end at 55555.

How to extract lines in an array, which contain a certain value? (numpy, scipy)

I have an numpy 2D array and I want it to return coloumn c where (r, c-1) (row r, coloumn c) equals a certain value (int n).
I don't want to iterate over the rows writing something like
for r in len(rows):
if array[r, c-1] == 1:
store array[r,c]
, because there are 4000 of them and this 2D array is just one of 20 i have to look trough.
I found "filter" but don't know how to use it (Found no doc).
Is there an function, that provides such a search?
I hope I understood your question correctly. Let's say you have an array a
a = array(range(7)*3).reshape(7, 3)
print a
array([[0, 1, 2],
[3, 4, 5],
[6, 0, 1],
[2, 3, 4],
[5, 6, 0],
[1, 2, 3],
[4, 5, 6]])
and you want to extract all lines where the first entry is 2. This can be done like this:
print a[a[:,0] == 2]
array([[2, 3, 4]])
a[:,0] denotes the first column of the array, == 2 returns a Boolean array marking the entries that match, and then we use advanced indexing to extract the respective rows.
Of course, NumPy needs to iterate over all entries, but this will be much faster than doing it in Python.
Numpy arrays are not indexed. If you need to perform this specific operation more effeciently than linear in the array size, then you need to use something other than numpy.

Resources