Python Pandas: selecting 1st element in array in all cells - arrays

What I am trying to do is select the 1st element of each cell regardless of the number of columns or rows (they may change based on user defined criteria) and make a new pandas dataframe from the data. My actual data structure is similar to what I have listed below.
0 1 2
0 [1, 2] [2, 3] [3, 6]
1 [4, 2] [1, 4] [4, 6]
2 [1, 2] [2, 3] [3, 6]
3 [4, 2] [1, 4] [4, 6]
I want the new dataframe to look like:
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
The code below generates a data set similar to mine and attempts to do what I want to do in my code without success (d), and mimics what I have seen in a similar question with success(c ; however, only one column). The link to the similar, but different question is here :Python Pandas: selecting element in array column
import pandas as pd
zz = pd.DataFrame([[[1,2],[2,3],[3,6]],[[4,2],[1,4],[4,6]],
[[1,2],[2,3],[3,6]],[[4,2],[1,4],[4,6]]])
print(zz)
x= zz.dtypes
print(x)
a = pd.DataFrame((zz.columns.values))
b = pd.DataFrame.transpose(a)
c =zz[0].str[0] # this will give the 1st value for each cell in columns 0
d= zz[[b[0]].values].str[0] #attempt to get 1st value for each cell in all columns

You can use apply and for selecting first value of list use indexing with str:
print (zz.apply(lambda x: x.str[0]))
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
Another solution with stack and unstack:
print (zz.stack().str[0].unstack())
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4

I would use applymap which applies the same function to each individual cell in your DataFrame
df.applymap(lambda x: x[0])
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4

I'm a big fan of stack + unstack
However, #jezrael already put that answer down... so + 1 from me.
That said, here is a quicker way. By slicing a numpy array
pd.DataFrame(
np.array(zz.values.tolist())[:, :, 0],
zz.index, zz.columns
)
0 1 2
0 1 2 3
1 4 1 4
2 1 2 3
3 4 1 4
timing

Related

Ordered random numbers in Matlab

I am trying to generate random numbers between 1 and 5 using Matlab's randperm and calling randperm = 5.
Each time this gives me a different array let's say for example:
x = randperm(5)
x = [3 2 4 1 5]
I need the vector to be arranged such that 4 and 5 are always next to each other and 2 is always between 1 and 3... so for e.g. [3 2 1 4 5] or [4 5 1 2 3].
So essentially I have two "blocks" of unequal length - 1 2 3 and 4 5. The order of the blocks is not so important, just that 4 & 5 end up together and 2 in between 1 and 3.
I can basically only have 4 possible combinations:
[1 2 3 4 5]
[3 2 1 4 5]
[4 5 1 2 3]
[4 5 3 2 1]
Does anyone know how I can do this?
Thanks
I'm not sure if you want a solution that would somehow generalize to a larger problem, but based on how you've described your problem above it looks like you are only going to have 8 possible combinations that satisfy your constraints:
possible = [1 2 3 4 5; ...
1 2 3 5 4; ...
3 2 1 4 5; ...
3 2 1 5 4; ...
4 5 1 2 3; ...
5 4 1 2 3; ...
4 5 3 2 1; ...
5 4 3 2 1];
You can now randomly select one or more of these rows using randi, and can even create an anonymous function to do it for you:
randPattern = #(n) possible(randi(size(possible, 1), [1 n]), :)
This allows you to select, for example, 5 patterns at random (one per row):
>> patternMat = randPattern(5)
patternMat =
4 5 3 2 1
3 2 1 4 5
4 5 3 2 1
1 2 3 5 4
5 4 3 2 1
You can generate each block and shuffle each one then and set them as members of a cell array and shuffle the cell array and finally convert the cell array to a vector.
b45=[4 5]; % block 1
b13=[1 3]; % block 2
r45 = randperm(2); % indices for shuffling block 1
r13 = randperm(2); % indices for shuffling block 2
r15 = randperm(2); % indices for shuffling the cell
blocks = {b45(r45) [b13(r13(1)) 2 b13(r13(2))]}; % shuffle each block and set them a members of a cell array
result = [blocks{r15}] % shuffle the cell and convert to a vector

Find and Replace specific number at specific location in array in MATLAB

I have an array containing numbers.
A = [1 0 5 6 2 4 5 7 8 8 3 2 1 0 0 1 0 0];
I have calculated peaks and locations of these numbers in an array.
pks = [6 8 1 ]
locs = [4 9 16]
Now I want to update the array with the new peaks value that I have calculated and plot it.
Example.
I have received peaks [6, 8, 1] at locations [4, 9, 16].
I have altered the peaks values e.g. (pks-1).
I want to replace the peak values in the original array with the new values [5, 7, 0].
Like this.
% replace: ↓ ↓ ↓
A = [1 0 5 5 2 4 5 7 7 8 3 2 1 0 0 0 0 0];
Is there any trick to do this in MATLAB?
Thanks a lot.
Example Code
A = [1 0 5 6 2 4 5 7 8 8 3 2 1 0 0 1 0 0];
[pks,locs] = findpeaks(A);
for i=1:length(pks)
if (pks(i)==locs(i))
pks_1(i)=(pks(i)-1);
A_copy(A_copy==pks(i))=pks_1(i);
else
goto if
end
end
You can directly index them, replace your example code with the following:
A = [1 0 5 6 2 4 5 7 8 8 3 2 1 0 0 1 0 0];
% We aren't interested in the actual pks values, so use ~ instead
[~,locs] = findpeaks(A)
% Reduce all values at 'locs' by 1
A(locs) = A(locs) - 1;
Note, there were several errors in your code. For instance,
you are comparing pks(i) == locs(i), have a think about what that's actually comparing because it doesn't find when your loop is at a peak. For that you would need a double loop
for jj = 1:numel(pks)
for ii = 1:numel(A)
if (ii == locs(ii))
% Peak is at index ii
end
end
end
Better would be
for ii = locs
% Peak is at index ii
end
Even better would be the direct indexing I've shown at the top of this answer!
You are also indexing A_copy and pks_1 before they're defined, so that could cause issues.
Also I'm not sure what you think the goto statement is doing?!

Counting values along an axis in a 3D array that are greater than threshold values from a 2D array

I have a 3D array of dimensions (200,200,3). These are images of dimensions (200,200) stacked using numpy.dstack. I would like to count the number of values along axis=2 that are greater than a corresponding 2D threshold array of dimensions (200,200). The output counts array should have dimensions (200,200). Here is my code so far.
import numpy as np
stacked_images=np.random.rand(200,200,3)
threshold=np.random.rand(200,200)
counts=(stacked_images<threshold).sum(axis=2)
I am getting the following error.
ValueError: operands could not be broadcast together with shapes (200,200,3) (200,200)
The code works if threshold is an integer/float value. For example.
threshold=0.3
counts=(stacked_images<threshold).sum(axis=2)
Is there a simple way to do this if threshold is a 2D array? I guess I am not understanding numpy broadcasting rules correctly.
numpy is expecting to make a value by value operation. In your case you seem to be wanting to know if any value in the full Z (axis=2) trace exceeds the equivalent x, y value in threshold.
As so just make sure threshold has the same shape, namely by building a 3D threshold using whatever method you prefer. Since you mentioned numpy.dstack:
import numpy as np
stacked_images = np.random.rand(10, 10, 3)
t = np.random.rand(10, 10)
threshold = np.dstack([t, t, t])
counts = (stacked_images < threshold).sum(axis=2)
print(counts)
, which results in:
[[2 0 3 3 1 3 1 0 1 2]
[0 1 2 0 0 1 0 0 1 3]
[2 1 3 0 3 2 1 3 1 3]
[2 0 0 3 3 2 0 2 0 1]
[1 3 0 0 0 3 0 2 1 2]
[1 1 3 2 3 0 0 3 0 3]
[3 1 0 1 2 0 3 0 0 0]
[3 1 2 1 3 0 3 2 0 2]
[3 1 1 2 0 0 1 0 1 0]
[0 2 2 0 3 0 0 2 3 1]]

Extract blocks of numbers from array in Matlab

I have a vector and I would like to extract all the blocks from it:
x = [1 1 1 4 4 5 5 4 6 1 2 4 4 4 9 8 4 4 4 4]
so that I will get vectors or a cell containing the blocks:
[1 1 1], [4 4], [5 5], [4], [6], [1], [2], [4 4 4], [9], [8], [4 4 4 4]
Is there an efficient way to do it without using for loops? Thanks!
You can use accumarray with a custom anonymous function:
y = accumarray(cumsum([true; diff(x(:))~=0]), x(:), [], #(x) {x.'}).';
This gives a cell array of vectors. In your example,
x = [1 1 1 4 4 5 5 4 6 1 2 4 4 4 9 8 4 4 4 4];
the result is
y{1} =
1 1 1
y{2} =
4 4
y{3} =
5 5
y{4} =
4
y{5} =
6
y{6} =
1
y{7} =
2
y{8} =
4 4 4
y{9} =
9
y{10} =
8
y{11} =
4 4 4 4
For loops aint as slow as you might think, especially not in more recent Matlab versions and especially not in our case. Maybe this will help
x = [1 1 1 4 4 5 5 4 6 1 2 4 4 4 9 8 4 4 4 4];
breakIdx = [0, find(diff(x)), length(x)];
groups = mat2cell(x,1,diff(breakIdx));
We find the groups by applying diff(x) and we get the group indices with find(). Then it's just a matter of moving the groups into the resulting cell groups.
There's very little edge case checks here so I recommend you add that.
If holding all blocks in a cell array is not so important, but ruther the full information about them, you can use this code:
x = [1 1 1 4 4 5 5 4 6 1 2 4 4 4 9 8 4 4 4 4];
elements = x(diff([0 x])~=0);
block_size = accumarray(cumsum(diff([0 x])~=0).',1).';
blocks = [elements; block_size];
to get a 2-row matrix with the element on the first row, and the block size on the second:
blocks =
1 4 5 4 6 1 2 4 9 8 4
3 2 2 1 1 1 1 3 1 1 4
Then define a function to create those blocks by need:
getBlock = #(k) ones(1,blocks(2,k))*blocks(1,k);
and call it with the number of block you want:
getBlock(8)
to get:
ans =
4 4 4

Building all possible arrays from vector of subarrays. With recursion [duplicate]

This question already has answers here:
Generate a matrix containing all combinations of elements taken from n vectors
(4 answers)
Closed 8 years ago.
I'm trying to build all possible arrays of length n of a vector of n elements with at least 2 integers in each position. I should be getting 2^n combinations, 16 in this case. My code is generating only half of them, and not saving the output to an array
allinputs = {[1 2] [2 3] [3 4] [5 6]}
A = []
the command I run is
inputArray = inputBuilder(A,[],allinputs,1)
for the function
function inputArray = inputBuilder(A,currBuild, allInputs, currIdx)
if currIdx <= length(allInputs)
for i = 1:length(allInputs{currIdx})
mybuild = [currBuild allInputs{currIdx}(i)];
inputBuilder(A,mybuild,allInputs,currIdx + 1);
end
if currIdx == length(allInputs)
A = [A mybuild];
%debug output
mybuild
end
if currIdx == 1
inputArray = A;
end
end
end
I want all 16 arrays to get output in a vector. Or some easy way to access them all. How can I do this?
EDIT:
Recursion may be a requirement because allinputs will have subarrays of different lengths.
allinputs = {[1] [2 3] [3 4] [5 6 7]}
with this array it will be 1*2*2*3 or 12 possible arrays built
Not sure exactly if this is what you want, but one way of doing what I think you want to do is as follows:
allinputs = {[1 2] [2 3] [3 4] [5 6]};
comb_results = combn([1 2],4);
A = zeros(size(comb_results));
for rowi = 1:size(comb_results, 1)
indices = comb_results(rowi,:);
for idxi = 1:numel(indices)
A(rowi, idxi) = allinputs{idxi}(indices(idxi));
end
end
This gives:
A =
1 2 3 5
1 2 3 6
1 2 4 5
1 2 4 6
1 3 3 5
1 3 3 6
1 3 4 5
1 3 4 6
2 2 3 5
2 2 3 6
2 2 4 5
2 2 4 6
2 3 3 5
2 3 3 6
2 3 4 5
2 3 4 6
combn is here.

Resources