How to convert data frame columns values into an array without loop - arrays

I have a data frame like this:
df = pd.DataFrame({'A': [10,10,11,14], 'B':[2,3,3,5]})
It looks like this:
A B
0 10 2
1 10 3
2 11 3
3 14 5
I want to convert to this, with A as the row index, and store B's values inside the array or matrix:
10 2 3
11 3
14 5
Is there python way of doing this without looking in each row in data frame df?
many thanks

Use groupby:
df.groupby('A')
Then you can (for instance) get the mean of the grouped version by:
df.groupby('A').mean()
which result in:
B
A
10 2.5
11 3.0
14 5.0

Related

Pandas How to Align Two Columns in a DataFrame and NaN empty cells

I'm using Python 3.8.8
I have a DataFrame structured like this:
A
B
0
1
1
2
2
1
3
7
4
7
5
8
and an array:
C = [3, 4, 7]
I would like to add an array "C" as a new column to the DataFrame. The problem is this array has a different length of index than the df. I would like to make up for the difference in length in C by filling the empty cells with NaNs. My desired result would look something like:
A
B
C
0
1
NaN
1
2
NaN
2
1
3
3
7
4
4
7
7
5
8
NaN
What I am looking for specifically is a way to add C starting at a specific index of the df, but I don't know how to work around the discrepancy between the length of the df and array.
Thank you for your time
To get around the problem of 'different length' when putting your list into the dataframe, you can convert it to a pandas series. Once you do that, you can easily add it to your dataframe with the rest of the values being filled with np.nan.
In your case, you can specifically also set the index when you convert your C list to a series, which you can then assign to your dataframe. Pandas nature to align data on indices will place the series on the right index
Consider using the code below:
c = pd.Series([3, 4, 7],index=[2,3,4])
df['C'] = c
prints:
A B 0
0 0 1 NaN
1 1 2 NaN
2 2 1 3.0
3 3 7 4.0
4 4 7 7.0
5 5 8 NaN
Renaming 0 should be trivial.

How to find minimum value of a column imported from Excel using MATLAB

I have a set of values in the following pattern.
A B C D
1 5 6 11
2 6 5 21
3 7 3 42
4 3 7 22
1 2 3 54
2 3 2 43
3 4 3 27
4 3 2 14
I exported the every column into MATLAB workspace as follows.
A = xlsread('F:\R.xlsx','Complete Data','A2:A43');
B = xlsread('F:\R.xlsx','Complete Data','B2:B43');
C = xlsread('F:\R.xlsx','Complete Data','C2:C43');
D = xlsread('F:\R.xlsx','Complete Data','D2:D43');
I need help with code where the it has to check the Column A, find the lowest D value and output the corresponding B and C values. I need the output to look like.
1 5 6 11
2 6 5 21
3 4 3 27
4 3 2 14
I read through related questions and understand that I need to make it a matrix and sort it based on the element on the 4th column using
sortrows
and get indices of the sorted elements. But I am stuck here. Please Guide me.
You can export those columns in one go as:
ABCD = xlsread('F:\R.xlsx','Complete Data','A2:D43');
Now use sortrows to sort the rows according to the first and the fourth column.
req = sortrows(ABCD, [1 4]);
☆ If all elements of the first column exist twice then:
req = req(1:2:end,:);
☆ If it is not necessary that all elements of the first column will exist twice then:
[~, ind] = unique(req(:,1));
req = req(ind,:);

How to perform transpose to each page of nd array? [duplicate]

This question already has an answer here:
Matlab - Transpose a 3D matrix only in the third dimension
(1 answer)
Closed 5 years ago.
I am trying to figure out how to import large array of data into 3D matrix to a specific order. I have already asked two question but i have not get reliable answer yet and get down voted too. Since then i have done some work and was able to import data to 3D matrix using reshape function. Instead of shooting actual problem, this is a simulation of actual problem.
k=1:27 % create a array of 27 data
r=reshape(k,[3,3,3]) % convert the array into 3 x 3 x 3 matrix,
The results of the first page and second of the matrix is, the data is placed along the columns, but i wanted to place them along rows, The transpose function does not work with ND arrays, I tried to use permute but i did not get the desired result, One solution will be perform transpose to each page, but that will break the 3D matrix in to many 2D matrices.
r(:,:,1) =
1 4 7
2 5 8
3 6 9
r(:,:,2) =
10 13 16
11 14 17
12 15 18
the expected outcome should be,
r(:,:,1) =
1 2 3
4 5 6
7 8 9
Link to the actual problem is,
Thanks
Is this what you want?
result = permute(r, [2 1 3]);
This permutes the first two dimensions. For your example r,
>> k = 1:27;
>> r = reshape(k, [3,3,3]);
>> result = permute(r, [2 1 3]);
>> result
result(:,:,1) =
1 2 3
4 5 6
7 8 9
result(:,:,2) =
10 11 12
13 14 15
16 17 18
result(:,:,3) =
19 20 21
22 23 24
25 26 27

Julia: Sort the columns of a matrix by the values in another vector (in place...)?

I am interested in sorting the columns of a matrix in terms of the values in 2 other vectors. As an example, suppose the matrix and vectors look like this:
M = [ 1 2 3 4 5 6 ;
7 8 9 10 11 12 ;
13 14 15 16 17 18 ]
v1 = [ 2 , 6 , 6 , 1 , 3 , 2 ]
v2 = [ 3 , 1 , 2 , 7 , 9 , 1 ]
I want to sort the columns of A in terms of their corresponding values in v1 and v2, with v1 taking precedence over v2. Additionally, I am interested in trying to sort the matrix in place as the matrices I am working with are very large. Currently, my crude solution looks like this:
MM = [ v1' ; v2' ; M ] ; ## concatenate the vectors with the matrix
MM[:,:] = sortcols(MM , by=x->(x[1],x[2]))
M[:,:] = MM[3:end,:]
which gives the desired result:
3x6 Array{Int64,2}:
4 6 1 5 2 3
10 12 7 11 8 9
16 18 13 17 14 15
Clearly my approach is not ideal is it requires computing and storing intermediate matrices. Is there a more efficient/elegant approach for sorting the columns of a matrix in terms of 2 other vectors? And can it be done in place to save memory?
Previously I have used sortperm for sorting an array in terms of the values stored in another vector. Is it possible to use sortperm with 2 vectors (and in-place)?
I would probably do it this way:
julia> cols = sort!([1:size(M,2);], by=i->(v1[i],v2[i]));
julia> M[:,cols]
3×6 Array{Int64,2}:
4 6 1 5 2 3
10 12 7 11 8 9
16 18 13 17 14 15
This should be pretty fast and uses only one temporary vector and one copy of the matrix. It's not fully in-place, but doing this operation completely in-place is not easy. You would need a sorting function that moves columns as it works, or alternatively a version of permute! that works on columns. You could start with the code for permute!! in combinatorics.jl and modify it to permute columns, reusing a single column-size temporary buffer.

Associating / linking an array column with another column in the array

I have an array that has some calcultations done on the second column. I would like the values from the third column to follow/be linked to the second column.
Test Code:
a1= [1,10,-11;
2,70,232;
3,33.2,-33;
4,40,44;]
a2calc=abs(a1(:,2)-max(a1(:,2))) %calculation
a2=[a1(:,1),a2calc,a1(:,3)] %new array
Example:
original a1 Array
1 10 -11
2 70 232
3 33.2 -33
4 40 44
a2 Array after column 2 calculations looks like this
1 60 -11
2 0 232
3 36.8 -33
4 30 44
I'm trying to get the final array to look like this (column 3 values follow / are linked to the second column)
1 60 232
2 0 -11
3 36.8 44
4 30 -33
What I'm having problems with is I'm not sure if I should use the index values of column 2 and if so how I can get it to look like the final output array I included in the question.
I might be wrong here, but it looks to me like the logic is:
After calculating the second column, change the order of the third column so that the third column is sorted the same way as the second. To see what I mean:
This represents the two columns, numbered from highest to lowest:
A = 1 1
4 3
2 2
3 4
If I understand it right, you want the resulting matrix to be
A = 1 1
4 4
2 2
3 3
If this is the right logic then you should check out sort with two outputs. You can use the second output to index the third column.
[~, idx] = sort(A(:, 2));
sorted_3 = sort(A(:, 3));
A(idx, 3) = sorted_3;
The output from this is:
A =
1.00000 60.00000 232.00000
2.00000 0.00000 -33.00000
3.00000 36.80000 44.00000
4.00000 30.00000 -11.00000
Good luck!

Resources