Associating / linking an array column with another column in the array - arrays

I have an array that has some calcultations done on the second column. I would like the values from the third column to follow/be linked to the second column.
Test Code:
a1= [1,10,-11;
2,70,232;
3,33.2,-33;
4,40,44;]
a2calc=abs(a1(:,2)-max(a1(:,2))) %calculation
a2=[a1(:,1),a2calc,a1(:,3)] %new array
Example:
original a1 Array
1 10 -11
2 70 232
3 33.2 -33
4 40 44
a2 Array after column 2 calculations looks like this
1 60 -11
2 0 232
3 36.8 -33
4 30 44
I'm trying to get the final array to look like this (column 3 values follow / are linked to the second column)
1 60 232
2 0 -11
3 36.8 44
4 30 -33
What I'm having problems with is I'm not sure if I should use the index values of column 2 and if so how I can get it to look like the final output array I included in the question.

I might be wrong here, but it looks to me like the logic is:
After calculating the second column, change the order of the third column so that the third column is sorted the same way as the second. To see what I mean:
This represents the two columns, numbered from highest to lowest:
A = 1 1
4 3
2 2
3 4
If I understand it right, you want the resulting matrix to be
A = 1 1
4 4
2 2
3 3
If this is the right logic then you should check out sort with two outputs. You can use the second output to index the third column.
[~, idx] = sort(A(:, 2));
sorted_3 = sort(A(:, 3));
A(idx, 3) = sorted_3;
The output from this is:
A =
1.00000 60.00000 232.00000
2.00000 0.00000 -33.00000
3.00000 36.80000 44.00000
4.00000 30.00000 -11.00000
Good luck!

Related

How can I set a matrix with different, repeating patterns in every column?

I'm trying to set an nx3 matrix in GNU Octave to scatter plot and compare it to a fitted surface which I already solved for and plotted. However this matrix has repeating patterns in columns 1 and 2; I could set them by hand, but the number of rows is somewhat big and the only row I currently have is the non-repeating one (row 3).
For example:
A=|1 5 z|
|2 5 z|
|3 5 z|
|4 5 z|
|1 10 z|
|2 10 z|
...
And so on. Where z are the values that I already have as a column vector, which I can simply punch into the matrix with:
A(:,3)=z
However, I've tried doing
A(2:4:n)=2;A(3:4:n)=3;A(4:4:n)=4
Which actually worked, for the first column, but had no luck with the second one (and I don't think is the cleanest way to do it). Any ideas?
It seems to me that the pattern in the first two columns correspond to a grid of coordinates, where x=1:4 and y=5:5:20 (or some other end value).
You can generate these coordinates using meshgrid:
[y, x] = meshgrid(5:5:20, 1:4);
(Note how x and y are reversed, don't ask). Next, you can put these into a matrix together with the z values you already have as follows:
A = [x(:), y(:), z];
Alternatively, you can do
A(:,1) = x(:);
A(:,2) = y(:);
Each of the column is repeating in a different way so you can generate each in different ways:
octave:1> col1 = repmat ([1:4].', [3 1]); # repeat matrix
octave:2> col2 = ([5 5 5 5].' .* [1 2 3])(:); # automatic broadcasting
octave:3> col3(1:12, 1) = 42; # on the fly by assignment
octave:4> A = [col1 col2 col3]
A =
1 5 42
2 5 42
3 5 42
4 5 42
1 10 42
2 10 42
3 10 42
4 10 42
1 15 42
2 15 42
3 15 42
4 15 42

How to find minimum value of a column imported from Excel using MATLAB

I have a set of values in the following pattern.
A B C D
1 5 6 11
2 6 5 21
3 7 3 42
4 3 7 22
1 2 3 54
2 3 2 43
3 4 3 27
4 3 2 14
I exported the every column into MATLAB workspace as follows.
A = xlsread('F:\R.xlsx','Complete Data','A2:A43');
B = xlsread('F:\R.xlsx','Complete Data','B2:B43');
C = xlsread('F:\R.xlsx','Complete Data','C2:C43');
D = xlsread('F:\R.xlsx','Complete Data','D2:D43');
I need help with code where the it has to check the Column A, find the lowest D value and output the corresponding B and C values. I need the output to look like.
1 5 6 11
2 6 5 21
3 4 3 27
4 3 2 14
I read through related questions and understand that I need to make it a matrix and sort it based on the element on the 4th column using
sortrows
and get indices of the sorted elements. But I am stuck here. Please Guide me.
You can export those columns in one go as:
ABCD = xlsread('F:\R.xlsx','Complete Data','A2:D43');
Now use sortrows to sort the rows according to the first and the fourth column.
req = sortrows(ABCD, [1 4]);
☆ If all elements of the first column exist twice then:
req = req(1:2:end,:);
☆ If it is not necessary that all elements of the first column will exist twice then:
[~, ind] = unique(req(:,1));
req = req(ind,:);

How do you fill-in missing values despite differences in index values?

Here's my situation. I have a predicted values in the form of array (i.e. ([1,3,1,2,3,...3]) ) and a data frame column of missing NA's. Both array and column of data frame have the same dimensions. But, the indices don't match another.
For instance, the indices of predicted array are 0:100.
On the other hand, the indices of the column of NA's don't begin with 0, rather the first index where NA is observed in the dataFrame.
What's Pandas function will fill-in the first missing value with the first element of predicted array, second missing value with the second element, and so forth?
Assuming your missing data is represented in the DF as NaN/None values:
df = pd.DataFrame({'col1': [2,3,4,5,7,6,5], 'col2': [2,3,None,5,None,None,5],}) # Column 2 has missing values
pred_vals = [11, 22, 33] # Predicted values to be inserted in place of the missing values
print 'Original:'
print df
missing = df[pd.isnull(df['col2'])].index # Find indices of missing values
df.loc[missing, 'col2'] = pred_vals # Replace missing values
print '\nFilled:'
print df
Result:
Original:
col1 col2
0 2 2
1 3 3
2 4 NaN
3 5 5
4 7 NaN
5 6 NaN
6 5 5
Filled:
col1 col2
0 2 2
1 3 3
2 4 11
3 5 5
4 7 22
5 6 33
6 5 5

Adding and multiplying tables' data by values in another table

Say I have a table of subtractions and divisions sorted by date:
tblFactors
dt sub divide
2014-07-01 1 1
2014-06-01 0 5
2014-05-01 2 1
2014-05-01 0 3
I have another table of values, sorted by date:
tblValues
dt val
2014-07-05 4
2014-06-15 5
2014-05-15 21
2014-04-14 31
2014-03-15 71
I need to perform some sequential calculations. For the first value in tblFactors, I need to subtract 1 from every val where tblValues.dt < '2014-07-01'.
Next, I need to process the second row in tblFactors. There is nothing to subtract. However, the divide = 5 means that I need to divide every val by 5 where tblValues.dt < '2014-06-01'. The tricky thing is that I need to do this on the modified val from the row before (divide 20 / 5, not 21 / 5).
Each row in tblFactors would process in this manner, giving a sequence like this:
tblFactors: Row 1 Row 2 Row 3 Row 4
Dt Original Val Subtract 1 Divide by 5 Subtract 2 Divide by 3
7/5/2014 4
6/15/2014 5 4
5/15/2014 21 20 4
4/14/2014 31 30 6 4
3/25/2014 71 70 14 12 4
This would leave me with:
qryValues
dt val
2014-07-05 4
2014-06-15 4
2014-05-15 4
2014-04-14 4
2014-03-15 4
Right now I'm doing vector multiplications over loops in R. I was wondering if there was a clever way to accomplish this in the native sql. I tried doing some aggregations but I've had limited success.

Vectorize the sum of unique columns

There are multiple occurrence of same combination of values in different rows of matlab, for example 1 1 in first and second row. I want to remove all those duplicates but adding the values in third column. In case of 1 1 it will be 7. Finally I want to create a similarity matrix as shown below in Answer. I don't mind 2*values in diagonals because I will not be considering diagonal elements in further work. The code below does this but it is not vectorized. Can this be vectorized somehow. Example is given below.
datain = [ 1 1 3;
1 1 4;
1 2 5;
1 2 4;
1 2 3;
1 3 8;
1 3 7;
1 3 12;
2 2 22;
2 2 77;
2 3 111;
2 3 113;
3 3 456;
3 3 568];
cmp1=unique(datain(:,1));
cmp1sz=size(cmp1,1);
cmp2=unique(datain(:,2));
cmp2sz=size(cmp2,1);
thetotal=zeros(cmp1sz,cmp2sz);
for i=1:size(datain,1)
for j=1:cmp1sz
for k=1:cmp2sz
if datain(i,1)==cmp1(j,1) && datain(i,2)== cmp2(k,1)
thetotal(j,k)=thetotal(j,k)+datain(i,3);
thetotal(k,j)=thetotal(k,j)+datain(i,3);
end
end
end
end
The answer is
14 12 27
12 198 224
27 224 2048
This is a poster case for using ACCUMARRAY.
thetotal = accumarray(datain(:,1:2),datain(:,3),[],#sum,0);
%# to make the array symmetric, you simply add its transpose
thetotal = thetotal + thetotal'
thetotal =
14 12 27
12 198 224
27 224 2048
EDIT
So what if datain does not contain only integer values? In this case, you can still construct a table, but e.g. thetotal(1,1) will not correspond to datain(1,1:2) == [1 1], but to the smallest entry in the first two columns of datain.
[uniqueVals,~,tmp] = unique(reshape(datain(:,1:2),[],1));
correspondingIndices = reshape(tmp,size(datain(:,1:2)));
thetotal = accumarray(correspondingIndices,datain(:,3),[],#sum,0);
The value at [1 1] now corresponds to the row [uniqueVals(1) uniqueVals(1)] in the first two cols of datain.

Resources