How to compare and add elements of an array - arrays

I'd like to take a single array lets say 3x5 as follows
1 3 5
1 3 5
2 8 6
4 5 7
4 5 8
and write a code that will create a new array that adds the third column together with the previous number if the numbers in the first and second columns equal the numbers in the row below it.
since the first two values in row 1 and 2, then add the third elements in row 1 and 2 together
so the output from the array above should look like this
1 3 10
2 8 6
4 5 15

The function accumarray(subs,val) accumulate elements of vector val using the subscripts subs. So we can use this function to sum the elements in the third column having the same value in the first and second column. We can use unique(...,'rows') to determine which pairs of value are unique.
%Example data
A = [1 3 5,
1 3 5,
2 3 6,
4 5 7,
4 5 7]
%Get the unique pair of value based on the first two column (uni) and the associated index.
[uni,~,sub] = unique(A(:,1:2),'rows');
%Create the result using accumarray.
R = [uni, accumarray(sub,A(:,3))]
If the orders matters the script would be a little bit more complex:
%Get the unique pair of value based on the first two column (uni) and the associated index.
[uni,~,sub] = unique(A(:,1:2),'rows');
%Locate the consecutive similar row with this small tricks
dsub = diff([0;sub])~=0;
%Create the adjusted index
subo = cumsum(dsub);
%Create the new array
R = [uni(sub(dsub),:), accumarray(subo,A(:,3))]
Or you can get an identical result with a for loop:
R = A(1,:)
for ii = 2:length(A)
if all(A(ii-1,1:2)==A(ii,1:2))
R(end,3) = R(end,3)+A(ii,3)
else
R(end+1,:) = A(ii,:)
end
end
Benchmark:
With an array A of size 100000x3 on the mathworks live editor:
The for loop take about 5.5s (no pre-allocation, so it's pretty slow)
The vectorized method take about 0.012s

Related

Replace the values in a vector with the keys from a map in MATLAB?

I have a vector of nondecreasing data. Here is a sample:
1
1
1
2
2
2
2
2
2
2
2
3
3
4
4
6
Clearly there are duplicates and missing numbers. I can remove the duplicates using unique, so my unique values are:
uniqueVals = unique(sortedData);
So far, so good. Now, I want to change the data so that the values in sortedData are replaced with their index number in uniqueVals. For instance, uniqueVals first 5 elements would be 1,2,3,4,6, with indices 1,2,3,4,5. I want to change sortedData so that 1 maps to 1, 2 maps to 2, 3 to 3, 4 to 4, 6 to 5 and so on.
I know I can create a "map" object, but that seems to just be used to map uniqueVals to its index. How do I apply that mapping so that the entries in sortedData are changed?
I have no need for this to be a particularly fast operation. sortedData contains only a few hundred thousand rows and it only needs to be done once.
You can use the third output from unique
[uniqueVals,~,yourOutput] = unique(sortedData);
yourOutput =
1
1
1
2
2
2
2
2
2
2
2
3
3
4
4
5
You can also use g = findgroups(sortedData);, which will give you the group index, where there is one group per unique value. The 2nd output of this tells you the value itself
[g, gValue] = findgroups( sortedData );

MATLAB - extract array values based on conditions

I have 4x4 matrix A
[1 2 3 4;
2 2 2 3;
5 5 5 5;
4 4 4 4]
I know how to locate all values less than 4. A<4. But I'm not sure how to write an 'if' statement for; three or more values, all which are less than 4, contained in the same row. For instance; see above A(1,:) and A(2,:) satisfies my conditions.
You can basically do A<4 to know which ones are smaller. If you want to know which rows contain N values smaller than 4 then you can do
rows=find(sum(A<4,2)>=3)
This basically does:
find smaller than 4
Count how many of them in each row (sum(_,2))
find if they are 3 or more
give the row index of those find()

Qsort for a particular range in C?

I have to q sort an array but for a given range for ex.
Given array
4 5 3 7 2 1
and then i have the range 2 & 5 this means i have to sort from index 2 all the way through 5.
Resultant array
4 5 2 3 7 1
I know we can set one bound
like
qsort(array,4,sizeof(int),compa)
this will sort the array till 3rd index but will always start from index 0. I want to start the first bound by a desired value. Any Suggestions??
Just pass address to the middle of the array.
qsort(array + 2, 5 - 2, sizeof(int), compa);

Matlab : Matrix indexing Logic

i am doing very simple Matrix indexing examples . where code is as give below
>> A=[ 1 2 3 4 ; 5 6 7 8 ; 9 10 11 12 ]
A =
1 2 3 4
5 6 7 8
9 10 11 12
>> A(end, end-2)
ans =
10
>> A(2:end, end:-2:1)
ans =
8 6
12 10
here i am bit confused . when i use A(end, end-2) it takes difference of two till first column and when there is just one column left there is no further processing , but when i use A(2:end, end:-2:1) it takes 6 10 but how does it print 8 12 while there is just one column left and we have to take difference of two from right to left , Pleas someone explain this simple point
The selection A(end, end-2) reads: take elements in last row of A that appear in column 4(end)-2=2.
The selection A(2:end, end:-2:1) similarly reads: take elements in rows 2 to 4(end) and starting from last column going backwards in jumps of two, i.e. 4 then 2.
To check the indexing, simply substitute the end with size(A,1) or size(A,2) if respectively found in the row and col position.
First the general stuff: end is just a placeholder for an index, namely the last position in a given array dimension. For instance, for an arbitrary array A(end,1) will pick the last element in column 1, and A(1,end) will pick the last element in the first row.
In your example, A(end, end-2) picks an element in the last row two columns before the last one.
To interpret a statement such as
A(2:end, end:-2:1)
it might help to substitute end with the actual index of the last row/column elements, so this is equivalent to
A(2:3, 4:-2:1)
Furthermore 4:-2:1 is equivalent to the list 4,2 since we are instructing to make the list starting at 4, decreasing in steps of 2, up to (minimum) 1. So this is equivalent to
A([2 3],[4 2])
Finally, the following combination of indices is implied by A([2 3],[4 2]):
A(2,4) A(2,2)
A(3,4) A(3,2)

matlab: eliminate elements from array

I have quite big array. To make things simple lets simplify it to:
A = [1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5 5 5 5];
So, there is a group of 1's (4 elements), 2's (2 elements), 3's (4 elements), 4's (2 elements) and 5's (8 elements). Now, I want to keep only columns, which belong to group of 3 or more elements. So it will be like:
B = [1 1 1 1 3 3 3 3 5 5 5 5 5 5 5 5];
I was doing it using for loop, scanning separately 1's, 2's, 3's and so on, but its extremely slow with big arrays...
Thanks for any suggestions how to do it in more efficient way :)
Art.
A general approach
If your vector is not necessarily sorted, then you need to run to count the number of occurrences of each element in the vector. You have histc just for that:
elem = unique(A);
counts = histc(A, elem);
B = A;
B(ismember(A, elem(counts < 3))) = []
The last line picks the elements that have less than 3 occurrences and deletes them.
An approach for a grouped vector
If your vector is "semi-sorted", that is if similar elements in the vector are grouped together (as in your example), you can speed things up a little by doing the following:
start_idx = find(diff([0, A]))
counts = diff([start_idx, numel(A) + 1]);
B = A;
B(ismember(A, A(start_idx(counts < 3)))) = []
Again, note that the vector need not to be entirely sorted, just that similar elements are adjacent to each other.
Here is my two-liner
counts = accumarray(A', 1);
B = A(ismember(A, find(counts>=3)));
accumarray is used to count the individual members of A. find extracts the ones that meet your '3 or more elements' criterion. Finally, ismember tells you where they are in A. Note that A needs not be sorted. Of course, accumarray only works for integer values in A.
What you are describing is called run-length encoding.
There is software for this in Matlab on the FileExchange. Or you can do it directly as follows:
len = diff([ 0 find(A(1:end-1) ~= A(2:end)) length(A) ]);
val = A(logical([ A(1:end-1) ~= A(2:end) 1 ]));
Once you have your run-length encoding you can remove elements based on the length. i.e.
idx = (len>=3)
len = len(idx);
val = val(idx);
And then decode to get the array you want:
i = cumsum(len);
j = zeros(1, i(end));
j(i(1:end-1)+1) = 1;
j(1) = 1;
B = val(cumsum(j));
Here's another way to do it using matlab built-ins.
% Set up
A=[1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5];
threshold=2;
% Get the unique elements of the array
uniqueElements=unique(A);
% Count haw many times each unique element occurs
counts=histc(A,uniqueElements);
% Write which elements should be kept
toKeep=uniqueElements(counts>threshold);
% Make a logical index
indexer=false(size(A));
for i=1:length(toKeep)
% For every unique element we want to keep select the indices in A that
% are equal
indexer=indexer|(toKeep(i)==A);
end
% Apply index
B=A(indexer);

Resources