Counting the occurance of a unique number in an array - MATLAB - arrays

I have an array that looks something like...
1 0 0 1 2 2 1 1 2 1 0
2 1 0 0 0 1 1 0 0 2 1
1 2 2 1 1 1 2 0 0 1 0
0 0 0 1 2 1 1 2 0 1 2
however my real array is (50x50).
I am relatively new to MATLAB and need to be able to count the amount of unique values in each row and column, for example there is four '1's in row-2 and three '0's in column-3. I need to be able to do this with my real array.
It would help even more if these quantities of unique values were in arrays of their own also.
PLEASE use simple language, or else i will get lost, for example if representing an array, don't call it x, but perhaps column_occurances_array... for me please :)

What I would do is iterate over each row of your matrix and calculate a histogram of occurrences for each row. Use histc to calculate the occurrences of each row. The thing that is nice about histc is that you are able to specify where the bins are to start accumulating. These correspond to the unique entries for each row of your matrix. As such, use unique to compute these unique entries.
Now, I would use arrayfun to iterate over all of your rows in your matrix, and this will produce a cell array. Each element in this cell array will give you the counts for each unique value for each row. Therefore, assuming your matrix of values is stored in A, you would simply do:
vals = arrayfun(#(x) [unique(A(x,:)); histc(A(x,:), unique(A(x,:)))], 1:size(A,1), 'uni', 0);
Now, if we want to display all of our counts, use celldisp. Using your example, and with the above code combined with celldisp, this is what I get:
vals{1} =
0 1 2
3 5 3
vals{2} =
0 1 2
5 4 2
vals{3} =
0 1 2
3 5 3
vals{4} =
0 1 2
4 4 3
What the above display is saying is that for the first row, you have 3 zeros, 5 ones and 3 twos. The second row has 5 zeros, 4 ones and 2 twos and so on. These are just for the rows. If you want to do these for columns, you have to modify your code slightly to operate along columns:
vals = arrayfun(#(x) [unique(A(:,x)) histc(A(:,x), unique(A(:,x)))].', 1:size(A,2), 'uni', 0);
By using celldisp, this is what we get:
vals{1} =
0 1 2
1 2 1
vals{2} =
0 1 2
2 1 1
vals{3} =
0 2
3 1
vals{4} =
0 1
1 3
vals{5} =
0 1 2
1 1 2
vals{6} =
1 2
3 1
vals{7} =
1 2
3 1
vals{8} =
0 1 2
2 1 1
vals{9} =
0 2
3 1
vals{10} =
1 2
3 1
vals{11} =
0 1 2
2 1 1
This means that in the first column, we see 1 zero, 2 ones and 1 two, etc. etc.

I absolutely agree with rayryeng! However, here is some code which might be easier to understand for you as a beginner. It is without cell arrays or arrayfuns and quite self-explanatory:
%% initialize your array randomly for demonstration:
numRows = 50;
numCols = 50;
yourArray = round(10*rand(numRows,numCols));
%% do some stuff of what you are asking for
% find all occuring numbers in yourArray
occVals = unique(yourArray(:));
% now you could sort them just for convinience
occVals = sort(occVals);
% now we could create a matrix occMat_row of dimension |occVals| x numRows
% where occMat_row(i,j) represents how often the ith value occurs in the
% jth row, analoguesly occMat_col:
occMat_row = zeros(length(occVals),numRows);
occMat_col = zeros(length(occVals),numCols);
for k = 1:length(occVals)
occMat_row(k,:) = sum(yourArray == occVals(k),2)';
occMat_col(k,:) = sum(yourArray == occVals(k),1);
end

Related

Transforming a data file into matrix with columns of identical elements?

I have a very large data file which has a format like below:
1 2 3 4 6 7 8
1 2 3 4 6
1 2 3 5 4 6
1 2 3 4 6
1 2 3 4 6
1 2 3 4 6 8
I am trying to load this data into Matlab. My aim is to create a matrix which has identical elements per one column and if some value is missing fill it with zero. So the output will be something like below:
1 2 3 4 0 6 7 8
1 2 3 4 0 6 0 0
1 2 3 4 5 6 0 0
1 2 3 4 0 6 0 0
1 2 3 4 0 6 0 0
1 2 3 4 0 6 0 8
Can someone give me any idea/code-snippets/links to realize this?
OK. Here is how I did it(test.dat is the file name with the input data):
%// The first section reads the dat file and fills missing entries in columns with zeros
fid = fopen('test.dat');
textLine = fgets(fid); % Read first line.
lineCounter = 1;
while ischar(textLine)
% get into numbers array.
numbers = sscanf(textLine, '%f ');
% Put numbers into a cell array IF and only if
% you need them after the loop has exited.
% First method - each number in one cell.
for k = 1 : length(numbers)
ca{lineCounter, k} = numbers(k);
end
% ALternate way where the whole array is in one cell.
ca2{lineCounter} = numbers;
% Read the next line.
textLine = fgets(fid);
lineCounter = lineCounter + 1;
end
fclose(fid);
emptyIndex = cellfun(#isempty,ca); %# Find indices of empty cells
ca(emptyIndex) = {0}; %# Fill empty cells with 0
A=cell2mat(ca);
%// The second section with create a new matrix AA from A matrix
%// which will be a unique entry in each column with missing entries as zero
uniq=unique(A);
row=size(A);
row=row(1);
%not considering zero
AA=zeros(row,uniq(end));
AA_idx=[];
for x=uniq(2):uniq(end)
AA_idxr=mod(find(A==x),row);
AA_idxr(AA_idxr==0)=row;
AA_idxc=x*ones(length(AA_idxr),1);
% AA_idxc(AA_idxc==0)=uniq(end)
c=[AA_idxr AA_idxc];
AA_idx=cat(1,AA_idx,c);
c=[];
end
for i=1:length(AA_idx)
index=AA_idx(i,:);
a=index(1);
b=index(2);
AA(a,b)=b;
end

Number 0's and 1's blocks in a binary vector

In MATLAB, there is the bwlabel function, that given a binary vector, for instance x=[1 1 0 0 0 1 1 0 0 1 1 1 0] gives (bwlabel(x)):
[1 1 0 0 0 2 2 0 0 3 3 3 0]
but what I want to obtain is
[1 1 2 2 2 3 3 4 4 5 5 5 6]
I know I can negate x to obtain (bwlabel(~x))
[0 0 1 1 1 0 0 2 2 0 0 0 3]
But how can I combine them?
All in one line:
y = cumsum([1,abs(diff(x))])
Namely, abs(diff(x)) spots changes in the binary vector, and you gain the output with the cumulative sum.
You can still do it using bwlabel by vertically concatenating x and ~x, using 4-connected components for the labeling, then taking the maximum down each column:
>> max(bwlabel([x; ~x], 4))
ans =
1 1 2 2 2 3 3 4 4 5 5 5 6
However, the solution from Bentoy13 is probably a bit faster.
x=[1 1 0 0 0 1 1 0 0 1 1 1 0];
A = bwlabel(x);
B = bwlabel(~x);
if x(1)==1
tmp = A>0;
A(tmp) = 2*A(tmp)-1;
tmp = B>0;
B(tmp) = 2*B(tmp);
C = A+B
elseif x(1)==0
tmp = A>0;
A(tmp) = 2*A(tmp);
tmp = B>1;
B(tmp) = 2*B(tmp)-1;
C = A+B
end
C =
1 1 2 2 2 3 3 4 4 5 5 5 6
You know the first index should remain 1, but the second index should go from 1 to 2, the third from 2 to 3 etc; thus even indices should be doubled and odd indices should double minus one. This is given by A+A-1 for odd entries, and B+B for even entries. So a simple check for whether A or B contains the even points is sufficient, and then simply add the two arrays.
I found this function that does exactly what i wanted:
https://github.com/davidstutz/matlab-multi-label-connected-components
So, clone the repository and compile in matlab using mex :
mex sp_fast_connected_relabel.cpp
Then,
labels = sp_fast_connected_relabel(x);

How can I find all the cells that have the same values in a multi-dimensional array in octave / matlab

How can I find all the cells that have the same values in a multi-dimensional array?
I can get it partially to work with result=A(:,:,1)==A(:,:,2) but I'm not sure how to also include A(:,:,3)
I tried result=A(:,:,1)==A(:,:,2)==A(:,:,3) but the results come back as all 0 when there should be 1 correct answer
which is where the number 8 is located in the same cell on all the pages of the array. Note: this is just a test the repeating number could be found multiple times and as different numbers.
PS: I'm using octave 3.8.1 which is like matlab
See code below:
clear all, tic
%graphics_toolkit gnuplot %use this for now it's older but allows zoom
A(:,:,1)=[1 2 3; 4 5 6; 7 9 8]; A(:,:,2)=[9 1 7; 6 5 4; 7 2 8]; A(:,:,3)=[2 4 6; 8 9 1; 3 5 8]
[i j k]=size(A)
for ii=1:k
maxamp(ii)=max(max(A(:,:,ii)))
Ainv(:,:,ii)=abs(A(:,:,ii)-maxamp(ii));%the extra max will get the max value of all values in array
end
%result=A(:,:,1)==A(:,:,2)==A(:,:,3)
result=A(:,:,1)==A(:,:,2)
result=double(result); %turns logical index into double to do find
[row col page] = find(result) %gives me the col, row, page
This is the output it gives me:
>>>A =
ans(:,:,1) =
1 2 3
4 5 6
7 9 8
ans(:,:,2) =
9 1 7
6 5 4
7 2 8
ans(:,:,3) =
2 4 6
8 9 1
3 5 8
i = 3
j = 3
k = 3
maxamp = 9
maxamp =
9 9
maxamp =
9 9 9
result =
0 0 0
0 1 0
1 0 1
row =
3
2
3
col =
1
2
3
page =
1
1
1
Use bsxfun(MATLAB doc, Octave doc) and check to see if broadcasting the first slice is equal across all slices with a call to all(MATLAB doc, Octave doc):
B = bsxfun(#eq, A, A(:,:,1));
result = all(B, 3);
If we're playing code golf, a one liner could be:
result = all(bsxfun(#eq, A, A(:,:,1)), 3);
The beauty of the above approach is that you can have as many slices as you want in the third dimension, other than just three.
Example
%// Your data
A(:,:,1)=[1 2 3; 4 5 6; 7 9 8];
A(:,:,2)=[9 1 7; 6 5 4; 7 2 8];
A(:,:,3)=[2 4 6; 8 9 1; 3 5 8];
B = bsxfun(#eq, A, A(:,:,1));
result = all(B, 3);
... gives us:
>> result
result =
0 0 0
0 0 0
0 0 1
The above makes sense since the third row and third column for all slices is the only value where every slice shares this same value (i.e. 8).
Here's another approach: compute differences along third dimension and detect when all those differences are zero:
result = ~any(diff(A,[],3),3);
You can do
result = A(:,:,1) == A(:,:,2) & A(:,:,1) == A(:,:,3);
sum the elements along the third dimension and divide it with the number of dimensions. We get back the original value if the values are the same in all dimension. Otherwise a different (e.g. a decimal) value. Then find the location where A and the summation are equal over the third dimension.
all( A == sum(A,3)./size(A,3),3)
ans =
0 0 0
0 0 0
0 0 1
or
You could also do
all(A==repmat(sum(A,3)./size(A,3),[1 1 size(A,3)]),3)
where repmat(sum(A,3)./size(A,3),[1 1 size(A,3)]) would highlight the implicit broadcasting of this when compared with A.
or
you skip the broadcasting altogether and just compare it with the first slice of A
A(:,:,1) == sum(A,3)./size(A,3)
Explanation
3 represents the third dimension .
sum(A,3) means that we are taking the sum over the third dimension.
Then we divide that sum by the number of dimensions.
It's basically the average value for that position in the third dimension.
If you add three values and then divide it by three then you get the original value back.
For example, A(3,3,:) is [8 8 8]. (8+8+8)/3 = 8.
If you take another example, i.e. the value above, A(2,3,:) = [6 4 1].
Then (6+4+1)/3=3.667. This is not equal to A(2,3,:).
sum(A,3)./size(A,3)
ans =
4.0000 2.3333 5.3333
6.0000 6.3333 3.6667
5.6667 5.3333 8.0000
Therefore, we know that the elements are not the same
throughout the third dimension. This is just a trick I use
to determine that. You also have to remember that
sum(A,3)./size(A,3) is originally a 3x3x1 matrix
that will be automatically expanded (i.e. broadcasted) to a
3x3x3 matrix when we do the comparison with A (A == sum(A,3)./size(A,3)).
The result of that comparison will be a logical array with 1 for the positions that are the same throughout the third dimension.
A == sum(A,3)./size(A,3)
ans =
ans(:,:,1) =
0 0 0
0 0 0
0 0 1
ans(:,:,2) =
0 0 0
1 0 0
0 0 1
ans(:,:,3) =
0 0 0
0 0 0
0 0 1
Then use all(....,3) to get those. The result is a 3x3x1
matrix where a 1 indicates that the value is the same in the
third dimension.
all( A == sum(A,3)./size(A,3),3)
ans =
0 0 0
0 0 0
0 0 1

Shuffle, then find and replace duplicates in two dimensional array - without sorting

I'm looking for efficient algorithm (or any at all..) for this tricky thing. I'll simplify my problem. In my application, this array is about 10000 times bigger :)
I have an 2D array like this:
0 2 1 3 4
1 2 0 4 3
0 2 1 3 4
4 1 2 3 0
Yes, in every row there are values range from 0 to 4 but in different order. The order matters! I can't just sort it and solve this in easy way :)
Then, I shuffle it by choosing a random indexes and swapping them - couple of times. Example result:
0 1 1 1 4
1 2 2 4 3
0 2 3 3 4
4 2 0 3 0
I see duplicates in the rows, that's not good.. Algorithm should find this duplicates and replace them with a value that will not be another duplicate in particular row, for example:
0 1 2 3 4
1 2 0 4 3
0 2 3 1 4
4 2 0 3 1
Can you share your ideas? Maybe there is already very famous algorithm for this problem? I'd be grateful for any hint.
EDIT
Clarification for T_G: After the shuffle, particular row can't exchange values with another rows. It need to find duplicates and replace it with available (any) value left - which is not another duplicate.
After shuffling:
0 1 1 1 4
1 2 2 4 3
0 2 3 3 4
4 2 0 3 0
Steps:
I have 0; I don't see another zeros. Next.
I have 1; I see another 1; I should change it (the second one); there is no 2 in this row, so lets change this duplicate 1 to 2.
I have 1; I see another 1. I should change it (the second one); there is no 3 in this row, so lets change this duplicate 1 to 3. etc...
So if you input this row:
0 0 0 0 0 0 0 0 0
You should get:
0 1 2 3 4 5 6 7 8
Try something like this:
// Iterate matrix lines, line by line
for(uint32_t line_no = 0; line_no < max_line_num; line_no++) {
// counters for each symbol 0-4; index is symbol, val is counter
uint8_t counters[6];
// Clear counters before usage
memset(0, counters, sizeof(counters));
// Compute counters
for(int i = 0; i < 6; i++)
counters[matrix[line_no][i]]++;
// Index of maybe unused symbol; by default is 4
int j = 4;
// Iterate line in reversed order
for(int i = 4; i >= 0; i--)
if(counters[matrix[line_no][i]] > 1) { // found dup
while(counters[j] != 0) // find unused symbol "j"
j--;
counters[matrix[line_no][i]]--; // Decrease dup counter
matrix[line_no][i] = j; // substitute dup to symbol j
counters[j]++; // this symbol j is used
} // for + if
} // for lines

Matlab counting elements in array

Hey guys I just have a quick question regarding counting elements in an array.
the array is something like this
B = [1 0 1 0 0 -1; 1 1 1 0 -1 -1; 0 1 -1 0 0 1]
From this array i want to create an array structure, called column counts and another row counts. I really do want to crate an array structure, even if it is a less efficient process.
basically i want to go through the array and total for each column, row the total amount of times these values occur. For instance for the first row, i want the following output.
Row Counts
-1 0 1
1 3 2
thanks in advance
You can use the hist function to do this.
fprintf('Row counts\n');
disp([-1 0 1])
fprintf('\n')
for row = 1:3
disp(hist(m(i,:),3));
end
yields
Row counts
-1 0 1
1 3 2
2 1 3
1 3 2
I don't fully understand your question, but if you want to count the occurrences of an element in a Matlab array you can do something like:
% Find value 3 in array A
A =[ 1 4 5 3 3 1 2 4 2 3 ];
count = sum( A == 3 )
When comparing A==3 Matlab will fill an array with 0 and 1, meaning the second one that the element in the given position in A has the element you were looking for. So you can count the occurrences by accumulating the values in the array A==3
Edit: you can access the different dimensions like that:
A = [ 1 2 3 4; 1 2 3 4; 1 2 3 4 ]; % 3rows x 4columns matrix
count1 = sum( A(:,1) == 2 ); % count occurrences in the first column
count2 = sum( A(:,3) == 2 ); % ' ' third column
count3 = sum( A(2,:) == 2 ); % ' ' second row
You always access given rows or columns like that.

Resources