2D Array containing only 0 or 1 - c

I have a 2 dimensional array which randomly contains values of 0 or 1.
How can I ( most efficiently ) determine the lower most element of value 1 ( the biggest row iteration i ) and the right most element ( the highest column iteration j ) ?
For example:
0 0 1 0
1 0 1 0
0 1 0 0
1 0 0 0
My program should answer i = 3 ( assuming first row is i = 0) and j = 2 ( assuming first column is 0 ).

Here's an idea:
Starting with the bottom-most row, use memrchr to find the last 1 in each row (I'm sort of assuming you store the numbers as char aka 8-bit integers).
Eventually you will find a row which has a 1. This is your answer for i. We got this far using cache-friendly, row-at-a-time operations because C uses row-major order.
Above, you also now know the lower bound for j (because you found the last 1 in the last row that had any 1s).
For the remaining rows, use memrchr from one past the lower bound for j to the end of each row. If you find any 1s there, update the lower bound. Repeat until you have inspected all the rows.
Of course, if you ever find a 1 in the last column, you can stop right away.

Use a plain loop and simply search from the beginning (or the end, depending on what you want to achieve) and check each element. There is no more efficient way.
As far as C and C++ are concerned, what is efficient and what is not lies in the nature of the implementation. If this is a bit field matrix for example, then you can optimize the code slightly by first comparing each byte against 0, before you start searching through the individual bits.
And as usual, it doesn't make sense to talk about efficiency without specifying what it means. Speed? Memory consumption? Program size? It also doesn't make sense to talk about efficient implementation in C or C++ without a given system in mind.

Here is the naive method - just iterating through all positions in the array. Worst case O(n*m):
#define WIDTH 4
#define HEIGHT 4
int main ()
{
int i,j,col,row;
int arr[HEIGHT][WIDTH] = set_Array();
for (j=0;j<HEIGHT;j++){
for (i = 0; i<WIDTH; i++){
if (arr[j][i]){
row = j>row?j:row;
col = i>col?i:col;
}}}
}
How can we improve this? Well we can start from the end and work backwards, but we will have to do the rows and columns alternately rather than just visiting each cell in turn. We could look for column, and then row, but that would be less efficient.
0. 1. 2. 3.
0. 0 0 1 0
1. 1 0 1 0
2. 0 1 0 0
3. 1 0 0 0
In this example, we search row 3 and column 3 first, and eliminate them from the search. Then row 2 and column 2 up to but not including the eliminated column 3 and row 3. Then row 1...
Of course, we stop searching rows when the bottom most one containing a 1 is found, and stop searching columns when the rightmost one containing a 1 is found.
Code:
#include <stdio.h>
#define WIDTH 4
#define HEIGHT 4
int main ()
{
int i,j,col = 0, row = 0;
int current_row = HEIGHT;
int current_col = WIDTH;
int arr[WIDTH][HEIGHT] = {{0,0,1,0},{1,0,1,0},{0,1,0,0},{1,0,0,0}};
while (!(row && col))
{
current_row--;
current_col--;
if (!row){
printf("searching row: %d\n",current_row);
for (i = 0; i < current_col; i++){
if (arr[current_row][i]){
row = current_row;
}}}
if (!col){
printf("searching col: %d\n",current_col);
for (j = 0; j < current_row; j++){
if (arr[j][current_col]){
col = current_col;
}}}
}
printf("col: %d, row: %d\n", col, row);
}
See it live
Output:
searching row: 3
searching col: 3
searching col: 2
col: 2, row: 3
The worst case is still O(m*n), and is actually slightly worse (you test cells on the diagonal starting from the bottom right twice), but the average case is better.
We scan through the lowest unsearched row for a 1, then search through the rightmost unsearched column for a 1.
When you find the lowest 1 you no longer search each row for more 1's. When you find the rightmost 1 you no longer search each column for more 1's either.
This way we stop the search once we find the answer, and unlike the naive method, this means that we don't usually have to go through each value in the array.

If the row size of the array is up to 32 numbers you can use a single int32_t to represent a whole row: The value of the number is the whole row.
Then your whole array will be a one dimensional array of int32_t's:
int32_t matrix[nRows];
Now you can find the lowermost row by finding the last number of matrix that is not equal to 0 in O(nRows) time with a very simple implementation.
Also, you can find the rightmost 1 by with the following trick:
For each matrix[i] you isolate the rightmost 1 by calculating matrix[i] & -matrix[i]. Then calculating the log2 of this result gives you the number of the column. The largest number of column for all matrix[i] numbers gives you the result you want. (Again O(nRows) time with a very simple implementation).
Of course, if the row size is larger that 32 values, you have to use more int32_t values per row, but the principle remains the same.

Related

Countif the Result of Subtracting Two Arrays Exceeds a Certain Value in Excel

I am new to array formulae and am having trouble with the following scenario:
I have the following matrix:
F G H I J ... R S T U V
1 0 0 1 1
0 1 1 1 2 3 1 2
2 0 2 3 1 2 0 1 0 0
2 1 0 0 1 0 0 3 0 0
My goal is to count the number of rows within which the difference between the sum of columns F:J and the sum of columns R:V is greater than a threshold. Critically, only rows with full data should be included: row 1 (where there are only values for columns F1:J1) and row 2 (where there are only some values for columns F2:J2) should be ignored.
If the threshold = 2.5, then the solution is 1. That is, row 3 is the only row with complete data where the difference between the sum of F3:J3 (8) and the sum of R3:V3 (3) is greater than 2.5 (e.g., 5 > 2.5).
I have tried to put together the following formula, rather pathetically, based on the teachings of #Tom Sharpe and #QHarr:
=COUNT(IF(SUBTOTAL(9,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))-SUBTOTAL(9,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))>2.5,IF(AND(SUBTOTAL(2,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))=COLUMNS(F1:J1),SUBTOTAL(2,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))=COLUMNS(R1:V1)),SUBTOTAL(9,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))),IF(AND(SUBTOTAL(2,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))=COLUMNS(F1:J1),SUBTOTAL(2,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))=COLUMNS(R1:V1)),SUBTOTAL(9,OFFSET(R1,ROW(R1:V1)-ROW(R1),0,1,COLUMNS(R1:V1))))))
But it seems to always produce a value of 1, even if I edit the matrix such that the difference between the sum of F4:J4 and R4:v4 also exceeds 2.5. Sadly I am struggling to understand why and would appreciate any guidance on the matter.
As an array formula in one cell without volatile functions:
=SUM((MMULT(--(LEN(F2:J5)*LEN(R2:V5)>0),--TRANSPOSE(COLUMN(F2:J2)>0))=5)*(MMULT(F2:J5-R2:V5,TRANSPOSE(--(COLUMN(F2:J2)>0)))>2.5))
should do the trick :D
Maybe, in say X1 (assuming you have labelled your columns):
=COUNTIF(Y:Y,TRUE)
In Y1 whatever your chosen cutoff (eg 2.5) and in Y2:
=((COUNTBLANK(F2:J2)+COUNTBLANK(R2:V2)=0)*SUM(F2:J2)-SUM(R2:V2))>Y$1
copied down to suit.
Try this:
=SUMPRODUCT((MMULT(F1:J4-R1:V4,--(ROW(INDIRECT("1:"&COLUMNS(F1:J4)))>0))>2.5)*(MMULT((LEN(F1:J4)>0)+(LEN(R1:V4)>0),--(ROW(INDIRECT("1:"&COLUMNS(F1:J4)))>0))=(COLUMNS(F1:J4)+COLUMNS(R1:V4))))
I think this will do it, replacing your AND's by multiplies (*):
=SUMPRODUCT(--((SUBTOTAL(9,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))-SUBTOTAL(9,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))>2.5)*(SUBTOTAL(2,OFFSET(F1,ROW(F1:F4)-ROW(F1),0,1,COLUMNS(F1:J1)))=COLUMNS(F1:J1))*(SUBTOTAL(2,OFFSET(R1,ROW(R1:R4)-ROW(R1),0,1,COLUMNS(R1:V1)))=COLUMNS(R1:V1))>0))
It could be simplified a bit more but a bit short of time.
Just another option...
=IF(NOT(OR(IFERROR(MATCH(TRUE,ISBLANK(F1:J1),0),FALSE),IFERROR(MATCH(TRUE,ISBLANK(R1:V1),0),FALSE))), SUBTOTAL(9,F1:J1)-SUBTOTAL(9,R1:V1), "Missing Value(s)")
My approach was a little different from what you tried to adapt from #TomSharp in that I'm validating the cells have data (not blank) and then perform the calculation, othewise return an error message. This is still an array function call, so when you enter the formulas, press ctrl+shft+enter.
The condition part of the opening if() checks to see that each range's cells are not blank: if a match( true= isblank(cell))
means a cell is blank (bad), if no match ... ie no blank cells, Match will return an #NA "error" (good). False is good = Errors found ? No. ((ie no blank cells))
Then the threshold condition becomes:
=COUNTIF(X1:X4,">"&Threshold)' Note: no Array formula here
I gave the threshold (Cell W6) a named range for read ablity.

matlab: how to speed up the count of consecutive values in a cell array

I have the 137x19 cell array Location(1,4).loc and I want to find the number of times that horizontal consecutive values are present in Location(1,4).loc. I have used this code:
x=Location(1,4).loc;
y={x(:,1),x(:,2)};
for ii=1:137
cnt(ii,1)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3)};
for ii=1:137
cnt(ii,2)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3),x(:,4)};
for ii=1:137
cnt(ii,3)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1})&strcmp(x(:,4),y{1,4}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3),x(:,4),x(:,5)};
for ii=1:137
cnt(ii,4)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1})&strcmp(x(:,4),y{1,4}{ii,1})&strcmp(x(:,5),y{1,5}{ii,1}));
end
... continue for all the columns. This code run and gives me the correct result but it's not automated and it's slow. Can you give me ideas to automate and speed up the code?
I think I will write an answer to this since I've not done so for a while.
First convert your cell Array to a matrix,this will ease the following steps by a lot. Then diff is the way to go
A = randi(5,[137,19]);
DiffA = diff(A')'; %// Diff creates a matrix that is 136 by 19, where each consecutive value is subtracted by its previous value.
So a 0 in DiffA would represent 2 consecutive numbers in A are equal, 2 consecutive 0s would mean 3 consecutive numbers in A are equal.
idx = DiffA==0;
cnt(:,1) = sum(idx,2);
To do 3 consecutive number counts, you could do something like:
idx2 = abs(DiffA(:,1:end-1))+abs(DiffA(:,2:end)) == 0;
cnt(:,2) = sum(idx2,2);
Or use another Diff, the abs is used to avoid negative number + positive number that also happens to give 0; otherwise only 0 + 0 will give you a 0; you can now continue this pattern by doing:
idx3 = abs(DiffA(:,1:end-2))+abs(DiffA(:,2:end-1))+abs(DiffA(:,3:end)) == 0
cnt(:,3) = sum(idx3,2);
In loop format:
absDiffA = abs(DiffA)
for ii = 1:W
absDiffA = abs(absDiffA(:,1:end-1) + absDiffA(:,1+1:end));
idx = (absDiffA == 0);
cnt(:,ii) = sum(idx,2);
end
NOTE: this method counts [0,0,0] twice when evaluating 2 consecutives, and once when evaluating 3 consecutives.

Finding Horizontal Line In Bitmap Image in C

I am stuck with my assesment. I am given a bitmap with its dimensions (rows and cols) such as this:
4 5
0 0 1 1 1
0 0 1 0 1
1 0 1 1 1
1 1 1 1 1
The task is to find horizontal and vertical longest line and a square (all made of 1). I just need the first kick with the horizontal line.
I have put the bitmap in 1D array and I don't know the next step.
The output of the picture above are the coordinates of the longest horizontal line:
3 0 3 4
Help is very much appreciated.
You have to extract the header into structure and data into an array like a[].
Let R and C be number of rows and columns of image. For a given array a[], each row of image starts at a[C*i] where i is the row number. So you can index the row with i, and access each bit in that row with C*i+j, were j less than C. Next you need to do processing for each row to find the length of longest horizontal line. Small change to this can be used to index column j and find longest vertical line.
To do the processing I said above, create a structure of point as
struct point
{
int x;
int y;
}p1, p2;
Also create a variable called lenh which will contain length of the horizontal line found. Also user variable llenh to store the longest length. For each vertex in row (i, j) indexed by (5i+j). Set llenh to 0. Starting in row, on seeing 1 update lenh, and see if it is greater than llenh. If yes update p1 point. On seeing 0 update p2 point and set lenh to 0 and also update llenh.
I haven't revised this completely. If any errors please comment..

Given a boolean matrix sorted row wise. Return a row with max number of 1

I came across a problem of Matrices but was trying to figure out the optimal solution.
Problem statement is the question topic itself.
Further see below
Example
Input matrix
0 1 1 1
0 0 1 1
1 1 1 1 // this row has maximum 1s
0 0 0 0
Output: 2
My Solution : Now since rows are sorted, i thought of performing binary search in each row with the first occurrence of 1, and then count of 1 will be total number of columns minus index of 1st 1.
This will do it in O(m*logn), but I was curious to know the logic if this could be done in linear time.
Thank you!
Start a cursor in the top right. In each row, step left until you reach the last 1 in the row. Then step down. If you step down and the cursor points to a 0, step down again. Never go right. You're looking for the row that has a 1 furthest to the left, so you never need to look to the right. The runtime is O(n+m), since you go through every row, stepping down m times, and you make a total of n steps left at most. Here's some pseudocode, assuming that the matrix is a list of lists:
bestrow = 0
leftmost = matrix.width
for i = matrix.height to 0:
row = matrix[i]
while row[leftmost - 1] == 1:
leftmost--
bestrow = i
return bestrow
If you translate the code literally, you may have problems with a matrix of all 0's, or if some row has all 1's. These are pretty easy to deal with, and the point of the pseudocode is just to communicate the general algorithm.
The solution for this problem depends on the number of elements in each row and the number of columns.
Here is an approach.
Step 1:
Simple do a binary && operation on all elements in each column until you get a true which means we found a column which has at least one one. This take max n steps where n is number of columns.
Step 2:
Now do a search for one from top to bottom in that column which gives you the row with max number of ones. This takes max of m steps.where m is number of rows
So overall it takes O(m+n) steps
It also helps you find multiple rows if any with the same property

How to structure a cell to store values in a specific format in Matlab?

I have a code that looks for the best combination between two arrays that are less than a specific value. The code only uses one value from each row of array B at a time.
B =
1 2 3
10 20 30
100 200 300
1000 2000 3000
and the code i'm using is :
B=[1 2 3; 10 20 30 ; 100 200 300 ; 1000 2000 3000];
A=[100; 500; 300 ; 425];
SA = sum(A);
V={}; % number of rows for cell V = num of combinations -- column = 1
n = 1;
for k = 1:length(B)
for idx = nchoosek(1:numel(B), k)'
rows = mod(idx, length(B));
if ~isequal(rows, unique(rows)) %if rows not equal to unique(rows)
continue %combination possibility valid
end %Ignore the combination if there are two elements from the same row
B_subset = B(idx);
if (SA + sum(B_subset) <= 2000) %if sum of A + (combination) < 2000
V(n,1) = {B_subset(:)}; %iterate cell V with possible combinations
n = n + 1;
end
end
end
However, I would like to display results differently than how this code stores them in a cell.
Instead of displaying results in cell V such as :
[1]
[10]
[300]
[10;200]
[1000;30]
[1;10;300]
This is preferred : (each row X column takes a specific position in the cell)
Here, this means that they should be arranged as cell(1,1)={[B(1,x),B(2,y),B(3,z),B(4,w)]}. Where x y z w are the columns with chosen values. So that the displayed output is :
[1;0;0;0]
[0;10;0;0]
[0;0;300;0]
[0;10;200;0]
[0;30;0;1000]
[1;10;300;0]
In each answer, the combination is determined by choosing a value from the 1st to 4th row of matrix B. Each row has 3 columns, and only one value from each row can be chosen at once. However, if for example B(1,2) cannot be used, it will be replaced with a zero. e.g. if row 1 of B cannot be used, then B(1,1:3) will be a single 0. And the result will be [0;x;y;z].
So, if 2 is chosen from the 1st row, and 20 is chosen from the 2nd row, while the 3rd and 4th rows are NOT included, they should show a 0. So the answer would be [2;20;0;0].
If only the 4th row is used (such as 1000 for example), the answer should be [0;0;0;1000]
In summary I want to implement the following :
Each cell contains length(B) values from every row of B (based on the combination)
Each value not used for the combination should be a 0 and printed in the cell
I am currently trying to implement this but my methods are not working .. If you require more info, please let me know.
edit
I have tried to implement the code in the dfb's answer below but having difficulties, please take a look at the answer as it contains half of the solution.
My MATLAB is super rusty, but doesn't something like this do what you need?
arr = zeros(1,len(B))
arr(idx) = B_subset(:)
V(n,1) = {arr}

Resources