array based sumif in excel by formula - arrays

So I m trying to find some alternative to sumifs in excel where each condition needs to be checked in a 2D range instead of a 1D range.
For example, in the below table I want the sum of values in column V for rows where A12 ("IJ") is present in range A2:C8 (P), B12 ("NM") is present in the range D2:F8 (S) and C12 ("XX") is present in range G2:I8 (A)
I am trying to find a solution involving an array-based formula (without VBA).
Like for example in the below-given formulas,
SUMPRODUCT((B2:B8'=A12)*J2:J8) will give an array-based calculation as follows
SUMPRODUCT({TRUE;FALSE;TRUE;FALSE;FALSE;TRUE;FALSE}*{22;79;45;67;43;72;52})
= SUMPRODUCT({22;0;45;0;0;72;0})
=139
It is easy when there is only one condition needs to be checked but like sumifs, I intend to check multiple conditions, but as soon as I add other conditions, the array becomes multidimensional and gives the wrong answer.
Example:
SUMPRODUCT((A2:C8=A12)*(D2:F8=B12)*J2:J8) breaks down to
=SUMPRODUCT(
{FALSE,TRUE,FALSE;FALSE,FALSE,FALSE;FALSE,TRUE,FALSE;FALSE,FALSE,FALSE;FALSE,FALSE,FALSE;FALSE,TRUE,FALSE;FALSE,FALSE,FALSE}*
{TRUE,FALSE,FALSE;FALSE,FALSE,FALSE;TRUE,FALSE,FALSE;FALSE,FALSE,FALSE;FALSE,FALSE,FALSE;TRUE,FALSE,FALSE;FALSE,FALSE,FALSE}
*J2:J8)
in the background what is happening is (example for 3rd row)
SUMPRODUCT( ({FALSE, TRUE ,FALSE} * {TRUE,FALSE,FALSE}) * 45 )
= SUMPRODUCT({FALSE,FALSE,FALSE} *45 )
=0
SUMPRODUCT(({FALSE,TRUE ,FALSE} + {TRUE,FALSE,FALSE}) * 45 )
= SUMPRODUCT({TRUE,TRUE,FALSE} *45 )
= 90
#expected answer =45
Can someone help me understand where I am going wrong or what I am missing?
If there is any other way then suggestions are always welcome.
Please note this is a dummy data actual data is very big for each header (P,S,A) there are values in 10 columns respectively and the number of rows is also very large.

Try this...
=SUMPRODUCT( ((A2:A8=A12)+(B2:B8=A12)+(C2:C8=A12)) * ((D2:D8=B12)+(E2:E8=B12)+(F2:F8=B12)) * ((G2:G8=C12)+(H2:H8=C12)+(I2:I8=C12)) * J2:J8 )
For SUMPRODUCT to work, the shape of the Boolean array needs to match the shape of the array you wish to conditionally sum.
J2:J8 is seven rows tall by one column wide.
The above formula creates an array of 1s and 0s from your three criteria ranges and shapes it into seven rows tall by one column wide.
At that point, SUMPRODUCT can do it's normal thing because the criteria array matches the dimension of the sum array J2:J8.

Related

Multiply two ranges in Excel to get an array result

Excel 365 allows to multiply ranges to get an array as a result.
Example:
#
A
B
C
1
1
0
1
2
0
1
1
Entering in A3
= A1:C1 * A2:C2
will evaluate to {1,0,1} * {0,1,1}
and return an array {0,0,1} spilling in A3:C3
#
A
B
C
3
0
0
1
This operation can also be used in formulas, especially useful in FILTER(), SUMPRODUCT() etc.
Is there a formula in Excel 365 that can take as arguments an arbitrary number of 1-D ranges, multiply them, and return a 1-D array in the same way as above?
For what I found out so far, SUMPRODUCT() and MMULT() can return only a single value, not a 1-D array.
Alternatively, I can write a LAMBDA, but would like to avoid it, if there is a ready-made formula for it.
I am not 100% what do you mean, I would assume you want to multiply all rows of the same column and return a row array with the result per column. You can achieve it in cell E1 using the following formula:
=BYCOL(A1:C3, LAMBDA(col, PRODUCT(col)))
and here is the output:
If you have only positives numbers, then you can use MMULT, based on the following mathematical properties:
Putting in excel terms using EXP/LN functions in our case it would be:
=EXP(MMULT(TOROW(ROW(A1:C3)/ROW(A1:C3)), LN(A1:C3)))
or using LET to avoid repetitions:
=LET(rng, A1:C3, rows, ROW(rng), u, TOROW(rows/rows), EXP(MMULT(u, LN(rng))))
You get the same result.
Note: rows/rows just returns the unit vector with the same number of rows as A1:C3.

Mapping a 2D array into 1D array with variable column width

I know mapping 2D array into 1D array has been asked many times, but I did not find a solution that would fit a where the column count varies.
So I want get a 1-dimensional index from this 2-dimensional array
Col> _0____1____2__
Row 0 |_0__|_1__|_2__|
V 1 |_3__|_4__|
2 |_5__|_6__|_7__|
3 |_8__|_9__|
4 |_10_|_11_|_12_|
5 |_13_|_14_|
The normal formula index = row * columns + column does not work, since after the 2nd row the index is out of place.
What is the correct formula here?
EDIT:
The specific issue is that I have a list of items in with the layout like in the grid, but a one dimensional array for the data. So while looping through the elements in the UI, I need to get the correct data, but can only get the row and column for that element. I need to find a way to turn a row/column value into an index for the data-array
Bad picture trying to explain it
A truly optimal answer (or even a provably correct one) will depend on the language you are using and how it lays out memory for such arrays.
However, taking your question simply at face value, you have to know what the actual length of each row is in order to calculate a 1D index.
So either the row length follows some pattern that can be inferred from the data, or you have (or can write) a rlen = rowLength( 2dTable, RowNumber) function.
Then, depending on how big the tables are and how fast you need to run, you can calculate a 1D index from the 2d table by adding all the previous row lengths until the current row length is less than the 2d column index.
or build a 1d table of the row lengths (or commulative rowlengths) so you can scan it and so only call your rowlength function for each row only once.
With a better description of your problem, you might get a better answer...
For your example which alternates between 3 and 2 columns you can construct a formula:
index = (row / 2) * (3 + 2) + (row % 2 ? 3 : 0) + column
(C-like syntax, assuming integer division)
In general though, the one and only way to implement what you're doing here, jagged arrays, is to make an array of arrays, a.k.a. an Iliffe vector. That means, use the row number as index into an array of pointers which point to the individual row arrays containing the actual data.
You can have an additional 1D array having the length of the columns say "length". Then your formula is index=sum {length(i)}+column. i runs from 0 to row.

Split matrix into several depending on value in Matlab

I have a cell array that I need to split into several matrices so that I can take the sum of subsets of the data. This is a sample of what I have:
A = {'M00.300', '1644.07';...
'M00.300', '9745.42'; ...
'M00.300', '2232.88'; ...
'M00.600', '13180.82'; ...
'M00.600', '2755.19'; ...
'M00.600', '15800.38'; ...
'M00.900', '18088.11'; ...
'M00.900', '1666.61'};
I want the sum of the second columns for each of 'M00.300', 'M00.600', and 'M00.900'. For example, to correspond to 'M00.300' I would want 1644.07 + 9745.42 + 2232.88.
I don't want to just hard code it because each data set is different, so I need the code to work for different size cell arrays.
I'm not sure of the best way to do this, I was going to begin by looping through A and comparing the strings in the first column and creating matrices within that loop, but that sounded messy and not efficient.
Is there a simpler way to do this?
Classic use of accumarray. You would use the first column as an index and the second column as the values associated with each index. accumarray works where you group values that belong to the same index together and you apply a function to those values. In your case, you'd use the default behaviour and sum things together.
However, you'll need to convert the first column into numeric labels. The third output of unique will help you do this. You'll also need to convert the second column into a numeric array and so str2double is a perfect way to do this.
Without further ado:
[val,~,id] = unique(A(:,1)); %// Get unique values and indices
out = accumarray(id, str2double(A(:,2))); %// Aggregate the groups and sum
format long g; %// For better display of precision
T = table(val, out) %// Display on a nice table
I get this:
>> T = table(val, out)
T =
val out
_________ ________
'M00.300' 13622.37
'M00.600' 31736.39
'M00.900' 19754.72
The above uses the table class that is available from R2013b and onwards. If you don't have this, you can perhaps use a for loop and print out each cell and value separately:
for idx = 1 : numel(out)
fprintf('%s: %f\n', val{idx}, out(idx));
end
We get:
M00.300: 13622.370000
M00.600: 31736.390000
M00.900: 19754.720000

Excel: Fill a range of cells with a value or formula depending on only one cell

We have a project on a certain math subject and I am done with the computations and it works just fine. So the task is, let's say you have a system of linear equations of certain number of unknowns, you input the number of unknowns, and fill in the values, and using matrix computations, find all the value of unknowns.
To make this short, I already finished the "find the value of unknowns" along with the computation, I checked it, and it seems fine. I can put 6 as the number of unknowns and it automatically computes the inverse of a 6x6 matrix and it will return the 6 unknown values using Index INDIVIDUALLY.
(Note: We aren't allowed to use VBA or Macros since we haven't discussed that yet.
The problem is, I don't know how to automatically fill a RANGE of cells with a VALUE or A FORMULA based on a SINGLE cell value.
For example, In cell A1, I will input 5 (which indicates the number of unknowns), then upon inputting this and hitting enter, let's say a range of cells A2 to A6 (which is 5 cells) will be automatically filled with incremented letters, like for A2 -> A ; A3 -> B ; ... A6 -> E, of which these letters indicate the 5 unknowns.
PROBLEM 2.
Another follow up question, let's say I input again 5, which again stands for the number of missing values/unknowns, in A1, besides the column of the variables A,B,C,D,E (5 unknowns), I want to automatically fill column B respectively with values from an array.
This is just the same with my first problem but this time, instead of incremented letters, it would be Incremented Index function.
For example: I input 5
*Column A will automatically be filled with the variables/letters
*Column B will automatically be filled with the values from an array that's computed using a formula but is not shown independently on cells.
I already have the formula
INDEX(Formula I created, Row number of the answer from the Formula I created , Column number of the answer from the formula I created)
The answers from the formula I made myself is also an array, an "n" rows and 1 column array. If I put the Index formula on a SINGLE cell, it returns specified row number value from the array that resulted in the computation from my formula
What I want is for example, for 5 unknowns
**A | B**
1|.......5..........................
2|.......A..............Some Value 1
3|.......B..............Some Value 2
4|.......C..............Some Value 3
5|.......D..............Some Value 4
6|.......E..............Some Value 5
Wherein the "Some Value" is the Arrayed Answer from my formula and the "1,2,3,4,5" specifies the row number from that arrayed answer.
This is upon inputting the matrix values, inputting the number of unknowns "n" in A1, and automatically filling a range of cells A2 to A"n" with letters A up to what letter "n" corresponds, and automatically filling a range of Cells B2 to B"n" with my formula but with incremented row number for every row in the Index(Formula, Row number , Column number) function.
Note: I hope there's a way to do this using excel functions only since we haven't discussed VBA or Macros yet so we can't use those, and even If we can, I have no knowledge for that. haha. :D
THANK YOU THANK YOU THANK YOU SO MUCH IN ADVANCED! Cheers. :D
Here's a formula for column A:A (write this in cell A2) and drag down:
=IF(ROW()-1<=$A$1,CHAR(ROW()+63),"")

Split array into smaller unequal-sized arrays dependend on array-column values

I'm quite new to MatLab and this problem really drives me insane:
I have a huge array of 2 column and about 31,000 rows. One of the two columns depicts a spatial coordinate on a grid the other one a dependent parameter. What I want to do is the following:
I. I need to split the array into smaller parts defined by the spatial column; let's say the spatial coordinate are ranging from 0 to 500 - I now want arrays that give me the two column values for spatial coordinate 0-10, then 10-20 and so on. This would result in 50 arrays of unequal size that cover a spatial range from 0 to 500.
II. Secondly, I would need to calculate the average values of the resulting columns of every single array so that I obtain per array one 2-dimensional point.
III. Thirdly, I could plot these points and I would be super happy.
Sadly, I'm super confused since I miserably fail at step I. - Maybe there is even an easier way than to split the giant array in so many small arrays - who knows..
I would be really really happy for any suggestion.
Thank you,
Arne
First of all, since you wish a data structure of array of different size you will need to place them in a cell array so you could try something like this:
res = arrayfun(#(x)arr(arr(:,1)==x,:), unique(arr(:,1)), 'UniformOutput', 0);
The previous code return a cell array with the array splitted according its first column with #(x)arr(arr(:,1)==x,:) you are doing a function on x and arrayfun(function, ..., 'UniformOutput', 0) applies function to each element in the following arguments (taken a single value of each argument to evaluate the function) but you must notice that arr must be numeric so if not you should map your values to numeric values or use another way to select this values.
In the same way you could do
uo = 'UniformOutput';
res = arrayfun(#(x){arr(arr(:,1)==x,:), mean(arr(arr(:,1)==x,2))), unique(arr(:,1)), uo, 0);
You will probably want to flat the returning value, check the function cat, you could do:
res = cat(1,res{:})
Plot your data depends on their format, so I can't help if i don't know how the data are, but you could try to plot inside a loop over your 'res' variable or something similar.
Step I indeed comes with some difficulties. Once these are solved, I guess steps II and III can easily be solved. Let me make some suggestions for step I:
You first define the maximum value (maxValue = 500;) and the step size (stepSize = 10;). Now it is possible to iterate through all steps and create your new vectors.
for k=1:maxValue/stepSize
...
end
As every resulting array will have different dimensions, I suggest you save the vectors in a cell array:
Y = cell(maxValue/stepSize,1);
Use the find function to find the rows of the entries for each matrix. At each step k, the range of values of interest will be (k-1)*stepSize to k*stepSize.
row = find( (k-1)*stepSize <= X(:,1) & X(:,1) < k*stepSize );
You can now create the matrix for a stepk by
Y{k,1} = X(row,:);
Putting everything together you should be able to create the cell array Y containing your matrices and continue with the other tasks. You could also save the average of each value range in a second column of the cell array Y:
Y{k,2} = mean( Y{k,1}(:,2) );
I hope this helps you with your task. Note that these are only suggestions and there may be different (maybe more appropriate) ways to handle this.

Resources