Excel - rolling average across columns without zeros, no array - arrays

I am currently using this formula to calculate a rolling average score across 12 columns for either the last 3 or 6 months.
=SUM(SUMIFS($E$54:P54,$E$54:P54,LARGE(IF($E$54:P54>0,$E$54:P54),{1,2,3})))
This is an array formula and is entered via CTRL + SHIFT + ENTER.
The problem now is that I need to deploy my work book on older machines and those being ancient office computers (we are talking windows XP and Office 2003...), I find that the array is killing the entire workbook. Now, I have already taken steps to speed up the workbook via VBA (disabling events, manual formula calc, etc.), but I need a way to convert the above array formula into a non-array formula which is NOT counting zeros or empty cells as part of the average.
I tried this below but couldnt get it to work with the zeros / empty cells.
=SUM(OFFSET($E68,0,COUNT($E68:$P68)-IF(COUNT($E68:$P68)>3,3,COUNT($E68:$P68)),1,IF(COUNT($E68:$P68)>3,3,COUNT($E68:$P68))))
Picture of the sample data attached below.

Since an Average is a "Sum" divided by a "Count", this can be accomplished using the simplicity of non-Array formulas.This formula avoids the inclusion of both zeros and blank cells:
=IF(SUM(A2:C2)>0,SUM(A2:C2)/(COUNT(A2:C2)-COUNTIF(A2:C2,0)),0).
If the row is filled with only zero values, then it shows the actual numerical average as zero, which is correct.If you want to avoid the display of zero averages in cells, then use this slightly different formula:
=IF(SUM(A3:C3)>0,SUM(A3:C3)/(COUNT(A3:C3)-COUNTIF(A3:C3,0)),"").
Of course, you will need to adjust the cell ranges for the 6 month and YTD averages; these formulas deal with 3- month ranges.

For last 3-Month range average: =SUM(OFFSET($A3,0,COUNT($A3:$L3)-IF(COUNT($A3:$L3)>3,3,COUNT($A3:$L3)),1,IF(COUNT($A3:$L3)>3,3,COUNT($A3:$L3))))/(COUNT(OFFSET($A3,0,COUNT($A3:$L3)-IF(COUNT($A3:$L3)>3,3,COUNT($A3:$L3)),1,IF(COUNT($A3:$L3)>3,3,COUNT($A3:$L3))))-COUNTIF(OFFSET($A3,0,COUNT($A3:$L3)-IF(COUNT($A3:$L3)>3,3,COUNT($A3:$L3)),1,IF(COUNT($A3:$L3)>3,3,COUNT($A3:$L3))),0))
For last 6-month average: SUM(OFFSET($A2,0,COUNT($A2:$L2)-IF(COUNT($A2:$L2)>6,6,COUNT($A2:$L2)),1,IF(COUNT($A2:$L2)>6,6,COUNT($A2:$L2))))/(COUNT(OFFSET($A2,0,COUNT($A2:$L2)-IF(COUNT($A2:$L2)>6,6,COUNT($A2:$L2)),1,IF(COUNT($A2:$L2)>6,6,COUNT($A2:$L2))))-COUNTIF(OFFSET($A2,0,COUNT($A2:$L2)-IF(COUNT($A2:$L2)>6,6,COUNT($A2:$L2)),1,IF(COUNT($A2:$L2)>6,6,COUNT($A2:$L2))),0))
These are quite lengthy, but they exclude the zero value cells from your original non-Array formula.The data was assumed to begin in row A2:L2.

Related

Makearray function in Office Excel unable to generate proper amount of columns for the array

I am using Office 365 currently and I want to make a visualization tools using MAKEARRAY functions.
For example, if I want to display sequential of 32 items, I would display it in this way:
I use the following formula of Makearray to generate the custom array for me
Note: Formula is pasted at cell value F3 .
=MAKEARRAY(ROUNDUP(B2/B3,0),IF(E3#=ROUNDUP(B2/B3,0),MOD(B2,B3),B3),LAMBDA(row,col,"O"))
but it seems like after debugging, this part of the formula are giving it the problem are these
IF(E3#=ROUNDUP(B2/B3,0),MOD(B2,B3),B3)
as I debugging the formula separately as shown in picture below, it can generate the correct amount of columns as it is supposed to.
Note: Generate exactly same amount to the no of columns if row number is not matching;
Generate modulus remainder formula if row number is matching to roundup of no. of items divided by no. of columns.
But in the end, I put that problematic formula back into the MAKEARRAY function just give only a single columns, which seems like it is quite wrong.
May I know why it display single columns even though by right, it should display the correct amount of no. of columns?
What about:
Formula in C1:
=WRAPROWS(INDEX("O",SEQUENCE(A1,,,0)),A2,"")
Or rather:
=WRAPROWS(EXPAND("O",A1,,"O"),A2,"")
MAKEARRAY does not expect an array in the number of columns. It is a set number. It will iterate the number of rows and number of columns to create the array. It will always be square and not jagged.
So you need to do the math to change the value:
=MAKEARRAY(ROUNDUP(B2/B3,0),B3,LAMBDA(rw,clm,IF(10*(rw-1)+clm>B2,"","O")))
Now as soon as the space is greater than the 32 it puts in "" instead of "O"

Accumulated value sum with Arrayformula

I have a row of values B2:F2 I want to SUM like i did in B3:F3 but with the use of Arrayformula.
formulas in row 3 with locked $B column:
Month
Jan
Feb
Mar
Apr
May
Value
15,106
15,559
10,875
21,679
18,118
Simple Cell formula
=SUM($B2:B2)
=SUM($B2:C2)
=SUM($B2:D2)
=SUM($B2:E2)
=SUM($B2:F2)
Progress: I tried this formula but it outputs the SUM of the entire range B2:F2 at once in the entire range B4:F4.
=ArrayFormula(IF(B2:F2="",,SUM(B2:$F2)))
Month
Jan
Feb
Mar
Apr
May
Value
15,106
15,559
10,875
21,679
18,118
Progress
=ArrayFormula(IF(B2:F2="",,SUM(B2:$F2)))
81,336
81,336
81,336
81,336
What is the best formula to get the same result in B3:F3 but using Arrayformula?
Make a copy of the example sheet.
Update
When tring to roll forward i discoverd the case when the value row cell are empty, like this in column J, if possible address this case in the answer
standard transposed running total fx will do:
=INDEX(TRANSPOSE(MMULT(TRANSPOSE((SEQUENCE(5)<=SEQUENCE(1, 5))*
FLATTEN(B2:F2)), SEQUENCE(5, 1, 1, 0))))
fully dynamic and maximally lightweight:
=INDEX(IF(C2:2="",,TRANSPOSE(MMULT(TRANSPOSE((
SEQUENCE( MAX(COLUMN(C2:2)*(C2:2<>""))-COLUMN(C2)+1)<=
SEQUENCE(1, MAX(COLUMN(C2:2)*(C2:2<>""))-COLUMN(C2)+1))*
FLATTEN(INDIRECT("C2:"&ADDRESS(2, MAX(COLUMN(C2:2)*(C2:2<>"")))))),
SEQUENCE( MAX(COLUMN(C2:2)*(C2:2<>""))-COLUMN(C2)+1, 1, 1, 0)))))
A simple way to calculate cumulative sum:
=ArrayFormula(IF(B1:1="",,SUMIF(COLUMN(B2:2),"<="&COLUMN(B2:2),B2:2)))
=arrayformula(mmult(if(isblank(B2:F2),0,B2:F2),if(column(B2:F2)>=transpose(column(B2:F2)),1,0)))
can produce a running sum in a row vector and can accommodate empty entries in the input range.
If you want to auto detect the number of columns in the input range, you can
replace B2:F2 with array_constrain(B2:2,1,max(arrayformula(if(isblank(B2:2),,column(B2:2))))-1) and
replace column(B2:F2) with array_constrain(column(B2:2),1,max(arrayformula(if(isblank(B2:2),,column(B2:2))))-1)
which is to say, cut the range leaving the number of rows that is the max column index of occupied cells in our range; minus 1 because we started with column 2.
(Also, as long as there is one arrayformula wrapping the whole formula, you can omit them in the nested inputs, as long as you preserve the () brackets.)
Nonetheless, there would be a (computational) efficiency concern.
In order to centralize the formula, in the above solution, we first created a filter for each desired entry in our running sum vector, 1,0,0,... for 1st entry, 1,1,0,... for 2nd entry, 1,1,1,0,... for 3rd, etc. And then, effectly, we apply a sum(filter(...)) via multiply by 1 or 0 using mmult. The array creation costs extra. The multiplication costs extra. And compared to iterated formulas that mutates cell by cell, we are not saving the multiply by 0 parts.
It may not end up being more than double or triple the runtime compared to iterated formulas. And you can experiment case by case. Small scale application is always fine. But for larger datasets, computational efficiency is something to keep in mind whenever we introduce extra computational steps, and potentially squaring the original amount when using mmult solutions.

Excel array Formula that copies only cells containing a string

I would prefer to use an excel array formula (but if it can only be done in VBA, so be it) that copies ALL cells from a column array that contains specific text. The picture below shows what I am after and what I have tried. I'm getting close (thanks to similar, but different questions) but can't quite get to where I want. At the moment, I am getting only the first cell instead of all the cells. In my actual application, I am searching through about 20,000 cells and will have a few hundred search terms. I expect most search terms to give me about 8 - 12 cells with that value.
formula I am using:
=INDEX($A$4:$A$10,MATCH(FALSE,ISERROR(SEARCH($C$1,$A$4:$A$10)),0))
Spredsheet Image
To make this work efficiently, I recommend having a separate cell holding the results count (I used cell C2) which has this formula:
=COUNTIF(A:A,"*"&C1&"*")
Then in cell C4 and copied down use this array formula (The -3 is just because the header row is row 3. If the header row was row 1, it would be -1):
=IF(ROW(A1)>$C$2,"",INDEX($A$4:$A$21000,SMALL(IF(ISNUMBER(SEARCH($C$1,$A$4:$A$21000)),ROW($A$4:$A$21000)-3),ROW(C1))))
I tested this with 21000 rows of data in column A with an average of 30 results per search string and the formula is copied down for 60 cells in column C. With that much data, this takes about 1-2 seconds to finish recalculating. Recalculation time can vary widely depending on other factors in your workbook (additional formulas, nested dependencies, use of volatile functions, etc) as well as your hardware.
Alternately, you could just use the built-in Filter functionality, but I hope this helps.
You need to get the ROWS. Put this in C4 and copy down.
=IFERROR(AGGREGATE(15,6, IF(SEARCH($C$1, $A$4:$A$10)>0, ROW($A$4:$A$10)), ROW($C4)-ROW($A$4)+1), "")
Array formula so use ctrl-shift-Enter

Optimization of array function that calculates products

I have the following array formula that calculates the returns on a particular stock in a particular year:
=IF(AND(NOT(E2=E3),H2=H3),PRODUCT(IF($E$2:E2=E1,$O$2:O2,""))-1,"")
But since I have 500,000 row entries as soon as I hit row 50,000 I get an error from Excel stating that my machine does not have enough resources to compute the values.
How shall I optimize the function so that it actually works?
E column refers to a counter to check the years and ticker values of stocks. If year is different from the previous value the function will output 1. It will also output 1 when the name of stock has changed. So for example you may have values for year 1993 and the next value is 1993 too but the name of stock is different, so clearly the return should be calculated anew, and I use 1 as an indication for that.
Then I have another column that runs a cumulative sum of those 1s. When a new 1 in that previous column is encountered I add 1 to the running total and keep printing same number until I observe a new one. This makes possible use of the array function, if the column that contains running total values (E column) has a next value that is different from previous I use my twist on SUMIF but with PRODUCT IF. This will return the product of all the corresponding running total E column values.
The source of the inefficiency, I believe, is in the steady increase with row number of the number of cells that must be examined in order to evaluate each successive array formula. In row 50,000, for example, your formula must examine cells in all the rows above it.
I'm a big fan of array formulas, so it pains me to say this, but I wouldn't do it this way. Instead, use additional columns to compute, in each row, the pieces of your formula that are needed to return the desired result. By taking that approach, you're exploiting Excel's very efficient recalculation engine to compute only what's needed.
As for the final product, compute that from a cumulative running product in an auxiliary column, and that resets to the value now in column O when column P in the row above contains a number. This approach is much more "local" and avoids formulas that depend on large numbers of cells.
I realize that text is not the best language for describing this, and my poor writing skills might be adding to the challenge, so please let me know if more detail is needed.
Interesting problem, thanks.
Could I suggest a really quick and [very] dirty vba? Something like the below. Obviously, have a backup of your file before running this. This assumes you want to start calculating from row 13.
Sub calculateP()
'start on row 13, column P:
Cells(13, 16).Select
'loop through every row as long as column A is populated:
Do
If ActiveCell(1, -14).Value = "" Then Exit Do 'column A not populated so exit loop
'enter formula:
Selection.FormulaR1C1 = _
"=IF(AND(NOT(RC[-11]=R[1]C[-11]),RC[-8]=R[1]C[-8]),PRODUCT(IF(R[-11]C5:RC[-11]=R[-1]C[-11],R2C15:RC[-1],""""))-1,"""")"
'convert cell value to value only (remove formula):
ActiveCell.Value = ActiveCell.Value
'select next row:
ActiveCell(2, 1).Select
Loop
End Sub
Sorry, this is definitely not a great answer for you... in fact, even this method could be achieved more elegantly using range... but, the quick and dirty approach may help you in the interim ??

Sumproduct formula returns a #VALUE! error when the last array refers to a column with formulas in every row. MS Excel 2010

I am trying to find an easy way to calculate commissions off of sales on multiple sheets within a workbook. Each month, I need to find the total net profit for only items sold within the specified month.
The formula I am currently using is:
=SUMPRODUCT((TEXT('Sheet Name'!$P$3:P24,"MY")=TEXT($G$4,"MY"))*'Sheet Name'!$M$3:M24)
Column P shows the Sold Date,
Column M includes a formula in each row to calculate the net profit, and
cell G4 is where I would enter the month & year I am currently working with.
I have come to the conclusion that it only gives me the #VALUE! error because of the formula in each row of Column M (example: =IF(OR(F15=0,G15=0)," ",(F15-L15)) ).
When I reference a different column (in place of Column M) that does not contain formulas it works perfectly (example: =SUMPRODUCT((TEXT('Sheet Name'!$P$3:P24,"MY")=TEXT($G$4,"MY"))*'Sheet Name'!$G$3:G24) ). Also, changing the astrisk to a comma causes the formula to calculate incorrectly and add the (--(TEXT double negative does not fix the problem.
How to I get this array to calculate without removing the formulas from Column M?
Thanks for your attention.
I presume it is giving you a #VALUE error because your formula results in text (a space) and it errors when trying to multiply a space by a number (aka True or False). I think you would be better served changing your M column formula to =IF(OR(F15=0,G15=0),0,(F15-L15)). Do you have a specific reason to not make it evaluate to 0? Also is there a reason you are converting to text to do your month/year check?
Try something like this: =SUMPRODUCT(--(MONTH('Sheet Name'!$P$3:P24)=MONTH($G$4)),--(YEAR('Sheet Name'!$P$3:P24)=YEAR($G$4)),'Sheet Name'!$M$3:M24). Of course this is dependent on entering the dates as actual dates. The -- is used to change a logical/boolean (true/false) into a 1 or 0. It won't do anything useful to text. For example, it should also work as =SUMPRODUCT((MONTH('Sheet Name'!$P$3:P24)=MONTH($G$4))*(YEAR('Sheet Name'!$P$3:P24)=YEAR($G$4))*'Sheet Name'!$M$3:M24) since the multiplication converts the truthy statements to numbers. The trick is to make sure when everything else evaluates, you have =sumproduct(numbers,numbers,numbers). Your instance is one array of =sumproduct(numbers/text).

Resources