Sum across rows and count rows less than 0 - arrays

I am trying to identify inventory shortages for each store and the type of bread I am supplying.
Example table showing the demand for each type of bread by store.
Row 8 is what I am trying to accomplish with a formula.
Based on the quantity on hand, I can conditionally format to highlight cells red or green based on shortages.
I can't get a formula to count the number of red cells.
I am thinking I need to use sumproduct.
=SUMPRODUCT(--($B9:$B11<C9:C11))
The above works for identifying shortages for Store A (Column C). However, when dragged to column D and E it doesn't remember what the other stores needed.
I can't get the rows to sum without adding ALL rows in the range. I need each row to be individually compared to the qty on hand. I assume an array needs to be used.

This is the formula I mentioned. It uses a standard method with mmult to get the row totals of the matrix, then compares them with the amounts available:
=SUM(--($B9:$B11<MMULT($C9:C11,TRANSPOSE(COLUMN($C9:C11)^0))))
entered in C8 and pulled across. Must be entered as an array formula using CtrlShiftEnter
EDIT
OP has commented that it shouldn't be listed as a shortage if a store doesn't need a particular item, even if the stock of that item has been exhausted.
So there should be an extra condition for it to be registered as a shortage only if the current column has a number >0 in a particular row as well as the row sum being greater than the amount available:
=SUM((C9:C11>0)*($B9:$B11<MMULT($C9:C11,TRANSPOSE(COLUMN($C9:C11)^0))))
If you wanted to select only some of the rows as well it would look like
=SUM(ISNUMBER(MATCH($A9:$A11,{"Wheat","Rye"},0))*(C9:C11>0)*($B9:$B11<MMULT($C9:C11,TRANSPOSE(COLUMN($C9:C11)^0))))

EDIT: FIXED TO WORK FOR CONDITIONAL FORMATTING:
Paste this into the module:
Public Function CellColour(addr)
CellColour = Range(addr).DisplayFormat.Interior.Color
End Function
Function FINDRED(rng As Range)
FINDRED = 0
Dim cell As Range
For Each cell In rng
If (cell.Parent.Evaluate("CellColour(""" & cell.Address & """)") = RGB(255, 0, 0)) Then
FINDRED = FINDRED + 1
End If
Next cell
End Function
You can then use this like a normal excel formula:

Related

How to use an ARRAYFORMULA in Google Sheets that references cells in the same column

After getting sick and tired of having to copy formulas back into my sheet anytime I needed to add a row (one of my gripes of Google Sheets where Excel is much better). I've decided to try using ARRAYFORMULA in row 2 of all my sheets to basically make column formulas. Google Support pages suggests this is an exact replacement for the functionality in Excel - it's not). Note that I don't think either Excel or Google does Column formulas well - but Excel definitely does it better then Google Sheets in this case.
Background
Just using ARRAYFORMULA with a known range works well anytime I add a row in the MIDDLE of that range. However, it doesn't work well when I add a new row to the end of my range that I want to be included. I have to manually change the last row in my ARRAYFORMULA formula if I add a row to the end, or I have to make my last row a "dummy" row with a note that says - don't add new rows, always add to the middle and hope other people using the sheet (or even myself) remember to abide by it. Using large sheets with lots of data, one person not following the rule can majorly screw it up for everyone sharing it. I like to have as much automated as possible to minimize costly mistakes.
I tried using ARRAYFORMULA using whole columns (e.g. A:A, B:B, etc.) but if it's a formula where I need a result output to each row (simple example: = ARRAYFORMULA ( C:C - 1), I get an #N/A result in the cell and the following error text:
Result was not automatically expanded, please insert more rows
UPDATE: This error was because the formula was in row 2 and therefore full columns (A:A, B:B, C:C, G:G) were always one more row than what was available in the sheet. Using C$2:C ($ before 2 is necessary), G$2:G, etc. solves that issue.
My workaround for that was to add a cell in a hidden column on my sheets with the following formula:
= ARRAYFORMULA( MAX( IF( LEN(A:A), ROW(A:A), ) ) )
Note: Whole columns work here because I'm using the MAX function which then returns a single value.
I then name that cell something to the effect of last_XXXX_row where XXXX is a short version of the name of the sheet so I have a constant I can reference and know what the last active row of the sheet is. Then I protect the cell and hide it.
It gets a little annoying as now I have to use INDIRECT everywhere and the formulas get long, but for the most part it works. For example:
= ARRAYFORMULA( ( $C$2:INDIRECT( "$C$" & last_unit_row) = 1 ) )
on my "unit" sheet returns TRUE or FALSE based on whether the value in column C is equal to 1 or not and returns the corresponding result in each row of the column I put this in. It's kind of long, but now at least I don't have to enter the formula in every row and then re-enter the formula every time I add a row - whether in the middle or the end of the sheet, it automatically updates the column as I add them. Yay.
NOTE: Logic wise, using $C$2:$C works and is a much shorter equation. However, I discovered that as you add data, it bogs the spreadsheet down significantly (and it's even slower without the $) - so I still recommended using indirect as per my example above, which works much faster.
Issue
Some formulas do not work as a direct analog when using ARRAYFORMULA. For instance, I've learned that the INDEX function inside of ARRAYFORMULA prevents ARRAYFORMULA from executing on the entire array, so have to avoid that. There's probably a few others I haven't tried yet.
My particular issue is in a column that needs to know something in the column above it. In both Excel and Google Sheets I often use a count up / reset column to track how many entries there are in a given category. For example, such a formula in column B dependent on a category value in column G typically looks like this:
= IF (G2 <> G1, 0, B1 + 1)
Then when I fill down with that formula, it automatically changes all cell references to the needed rows. It's checking a category label in column G - and if that label changes, it resets to 0 (sometimes I reset to 1, depending), otherwise it increments the value in column B. This is helpful when there isn't a uniform number of entries for each category and each entry needs a subindex.
I can't seem to get this to work using ARRAYFORMULA.
Attempted Solutions
I tried this:
= ARRAYFORMULA( IF( $G2:INDIRECT( "$G$" & last_item_row ) <> $G1:INDIRECT( "$G$" & ( last_item_row - 1 ) ), 0, $B1:INDIRECT( "$B$" & ( last_item_row -1 ) ) ) )
And I get a #REF result in the cell with the error text:
Circular Dependency Detected. To resolve with iterative calculation, see File > Settings
So... it sort of makes sense as it appears that there's a reference to the cell the formula is in inside the range that's created by INDIRECT. However, if the formula was executed properly, it's would always calculate based on the cell ABOVE it and never actually use its own cell as part of the calculation.
If I could use INDEX instead of INDIRECT I ought to be able to avoid this, but I can't.
UPDATE: This formula is basically correct mathematically:
= ARRAYFORMULA ( IF( $G$1:INDIRECT( "$G$" & ( last_item_row - 1 ) ) <> $G$2:INDIRECT( "$G$" & ( last_item_row ) ), 0, ($B$1:INDIRECT( "$B$" & ( last_item_row - 1 ) ) + 1 ) ) )
However, it requires that Iterative calculations are turned on, and it has a maximum value it will "max out" at based on the number of iterations allowed - and there are diminishing returns as the number of iterations is increased. At 100 iterations, it maxes at 10 - my real data has some categories that have 25 sub indices and the spreadsheet gets slower to calculate as iterations goes up, so this isn't a viable solution.
Am I overcomplicating this? Is there a simpler solution I'm not seeing? I'm trying to us COUNTIF as well [ Non-Array version of the formula that works when filled down: =COUNTIF($G$1:$G1,$G2) ], but haven't gotten it to work.
Closest ARRAYFORMULA version I have is this:
=ARRAYFORMULA( COUNTIF($G$1:($G1:INDIRECT( "$G$" & ( last_item_row - 1 ) ) ), $G2:INDIRECT( "$G$" & last_item_row ) ) )
I'm surprised that even worked at all - it returns array values, but it gets me the total number of times that category appears in every row, instead of just the ones leading up to that row.
Example
The example above, which uses the formula = if( B2<>B1, 0, A1 + 1 ) in Cell A2 and filled down to cell A13, shows example input (Category) and the desired output (Sub Index). With this formula, however, if I add to cell B14, A14 will not populate unless I copy and paste or fill the formula down to the next row. I want an ARRAYFORMULA in Cell A2 that will automatically fill the cells below it when I add additional data in column B (whether below or by adding a row in between) without having to touch the formula again.
I tried using ARRAYFORMULA using whole columns (e.g. A:A, B:B, etc.) but if it's a formula where I need a result output to each row
you can always freeze it like:
=INDIRECT("A:A")
this way you can add rows anywhere you want (if you of course not add new row above the row that holds the formula - that would be troublesome to fit in A:A into A2:A)
that the INDEX function just does not work with ARRAYFORMULA at all
INDEX is already ARRAYFORMMULA type of formula. the analogy being here as: you need a car to get from A to B where INDEX is a blue car with 3 doors and ARRAYFORMULA is a red car with 5 doors - it doesn't matter what color you have, you just need a car
= IF (G2 <> G1, 0, B1 + 1)
while this is direct logic there are several ways how to achieve the same thing. proper usage would require not to use such formulas as ARRAYFORMULA in column G or B to avoid circular dependency errors. for a simple resetting count up try:
=ARRAYFORMULA(COUNTIFS(B1:B7, B1:B7,
SEQUENCE(ROWS(B1:B7)),"<="&SEQUENCE(ROWS(B1:B7))))
feel free to change B1:B7 to open range or frozen range...
update:
=INDEX(IF(B2:B="";; COUNTIFS(B2:B; B2:B;
SEQUENCE(ROWS(B2:B)); "<="&SEQUENCE(ROWS(B2:B)))-1))
You should use the
INDEX
function with
MATCH
instead of
INDIRECT
in your
ARRAYFORMULA
For example, if your data is in the range
A1:A11
, and you want to retrieve the value at row 5 through
INDIRECT
, you would use:
=INDIRECT("A" & 5)
If you want to retrieve the value at row 5 through
INDEX
, you would use:
=INDEX(A:A, 5)
Now, if you want to use the
INDEX
function with
ARRAYFORMULA
, you will need to use the
MATCH
function to find the row number of the value you want, since
INDEX
only takes a row number as its second argument.
For example, if you want to retrieve the value at row 5 through
ARRAYFORMULA
and
INDEX/MATCH
, you would use:
=ARRAYFORMULA(INDEX(A:A, MATCH(5, ROW(A:A))))
This formula will return the value in
A5
Note: In your example, you're trying to match
$G2
with
$G1:$G1
. If you want to match
$G2
with
$G1:$G3
, you would use:
=ARRAYFORMULA(INDEX(A:A, MATCH($G2, $G1:$G3)))
This formula will return the value from
A1
,
A2
, or
A3
depending on the value of
$G2
Use SCAN to get a running total with OFFSET to get the previous range.
=SCAN(B1,B2:INDEX(B:B,COUNTA(B:B)),LAMBDA(a,c,if(c=OFFSET(c,-1,0),1+a,0)))
i
Cat
0
A
1
A
2
A
3
A
0
B
1
B
2
B
3
B
4
B
0
A

Index/Match with IF Statement

As you can see, I have a database table on the left. And I want to add in IF statement that allows me to lookup the [Code], [Name] and [Amount] of the top 5 of Company A ONLY. Then do a top 5 for Company B and so on. I have managed to lookup the top 5 out of ALL companies but cannot seem to add a criteria to target specific company.
Here are my formulas so far:
Formula in Column K [Company]: = INDEX(Database,MATCH(N3,sales,0),1)
Formula in Column L [Code]: = INDEX(Database,MATCH(N3,sales,0),2)
Formula in Column M [Name]: = INDEX(Database,MATCH(N3,sales,0),2)
Formula in Column N [Amount]: = LARGE(sales,ROW(1:20))
The intended result is to show the top 5 sales person in each company along with their [Code], [Name] and [Amount], feel free to suggest any edits to the worksheet.
Here's an alternative if you know the code is unique. After putting A into K3:K7
First get the highest amounts for Company A starting in N3
=AGGREGATE(14,6,Database[Amount]/(Database[Company]=K3),ROWS(N$1:N1))
Then find the code which matches the amount, but only if it hasn't been used before (this assumes that the code is unique) starting in L3
=INDEX(Database[Code],MATCH(1,INDEX((Database[Company]="A")*(Database[Amount]=N3)*ISNA(MATCH(Database[Code],L$2:L2,0)),0),0))
Then find the matching name with a normal INDEX/MATCH starting in M3
=INDEX(Database[Name],MATCH(L3,Database[Code],0))
Okay, I have achieved this with the use of a helper column which you can hide. Please nnote though that this will only work as long as there are not more than 9 identical totals for any 1 company, I don't think you should have that issue but it may occur, the digits being added by the helper column would need to be tweaked
First Helper Column:
Adds a digit to the end of the total representing the number of times that amount already exists above for that company. This formula is =CONCATENATE([#Amount],COUNTIFS($A$1:A1,A2,$D$1:D1,D2))*1
This is multiplied by 1 to keep the number format for LARGEto work with.
Second Helper Column:
This is an array formula and will need to be input by using Ctrl+Shift+Enter while still in the formula bar.
The formula for this one is:
=LARGE(IF(Company="A",Helper),ROW(1:1))
What this formula does as an array formula is produce a list of results based on the IF statement that LARGE can use. Rather than the entire column being ranked largest to smallest, we can now single out the rows that have company "A" like so:
=LARGE({20000;20001;20002;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;15000;14000;30000;FALSE;FALSE;FALSE;FALSE},ROW(1:1))
LARGE will only work with numeric values so the FALSES produced where column A does not match "A" will be ignored. Notice why I have used the helper column here to eliminate unique values but not affect the top 5.
ROW(1:1) has been used as this will automatically update when the formula is dragged down to produce the next highest result in this array.
The main formula for top 5 array
Again this is an Array formula so will need to be input by using Ctrl+Shift+Enter while still in the formula bar.
=INDEX(Database,SMALL(IF(Company="A",IF(Helper=$O3,ROW(Company))),1)-1,COLUMN(A:A))
With array formulas for some unknown reason IF(AND()) just does not work for me so I have nested two IF's instead.
Notice how I am again checking whether the first column matches "A" and then whether the last column matches the result of the second formula. What will happen is where both of these conditions match in the array (as in both produce TRUE for the same row) I wanted the row number to be returned.
IF({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE},IF({FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;TRUE;FALSE;FALSE},{2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20}))
It looks like a mess I know, but the position where both TRUEs align gives us the row 16 as a result.
{FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;16;FALSE;FALSE;FALSE;FALSE}
As I know that there can only be one match possible for this, I use SMALL to grab the first smallest number to use in the INDEX formula for row and deduct 1 as we are not considering the headers in the INDEX formula so we actually want the 15th result.
Again, COLUMN(A:A) has been used for the column number to return as this will automatically update when the formula is dragged across.
If you are struggling with my explanation and want me to provide more clarity, feel free to reach out and I will try my best to explain the logic in more detail

Excel Formula to Associate two data sets together based on exact match closets, match, and no match

I have a sheet with a column of dates, some may repeat, and amounts in the column next to those dates. On another sheet I also have a sheet full of dates some that may repeat and other amounts. I need a formula that will go through the first sheet's dates and find the nearest amount to that date thats on the second sheet and bring me back the difference between the two. The amount on Sheet 1 will always be greater than or equaled to what is on sheet 2 for the proper match up. An example of my problem is detailed below.
Sheet 1 will have
09/08/2014 $3,838
09/08/2014 $564
09/08/2014 $1023
09/09/2014 $645
Sheet 2 will have
09/08/2014 $561
09/08/2014 $1023
09/09/2014 $35
Basically what the end result should be is for the formula to give the difference between what is on sheet 1 and what is on sheet 2. the formula that compares sheet 1 and sheet 2 cannot be based on a parameter for example a this is 90% of this so these are to be matched. this is because there will be situations where they should be matched up even though whats on sheet 2 is only 10% of whats on sheet 1 but its the only instance of that number on sheet 1 so they must be matched up. So the final results should look like this below
09/08/2014 $3,838 = $3,838
09/08/2014 $564 = $3
09/08/2014 $1023 = 0
09/09/2014 $645 = $610
I believe what I am looking for is going to involve indexing the formula among other things. I have tried doing this formula a ton of ways already including an extensive nested if and vlookup and also adding some sumifs in a separate column and concatenating dates and amounts but that route didnt lead me any success either. Feel free to add stuff to other columns to go off of.
The first rule in the formula I think would be to match the exact numbers then match the next closest numbers and subtract them then any left over unmatched would be just the number of sheet 1 but numbers on sheet 2 can only be used once
I think I understand what you want but please correct me if I have it wrong. Based on what I think you want then how about this.
I assume:
There are two columns of data in Sheet 1 and two columns of data in Sheet 2
In both cases there are dates in column A and money amounts in column B
The data starts in row 1 and continues in contiguous rows down the sheet
The rows of both sheets are sorted first by date (column A) oldest to newest and then by value (column B) smallest to largest
Then put this formula in cell C1 in Sheet 1 and enter it as an array formula (With the cell in edit mode press CTRL+SHIFT+ENTER instead of just ENTER). You'll know if you did this correctly since the formula in the formula bar will then be wrapped in braces, i.e. {=}.
=MIN(ABS(OFFSET(Sheet2!$A$1,MATCH($A1,Sheet2!$A$1:$A$13,0)-1,1,COUNTIF(Sheet2!$A$1:$A$13,"="&Sheet1!$A1),1)-$B1))
Then copy that formula down column C for all rows.
That should do what you want.

Optimization of array function that calculates products

I have the following array formula that calculates the returns on a particular stock in a particular year:
=IF(AND(NOT(E2=E3),H2=H3),PRODUCT(IF($E$2:E2=E1,$O$2:O2,""))-1,"")
But since I have 500,000 row entries as soon as I hit row 50,000 I get an error from Excel stating that my machine does not have enough resources to compute the values.
How shall I optimize the function so that it actually works?
E column refers to a counter to check the years and ticker values of stocks. If year is different from the previous value the function will output 1. It will also output 1 when the name of stock has changed. So for example you may have values for year 1993 and the next value is 1993 too but the name of stock is different, so clearly the return should be calculated anew, and I use 1 as an indication for that.
Then I have another column that runs a cumulative sum of those 1s. When a new 1 in that previous column is encountered I add 1 to the running total and keep printing same number until I observe a new one. This makes possible use of the array function, if the column that contains running total values (E column) has a next value that is different from previous I use my twist on SUMIF but with PRODUCT IF. This will return the product of all the corresponding running total E column values.
The source of the inefficiency, I believe, is in the steady increase with row number of the number of cells that must be examined in order to evaluate each successive array formula. In row 50,000, for example, your formula must examine cells in all the rows above it.
I'm a big fan of array formulas, so it pains me to say this, but I wouldn't do it this way. Instead, use additional columns to compute, in each row, the pieces of your formula that are needed to return the desired result. By taking that approach, you're exploiting Excel's very efficient recalculation engine to compute only what's needed.
As for the final product, compute that from a cumulative running product in an auxiliary column, and that resets to the value now in column O when column P in the row above contains a number. This approach is much more "local" and avoids formulas that depend on large numbers of cells.
I realize that text is not the best language for describing this, and my poor writing skills might be adding to the challenge, so please let me know if more detail is needed.
Interesting problem, thanks.
Could I suggest a really quick and [very] dirty vba? Something like the below. Obviously, have a backup of your file before running this. This assumes you want to start calculating from row 13.
Sub calculateP()
'start on row 13, column P:
Cells(13, 16).Select
'loop through every row as long as column A is populated:
Do
If ActiveCell(1, -14).Value = "" Then Exit Do 'column A not populated so exit loop
'enter formula:
Selection.FormulaR1C1 = _
"=IF(AND(NOT(RC[-11]=R[1]C[-11]),RC[-8]=R[1]C[-8]),PRODUCT(IF(R[-11]C5:RC[-11]=R[-1]C[-11],R2C15:RC[-1],""""))-1,"""")"
'convert cell value to value only (remove formula):
ActiveCell.Value = ActiveCell.Value
'select next row:
ActiveCell(2, 1).Select
Loop
End Sub
Sorry, this is definitely not a great answer for you... in fact, even this method could be achieved more elegantly using range... but, the quick and dirty approach may help you in the interim ??

Excel arrays count totals using criterias from multiple ranges (or sheets)

What I would like to do is to count the amount of lines that matches criterias to be verified in two arrays.
I can't use VBA, add new columns (for instance a new column with VLOOKUP formula) and preferably use arrays.
I have two separate ranges, each with a ID column for the identifier and other fields with data.
For instance, range 1:
Range 2:
If I had only to check the first range I would do:
={SUM((D4:D7="Red") * (E4:E7="Big"))}
But I don't know how to check also using data from the other range.
How, for example, to count the number of items that are Red, Big and Round by using both Ranges ?
Put this in the cell F4:
=IF((VLOOKUP(C4,$C$11:$D$12,2)="Round")*(D4="Red")*(E4="Big"),1,"")
Note that the behavior of VLOOKUP is that it finds the value up to the first parameter. Since there's no 1 in your second dataset, this first cell is going to show "#N/A", which I don't know how to solve, but when you extend this formula down to also compare the other sample data in the first set, the ID numbers 2 and 4 will show up as "yes" for you.
Edit: You wanted a count of this list. So after this, it should be easy to get a count of cells in this column using the COUNT function.
Try this array formula
=SUM((D4:D7="Red")*(E4:E7="Big")*ISNUMBER(MATCH(C4:C7,IF(D12:D13="Round",C12:C13),0)))
The last part is the added criterion you want - the IF function returns {2,4} [IDs where Data 3 is "Round"] and then you can use MATCH to compare C4:C7 against that. If there is a match you get a NUMBER (instead of #N/A) so you can then use ISNUMBER to get TRUE/FALSE and that feeds in to your original formula - result should be 2

Resources