Excel, counting within a formulatext on array

Excel, counting within a formulatext on array - arrays

Imagine rows A2:A11 = name of customer, Columns B1:AE1 = days of the month.
To make it easy:
Daily, we tally if customers purchase (quantity) and separate them with a + to get the total of that day purchase. (example: on 2nd day of the month (C2:C5)
Abe =44+54+10
John =22+10+40
Sara =40
Mary=10+10
Also we need to count total “sales cases” of the whole day (in the above example it is 3+3+1+2)= 9 to show in the last row of the day. (B12 in this example)
The logic is something like
=SUMPRODUCT(LEN(FORMULATEXT(C2:C5))-LEN(SUBSTITUTE(FORMULATEXT(C2:C5),"+","")))
But I’m getting NA.
reminder: when there are no "+" signs & the value is more than zero, it should count as 1.
help?

There is a trick to do this, which might end up a bit convoluted to achieve the end result you want
First, define a new name in Name Manager (in the Formula menu bar)
Name: FormulaText
Refers to: =GET.CELL(6,OFFSET(INDIRECT("RC",FALSE),0,-1))
Now if you have a formula in cell B3 of =10+20+30 enter =FormulaText in cell C3 and you will get the text version of the formula
You can now count + symbols in that formula using =LEN(C3)-LEN(SUBSTITUTE(C3,"+",""))
In your specific circumstances, I would offset all of this to the right of your spreadsheet, say by 35 columns, in which case you will need to change the definition of FormulaText accordingly.
A fair bit of set-up work, but the result should work automagically.

Related

EXCEL: How to count if data matches in specific columns where header is within the date range?

I would like to get a count on the total number of lessons of a specific length given within a specific date range.
I can figure out how to get the number of a specific type of lesson on a specific day with something like in:
=countifs(INDEX($E:$V,,MATCH($A8,$E$1:$V$1,1)),"=30")
But I can't figure out how to to find all of the values for say Dates <=A8 (for row 8), or dates >A8 & <=A9 (for row 9).
I am looking to get the data to output like the yellow section.

In Excel:
=countifs(INDEX($E:$V,,MATCH($A8,$E$1:$V$1,1)):INDEX($E:$V,,IFERROR(MATCH($A9,$E$1:$V$1,1)-1,COLUMN($V$1))),"=30")
But note, this will not work in Google Sheets.

I'm not sure it is the most elegant. My solution was to create a hidden column (Now E:E) for the to calculate the days between the A column dates.
Example for cell E9
=SUMPRODUCT(($F$1:$W$1>A8)*($F$1:$W$1<A9))
This returned a number of days between days A7 and A8.
I then modified Scott Craner's formula to this (Example for cell B9):
=COUNTIFS(INDEX($F:$W,,MATCH($A9-$E9,$F$1:$W$1,1)):INDEX($F:$W,,IFERROR(MATCH($A9,$F$1:$W$1,),COLUMN($W$1))),"=30")
Remember now, that after adding a column my dates shifted from E1:V1 to F1:W1
These two steps solved my issue with double counting on certain dates.
Lastly, I went back and tested this with Google Sheets and it does now appear that it will also work with Google Sheets.

Is there a way to consolidate multiple formulas into one

all:
I am trying to design a shared worksheet that measures salespeople performance over a period of time. In addition to calculating # of units, sales price, and profit, I am trying to calculate how many new account were sold in the month (ideally, I'd like to be able to change the timeframe so I can calculate larger time periods like quarter, year etc').
In essence, I want to find out if a customer was sold to in the 12 months before the present month, and if not, that I will see the customer number and the salesperson who sold them.
So far, I was able to calculate that by adding three columns that each calculate a part of the process (see screenshot below):
Column H (SoldLastYear) - Shows customers that were sold in the year before this current month: =IF(AND(B2>=(TODAY()-365),B2<(TODAY()-DAY(TODAY())+1)),D2,"")
Column I (SoldNow) - Shows the customers that were sold this month, and if they are NOT found in column H, show "New Cust": =IFNA(IF(B2>TODAY()-DAY(TODAY()),VLOOKUP(D2,H:H,1,FALSE),""),"New Cust")
Column J (NewCust) - If Column I shows "New Cust", show me the customer number: =IF(I2="New Cust",D2,"")
Column K (SalesName) - if Column I shows "New Cust", show me the salesperson name: =IF(I2="New Cust",C2,"")
Does anyone have an idea how I can make this more efficient? Could an array formula work here or will it be stuck in a loop since its referring to other lines in the same column?
Any help would be appreciated!!
EDIT: Here is what Im trying to achieve:
Instead of:
having Column H showing me what was sold in the 12 months before the 1st day of the current month (for today's date: 8/1/19-7/31/20);
Having Column I showing me what was sold in August 2020; and
Column I searching column H to see if that customer was sold in the timeframe specified in Column H
I want to have one column that does all three: One column that flags all sales made for the last 12 months from the beginning of the current month (so, 8/1/19 to 8/27/20), then compares sales made in current month (august) with the sales made before it, and lets me know the first time a customer shows up in current month IF it doesn't appear in the 12 months prior --> finds the new customers after a dormant period of 12 months.
Im really just trying to find a way to make the formula better and less-resource consuming. With a large dataset, the three columns (copied a few times for different timeframes) really slow down Excel...
Here is an example of the end result:
Example of final product

EXCEL: Create Array Formula out of INDEX/MATCH with multiple results

my aim is to convert a massive excel sheet with different projects, employees and hours worked per month into an overview per employee. It should be possible to display the projects the employee is involved in and how many hours he worked per project per month.
The original sheet looks something like this:
I managed to find the projects Person A worked in by filtering through the INDEX/MATCH function. I applied the formula to the whole row where the employees are listed and receive multiple results of projects. My question is how to transform the formula into something more effective to copy all of the matched results (projects) into a column (see 1).
This is what I have so far, if matches the employee name in a certain area; the output is the first match of the project he is involved in:
=INDEX(B2:J3;1;MATCH("Person A";Sheet1!B3:E3;0))
How can I copy this to the bottom cells to copy all of the matched results? Does it help to create an array formula with this?

You can use he following formula in cell B9:
=IFERROR(INDEX($2:$2,SMALL(IF($3:$3=$B$8,COLUMN($3:$3)-COLUMN(INDEX($3:$3,1,1))+1),ROWS(A$1:A1))),"")
It indexes row 2 and looks for the column number of the first match in row 3 that equals the value in B8 (=Person A). When dragging down it will look for the second match ROWS(A$1:A1) will become ROWS(A$1:A2) = 2.
For Person B you can use this formula in cell B14:
=IFERROR(INDEX($2:$2,SMALL(IF($3:$3=$B$13,COLUMN($3:$3)-COLUMN(INDEX($3:$3,1,1))+1),ROWS(A$1:A1))),"")
I hope this is what you where looking for.
PS
if you paste the following formula in cell C9 you will get the sum result for Person A on Project XY in month 10 2019:
=IF(OR($B9="",C$8=""),"",SUMPRODUCT(($B$2:$K$2=$B9)*($B$3:$K$3=$B$8)*($A$4:$A$6=C$8),B4:K6))
Note: That is provided that the value in cell C8 equals the value in cell A4.

iterating over multindex - a groupby.value_counts() object is only through values and not through original date index

i want to know the percent of males in the ER (emergency room) during days that i defined as over crowded days.
i have a DF named eda with rows repesenting each entry to the ER. a certain column states if the entry occurred in an over crowded day (1 means over crowded) and a certain column states the gender of the person who entered.
so far i managed to get a series of over crowded days as index and a sub-index representing gender and the number of entries in that gender.
i used this code :
eda[eda.over_crowd==1].groupby(eda[eda.over_crowd==1].index.date).gender.value_counts()
and got the following result:
my question is, what is the most 'pandas-ian' way to get the percent of males\females in general. or, how to continue from the point i stopped?
as can be shown in the bottom of the screenshot, when i iterate over the elements, each value is the male of female consecutively. i want to iterate over dates so i could somehow write a more clean loop that will produce another column of male percentage.

i found a pretty elegant solution. i'm sure there are more, but maybe it can help someone else.
so i defined a multi-index series with all dates and counts of females and males. then used .loc to operate on each count of all dates to get percentage of males at each day. finally i just extract only the days that apply for over_crowd==1.
temp=eda.groupby(eda.index.date).gender.value_counts()
crowding['male_percent']=np.divide(100*temp.loc[:,1],temp.loc[:,2]+temp.loc[:,1])
crowding.male_percent[crowding.over_crowd==1]

Issue with a multiple criteria INDEX MATCH formula

so I used this array formula with INDEX MATCH:
{=INDEX(ENTRIES!$F$4:$F$28;MATCH(C4&F4&G4;ENTRIES!$C$4:$C$28&ENTRIES!$G$4:$G$28&ENTRIES!$H$4:$H$28;0))}
Here is the thing, I was trying to display the price of the "entries" sheet on the "sales" sheet, the problem comes up when there are different prices for one "Code" or product over time. I tried to solve it with an Index Match formula (above) that matches the price of the code (product) with the month and the year but it doesn't assign the price or any value on the months between the updates of the price. see picture
example: for month 6 it should assign the price of month 5 because there is not any update or change. and the same for month 9 it should be the same e of the month 8 for that product. How can I do that?

Looks to me like it's throwing those errors because it won't be able to find these months. In all cases these months are missing, at least with your data, you could tell the formula to pick the maximum row from the data that is below or equal to your search month using MAX()
Furthermore, matching multiple criteria through concatenating cells and columns can get tricky once numeric values/dates are involves and could throw back wrong/unexpected results. Try something along these lines instead:
MATCH(1,((Criteria1)*(Criteria2)*(Criteria3))...
So the whole thing would look like:
=INDEX(ENTRIES!$F$1:$F$28;MAX(((ENTRIES!$C$4:$C$28=C4)*(ENTRIES!$G$4:$G$28<=G4)*(ENTRIES!$H$4:$H$28=H4)*ROW(ENTRIES!$F$4:$F$28))))
Entered through CtrlShiftEnter

#JVDV answer was helpful, but it didn't work for me because it gives me a higher value instead of the latest price for the last month known not the next one.
Anyway looking at your formula, I finally came up with this:
{=IFERROR((INDEX(ENTRIES!$F$4:$F$28;MATCH(C8&J8&K8;ENTRIES!$C$4:$C$28&ENTRIES!$G$4:$G$28&ENTRIES!$H$4:$H$28;0)));INDEX(ENTRIES!$F$4:$F$28;MATCH(1;(ENTRIES!$C$4:$C$28=C8)*(ENTRIES!$G$4:$G$28=J8-1)*(ENTRIES!$H$4:$H$28=K8);0)))}
The first part is my original formula but now when it throws an error applies the second INDEX which finds the price of the month before the month with no price in the data. Of course, it's isn't perfect either because I have to "update" the price at least every two months.
I tried another way with a <= sign but it didn't work either.