Long calculation times with XLOOKUP vs INDEX-MIN-COLUMN - arrays

I'm using this formula =IF(B24="","",IFERROR(INDEX(Sheet3!$C$3:$EE$3,,MIN(IF(Sheet3!$C$4:$EE$23=(Sheet2!C24&$K$18),COLUMN(Sheet3!$C:$EE)))-2),"NF")) to return a cell value in the top row of an array - a date in this case.
The search criteria is a combination of a unique project number and a 2 digit status alphanumerical code for the project. The array consists of 23 rows where combinations of the unique numbers are found, each with different status codes.
So essentially, I'm building a FILTERED project status dashboard that returns dates linked to the relevant project status.
The code above is inspired from ( LINK ) that uses a very similar layout, but it uses town suburbs linked to postal codes instead of project numbers and status codes. The formula works well (though, not entered as an array formula), but I don't have a single formula in the sheet, I have 3 300 occurrences of this formula.
The problem comes in when the user changes the FILTER - Excel recalculates the entire dashboard and that takes anywhere from 2 to 5 minutes to run. You hit the escape button and cancel the calculation after setting the filter, but Excel just starts calculating again after a few seconds. After that, Excel's response is sluggish and almost unusable. Yes - our hardware is pretty weak ...
I tried XLOOKUP as well, but can't set the "lookup_array" to an array ( Sheet3!$C$4:$EE$23 ) because it doesn't match the "return-array" ( Sheet3!$C$3:$EE$3 ) Concatenating the lookup arrays with & works, but then you'd have to do that for all 23 rows, and again, multiply that by 3 300.
I thought of creating a UDF, but the function will still be called every time Excel recalculates after filtering... 3 300 calls ...
Any ideas on how to make the INDEX version run faster, or make the XLOOKUP accept the lookup_array as Sheet3!$C$4:$EE$23 in the hopes that it'll run faster?
Thank you!

Not really an elegant solution, but it works.
I imported the dataset into a helper sheet, where I combined the cell value with the corresponding value in Column A for each row ( a name in this case ) and the date from row 1 for each column, using underscore as a delimiter.
This new data range was then given a unique name, EE in this case.
On a second helper sheet, using this formula =INDEX(Filtered,1+INT((ROW('Sheet1'!C3)-1)/COLUMNS(Filtered)),MOD(ROW('Sheet1'!C3)-1+COLUMNS(Filtered),COLUMNS(Filtered))+1) and drag it down till it returns an REF! error and going back one row before the error.
This transposes all the data into a single column G. Using =UNIQUE(SORT(FILTER(B3:B3240,B3:B3240<> "",""))) then gives me a filtered list of unique values in column H that I then run
=IF(H3="","",LEFT(H3, SEARCH("_",H3,1)-1)) for the first data value in I, and
=IF(H3="","",MID(H3, SEARCH("_",H3) + 1, SEARCH("_",H3,SEARCH("_",H3)+1) - SEARCH("_",H3) - 1)) for the middle data value in J, and
=IF(H3="","",IFERROR(TEXT(RIGHT(H3,5),"yyyy-mm-dd"),"NF")) for the last data value in K.
Then just run XLOOPUP across columns I, J and K.
Runs quick and easy and solves a few of the other issue I had as well.
The second data set has just over 35 000 rows - still works well and fast.

Related

Excel formula to return a value if two values are true and one of two other things are true

I'm trying to write a formula in Excel where if value in B is “Eligible/Previously Eligible” and value in C is more than 365 days before today and value in D contains either 10 or 20, then return the value in A. I’ve been searching around and have written this:
=IFERROR(INDEX($A:$A,SMALL(IF((COUNTIFS($B:$B,"Eligible/Previously Eligible",$C:$C,"<"&TODAY()-365,$D:$D,{"*10*","*40*"})),ROW($A:$D)-MIN(ROW($A:$D))+1),ROW(A1)),COLUMN(A1)),"")
And have activated with the CTRL+Shift+Enter combo, but it just pulls in everything from A regardless of what is in B, C, or D:
#Solar Mike and #Scott Craner, thanks! This has gotten me closer but not quite there. I have a formula now that works to return the ID numbers that meet the criteria:
=IF(AND(B2="Eligible/Previously Eligible",D2<TODAY()-365,D2<>"",OR(SUM(COUNTIF(C2,{"*10*","*20*"})))),A2,"")
But I still can't get it to give me a list without white space. So, I can get what's in the "ID Numbers with Problems" column, but what do I need to write to get it to show the way I've done it manually in the "What I want" column?
image of what
Add a helper column:
=and(find("*elig",B2,1)>=1,now()-c2>=365,D2/10<=2)
Drag down to test each row. Then a results table based on those that give true.
Or use sumproduct() with column A to give the IDs.

How to count unique occurrences with criteria in excel

I'm using the below array formula to count the unique occurrences of text in column C using the agent name in column G as the reference. This is giving me multiple issues.
=SUM( --(FREQUENCY(IF(G3:G100000 = J5,MATCH(C3:C100000,C3:C100000,0)),ROW(C3:C100000) - ROW(C3) + 1) > 0))
Depending on the data set I'm using multiple agents will return a #N/A result and I can't figure out why.
Each dataset I'm using is 20k to 30k lines, so the formulas take a long time to process.
Any ideas how I could do this faster or better? Also any ideas why some agents get bad returns?
I am assuming that you are looking for the number of unique combinations of columns C and G.
Create a pivot table and check the box to add this data to the data model.
Drag both column headers to the Rows section, also drag one (of those same two) into the the values section.
click on the the field in the values section > value field settings > summarize values by > choose Distinct Count. This removes all duplicates.
Click the Row Labels filter and uncheck the blanks.
You can drop in new data then right-click on the pivot and refresh to see the new results. See the image.

Vlookup from multiple criteria to display nearest answer

I was hoping someone can help me. I have hit a solid wall.
I have a table with product information included and I am building a calculator which should spit out a number of options based on set criteria which is in the table. I am failing at just pulling through a code. I feel rather embarassed asking about how to do a vlookup here. But basically I have a vlookup which depends on multiple criteria and for the calc to cough out the nearest match (if applicable) based on this criteria.
Criteria 1 = Product
Criteria 2 = Type
Criteria 3 = Height
Criteria 4 = Min
I have created a search key in the table to concatenate all of these columns and then done a vlookup, which is =Vlookup(Criteria1 & Criteria2 & Criteria3 & Criteria4, Table Data, Code Required) But this does not appear to be giving me results, it either coughs out an error or the incorrect product. Below is my data and my calc I am hoping to complete. Can someone please help?
Here is an example looking for a closest match on Min. It demonstrates the principle so you can extend.
The closest match formula part is:
MATCH(MIN(ABS(E2:E4-K2)),ABS(E2:E4-K2),0))
Column E for column with Min values in. And K2 for target Min. This is an array formula entered with Ctrl + Shift+Enter. You would adjust the range of E2:E4.
The multiple criteria part is using:
=MATCH(lookup_value_1&lookup_value_2&lookup_value_3, lookup_array_1&lookup_array_2&lookup_array_3, match_type)
Where you are concantenating your parameters and searching for a match of the concatenation of those parameters in the table (you could do this against the key column if the key is made up of the same parameters.)
Overall formula with some test data (using one estimate figure):
=INDEX(F:F,MATCH(K1&K5&J5&INDEX(E2:E4,MATCH(MIN(ABS(E2:E4-K2)),ABS(E2:E4-K2),0)),B:B&C:C&D:D&E:E,0))
Above entered combined formula remember is an array formula so entered with Ctrl+Shift+Enter . You can reduce the ranges from entire columns to only those rows holding data.
Data data:
I am not typing all that out from picture so here is a quick n dirty
I tried with the QHarr's solution but it didn't work with all the rows.
My solution is:
Add a column with:
=IF(E2 < $K$2, E2, 0) and copy for all rows
In L5 create the formula:
{=INDEX(F2:F19,MATCH($K$1&K5&$J$5&INDEX(E2:E19,MATCH(MAX(SI(B2:B19=$K$1,1,0)*IF(C2:C19=K5,1,0)*IF(D2:D19=$J$5,1,0)*G2:G19,0),E2:E19,0)),B2:B19&C2:C19&D2:D19&E2:E19,0))}
Copy the formula to L6 and L7
Excel exercise printscreen
Originally marked this as answered and it did work initially but as I added more products it began to fail. I did manage to (after much trial and error) find a simple solution {=INDEX(Calc!$I$2:$I$189,MATCH(Output!$H$7,IF(Calc!$B$2:$B$189=Output!A12,Calc!$H$2:$H$189),1))}

Get column header of last non-empty cell in unknown row

I have a table in a sheet called "DATA" with the following headers:
Country, Code, Series, 2000, 2001, 2002, 2003, 2004, 2005, 2006.
In each row I have data for all columns always, except for years. Some rows have data for some years only, others all years.
In sheet "DATA AVAILABILITY" I want to build a formula which returns the most recent year for which there is available information in sheet "DATA", given a certain country and code. The relevant country and codes are in cells E2 and A3 of "DATA AVAILABILITY". Let's say, for argument's sake, that these are Country: Angola; Code: 3.
I have first built an array MATCH formula with two criteria:
={MATCH(1,('DATA AVAILABILITY'!E$2=Data!$B$1:$B$104701)*('DATA AVAILABILITY'!$A3=Data!$D$1:$D$104701),0)}
This has successfully given me the row in "DATA" in which there is information for Angola and code 3, which is row 1776.
Now I would like to get the header for the last non-empty cell of row 1776 in sheet "DATA". For this, I started by building a formula that would give me the column number of that cell:
=LOOKUP(2,1/(Data!1776:1776<>""),COLUMN(Data!1776:1776))
It successfully returned the number 53 which, after verifying on sheet "Data" is the correct number. I then added to the formula so that it would return the header, i.e., the year, instead of the column number:
=INDEX(Data!$A$1:$BE$104701,1,LOOKUP(2,1/(Data!1776:1776<>""),COLUMN(Data!1776:1776)))
Finally, I would like to combine both formulas (the MATCH and the INDEX formulas) so that the final result would be returned with one formula only. However, when I try to do it, something goes wrong and an error comes up - I am not even able to enter the formula. When I click ENTER, Excel returns an error that says there is a problem with the formula. what I have tried to do is to replace, in the LOOKUP within the INDEX, "Data!1776:1776" for the array MATCH formula that returns the row in which the information is - in my example, row 1776. The final formula which is not working is as follows:
=INDEX(Data!$A$1:$BE$104701,1,LOOKUP(2,1/(MATCH(1,('DATA AVAILABILITY'!E$2=Data!$B$1:$B$104701)*('DATA AVAILABILITY'!$A3=Data!$D$1:$D$104701)<>""),COLUMN(MATCH(1,('DATA AVAILABILITY'!E$2=Data!$B$1:$B$104701)*('DATA AVAILABILITY'!$A3=Data!$D$1:$D$104701))))
What may I be doing wrong?
Thank you
Hard to tell what is going on without at least some sample data (as a table or linked workbook -- NOT as a screenshot), and I would do it a bit differently.
You can simplify your formula to get the Header of the column that contains the last data in row 1776:
=LOOKUP(2,1/(Data!1776:1776<>""),Data!$1:$1)
To return the column number:
=LOOKUP(2,1/(Data!1776:1776<>""),COLUMN(Data!$1:$1))
To return the Appropriate Row Number (enter with CSE):
=MAX(($E$2=Data!$B$1:$B$104701)*(A3=Data!$D$1:$D$104701)*ROW($A$1:$A$104701))
To return the last filled in value, in the row that matches Country and Code, we make use of the fact that using 0 for the column number in the INDEX function returns all the columns in the designated row:
=LOOKUP(2,1/(INDEX(Data!$B$1:$BE$104701,MAX(($E$2=Data!$B$1:$B$104701)*(A3=Data!$D$1:$D$104701)*ROW($A$1:$A$104701)),0)<>""),INDEX(Data!$B$1:$BE$104701,MAX(($E$2=Data!$B$1:$BE$104701)*(A3=Data!$D$1:$D$104701)*ROW($A$1:$A$104701)),0))
entered with CSE.

Remove duplicate values based on timestamp

I would need your help with and SQL query that has to remove duplicate entries from a table, mostly using the datestamp column as a criteria in two passes.
Microsoft SQL DBMS is in question.
Here is a little more details:
Terminology: Module is basically a group of single machine workplaces onto which users operate.
Table:
ModNam column is fixed, there are 15 modules from M A01 to M A15, then goes the B row M B01 ... M B15 and so on until row F.
Pos column is irrelevant at the moment.
MdCod column represents a code of the machine being added to the position in the certain module. It can be replaced by another machine at any given time.
I have one query that will be inserting data into this table by copying entries from another table, every time a new machine is added to one of the positions.
Tricky part for me is a second query that should be comparing records in two phases and if:
1) Inside same module (first pass of the query represented with red color in the example pic attached):
ModNam value is the same, MdCod matches between the entries then the most recent datestamp decides the single one to stay and others duplicates get deleted
2) Inside other module (second pass of the query represented with purple color in the example pic attached):
ModNam values are different and MdCod matches between the entries then the most recent datestamp decides the single one to stay and others duplicates get deleted.
Please help and advise.
Example pic (updated):
Thank you all in advance.

Resources