Change INDEX MATCH formula to an array formula in EXCEL - arrays

I am trying to create an INDEX MATCH formula that searches through a column containing a list of jpegs and returns all jpegs that start with a particular string and converts them to a hyperlink.
Currently my formula returns just the first instance, but I'd like it to return all matches.
The list of jpegs is in column F (F1:F1000) on Sheet 2 of the workbook. The string that is being used in the search is the product SKU in column A, sheet 1.
Here is the working non-array version which I have entered in C2 on sheet 1 and filled down:
=IFERROR(
HYPERLINK(
CONCATENATE(sku_url,INDEX(Sheet2!$F$1:$F$1000,
MATCH(A2&"*",Sheet2!$F$1:$F$1000,0),1))),
"image not found")
This works for column C, but how can I fill this formula to the right so that column D contains the second image for each sku, E contains the third, and so forth. I plan to have no more than six images for each SKU, so I have assigned columns C through H to product image URLs. If a SKU doesn't have six images, these extra columns should be empty.

Assuming use of Excel 2010 or later:
=IF(COLUMNS($A:A)>COUNTIF(Sheet2!$F$1:$F$1000,$A2&"*"),"",IFERROR(HYPERLINK(CONCATENATE(sku_url,INDEX(Sheet2!$F:$F,AGGREGATE(15,6,ROW(Sheet2!$F$1:$F$1000)/(LEFT(Sheet2!$F$1:$F$1000,LEN($A2))=$A2),COLUMNS($A:A))))),"imagenotfound"))
As way of an explanation, the initial IF clause, i.e.:
IF(COLUMNS($A:A)>COUNTIF(Sheet2!$F$1:$F$1000,$A2&"*"),""
is straightforward enough:
COUNTIF(Sheet2!$F$1:$F$1000,$A2&"*")
simply gives a count of the total number of rows which match that condition, and since:
COLUMNS($A:A)
which is equal to 1 and becomes, when copied to the right, successively:
COLUMNS($A:B)
(which is equal to 2)
COLUMNS($A:C)
(which is equal to 3)
etc., etc., this clause will be equivalent to, in successive columns:
IF(1>COUNTIF(Sheet2!$F$1:$F$1000,$A2&"*"),""
IF(2>COUNTIF(Sheet2!$F$1:$F$1000,$A2&"*"),""
IF(3>COUNTIF(Sheet2!$F$1:$F$1000,$A2&"*"),""
etc., etc., and so a blank will be returned in cells where that initial clause is TRUE.
The only other clause of note is that which generates an array of successive row numbers for when this condition is met. Unfortunately the COUNTIF statement above is, for technical reasons, not employable within our AGGREGATE construction.
Fortunately we can reproduce the results of that COUNTIF statement using another set-up with LEFT.
Reducing the range in question temporarily from F1:F1000 to F1:F10 to aid the explanation, this part:
LEFT(Sheet2!$F$1:$F$10,LEN($A2))=$A2
will simply generate an array of Boolean TRUE/FALSE returns as to the result of that statement for each of the entries in F1:F10. We might have, for example:
{FALSE;TRUE;FALSE;TRUE;TRUE;FALSE;TRUE;FALSE;TRUE;FALSE}
When we then reciprocate the equivalent row numbers for each of those entries with this array of Booleans, i.e. perform:
ROW(Sheet2!$F$1:$F$10)/(LEFT(Sheet2!$F$1:$F$10,LEN($A2))=$A2)
we have:
{1;2;3;4;5;6;7;8;9;10}/{FALSE;TRUE;FALSE;TRUE;TRUE;FALSE;TRUE;FALSE;TRUE;FALSE}
and since, when coerced by any suitable mathematical operation (of which division is one), Boolean TRUE/FALSE values are coerced into their numerical equivalents (TRUE=1, FALSE=0), the above becomes:
{#DIV/0!;2;#DIV/0!;4;5;#DIV/0!;7;#DIV/0!;9;#DIV/0!}
Since AGGREGATE, with a first parameter of 15 is instructed to find the smallest value within an array, and with a second parameter of 6 is instructed to ignore any error values within that array, all that is left is to set the fourth parameter within that function, k, which determines whether the first smallest, second smallest, etc. value should be returned.
Again, by using:
COLUMNS($A:A)
for this parameter, which we know will generate a series of consecutive integers (1, 2, 3, etc.) as copied to the right, we thus guarantee that we will return the required row number to each version of the formula.
Regards

Use the AGGREGATE¹ function with the SMALL sub-function (15). Adjust the k parameter to increase with COLUMN as you fill right.
The standard (non-array) formula in B2 is,
=IFERROR(
HYPERLINK(
CONCATENATE(sku_url, INDEX(Sheet2!$F:$F,
AGGREGATE(15, 6, ROW($1:$999)/(LEFT(Sheet2!$F$1:$F$999, LEN($A2))=$A2), COLUMN(A:A))))),
"image not found")
Fill right as necessary.
      
¹ The AGGREGATE function was introduced with Excel 2010. It is not available in earlier versions.

Related

Index/Match with IF Statement

As you can see, I have a database table on the left. And I want to add in IF statement that allows me to lookup the [Code], [Name] and [Amount] of the top 5 of Company A ONLY. Then do a top 5 for Company B and so on. I have managed to lookup the top 5 out of ALL companies but cannot seem to add a criteria to target specific company.
Here are my formulas so far:
Formula in Column K [Company]: = INDEX(Database,MATCH(N3,sales,0),1)
Formula in Column L [Code]: = INDEX(Database,MATCH(N3,sales,0),2)
Formula in Column M [Name]: = INDEX(Database,MATCH(N3,sales,0),2)
Formula in Column N [Amount]: = LARGE(sales,ROW(1:20))
The intended result is to show the top 5 sales person in each company along with their [Code], [Name] and [Amount], feel free to suggest any edits to the worksheet.
Here's an alternative if you know the code is unique. After putting A into K3:K7
First get the highest amounts for Company A starting in N3
=AGGREGATE(14,6,Database[Amount]/(Database[Company]=K3),ROWS(N$1:N1))
Then find the code which matches the amount, but only if it hasn't been used before (this assumes that the code is unique) starting in L3
=INDEX(Database[Code],MATCH(1,INDEX((Database[Company]="A")*(Database[Amount]=N3)*ISNA(MATCH(Database[Code],L$2:L2,0)),0),0))
Then find the matching name with a normal INDEX/MATCH starting in M3
=INDEX(Database[Name],MATCH(L3,Database[Code],0))
Okay, I have achieved this with the use of a helper column which you can hide. Please nnote though that this will only work as long as there are not more than 9 identical totals for any 1 company, I don't think you should have that issue but it may occur, the digits being added by the helper column would need to be tweaked
First Helper Column:
Adds a digit to the end of the total representing the number of times that amount already exists above for that company. This formula is =CONCATENATE([#Amount],COUNTIFS($A$1:A1,A2,$D$1:D1,D2))*1
This is multiplied by 1 to keep the number format for LARGEto work with.
Second Helper Column:
This is an array formula and will need to be input by using Ctrl+Shift+Enter while still in the formula bar.
The formula for this one is:
=LARGE(IF(Company="A",Helper),ROW(1:1))
What this formula does as an array formula is produce a list of results based on the IF statement that LARGE can use. Rather than the entire column being ranked largest to smallest, we can now single out the rows that have company "A" like so:
=LARGE({20000;20001;20002;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;15000;14000;30000;FALSE;FALSE;FALSE;FALSE},ROW(1:1))
LARGE will only work with numeric values so the FALSES produced where column A does not match "A" will be ignored. Notice why I have used the helper column here to eliminate unique values but not affect the top 5.
ROW(1:1) has been used as this will automatically update when the formula is dragged down to produce the next highest result in this array.
The main formula for top 5 array
Again this is an Array formula so will need to be input by using Ctrl+Shift+Enter while still in the formula bar.
=INDEX(Database,SMALL(IF(Company="A",IF(Helper=$O3,ROW(Company))),1)-1,COLUMN(A:A))
With array formulas for some unknown reason IF(AND()) just does not work for me so I have nested two IF's instead.
Notice how I am again checking whether the first column matches "A" and then whether the last column matches the result of the second formula. What will happen is where both of these conditions match in the array (as in both produce TRUE for the same row) I wanted the row number to be returned.
IF({TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;TRUE;TRUE;FALSE;FALSE;FALSE;FALSE},IF({FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;TRUE;FALSE;TRUE;FALSE;FALSE},{2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20}))
It looks like a mess I know, but the position where both TRUEs align gives us the row 16 as a result.
{FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;16;FALSE;FALSE;FALSE;FALSE}
As I know that there can only be one match possible for this, I use SMALL to grab the first smallest number to use in the INDEX formula for row and deduct 1 as we are not considering the headers in the INDEX formula so we actually want the 15th result.
Again, COLUMN(A:A) has been used for the column number to return as this will automatically update when the formula is dragged across.
If you are struggling with my explanation and want me to provide more clarity, feel free to reach out and I will try my best to explain the logic in more detail

Returning multiple adjacent cell results from an min array which may include multiple duplicate values

I'm trying to setup a formula that will return the contents of an related cell (my related cell is on another sheet) from the smallest 2 results in an array. This is what I'm using right now.
=INDEX('Sheet1'!$A$40:'Sheet1'!$A$167,MATCH(SMALL(F1:F128,1),F1:F128,0),1)
And
=INDEX('Sheet1'!$A$40:'Sheet1:!$A$167,MATCH(SMALL(F1:F128,2),F1:F128,0),1)
The problem I've run into is twofold.
First, if there are multiple lowest results I get whichever one appears first in the array for both entries.
Second, if the second lowest result is duplicated but the first is not I get whichever one shows up on the list first, but any subsequent duplicates are ignored. I would like to be able to display the names associated with the duplicated scores.
You will have to adjust the k parameter of the SMALL function to raise the k according to duplicates. The COUNTIF function should be sufficient for this. Once all occurrences of the top two scores are retrieved, standard 'lookup multiple values' formulas can be applied. Retrieving successive row positions with the AGGREGATE¹ function and passing those into an INDEX of the names works well.
    
The formulas in H2:I2 are,
=IF(SMALL(F$40:F$167, ROW(1:1))<=SMALL(F$40:F$167, 1+COUNTIF(F$40:F$167, MIN(F$40:F$167))), SMALL(F$40:F$167, ROW(1:1)), "") '◄ H2
=IF(LEN(H40), INDEX(A$40:A$167, AGGREGATE(15, 6, ROW($1:$128)/(F$40:F$167=H40), COUNTIF(H$40:H40, H40))), "") '◄ I2
Fill down as necessary. The scores are designed to terminate after the last second place so it would be a good idea to fill down several rows more than is immediately necessary for future duplicates.
¹ The AGGREGATE function was introduced with Excel 2010². It is not available in earlier versions.
² Related article for pre-xl2010 functions - see Multiple Ranked Returns from INDEX().
The following formula will do what I think you want:
=IF(OR(ROW(1:1)=1,COUNTIF($E$1:$E1,INDEX(Sheet1!$A$40:$A$167,MATCH(SMALL($F$1:$F$128,ROW(1:1)),$F$1:$F$128,0)))>0,ROW(1:1)=2),INDEX(Sheet1!$A$40:$A$167,MATCH(1,INDEX(($F$1:$F$128=SMALL($F$1:$F$128,ROW(1:1)))*(COUNTIF($E$1:$E1,Sheet1!$A$40:$A$167)=0),),0)),"")
NOTE:
This is an array formula and must be confirmed with Ctrl-Shift-Enter.
There are two references $E$1:$E1. This formula assumes that it will be entered in E2 and copied down. If it is going in a different column Change these two references. It must go in the second row or it will through a circular reference.
What it will do
If there is a tie for first place it will only list those teams that are tied for first.
If there is only one first place but multiple tied for second places it will list all those in second.
So make sure you copy the formula down far enough to cover all possible ties. It will put "" in any that do not fill, so err on the high side.
To get the Scores use this simple formula, I put mine in Column F:
=IF(E2<>"",SMALL($F$1:$F$128,ROW(1:1)),"")
Again change the E reference to the column you use for the output.
I did a small test:

Excel table lookup matching values of two columns

I'd like to create a table lookup formula that matches two columns. For instance, suppose I'd like to find the value of the Letter column at the row where the Type column is Biennial and the Result column is Warning.
A B C
1 Letter Type Result
2 A Annual Exceeds
3 B Biennial Warning
4 C Biennial DevelopmentNeeded
5 D Biennial PartiallyMeets
6 E Annual Meets
What would the formula look like to accomplish this?
The SUMPRODUCT() formula is really apt for situations where you want to lookup a value with multiple criteria. It is most convenient when wanting to look up numeric values, but it can be adjusted to look up string values as well. As a bonus, you can avoid having to use array formulas.
This particular problem can be tackled with the following formula (indentation added for legibility, which you can do in Excel formulas using ALT + ENTER):
=INDEX(
$A$2:$A$6,
SUMPRODUCT(
($B$2:$B$6 = "Biennial") *
($C$2:$C$6 = "Warning") *
ROW($A$2:$A$6)
) - 1
)
First, SUMPRODUCT() is used to filter out the proper rows using ($B$2:$B$6 = "Biennial") and ($C$2:$C$6 = "Warning"); the multiplication operator * functions as an AND operator (the + operator would function as an OR operator).
Then the result is multiplied by ROW($A$2:$A$6) to find the particular row that has the combination. SUMPRODUCT() then adds everything up, which in this case gives us 3. As the result sought is actually on row 2 due to the column headings, we subtract 1. By applying the INDEX() function, we get the desired result: B.
Beware though that this is the case if and only if the combination sought is unique. If the combination sought exists more than once, this will break down.
Another method that avoids array entry is:
=INDEX($A$2:$A$6,MATCH(2,index(1/(($B$2:$B$6="Biennial")*($C$2:$C$6="Warning")),0)))
It exploits the fact that the match function ignores certain errors and that index manages arrays naturally.
You can use an array formula if you like:
=INDEX($A$2:$A$6,MATCH(1,($B$2:$B$6="Biennial")*($C$2:$C$6="Warning"),0))
Enter in with Ctrl+Shift+Enter
If you want to do this without array formulas, one way you could do it is by creating a helper column.
Column D to have the formula:
=B2&C2
Copied down
Then the new formula could be:
=INDEX($A$2:$A$6,MATCH("BiennialWarning",$D$2:$D$6,0))
It's just a play on the text, really.

Excel arrays count totals using criterias from multiple ranges (or sheets)

What I would like to do is to count the amount of lines that matches criterias to be verified in two arrays.
I can't use VBA, add new columns (for instance a new column with VLOOKUP formula) and preferably use arrays.
I have two separate ranges, each with a ID column for the identifier and other fields with data.
For instance, range 1:
Range 2:
If I had only to check the first range I would do:
={SUM((D4:D7="Red") * (E4:E7="Big"))}
But I don't know how to check also using data from the other range.
How, for example, to count the number of items that are Red, Big and Round by using both Ranges ?
Put this in the cell F4:
=IF((VLOOKUP(C4,$C$11:$D$12,2)="Round")*(D4="Red")*(E4="Big"),1,"")
Note that the behavior of VLOOKUP is that it finds the value up to the first parameter. Since there's no 1 in your second dataset, this first cell is going to show "#N/A", which I don't know how to solve, but when you extend this formula down to also compare the other sample data in the first set, the ID numbers 2 and 4 will show up as "yes" for you.
Edit: You wanted a count of this list. So after this, it should be easy to get a count of cells in this column using the COUNT function.
Try this array formula
=SUM((D4:D7="Red")*(E4:E7="Big")*ISNUMBER(MATCH(C4:C7,IF(D12:D13="Round",C12:C13),0)))
The last part is the added criterion you want - the IF function returns {2,4} [IDs where Data 3 is "Round"] and then you can use MATCH to compare C4:C7 against that. If there is a match you get a NUMBER (instead of #N/A) so you can then use ISNUMBER to get TRUE/FALSE and that feeds in to your original formula - result should be 2

How do you extract a subarray from an array in a worksheet function?

Is there some way of getting an array in Excel of a smaller size than a starting array in a cell worksheet function?
So if I had:
{23, "", 34, 46, "", "16"}
I'd end up with:
{23, 34, 46, 16}
which I could then manipulate with some other function.
Conclusion: If I was to do a lot of these I would definitely use jtolle's UDF comb solution. The formula that PPC uses is close, but diving in and testing, I found it gives errors in the empty slots, misses the first value, and there is an easier way to get the row numbers, so here is my final solution:
=IFERROR(INDEX($A$1:$A$6, SMALL(IF(($A$1:$A$6<>""),ROW($A$1:$A$6)),ROW(1:6))),"")
Which must be entered as an array formula (CTRL-SHIFT-ENTER). If being displayed then it must be entered in at least an area as big as the resultset to show all results.
If all you want to do is grab a subset of an array, and you already know the positions of the elements you want, you can just use INDEX with an array for the index argument. That is:
=INDEX({11,22,33,44,55},{2,3,5})
returns {22,33,55}. But that's usually not very useful because you don't know the positions, and I don't know any way to get them without a UDF.
What I have done for this kind of in-worksheet array filtration is to write a UDF with the following form:
'Filters an input sequence based on a second "comb" sequence.
'Non-False-equivalent, non-error values in the comb represent the positions of elements
'to be kept.
Public Function combSeq(seqToComb, seqOfCombValues)
'various library calls to work with 1xn or nx1 arrays or ranges as well as 1-D arrays
'iterate the "comb" and collect positions of keeper elements
'create a new array of the right length and copy in the keeper elements
End Function
I only posted pseudocode because my actual code is all calls to library functions, including the collect-positions and copy-from-positions operations. It would probably obscure the basic idea, which is pretty simple.
You'd call such a UDF like so:
=combSeq({23, "", 34, 46, "", "16"}, {23, "", 34, 46, "", "16"} <> "")
or
=combSeq(Q1:Q42, SIN(Z1:Z42) > 0.5)
and use Excel's normal array mechanics to generate the "comb". It's a lightweight, Excel-friendly way to get a lot of the benefits of the more standard filter(list-to-filter, test-function) function you might see in other programming systems.
I use the name "comb" because "filter" usually means "filter with this function", and with Excel you have to apply the test function before calling the filtration function. Also it can be useful to compute one "comb" as an intermediate result and then use it to...er, comb...multiple lists.
There is an answer on this site: http://www.mrexcel.com/forum/showthread.php?t=112002. Not much explanation though.
Assuming you have data with blank cells on column A and you put this in column B; that will retrieve data in the same order skipping the blanks
=INDEX( $A$1:$A$6,
SMALL(
IF(
($A$2:$A$6<>""),
ROW($A$2:$A$6)
),
ROW()-ROW($B$1)
)
)
Here is the explanation:
ROW()-ROW($B$1) is just a trick that will give you an incrementing number (ie 1 in B1, 2 in B2...)
IF (... , ROW($A$2:$A$6) ) is the main part of the trick: it builds an array of the row numbers where the IF condition is true (note that the IF has no 'else' value)
SMALL(..) will return the Xth smallest value of that array (in our case the number of the Xth nonblank row), where X is the row number of the current cell (1 in B1 ...)
INDEX will then translate from the row number to its value
Note that INDEX and ROW start one row above the actual table to always have an offset > 0 (INDEX does not like zeros)
The above answers all give brittle formulas that cannot be moved to different locations on the sheet and are very sensitive to inserted rows and columns.
Here is a version that is not sensitive and can be moved around to any row:
=INDEX($A$10:$A$40, SMALL(IF(B$10:B$40,ROW(INDIRECT("1:30"))),ROW(INDIRECT("1:30"))))
In this example the original array values are placed in $A$10:$A$40 (perhaps by using the array formula {TRANSPOSE(originalArray)} if the original data was a row instead of a column).
Column B$10:B$40 contains boolean flags (TRUE or FALSE) that determine if this array element should be preserved in the result (TRUE) or not (FALSE). You can populate this column using any function you want. To create the test mentioned in the OP, <>"", B$10 should be filled with: =A10<>"" (and then copied down thru B$40). Column A has absolute column references and column B has relative column references, so the formula can be copied over into columns further to the right, allowing you to create other types of attributes and sub-arrays, which will be governed by boolean tests you put in columns C and D etc.
This example will handle an original array of up to 30 elements. For a larger array, adjust the ranges $A$10:$A$40 and B$10:B$40 (which represent 30 rows) and also adjust the two occurrences of "1:30" to suit.
A possible worksheet function solution:
=INDEX(A1:A6,N(IF(1,MODE.MULT(IF(A1:A6<>"",ROW(1:6)*{1,1})))))
The MODE.MULT function returns a reduced array of indices and N(IF(1,.)) is inserted so that the array is passed by-reference to the INDEX function.

Resources